summaryrefslogtreecommitdiff
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* bpo-39939: Add str.removeprefix and str.removesuffix (GH-18939)sweeneyde2020-04-221-0/+57
| | | | | Added str.removeprefix and str.removesuffix methods and corresponding bytes, bytearray, and collections.UserString methods to remove affixes from a string if present. See PEP 616 for a full description.
* bpo-40268: Remove a few pycore_pystate.h includes (GH-19510)Victor Stinner2020-04-141-2/+3
|
* bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509)Victor Stinner2020-04-141-4/+4
| | | | | | | Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET() for consistency with _PyThreadState_GET() and to have a shorter name (help to fit into 80 columns). Add also "assert(tstate != NULL);" to the function.
* bpo-40268: Add _PyInterpreterState_GetConfig() (GH-19492)Victor Stinner2020-04-131-7/+9
| | | | | | | | Don't access PyInterpreterState.config member directly anymore, but use new functions: * _PyInterpreterState_GetConfig() * _PyInterpreterState_SetConfig() * _Py_GetConfig()
* bpo-39943: Add the const qualifier to pointers on non-mutable PyBytes data. ↵Serhiy Storchaka2020-04-121-3/+3
| | | | (GH-19472)
* bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode ↵Serhiy Storchaka2020-04-111-133/+162
| | | | data. (GH-19345)
* bpo-40170: Add _PyIndex_Check() internal function (GH-19426)Victor Stinner2020-04-081-1/+2
| | | | | | | | | Add _PyIndex_Check() function to the internal C API: fast inlined verson of PyIndex_Check(). Add Include/internal/pycore_abstract.h header file. Replace PyIndex_Check() with _PyIndex_Check() in C files of Objects and Python subdirectories.
* bpo-37388: Don't check encoding/errors during finalization (GH-19409)Victor Stinner2020-04-071-0/+6
| | | | | | | | | str.encode() and str.decode() no longer check the encoding and errors in development mode or in debug mode during Python finalization. The codecs machinery can no longer work on very late calls to str.encode() and str.decode(). This change should help to call _PyObject_Dump() to debug during late Python finalization.
* bpo-40130: _PyUnicode_AsKind() should not be exported. (GH-19265)Serhiy Storchaka2020-04-011-49/+46
| | | | | Make it a static function, and pass known attributes (kind, data, length) instead of the PyUnicode object.
* Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985)Inada Naoki2020-03-141-35/+0
| | | | | | | * Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)" This reverts commit c7ad974d341d3edb6b9d2a2dcae4d3d4794ada6b. * Update unicodeobject.h
* bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)Inada Naoki2020-03-141-0/+35
| | | Co-authored-by: Victor Stinner <vstinner@python.org>
* bpo-39573: Finish converting to new Py_IS_TYPE() macro (GH-18601)Andy Lester2020-03-041-2/+2
|
* bpo-39087: Optimize PyUnicode_AsUTF8AndSize() (GH-18327)Inada Naoki2020-02-271-25/+73
| | | Avoid using temporary bytes object.
* closes bpo-39684: Combine two if/thens and squash uninit var warning. (GH-18565)Andy Lester2020-02-201-8/+3
|
* bpo-39500: Fix compile warnings in unicodeobject.c (GH-18519)Hai Shi2020-02-171-2/+2
|
* bpo-35081: Move bytes_methods.h to the internal C API (GH-18492)Victor Stinner2020-02-121-1/+1
| | | | | Move the bytes_methods.h header file to the internal C API as pycore_bytes_methods.h: it only contains private symbols (prefixed by "_Py"), except of the PyDoc_STRVAR_shared() macro.
* bpo-39605: Remove a cast that causes a warning. (GH-18473)Benjamin Peterson2020-02-111-1/+1
|
* closes bpo-39605: Fix some casts to not cast away const. (GH-18453)Andy Lester2020-02-111-15/+15
| | | | | | | | | | | | | | | gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either: Adding the const to the type cast, as in: - return _PyUnicode_FromUCS1((unsigned char*)s, size); + return _PyUnicode_FromUCS1((const unsigned char*)s, size); or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in: - PyDTrace_FUNCTION_ENTRY((char *)filename, (char *)funcname, lineno); + PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno); These changes will not change code, but they will make it much easier to check for errors in consts
* bpo-39245: Switch to public API for Vectorcall (GH-18460)Petr Viktorin2020-02-111-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The bulk of this patch was generated automatically with: for name in \ PyObject_Vectorcall \ Py_TPFLAGS_HAVE_VECTORCALL \ PyObject_VectorcallMethod \ PyVectorcall_Function \ PyObject_CallOneArg \ PyObject_CallMethodNoArgs \ PyObject_CallMethodOneArg \ ; do echo $name git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g" done old=_PyObject_FastCallDict new=PyObject_VectorcallDict git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g" and then cleaned up: - Revert changes to in docs & news - Revert changes to backcompat defines in headers - Nudge misaligned comments
* bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397)Victor Stinner2020-02-111-14/+33
| | | | PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the string is not ready.
* bpo-39573: Use Py_TYPE() macro in Objects directory (GH-18392)Victor Stinner2020-02-071-4/+4
| | | Replace direct access to PyObject.ob_type with Py_TYPE().
* bpo-39573: Add Py_SET_REFCNT() function (GH-18389)Victor Stinner2020-02-071-2/+2
| | | | Add a Py_SET_REFCNT() function to set the reference counter of an object.
* Add PyInterpreterState.fs_codec.utf8 (GH-18367)Victor Stinner2020-02-051-46/+47
| | | | | | Add a fast-path for UTF-8 encoding in PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize(). Add _PyUnicode_FiniEncodings() helper function for _PyUnicode_Fini().
* bpo-39542: Simplify _Py_NewReference() (GH-18332)Victor Stinner2020-02-031-1/+5
| | | | | | | | | * Remove _Py_INC_REFTOTAL and _Py_DEC_REFTOTAL macros: modify directly _Py_RefTotal. * _Py_ForgetReference() is no longer defined if the Py_TRACE_REFS macro is not defined. * Remove _Py_NewReference() implementation from object.c: unify the two implementations in object.h inline function. * Fix Py_TRACE_REFS build: _Py_INC_TPALLOCS() macro has been removed.
* bpo-38631: Avoid Py_FatalError() in unicodeobject.c (GH-18281)Victor Stinner2020-01-301-23/+28
| | | | | Replace Py_FatalError() calls with _PyErr_WriteUnraisableMsg(), _PyObject_ASSERT_FAILED_MSG() or Py_UNREACHABLE() in unicode_dealloc() and unicode_release_interned().
* Fix compiler warning in Objects/unicodeobject.c (GH-17440)Pablo Galindo2019-12-021-1/+1
|
* bpo-38896: Remove PyUnicode_ClearFreeList() function (GH-17354)Victor Stinner2019-11-231-9/+0
| | | | Remove PyUnicode_ClearFreeList() function: the Unicode free list has been removed in Python 3.3.
* bpo-38858: Call _PyUnicode_Fini() in Py_EndInterpreter() (GH-17330)Victor Stinner2019-11-221-16/+19
| | | Py_EndInterpreter() now clears the filesystem codec.
* bpo-28029: Make "".replace("", s, n) returning s for any n != 0. (GH-16981)Serhiy Storchaka2019-10-301-1/+4
|
* bpo-38409: Fix grammar in str.strip() docstring (GH-16682)Zachary Ware2019-10-091-2/+2
|
* bpo-36389: _PyObject_CheckConsistency() available in release mode (GH-16612)Victor Stinner2019-10-071-42/+42
| | | | | | | | | | | | | | | | | | | | | bpo-36389, bpo-38376: The _PyObject_CheckConsistency() function is now also available in release mode. For example, it can be used to debug a crash in the visit_decref() function of the GC. Modify the following functions to also work in release mode: * _PyDict_CheckConsistency() * _PyObject_CheckConsistency() * _PyType_CheckConsistency() * _PyUnicode_CheckConsistency() Other changes: * _PyMem_IsPtrFreed(ptr) now also returns 1 if ptr is NULL (equals to 0). * _PyBytesWriter_CheckConsistency() now returns 1 and is only used with assert(). * Reorder _PyObject_Dump() to write safe fields first, and only attempt to render repr() at the end.
* bpo-38353: Cleanup includes in the internal C API (GH-16548)Victor Stinner2019-10-021-1/+2
| | | | Use forward declaration of types to avoid includes in the internal C API. Add also comment to justify other includes.
* bpo-38236: Dump path config at first import error (GH-16300)Victor Stinner2019-09-231-9/+10
| | | | Python now dumps path configuration if it fails to import the Python codecs of the filesystem and stdio encodings.
* bpo-37206: Unrepresentable default values no longer represented as None. ↵Serhiy Storchaka2019-09-141-5/+5
| | | | | | | (GH-13933) In ArgumentClinic, value "NULL" should now be used only for unrepresentable default values (like in the optional third parameter of getattr). "None" should be used if None is accepted as argument and passing None has the same effect as not passing the argument at all.
* Fix unused variable and signed/unsigned warnings (GH-15537)Raymond Hettinger2019-08-271-0/+6
|
* bpo-36311: Fixes decoding multibyte characters around chunk boundaries and ↵Steve Dower2019-08-211-6/+10
| | | | improves decoding performance (GH-15083)
* bpo-37483: add _PyObject_CallOneArg() function (#14558)Jeroen Demeyer2019-07-041-6/+4
|
* bpo-37388: Add PyUnicode_Decode(str, 0) fast-path (GH-14385)Victor Stinner2019-06-261-0/+4
| | | Add a fast-path to PyUnicode_Decode() for size equals to 0.
* bpo-37388: Development mode check encoding and errors (GH-14341)Victor Stinner2019-06-261-5/+63
| | | | | | | | | In development mode and in debug build, encoding and errors arguments are now checked on string encoding and decoding operations. Examples: open(), str.encode() and bytes.decode(). By default, for best performances, the errors argument is only checked at the first encoding/decoding error, and the encoding argument is sometimes ignored for empty strings.
* bpo-24214: Fixed the UTF-8 and UTF-16 incremental decoders. (GH-14304)Serhiy Storchaka2019-06-251-3/+7
| | | | | | | * The UTF-8 incremental decoders fails now fast if encounter a sequence that can't be handled by the error handler. * The UTF-16 incremental decoders with the surrogatepass error handler decodes now a lone low surrogate with final=False.
* bpo-37348: optimize decoding ASCII string (GH-14283)Inada Naoki2019-06-241-34/+51
| | | `_PyUnicode_Writer` is a relatively complex structure. Initializing it is significant overhead when decoding short ASCII string.
* bpo-36710: Use tstate in pylifecycle.c (GH-14249)Victor Stinner2019-06-201-1/+3
| | | | In pylifecycle.c: pass tstate argument, rather than interp argument, to functions.
* bpo-36974: tp_print -> tp_vectorcall_offset and tp_reserved -> tp_as_async ↵Jeroen Demeyer2019-05-301-6/+6
| | | | | | | | | (GH-13464) Automatically replace tp_print -> tp_vectorcall_offset tp_compare -> tp_as_async tp_reserved -> tp_as_async
* remove unnecessary tp_dealloc (GH-13647)Inada Naoki2019-05-291-7/+1
|
* bpo-36763: Implement the PEP 587 (GH-13592)Victor Stinner2019-05-271-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add a whole new documentation page: "Python Initialization Configuration" * PyWideStringList_Append() return type is now PyStatus, instead of int * PyInterpreterState_New() now calls PyConfig_Clear() if PyConfig_InitPythonConfig() fails. * Rename files: * Python/coreconfig.c => Python/initconfig.c * Include/cpython/coreconfig.h => Include/cpython/initconfig.h * Include/internal/: pycore_coreconfig.h => pycore_initconfig.h * Rename structures * _PyCoreConfig => PyConfig * _PyPreConfig => PyPreConfig * _PyInitError => PyStatus * _PyWstrList => PyWideStringList * Rename PyConfig fields: * use_module_search_paths => module_search_paths_set * module_search_path_env => pythonpath_env * Rename PyStatus field: _func => func * PyInterpreterState: rename core_config field to config * Rename macros and functions: * _PyCoreConfig_SetArgv() => PyConfig_SetBytesArgv() * _PyCoreConfig_SetWideArgv() => PyConfig_SetArgv() * _PyCoreConfig_DecodeLocale() => PyConfig_SetBytesString() * _PyInitError_Failed() => PyStatus_Exception() * _Py_INIT_ERROR_TYPE_xxx enums => _PyStatus_TYPE_xxx * _Py_UnixMain() => Py_BytesMain() * _Py_ExitInitError() => Py_ExitStatusException() * _Py_PreInitializeFromArgs() => Py_PreInitializeFromBytesArgs() * _Py_PreInitializeFromWideArgs() => Py_PreInitializeFromArgs() * _Py_PreInitialize() => Py_PreInitialize() * _Py_RunMain() => Py_RunMain() * _Py_InitializeFromConfig() => Py_InitializeFromConfig() * _Py_INIT_XXX() => _PyStatus_XXX() * _Py_INIT_FAILED() => _PyStatus_EXCEPTION() * Rename 'err' PyStatus variables to 'status' * Convert RUN_CODE() macro to config_run_code() static inline function * Remove functions: * _Py_InitializeFromArgs() * _Py_InitializeFromWideArgs() * _PyInterpreterState_GetCoreConfig()
* bpo-36946: Fix possible signed integer overflow when handling slices. (GH-13375)Zackery Spytz2019-05-171-1/+2
| | | | | | | The final addition (cur += step) may overflow, so use size_t for "cur". "cur" is always positive (even for negative steps), so it is safe to use size_t here. Co-Authored-By: Martin Panter <vadmium+py@gmail.com>
* bpo-36594: Fix incorrect use of %p in format strings (GH-12769)Zackery Spytz2019-05-061-3/+3
| | | In addition, fix some other minor violations of C99.
* bpo-36775: _PyCoreConfig only uses wchar_t* (GH-13062)Victor Stinner2019-05-021-72/+242
| | | | | | | | | | | | | | | | | _PyCoreConfig: Change filesystem_encoding, filesystem_errors, stdio_encoding and stdio_errors fields type from char* to wchar_t*. Changes: * PyInterpreterState: replace fscodec_initialized (int) with fs_codec structure. * Add get_error_handler_wide() and unicode_encode_utf8() helper functions. * Add error_handler parameter to unicode_encode_locale() and unicode_decode_locale(). * Remove _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideString() to _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideStringFromString() to _PyCoreConfig_DecodeLocale().
* bpo-36775: Add _PyUnicode_InitEncodings() (GH-13057)Victor Stinner2019-05-021-0/+97
| | | | | | Move get_codec_name() and initfsencoding() from pylifecycle.c to unicodeobject.c. Rename also "init" functions in pylifecycle.c.
* bpo-36775: Add _Py_FORCE_UTF8_FS_ENCODING macro (GH-13056)Victor Stinner2019-05-021-2/+2
| | | | | | | Add _Py_FORCE_UTF8_LOCALE and _Py_FORCE_UTF8_FS_ENCODING macros to avoid factorize "#if defined(__ANDROID__) || defined(__VXWORKS__)" and "#if defined(__APPLE__)". Cleanup also config_init_fs_encoding().