summaryrefslogtreecommitdiff
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* bpo-39573: Use Py_TYPE() macro in Objects directory (GH-18392)Victor Stinner2020-02-071-4/+4
| | | Replace direct access to PyObject.ob_type with Py_TYPE().
* bpo-39573: Add Py_SET_REFCNT() function (GH-18389)Victor Stinner2020-02-071-2/+2
| | | | Add a Py_SET_REFCNT() function to set the reference counter of an object.
* Add PyInterpreterState.fs_codec.utf8 (GH-18367)Victor Stinner2020-02-051-46/+47
| | | | | | Add a fast-path for UTF-8 encoding in PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize(). Add _PyUnicode_FiniEncodings() helper function for _PyUnicode_Fini().
* bpo-39542: Simplify _Py_NewReference() (GH-18332)Victor Stinner2020-02-031-1/+5
| | | | | | | | | * Remove _Py_INC_REFTOTAL and _Py_DEC_REFTOTAL macros: modify directly _Py_RefTotal. * _Py_ForgetReference() is no longer defined if the Py_TRACE_REFS macro is not defined. * Remove _Py_NewReference() implementation from object.c: unify the two implementations in object.h inline function. * Fix Py_TRACE_REFS build: _Py_INC_TPALLOCS() macro has been removed.
* bpo-38631: Avoid Py_FatalError() in unicodeobject.c (GH-18281)Victor Stinner2020-01-301-23/+28
| | | | | Replace Py_FatalError() calls with _PyErr_WriteUnraisableMsg(), _PyObject_ASSERT_FAILED_MSG() or Py_UNREACHABLE() in unicode_dealloc() and unicode_release_interned().
* Fix compiler warning in Objects/unicodeobject.c (GH-17440)Pablo Galindo2019-12-021-1/+1
|
* bpo-38896: Remove PyUnicode_ClearFreeList() function (GH-17354)Victor Stinner2019-11-231-9/+0
| | | | Remove PyUnicode_ClearFreeList() function: the Unicode free list has been removed in Python 3.3.
* bpo-38858: Call _PyUnicode_Fini() in Py_EndInterpreter() (GH-17330)Victor Stinner2019-11-221-16/+19
| | | Py_EndInterpreter() now clears the filesystem codec.
* bpo-28029: Make "".replace("", s, n) returning s for any n != 0. (GH-16981)Serhiy Storchaka2019-10-301-1/+4
|
* bpo-38409: Fix grammar in str.strip() docstring (GH-16682)Zachary Ware2019-10-091-2/+2
|
* bpo-36389: _PyObject_CheckConsistency() available in release mode (GH-16612)Victor Stinner2019-10-071-42/+42
| | | | | | | | | | | | | | | | | | | | | bpo-36389, bpo-38376: The _PyObject_CheckConsistency() function is now also available in release mode. For example, it can be used to debug a crash in the visit_decref() function of the GC. Modify the following functions to also work in release mode: * _PyDict_CheckConsistency() * _PyObject_CheckConsistency() * _PyType_CheckConsistency() * _PyUnicode_CheckConsistency() Other changes: * _PyMem_IsPtrFreed(ptr) now also returns 1 if ptr is NULL (equals to 0). * _PyBytesWriter_CheckConsistency() now returns 1 and is only used with assert(). * Reorder _PyObject_Dump() to write safe fields first, and only attempt to render repr() at the end.
* bpo-38353: Cleanup includes in the internal C API (GH-16548)Victor Stinner2019-10-021-1/+2
| | | | Use forward declaration of types to avoid includes in the internal C API. Add also comment to justify other includes.
* bpo-38236: Dump path config at first import error (GH-16300)Victor Stinner2019-09-231-9/+10
| | | | Python now dumps path configuration if it fails to import the Python codecs of the filesystem and stdio encodings.
* bpo-37206: Unrepresentable default values no longer represented as None. ↵Serhiy Storchaka2019-09-141-5/+5
| | | | | | | (GH-13933) In ArgumentClinic, value "NULL" should now be used only for unrepresentable default values (like in the optional third parameter of getattr). "None" should be used if None is accepted as argument and passing None has the same effect as not passing the argument at all.
* Fix unused variable and signed/unsigned warnings (GH-15537)Raymond Hettinger2019-08-271-0/+6
|
* bpo-36311: Fixes decoding multibyte characters around chunk boundaries and ↵Steve Dower2019-08-211-6/+10
| | | | improves decoding performance (GH-15083)
* bpo-37483: add _PyObject_CallOneArg() function (#14558)Jeroen Demeyer2019-07-041-6/+4
|
* bpo-37388: Add PyUnicode_Decode(str, 0) fast-path (GH-14385)Victor Stinner2019-06-261-0/+4
| | | Add a fast-path to PyUnicode_Decode() for size equals to 0.
* bpo-37388: Development mode check encoding and errors (GH-14341)Victor Stinner2019-06-261-5/+63
| | | | | | | | | In development mode and in debug build, encoding and errors arguments are now checked on string encoding and decoding operations. Examples: open(), str.encode() and bytes.decode(). By default, for best performances, the errors argument is only checked at the first encoding/decoding error, and the encoding argument is sometimes ignored for empty strings.
* bpo-24214: Fixed the UTF-8 and UTF-16 incremental decoders. (GH-14304)Serhiy Storchaka2019-06-251-3/+7
| | | | | | | * The UTF-8 incremental decoders fails now fast if encounter a sequence that can't be handled by the error handler. * The UTF-16 incremental decoders with the surrogatepass error handler decodes now a lone low surrogate with final=False.
* bpo-37348: optimize decoding ASCII string (GH-14283)Inada Naoki2019-06-241-34/+51
| | | `_PyUnicode_Writer` is a relatively complex structure. Initializing it is significant overhead when decoding short ASCII string.
* bpo-36710: Use tstate in pylifecycle.c (GH-14249)Victor Stinner2019-06-201-1/+3
| | | | In pylifecycle.c: pass tstate argument, rather than interp argument, to functions.
* bpo-36974: tp_print -> tp_vectorcall_offset and tp_reserved -> tp_as_async ↵Jeroen Demeyer2019-05-301-6/+6
| | | | | | | | | (GH-13464) Automatically replace tp_print -> tp_vectorcall_offset tp_compare -> tp_as_async tp_reserved -> tp_as_async
* remove unnecessary tp_dealloc (GH-13647)Inada Naoki2019-05-291-7/+1
|
* bpo-36763: Implement the PEP 587 (GH-13592)Victor Stinner2019-05-271-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add a whole new documentation page: "Python Initialization Configuration" * PyWideStringList_Append() return type is now PyStatus, instead of int * PyInterpreterState_New() now calls PyConfig_Clear() if PyConfig_InitPythonConfig() fails. * Rename files: * Python/coreconfig.c => Python/initconfig.c * Include/cpython/coreconfig.h => Include/cpython/initconfig.h * Include/internal/: pycore_coreconfig.h => pycore_initconfig.h * Rename structures * _PyCoreConfig => PyConfig * _PyPreConfig => PyPreConfig * _PyInitError => PyStatus * _PyWstrList => PyWideStringList * Rename PyConfig fields: * use_module_search_paths => module_search_paths_set * module_search_path_env => pythonpath_env * Rename PyStatus field: _func => func * PyInterpreterState: rename core_config field to config * Rename macros and functions: * _PyCoreConfig_SetArgv() => PyConfig_SetBytesArgv() * _PyCoreConfig_SetWideArgv() => PyConfig_SetArgv() * _PyCoreConfig_DecodeLocale() => PyConfig_SetBytesString() * _PyInitError_Failed() => PyStatus_Exception() * _Py_INIT_ERROR_TYPE_xxx enums => _PyStatus_TYPE_xxx * _Py_UnixMain() => Py_BytesMain() * _Py_ExitInitError() => Py_ExitStatusException() * _Py_PreInitializeFromArgs() => Py_PreInitializeFromBytesArgs() * _Py_PreInitializeFromWideArgs() => Py_PreInitializeFromArgs() * _Py_PreInitialize() => Py_PreInitialize() * _Py_RunMain() => Py_RunMain() * _Py_InitializeFromConfig() => Py_InitializeFromConfig() * _Py_INIT_XXX() => _PyStatus_XXX() * _Py_INIT_FAILED() => _PyStatus_EXCEPTION() * Rename 'err' PyStatus variables to 'status' * Convert RUN_CODE() macro to config_run_code() static inline function * Remove functions: * _Py_InitializeFromArgs() * _Py_InitializeFromWideArgs() * _PyInterpreterState_GetCoreConfig()
* bpo-36946: Fix possible signed integer overflow when handling slices. (GH-13375)Zackery Spytz2019-05-171-1/+2
| | | | | | | The final addition (cur += step) may overflow, so use size_t for "cur". "cur" is always positive (even for negative steps), so it is safe to use size_t here. Co-Authored-By: Martin Panter <vadmium+py@gmail.com>
* bpo-36594: Fix incorrect use of %p in format strings (GH-12769)Zackery Spytz2019-05-061-3/+3
| | | In addition, fix some other minor violations of C99.
* bpo-36775: _PyCoreConfig only uses wchar_t* (GH-13062)Victor Stinner2019-05-021-72/+242
| | | | | | | | | | | | | | | | | _PyCoreConfig: Change filesystem_encoding, filesystem_errors, stdio_encoding and stdio_errors fields type from char* to wchar_t*. Changes: * PyInterpreterState: replace fscodec_initialized (int) with fs_codec structure. * Add get_error_handler_wide() and unicode_encode_utf8() helper functions. * Add error_handler parameter to unicode_encode_locale() and unicode_decode_locale(). * Remove _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideString() to _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideStringFromString() to _PyCoreConfig_DecodeLocale().
* bpo-36775: Add _PyUnicode_InitEncodings() (GH-13057)Victor Stinner2019-05-021-0/+97
| | | | | | Move get_codec_name() and initfsencoding() from pylifecycle.c to unicodeobject.c. Rename also "init" functions in pylifecycle.c.
* bpo-36775: Add _Py_FORCE_UTF8_FS_ENCODING macro (GH-13056)Victor Stinner2019-05-021-2/+2
| | | | | | | Add _Py_FORCE_UTF8_LOCALE and _Py_FORCE_UTF8_FS_ENCODING macros to avoid factorize "#if defined(__ANDROID__) || defined(__VXWORKS__)" and "#if defined(__APPLE__)". Cleanup also config_init_fs_encoding().
* bpo-36389: Add _PyObject_CheckConsistency() function (GH-12803)Victor Stinner2019-04-121-49/+44
| | | | | | Add a new _PyObject_CheckConsistency() function which can be used to help debugging. The function is available in release mode. Add a 'check_content' parameter to _PyDict_CheckConsistency().
* bpo-36549: str.capitalize now titlecases the first character instead of ↵Kingsley M2019-04-121-1/+1
| | | | uppercasing it (GH-12804)
* bpo-24214: Fixed the UTF-8 incremental decoder. (GH-12603)Serhiy Storchaka2019-03-301-0/+3
| | | | The bug occurred when the encoded surrogate character is passed to the incremental decoder in two chunks.
* bpo-36312: Fix decoders for some code pages. (GH-12369)Serhiy Storchaka2019-03-201-5/+16
|
* bpo-36356: Release Unicode interned strings on Valgrind (#12431)Victor Stinner2019-03-191-15/+42
| | | | | | | | | | | When Python is compiled with Valgrind support, release Unicode interned strings at exit in _PyUnicode_Fini(). * Rename _Py_ReleaseInternedUnicodeStrings() to unicode_release_interned() and make it private. * unicode_release_interned() is now called from _PyUnicode_Fini(): it must be called with a running Python thread state for TRASHCAN, it cannot be called from pymain_free(). * Don't display statistics on interned strings at exit anymore
* bpo-36301: Error if decoding pybuilddir.txt fails (GH-12422)Victor Stinner2019-03-191-2/+11
| | | | | | | | Python initialization now fails if decoding pybuilddir.txt configuration file fails at startup. _PyPathConfig_Calculate() now reports memory allocation failure and decoding error on decoding pybuilddir.txt content from UTF-8/surrogateescape.
* bpo-36297: remove "unicode_internal" codec (GH-12342)Inada Naoki2019-03-181-102/+0
|
* bpo-35713: Split _Py_InitializeCore into subfunctions (GH-11650)Victor Stinner2019-01-221-1/+0
| | | | | | | | | | | | | | * Split _Py_InitializeCore_impl() into subfunctions: add multiple pycore_init_xxx() functions * Preliminary sys.stderr is now set earlier to get an usable sys.stderr ealier. * Move code into _Py_Initialize_ReconfigureCore() to be able to call it from _Py_InitializeCore(). * Split _PyExc_Init(): create a new _PyBuiltins_AddExceptions() function. * Call _PyExc_Init() earlier in _Py_InitializeCore_impl() and new_interpreter() to get working exceptions earlier. * _Py_ReadyTypes() now returns _PyInitError rather than calling Py_FatalError(). * Misc code cleanup
* bpo-35713: Rework Python initialization (GH-11647)Victor Stinner2019-01-221-14/+18
| | | | | | | | | | | | | | | | | | | * The PyByteArray_Init() and PyByteArray_Fini() functions have been removed. They did nothing since Python 2.7.4 and Python 3.2.0, were excluded from the limited API (stable ABI), and were not documented. * Move "_PyXXX_Init()" and "_PyXXX_Fini()" declarations from Include/cpython/pylifecycle.h to Include/internal/pycore_pylifecycle.h. Replace "PyAPI_FUNC(TYPE)" with "extern TYPE". * _PyExc_Init() now returns an error on failure rather than calling Py_FatalError(). Move macros inside _PyExc_Init() and undefine them when done. Rewrite macros to make them look more like statement: add ";" when using them, add "do { ... } while (0)". * _PyUnicode_Init() now returns a _PyInitError error rather than call Py_FatalError(). * Move stdin check from _PySys_BeginInit() to init_sys_streams(). * _Py_ReadyTypes() now returns a _PyInitError error rather than calling Py_FatalError().
* bpo-35552: Fix reading past the end in PyUnicode_FromFormat() and ↵Serhiy Storchaka2019-01-121-3/+9
| | | | | | PyBytes_FromFormat(). (GH-11276) Format characters "%s" and "%V" in PyUnicode_FromFormat() and "%s" in PyBytes_FromFormat() no longer read memory past the limit if precision is specified.
* bpo-35560: Remove assertion from format(float, "n") (GH-11288)Xtreak2019-01-071-1/+1
| | | | | Fix an assertion error in format() in debug build for floating point formatting with "n" format, zero padding and small width. Release build is not impacted. Patch by Karthikeyan Singaravelan.
* bpo-35636: Remove redundant check in unicode_hash(). (GH-11402)animalize2019-01-021-10/+1
| | | | _Py_HashBytes() does the check for empty string.
* bpo-35444: Unify and optimize the helper for getting a builtin object. ↵Serhiy Storchaka2018-12-111-2/+3
| | | | | | | | (GH-11047) This speeds up pickling of some iterators. This fixes also error handling in pickling methods when fail to look up builtin "getattr".
* bpo-35365: Use a wchar_t* buffer in the code page decoder. (GH-10837)Serhiy Storchaka2018-12-041-60/+52
|
* bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)Serhiy Storchaka2018-12-031-5/+4
|
* bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759)Victor Stinner2018-11-281-7/+5
| | | | | | | | | Fix memory leak in PyUnicode_EncodeLocale() and PyUnicode_EncodeFSDefault() on error handling. Changes: * Fix unicode_encode_locale() error handling * Fix test_codecs.LocaleCodecTest
* bpo-33954: Fix compiler warning in _PyUnicode_FastFill() (GH-10737)Victor Stinner2018-11-271-1/+4
| | | | | | | 'data' argument of unicode_fill() is modified, so it must not be constant. Add more assertions to unicode_fill(): check the maximum character value.
* bpo-33012: Fix invalid function cast warnings with gcc 8. (GH-6749)Serhiy Storchaka2018-11-271-1/+1
| | | | | | Fix invalid function cast warnings with gcc 8 for method conventions different from METH_NOARGS, METH_O and METH_VARARGS excluding Argument Clinic generated code.
* bpo-33954: Fix _PyUnicode_InsertThousandsGrouping() (GH-10623)Victor Stinner2018-11-261-98/+165
| | | | | | | | | | | | Fix str.format(), float.__format__() and complex.__format__() methods for non-ASCII decimal point when using the "n" formatter. Changes: * Rewrite _PyUnicode_InsertThousandsGrouping(): it now requires a _PyUnicodeWriter object for the buffer and a Python str object for digits. * Rename FILL() macro to unicode_fill(), convert it to static inline function, add "assert(0 <= start);" and rework its code.
* bpo-35059: Cast void* to PyObject* (GH-10650)Victor Stinner2018-11-221-3/+6
| | | Don't pass void* to Python macros: use _PyObject_CAST().