summaryrefslogtreecommitdiff
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix the internals of our hash functions to used unsigned values during hashGregory P. Smith2012-12-101-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | computation as the overflow behavior of signed integers is undefined. NOTE: This change is smaller compared to 3.2 as much of this cleanup had already been done. I added the comment that my change in 3.2 added so that the code would match up. Otherwise this just adds or synchronizes appropriate UL designations on some constants to be pedantic. In practice we require compiling everything with -fwrapv which forces overflow to be defined as twos compliment but this keeps the code cleaner for checkers or in the case where someone has compiled it without -fwrapv or their compiler's equivalent. Found by Clang trunk's Undefined Behavior Sanitizer (UBSan). Cleanup only - no functionality or hash values change.
| * Fix the internals of our hash functions to used unsigned values during hashGregory P. Smith2012-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | computation as the overflow behavior of signed integers is undefined. In practice we require compiling everything with -fwrapv which forces overflow to be defined as twos compliment but this keeps the code cleaner for checkers or in the case where someone has compiled it without -fwrapv or their compiler's equivalent. Found by Clang trunk's Undefined Behavior Sanitizer (UBSan). Cleanup only - no functionality or hash values change.
* | (Merge 3.2) Issue #16416: On Mac OS X, operating system data are now alwaysVictor Stinner2012-12-031-4/+5
|\ \ | |/ | | | | | | | | | | encoded/decoded to/from UTF-8/surrogateescape, instead of the locale encoding (which may be ASCII if no locale environment variable is set), to avoid inconsistencies with os.fsencode() and os.fsdecode() functions which are already using UTF-8/surrogateescape.
| * Issue #16416: On Mac OS X, operating system data are now alwaysVictor Stinner2012-12-031-4/+5
| | | | | | | | | | | | | | encoded/decoded to/from UTF-8/surrogateescape, instead of the locale encoding (which may be ASCII if no locale environment variable is set), to avoid inconsistencies with os.fsencode() and os.fsdecode() functions which are already using UTF-8/surrogateescape.
* | Issue #16215: Fix potential double memory free in str.replace().Antoine Pitrou2012-11-171-0/+2
| | | | | | | | Patch by Serhiy Storchaka.
* | #8271: the utf-8 decoder now outputs the correct number of U+FFFD ↵Ezio Melotti2012-11-041-6/+4
| | | | | | | | characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.
* | merge 3.2 (#16369)Benjamin Peterson2012-10-301-0/+6
|\ \ | |/
| * initialize more global type objects (closes #16369)Benjamin Peterson2012-10-301-0/+6
| |
| * Issue #14700: Fix buggy overflow checks for large precision and width in ↵Mark Dickinson2012-10-281-2/+2
| | | | | | | | new-style and old-style formatting.
* | Issue #14783: Merge changes from 3.2.Chris Jerdonek2012-10-071-1/+2
|\ \ | |/
| * Issue #14783: Improve int() docstring and also str(), range(), and slice().Chris Jerdonek2012-10-071-1/+2
| | | | | | | | | | | | This commit rewrites the docstring for int() to incorporate the documentation changes made in issue #16036. It also switches the docstrings for int(), str(), range(), and slice() to use multi-line signatures.
* | Issue #16096: Fix several occurrences of potential signed integer overflow. ↵Mark Dickinson2012-10-061-14/+9
| | | | | | | | Thanks Serhiy Storchaka.
* | #16127: remove outdated references to narrow builds. Patch by Serhiy Storchaka.Ezio Melotti2012-10-051-10/+4
| |
* | Fix PyUnicode_Format(): return NULL if PyUnicode_READY(uformat) failedVictor Stinner2012-10-051-1/+3
| | | | | | | | | | This error cannot occur in practice: PyUnicode_FromObject() always return a "ready" string.
* | Issue #15379: Fix passing of non-BMP characters as integers for the charmap ↵Antoine Pitrou2012-09-231-3/+4
|\ \ | |/ | | | | | | | | decoder (already working as unicode strings). Patch by Serhiy Storchaka.
| * Issue #15379: Fix passing of non-BMP characters as integers for the charmap ↵Antoine Pitrou2012-09-231-2/+26
| | | | | | | | | | | | decoder (already working as unicode strings). Patch by Serhiy Storchaka.
* | Issue #15144: Fix possible integer overflow when handling pointers as ↵Antoine Pitrou2012-09-201-9/+6
| | | | | | | | | | | | integer values, by using Py_uintptr_t instead of size_t. Patch by Serhiy Storchaka.
* | Issue #15900: Fixed reference leak in PyUnicode_TranslateCharmap()Christian Heimes2012-09-111-6/+5
| |
* | Fixed memory leak in error branch of formatfloat(). CID 719687Christian Heimes2012-09-101-1/+3
| |
* | Fix C++-style comment (xlc compilation failure)Antoine Pitrou2012-09-021-1/+1
| |
* | merge 3.2 (#15801)Benjamin Peterson2012-08-281-2/+1
|\ \ | |/
| * use the stricter PyMapping_Check (closes #15801)Benjamin Peterson2012-08-281-2/+1
| |
* | Issue #15728: Fix leak in PyUnicode_AsWideCharString(). Found by Coverity.Stefan Krah2012-08-191-1/+3
| |
* | Merge str docstring fix from 3.2Nick Coghlan2012-08-161-4/+8
|\ \ | |/
| * Fix str docstringNick Coghlan2012-08-161-4/+8
| |
| * Issue #14579: Fix CVE-2012-2135: vulnerability in the utf-16 decoder after ↵Antoine Pitrou2012-07-211-31/+21
| | | | | | | | | | | | error handling. Patch by Serhiy Storchaka.
* | Use correct types for ASCII_CHAR_MASK integer constants.Mark Dickinson2012-07-071-2/+2
| |
* | Issue #14874: Restore charmap decoding speed to pre-PEP 393 levels.Antoine Pitrou2012-06-161-15/+48
| | | | | | | | Patch by Serhiy Storchaka.
* | _copy_characters(): move debug code at the top to avoid noisy #ifdefVictor Stinner2012-06-161-26/+23
| | | | | | | | | | And don't use assert() anymore if check_maxchar is set: return -1 on error instead.
* | Fix PyUnicode_GetSize(): Don't replace _PyUnicode_Ready() exceptionVictor Stinner2012-06-161-2/+3
| |
* | Fix a compiler warning in _copy_characters() and remove debug codeVictor Stinner2012-06-161-10/+1
| |
* | Oops, fix my previous change on _copy_characters()Victor Stinner2012-06-161-2/+2
| |
* | Fix unicode_adjust_maxchar(): catch PyUnicode_New() failureVictor Stinner2012-06-161-1/+2
| |
* | Fix "%f" format of str%args if the result is not an ASCII or latin1 stringVictor Stinner2012-06-161-17/+19
| |
* | Remove debug codeVictor Stinner2012-06-161-8/+0
| |
* | Optimize _PyUnicode_FastCopyCharacters() when maxchar(from) > maxchar(to)Victor Stinner2012-06-161-55/+75
| |
* | unicodeobject.c: Remove debug codeVictor Stinner2012-06-161-14/+0
| |
* | Issue #15026: utf-16 encoding is now significantly faster (up to 10x).Antoine Pitrou2012-06-151-47/+33
| | | | | | | | Patch by Serhiy Storchaka.
* | Rearrange code to beat an optimizer bug affecting Release x64 on windowsKristján Valur Jónsson2012-06-061-12/+10
| | | | | | | | with VS2010sp1
* | Issue #14993: Use standard "unsigned char" instead of a unsigned char bitfieldVictor Stinner2012-06-041-10/+10
| |
* | Issue #14909: A number of places were using PyMem_Realloc() apis andKristjan Valur Jonsson2012-05-311-2/+4
| | | | | | | | | | PyObject_GC_Resize() with incorrect error handling. In case of errors, the original object would be leaked. This checkin fixes those cases.
* | Issue #14744: Fix compilation on Windows (part 2)Victor Stinner2012-05-291-1/+1
| |
* | Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args ↵Victor Stinner2012-05-291-97/+265
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and str.format(args) * Formatting string, int, float and complex use the _PyUnicodeWriter API. It avoids a temporary buffer in most cases. * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just keep a reference to the string if the output is only composed of one string * Disable overallocation when formatting the last argument of str%args and str.format(args) * Overallocation allocates at least 100 characters: add min_length attribute to the _PyUnicodeWriter structure * Add new private functions: _PyUnicode_FastCopyCharacters(), _PyUnicode_FastFill() and _PyUnicode_FromASCII() The speed up is around 20% in average.
* | Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs.Antoine Pitrou2012-05-151-198/+79
| | | | | | | | Patch by Serhiy Storchaka.
* | Silence VS 2010 signed/unsigned warnings.Martin v. Löwis2012-05-151-2/+5
| |
* | Fix refleaks introduced by 83da67651687.Antoine Pitrou2012-05-121-2/+8
| |
* | Fix logic error introduced by 83da67651687.Antoine Pitrou2012-05-121-2/+2
| |
* | simplify by shortcutting when the kind of the needle is larger than the haystackBenjamin Peterson2012-05-111-21/+11
| |
* | Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy ↵Antoine Pitrou2012-05-101-474/+165
| | | | | | | | Storchaka.
* | Rename unicode_write_t structure and its methods to "_PyUnicodeWriter"Victor Stinner2012-05-091-16/+16
| |