summaryrefslogtreecommitdiff
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* Issue #26200: Added Py_SETREF and replaced Py_XSETREF with Py_SETREFSerhiy Storchaka2016-04-101-4/+4
|\ | | | | | | in places where Py_DECREF was used.
| * Issue #26200: Added Py_SETREF and replaced Py_XSETREF with Py_SETREFSerhiy Storchaka2016-04-101-4/+4
| | | | | | | | in places where Py_DECREF was used.
* | Issue #22570: Renamed Py_SETREF to Py_XSETREF.Serhiy Storchaka2016-04-061-4/+4
|\ \ | |/
| * Issue #22570: Renamed Py_SETREF to Py_XSETREF.Serhiy Storchaka2016-04-061-4/+4
| |
* | Issue #26494: Fixed crash on iterating exhausting iterators.Serhiy Storchaka2016-03-301-1/+1
|\ \ | |/ | | | | | | | | Affected classes are generic sequence iterators, iterators of str, bytes, bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding views and os.scandir() iterator.
| * Issue #26494: Fixed crash on iterating exhausting iterators.Serhiy Storchaka2016-03-301-1/+1
| | | | | | | | | | | | Affected classes are generic sequence iterators, iterators of str, bytes, bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding views and os.scandir() iterator.
* | Merge 3.5Victor Stinner2016-03-011-9/+12
|\ \ | |/
| * Issue #26464: Fix unicode_fast_translate() againVictor Stinner2016-03-011-9/+12
| | | | | | | | Initialize i variable if the string is non-ASCII.
* | Merge 3.5Victor Stinner2016-03-011-3/+4
|\ \ | |/
| * Fix str.translate()Victor Stinner2016-03-011-3/+4
| | | | | | | | | | | | Issue #26464: Fix str.translate() when string is ASCII and first replacements removes character, but next replacement uses a non-ASCII character or a string longer than 1 character. Regression introduced in Python 3.5.0.
* | Merge 3.5Victor Stinner2016-01-271-0/+2
|\ \ | |/
| * Fix resize_compact()Victor Stinner2016-01-271-0/+2
| | | | | | | | | | Issue #26217: resize_compact() must set wstr_length to 0 after freeing the wstr string. Otherwise, an assertion fails in _PyUnicode_CheckConsistency().
* | Issue #20440: More use of Py_SETREF.Serhiy Storchaka2015-12-271-1/+1
|\ \ | |/ | | | | | | This patch is manually crafted and contains changes that couldn't be handled automatically.
| * Issue #20440: More use of Py_SETREF.Serhiy Storchaka2015-12-271-1/+1
| | | | | | | | | | This patch is manually crafted and contains changes that couldn't be handled automatically.
* | Issue #25923: Added more const qualifiers to signatures of static and ↵Serhiy Storchaka2015-12-251-3/+3
| | | | | | | | private functions.
* | Issue #25923: Added the const qualifier to static constant arrays.Serhiy Storchaka2015-12-251-6/+6
| |
* | Issue #20440: Massive replacing unsafe attribute setting code with specialSerhiy Storchaka2015-12-241-8/+4
|\ \ | |/ | | | | macro Py_SETREF.
| * Issue #20440: Massive replacing unsafe attribute setting code with specialSerhiy Storchaka2015-12-241-8/+4
| | | | | | | | macro Py_SETREF.
* | Issues #25890, #25891, #25892: Removed unused variables in Windows code.Serhiy Storchaka2015-12-181-1/+0
| | | | | | | | Reported by Alexander Riccio.
* | Issue #25709: Fixed problem with in-place string concatenation and utf-8 cache.Serhiy Storchaka2015-12-031-0/+5
|\ \ | |/
| * Issue #25709: Fixed problem with in-place string concatenation and utf-8 cache.Serhiy Storchaka2015-12-031-0/+5
| |\
| | * Issue #25709: Fixed problem with in-place string concatenation and utf-8 cache.Serhiy Storchaka2015-12-031-0/+5
| | |
* | | merge 3.5 (#25630)Benjamin Peterson2015-11-151-0/+1
|\ \ \ | |/ /
| * | make the PyUnicode_FSConverter cleanup set the decrefed argument to NULL ↵Benjamin Peterson2015-11-151-0/+1
| | | | | | | | | | | | (closes #25630)
* | | Issue #24821: Refactor STRINGLIB(fastsearch_memchr_1char) and split it onSerhiy Storchaka2015-11-141-17/+16
| | | | | | | | | | | | | | | STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly without special preconditions.
* | | Issue #25523: Merge a-to-an corrections from 3.5.Serhiy Storchaka2015-11-021-2/+2
|\ \ \ | |/ /
| * | Issue #25523: Merge a-to-an corrections from 3.4.Serhiy Storchaka2015-11-021-2/+2
| |\ \ | | |/
| | * Issue #25523: Further a-to-an corrections.Serhiy Storchaka2015-11-021-2/+2
| | |
* | | Issue #25353: Optimize unicode escape and raw unicode escape encoders to useVictor Stinner2015-10-121-44/+69
| | | | | | | | | | | | the new _PyBytesWriter API.
* | | Writer APIs: use empty string singletonsVictor Stinner2015-10-121-9/+18
| | | | | | | | | | | | | | | Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the empty bytes/Unicode string if the string is empty.
* | | Optimize error handlers of ASCII and Latin1 encoders when the replacementVictor Stinner2015-10-091-32/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path
* | | Add _PyBytesWriter_WriteBytes() to factorize the codeVictor Stinner2015-10-091-5/+3
| | |
* | | _PyBytesWriter: simplify code to avoid "prealloc" parametersVictor Stinner2015-10-091-30/+28
| | | | | | | | | | | | | | | Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().
* | | Issue #25318: Fix backslashreplace()Victor Stinner2015-10-091-1/+1
| | | | | | | | | | | | Fix code to estimate the needed space.
* | | Issue #25318: Avoid sprintf() in backslashreplace()Victor Stinner2015-10-091-7/+18
| | | | | | | | | | | | | | | | | | Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors(). Add also unit tests for non-BMP characters.
* | | Issue #25318: Move _PyBytesWriter to bytesobject.cVictor Stinner2015-10-091-210/+0
| | | | | | | | | | | | Declare also the private API in bytesobject.h.
* | | Optimize backslashreplace error handlerVictor Stinner2015-10-091-49/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.
* | | Issue #25318: Add _PyBytesWriter APIVictor Stinner2015-10-091-69/+247
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.
* | | Issue #25301: Fix compatibility with ISO C90Victor Stinner2015-10-051-1/+5
| | |
* | | Issue #25301: The UTF-8 decoder is now up to 15 times as fast for errorVictor Stinner2015-10-051-9/+39
| | | | | | | | | | | | handlers: ``ignore``, ``replace`` and ``surrogateescape``.
* | | Fix _PyUnicodeWriter_PrepareKind()Victor Stinner2015-10-031-7/+18
| | | | | | | | | | | | | | | | | | Initialize kind to 0 (PyUnicode_WCHAR_KIND) to ensure that _PyUnicodeWriter_PrepareKind() handles correctly read-only buffer: copy the buffer.
* | | Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:Serhiy Storchaka2015-10-021-9/+12
|\ \ \ | |/ / | | | | | | | | | | | | | | | 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
| * | Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:Serhiy Storchaka2015-10-021-9/+12
| |\ \ | | |/ | | | | | | | | | | | | | | | 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
| | * Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:Serhiy Storchaka2015-10-021-9/+12
| | | | | | | | | | | | | | | 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate.
* | | Make _PyUnicode_TranslateCharmap() symbol privateVictor Stinner2015-10-011-1/+1
| | | | | | | | | | | | unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().
* | | Issue #25267: The UTF-8 encoder is now up to 75 times as fast for errorVictor Stinner2015-10-011-2/+5
| | | | | | | | | | | | | | | handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.
* | | Optimize ascii/latin1+surrogateescape encodersVictor Stinner2015-09-291-0/+16
| | | | | | | | | | | | | | | | | | | | | Issue #25227: Optimize ASCII and latin1 encoders with the ``surrogateescape`` error handler: the encoders are now up to 3 times as fast. Initial patch written by Serhiy Storchaka.
* | | Issue #25227: Cleanup unicode_encode_ucs1() error handlerVictor Stinner2015-09-241-9/+13
| | | | | | | | | | | | | | | | | | | | | * Change limit type from unsigned int to Py_UCS4, to use the same type than the "ch" variable (an Unicode character). * Reuse ch variable for _Py_ERROR_XMLCHARREFREPLACE * Add some newlines for readability
* | | Issue #24870: revert unwanted changeVictor Stinner2015-09-221-43/+9
| | | | | | | | | | | | Sorry, I pushed the patch on the UTF-8 decoder by mistake :-(
* | | Issue #25207, #14626: Fix my commit.Victor Stinner2015-09-221-9/+43
| | | | | | | | | | | | | | | It doesn't work to use #define XXX defined(YYY)" and then "#ifdef XXX" to check YYY.