| Commit message (Collapse) | Author | Age | Files | Lines |
| |\
| |
| |
| | |
in places where Py_DECREF was used.
|
| | |
| |
| |
| | |
in places where Py_DECREF was used.
|
| |\ \
| |/ |
|
| | | |
|
| |\ \
| |/
| |
| |
| |
| | |
Affected classes are generic sequence iterators, iterators of str, bytes,
bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding
views and os.scandir() iterator.
|
| | |
| |
| |
| |
| |
| | |
Affected classes are generic sequence iterators, iterators of str, bytes,
bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding
views and os.scandir() iterator.
|
| |\ \
| |/ |
|
| | |
| |
| |
| | |
Initialize i variable if the string is non-ASCII.
|
| |\ \
| |/ |
|
| | |
| |
| |
| |
| |
| | |
Issue #26464: Fix str.translate() when string is ASCII and first replacements
removes character, but next replacement uses a non-ASCII character or a string
longer than 1 character. Regression introduced in Python 3.5.0.
|
| |\ \
| |/ |
|
| | |
| |
| |
| |
| | |
Issue #26217: resize_compact() must set wstr_length to 0 after freeing the wstr
string. Otherwise, an assertion fails in _PyUnicode_CheckConsistency().
|
| |\ \
| |/
| |
| |
| | |
This patch is manually crafted and contains changes that couldn't be handled
automatically.
|
| | |
| |
| |
| |
| | |
This patch is manually crafted and contains changes that couldn't be handled
automatically.
|
| | |
| |
| |
| | |
private functions.
|
| | | |
|
| |\ \
| |/
| |
| | |
macro Py_SETREF.
|
| | |
| |
| |
| | |
macro Py_SETREF.
|
| | |
| |
| |
| | |
Reported by Alexander Riccio.
|
| |\ \
| |/ |
|
| | |\ |
|
| | | | |
|
| |\ \ \
| |/ / |
|
| | | |
| | |
| | |
| | | |
(closes #25630)
|
| | | |
| | |
| | |
| | |
| | | |
STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly
without special preconditions.
|
| |\ \ \
| |/ / |
|
| | |\ \
| | |/ |
|
| | | | |
|
| | | |
| | |
| | |
| | | |
the new _PyBytesWriter API.
|
| | | |
| | |
| | |
| | |
| | | |
Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the
empty bytes/Unicode string if the string is empty.
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual
character.
Cleanup unicode_encode_ucs1():
* Rename repunicode to rep
* Clear rep object on error
* Factorize code between bytes and unicode path
|
| | | | |
|
| | | |
| | |
| | |
| | |
| | | |
Substract preallocate bytes from min_size before calling
_PyBytesWriter_Prepare().
|
| | | |
| | |
| | |
| | | |
Fix code to estimate the needed space.
|
| | | |
| | |
| | |
| | |
| | |
| | | |
Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors().
Add also unit tests for non-BMP characters.
|
| | | |
| | |
| | |
| | | |
Declare also the private API in bytesobject.h.
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in
UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and
Latin1 encoders.
Use the new _PyBytesWriter API to optimize these error handlers for the
encoders. It avoids to create an exception and call the slow implementation of
the error handler.
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a new private API to optimize Unicode encoders. It uses a small buffer
allocated on the stack and supports overallocation.
Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable
overallocation for the UTF-8 encoder with error handlers.
unicode_encode_ucs1(): initialize collend to collstart+1 to not check the
current character twice, we already know that it is not ASCII.
|
| | | | |
|
| | | |
| | |
| | |
| | | |
handlers: ``ignore``, ``replace`` and ``surrogateescape``.
|
| | | |
| | |
| | |
| | |
| | |
| | | |
Initialize kind to 0 (PyUnicode_WCHAR_KIND) to ensure that
_PyUnicodeWriter_PrepareKind() handles correctly read-only buffer: copy the
buffer.
|
| |\ \ \
| |/ /
| | |
| | |
| | |
| | |
| | | |
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
|
| | |\ \
| | |/
| | |
| | |
| | |
| | |
| | | |
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
|
| | | |
| | |
| | |
| | |
| | | |
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
|
| | | |
| | |
| | |
| | | |
unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().
|
| | | |
| | |
| | |
| | |
| | | |
handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka.
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | | |
Issue #25227: Optimize ASCII and latin1 encoders with the ``surrogateescape``
error handler: the encoders are now up to 3 times as fast.
Initial patch written by Serhiy Storchaka.
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | | |
* Change limit type from unsigned int to Py_UCS4, to use the same type than the
"ch" variable (an Unicode character).
* Reuse ch variable for _Py_ERROR_XMLCHARREFREPLACE
* Add some newlines for readability
|
| | | |
| | |
| | |
| | | |
Sorry, I pushed the patch on the UTF-8 decoder by mistake :-(
|
| | | |
| | |
| | |
| | |
| | | |
It doesn't work to use #define XXX defined(YYY)" and then "#ifdef XXX"
to check YYY.
|