|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| | This was mostly written by Travis Oliphant.
I've inspected it all; Neal Norwitz and MvL have also looked at it
(in an earlier incarnation). | 
| | 
| 
| 
| | (reviewed by Neal Norwitz) | 
| | 
| 
| 
| 
| | Add (int) casts to silence compiler warnings.
Raise Python exceptions for overflows. | 
| | |  | 
| | |  | 
| | 
| 
| 
| | Convert Py_ssize_t using PyInt_FromSsize_t | 
| | |  | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In C++, it's an error to pass a string literal to a char* function
without a const_cast().  Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.
I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc.  Predictably, there were a large set of functions that
needed to be fixed as a result of these changes.  The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].
One cast was required as a result of the changes:  A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *. | 
| | 
| 
| 
| 
| 
| 
| 
| | [ 1327110 ] wrong TypeError traceback in generator expressions
by removing the code that can stomp on the users' TypeError raised by the
iterable argument to ''.join() -- PySequence_Fast (now?) gives a perfectly
reasonable message itself.  Also, a couple of tests. | 
| | 
| 
| 
| | Will backport | 
| | 
| 
| 
| | enabled. | 
| | 
| 
| 
| 
| | subclasses should be substituted as-is and not have tp_str called on
them. | 
| | 
| 
| 
| 
| | unicode instance if the argument is not an instance of basestring and
calling __str__ on the argument returns a unicode instance. | 
| | 
| 
| 
| | Hex longs now print with lowercase letters like their int counterparts. | 
| | 
| 
| 
| | * Speed-up str.count() by using memchr() to fly between first char matches. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | * Speed-up "x in y" where x has more than one character.
The existing code made excessive calls to the expensive memcmp() function.
The new code uses memchr() to rapidly find a start point for memcmp().
In addition to knowing that the first character is a match, the new code
also checks that the last character is a match.  This significantly reduces
the incidence of false starts (saving memcmp() calls and making quadratic
behavior less likely).
Improves the timings on:
    python -m timeit -r7 -s"x='a'*1000" "'ab' in x"
    python -m timeit -r7 -s"x='a'*1000" "'bc' in x"
Once this code has proven itself, then string_find_internal() should refer
to it rather than running its own version.  Also, something similar may
apply to unicode objects. | 
| | 
| 
| 
| | This should go on whatever bugfix branches the other fetches up on. | 
| | 
| 
| 
| 
| | _PyString_Resize() readied strings for mutation but did not invalidate
the cached hash value. | 
| | 
| 
| 
| 
| 
| 
| | (Patch contributed by Nick Coghlan.)
Now joining string subtypes will always return a string.
Formerly, if there were only one item, it was returned unchanged. | 
| | 
| 
| 
| 
| 
| | hack: it would resize *interned* strings in-place!  This occurred because
their reference counts do not have their expected value -- stringobject.c
hacks them.  Mea culpa. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | interning were not clear here -- a subclass could be mutable, for
example -- and had bugs.  Explicitly interning a subclass of string
via intern() will raise a TypeError.  Internal operations that attempt
to intern a string subclass will have no effect.
Added a few tests to test_builtin that includes the old buggy code and
verifies that calls like PyObject_SetAttr() don't fail.  Perhaps these
tests should have gone in test_string. | 
| | |  | 
| | 
| 
| 
| 
| | methods on string and unicode objects. Added unicode.decode()
which was missing for no apparent reason. | 
| | 
| 
| 
| | places it's just noise. | 
| | |  | 
| | 
| 
| 
| | separaters on str.split() and str.rsplit(). | 
| | 
| 
| 
| 
| 
| | bit by checking the value of UCHAR_MAX in Include/Python.h.  There was a
check in Objects/stringobject.c.  Remove that.  (Note that we don't define
UCHAR_MAX if it's not defined as the old test did.) | 
| | 
| 
| 
| 
| | SF feature request #801847.
Original patch is written by Sean Reifschneider. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | and left shifts.  (Thanks to Kalle Svensson for SF patch 849227.)
  This addresses most of the remaining semantic changes promised by
  PEP 237, except for repr() of a long, which still shows the trailing
  'L'.  The PEP appears to promise warnings for operations that
  changed semantics compared to Python 2.3, but this is not
  implemented; we've suffered through enough warnings related to
  hex/oct literals and I think it's best to be silent now. | 
| | |  | 
| | 
| 
| 
| | This closes SF bug #827260. | 
| | 
| 
| 
| | Backported to 2.3. | 
| | 
| 
| 
| 
| 
| | Adding missing support for '%F'.
Will backport to 2.3.1. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | * Doc - add doc for when functions were added
 * UserString
 * string object methods
 * string module functions
'chars' is used for the last parameter everywhere.
These changes will be backported, since part of the changes
have already been made, but they were inconsistent. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| | instead of raising a TypeError. (From SF patch #710127)
Add tests to verify this is fixed.
Add various tests for '%c' % int. | 
| | 
| 
| 
| 
| 
| 
| 
| | types.  The special handling for these can now be removed from save_newobj().
Add some testing for this.
Also add support for setting the 'fast' flag on the Python Pickler class,
which suppresses use of the memo. | 
| | 
| 
| 
| 
| 
| | and unicode
Patch by Christopher Blunck. | 
| | 
| 
| 
| 
| | a single character.  Shaves another 10% off the running time by avoiding
the lg2(N) loops and cache effects for the other cases. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Christian Tismer pointed out the high cost of the loop overhead and
function call overhead for 'c' * n where n is large.  Accordingly,
the new code only makes lg2(n) loops.
Interestingly, 'c' * 1000 * 1000 ran a bit faster with old code.  At some
point, the loop and function call overhead became cheaper than invalidating
the cache with lengthy memcpys.  But for more typical sizes of n, the new
code runs much faster and for larger values of n it runs only a bit slower. | 
| | 
| 
| 
| 
| | Python 2.2.x backport candidate. (This bug has been around since
Python 1.6.) | 
| | 
| 
| 
| 
| 
| 
| | Obtain cleaner coding and a system wide
performance boost by using the fast, pre-parsed
PyArg_Unpack function instead of PyArg_ParseTuple
function which is driven by a format string. | 
| | |  | 
| | |  | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | When mwh added extended slicing, strings and unicode became mappings.
Thus, dict was set which prevented an error when doing:
	newstr = 'format without a percent' % string_value
This fix raises an exception again when there are no formats
and % with a string value. | 
| | |  |