summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPauli Virtanen <pav@iki.fi>2009-12-06 12:21:15 +0000
committerPauli Virtanen <pav@iki.fi>2009-12-06 12:21:15 +0000
commit03f0ace600624ffbb13467e6fdf4e112fc6d714f (patch)
tree7124eda182cebd954715426194416156a1d4fcde
parent5f8e25452e8aff0f74b3cbf8aca4e87c8c41cc23 (diff)
downloadnumpy-03f0ace600624ffbb13467e6fdf4e112fc6d714f.tar.gz
doc: update Py3K notes
-rw-r--r--doc/Py3K.txt780
1 files changed, 475 insertions, 305 deletions
diff --git a/doc/Py3K.txt b/doc/Py3K.txt
index 631b5ffb7..92918f27b 100644
--- a/doc/Py3K.txt
+++ b/doc/Py3K.txt
@@ -1,10 +1,17 @@
-******************************************
-Notes on making the transition to Python 3
-******************************************
+.. -*-rst-*-
+
+*********************************************
+Developer notes on the transition to Python 3
+*********************************************
+
+:date: 2009-12-05
+:author: Charles R. Harris
+:author: Pauli Virtanen
General
=======
+
Resources
---------
@@ -13,12 +20,15 @@ Information on porting to 3K:
- http://wiki.python.org/moin/cporting
- http://wiki.python.org/moin/PortingExtensionModulesToPy3k
+
Git trees
---------
- http://github.com/pv/numpy-work/commits/py3k
+- http://github.com/cournape/numpy/commits/py3_bootstrap
- http://github.com/illume/numpy3k/commits/work
+
Prerequisites
-------------
@@ -28,184 +38,501 @@ compatible version. Its 3K SVN branch, however, works quite well:
- http://python-nose.googlecode.com/svn/branches/py3k
-Semantic changes
-================
+Known semantic changes on Py2
+=============================
+
+As a side effect, the Py3 adaptation has caused the following semantic
+changes that are visible on Py2.
+
+* There are no known semantic changes.
-We make the following semantic changes:
-* division: integer division is by default true_divide, also for arrays
+Known semantic changes on Py3
+=============================
-* Unicode field names are no longer supported in Py2,
- and Byte field names will not be supported in Py3.
+The following semantic changes have been made on Py3:
+
+* Division: integer division is by default true_divide, also for arrays.
+
+* Dtype field names are Unicode.
+
+* Only unicode dtype field titles are included in fields dict.
+
+.. todo::
+
+ Check for any other changes ... This we want in the end to include
+ in the release notes, and also in a "how to port" document.
Python code
===========
-What we do now
---------------
-
2to3 in setup.py
+----------------
+
+Currently, setup.py calls 2to3 automatically to convert Python sources
+to Python 3 ones, and stores the results under::
+
+ build/py3k
+
+Only changed files will be re-converted when setup.py is called a second
+time, making development much faster.
+
+Currently, this seems to handle most (all?) of the necessary Python
+code conversion.
+
+Not all of the 2to3 transformations are appropriate for all files.
+Especially, 2to3 seems to be quite trigger-happy in replacing e.g.
+``unicode`` by ``str`` which causes problems in ``defchararray.py``.
+For files that need special handling, add entries to
+``tools/py3tool.py``.
- Currently, setup.py calls 2to3 automatically to convert Python sources
- to Python 3 ones, and stores the results under::
+.. todo::
- build/py3k
+ Should we be a good citizen and use ``lib2to3`` instead?
- Only changed files will be re-converted when setup.py is called a second
- time, making development much faster.
+.. todo::
+
+ Do we want to get rid of this hack in the long run?
- Currently, this seems to handle most (all?) of the necessary Python
- code conversion.
numpy.compat.py3k
+-----------------
- There are some utility functions needed for 3K compatibility in
- ``numpy.compat.py3k`` -- they can be imported from ``numpy.compat``.
- More can be added as needed.
+There are some utility functions needed for 3K compatibility in
+``numpy.compat.py3k`` -- they can be imported from ``numpy.compat``:
+- bytes: bytes constructor
+- asbytes: convert string to bytes (no-op on Py2)
+- getexception: get current exception (see below)
+- isfileobj: detect Python file objects
-Syntax changes
---------------
+More can be added as needed.
+
+
+Exception syntax
+----------------
-Code that wants to cater for both Python2 and Python3 needs to take
-at least the following into account:
+Syntax change: "except FooException, bar:" -> "except FooException as bar:"
-1) "except FooException, bar:" -> "except FooException as bar:"
+Code that wants to cater both for Py2 and Py3 should do something like::
-2) "from localmodule import foo"
+ try:
+ spam
+ except SpamException:
+ exc = getexception()
- Syntax for relative imports has changed and is incompatible between
- Python 2.4 and Python 3. The only way seems to use absolute imports
- throughout.
+This is taken care also by 2to3, however.
-3) "print foo, bar" -> "print(foo, bar)"
- Print is no longer a statement.
+Relative imports
+----------------
+
+The new relative import syntax,
+
+ from . import foo
+
+is not available on Py2.4, so we can't simply use it.
+
+Using absolute imports everywhere is probably OK, if they just happen
+to work.
+
+2to3, however, converts the old syntax to new syntax, so as long as we
+use the hack, it takes care of most parts.
+
+
+Print
+-----
+
+The Print statement changed to a builtin function in Py3.
+
+Probably can generally be replaced by something like::
+
+ print("%s %s %s" % (a, b, c))
+
+and in any case, there shouldn't be many print() statements in a
+library as low-level as Numpy is. When writing to a file, `file.write`
+should also be preferred to print.
+
+types module
+------------
+
+The following items were removed from `types` module in Py3:
+
+- StringType (Py3: `bytes` is equivalent, to some degree)
+- InstanceType (Py3: ???)
+- IntType (Py3: no equivalent)
+- LongType (Py3: equivalent `long`)
+- FloatType (Py3: equivalent `float`)
+- BooleanType (Py3: equivalent `bool`)
+- ComplexType (Py3: equivalent `complex`)
+- UnicodeType (Py3: equivalent `str`)
+- BufferType (Py3: more-or-less equivalent `memoryview`)
+
+In ``numerictypes.py``, the "common" types were replaced by their
+plain equivalents, and `IntType` was dropped.
+
+.. todo::
+
+ BufferType should probably be replaced with `memoryview` in most places.
+ This was currently changed in a couple of places.
C Code
======
-What has been done so far, and some known TODOs
------------------------------------------------
+
+NPY_PY3K
+--------
+
+A #define in config.h, defined when building for Py3.
+
+.. todo::
+
+ Currently, this is generated as a part of the config.
+ Is this sensible (we could also use PY_MAJOR_VERSION)?
+
private/npy_3kcompat.h
+----------------------
- Convenience macros for Python 3 support.
- New ones that need to be added should be added in this file.
+Convenience macros for Python 3 support:
-ob_type etc.
+- PyInt -> PyLong on Py3
+- PyString -> PyBytes on Py3
+- PyUString -> PyUnicode on Py3 and PyString on Py2
+- PyBytes on Py3
+- Py_SIZE et al., for older Python versions
+- PyFile compatibility on Py3
+- PyObject_Cmp, convenience comparison function on Py3
+
+Any new ones that need to be added should be added in this file.
+
+.. todo::
+
+ Remove PyString_* eventually -- having a call to one of these in Numpy
+ sources is a sign of an error...
+
+
+ob_type, ob_size
+----------------
+
+These use Py_SIZE, etc. macros now. The macros are also defined in
+npy_3kcompat.h for the Python versions that don't have them natively.
- These use Py_SIZE, etc. macros now. The macros are also defined in
- npy_3kcompat.h for the Python versions that don't have them natively.
PyNumberMethod
+--------------
+
+The structures have been converted to the new format:
+
+- number.c
+- scalartypes.c.src
+- scalarmathmodule.c.src
+
+The slots np_divide, np_long, np_oct, np_hex, and np_inplace_divide
+have gone away. The slot np_int is what np_long used to be, tp_divide
+is now tp_floor_divide, and np_inplace_divide is now
+np_inplace_floor_divide.
+
+These have simply been #ifdef'd out on Py3.
+
+.. todo::
+
+ Check if semantics of the methods have changed
+
+.. todo::
+
+ We will also have to make sure the
+ *_true_divide variants are defined. This should also be done for
+ python < 3.x, but that introduces a requirement for the
+ Py_TPFLAGS_HAVE_CLASS in the type flag.
+
+
+PyBuffer
+--------
+
+PyBuffer usage is widely spread in multiarray:
+
+1) The void scalar makes use of buffers
+2) Multiarray has methods for creating buffers etc. explicitly
+3) Arrays can be created from buffers etc.
+4) The .data attribute of an array is a buffer
+
+Py3 introduces the PEP 3118 buffer protocol as the *only* protocol,
+so we must implement it.
+
+The exporter parts of the PEP 3118 buffer protocol are currently
+implemented in ``buffer.c`` for arrays, and in ``scalartypes.c.src``
+for generic array scalars. The generic array scalar exporter, however,
+doesn't currently produce format strings, which needs to be fixed.
+
+Currently, the format string and some of the memory is cached in the
+PyArrayObject structure. This is partly needed because of Python bug #7433.
+
+From the consumer side, the new buffer protocol is mostly backward
+compatible with the old one, so little needs to be done here to retain
+basic functionality. However, we *do* want to make use of the new
+features, at least in `multiarray.frombuffer` and maybe in `multiarray.array`.
- The structures have been converted to the new format.
+Since there is a native buffer object in Py3, the `memoryview`, the
+`newbuffer` and `getbuffer` functions are removed from `multiarray` in
+Py3: their functionality is taken over by the new `memoryview` object.
- TODO: check if semantics of the methods have changed
+.. todo::
-PyBuffer_*
+ Implement support for consuming new buffer objects.
+ Probably in multiarray.frombuffer? Perhaps also in multiarray.array?
- These parts have been replaced with stub code, marked by #warning XXX
+.. todo::
- TODO: implement the new buffer protocol: for scalars and arrays
+ make ndarray shape and strides natively Py_ssize_t
- - generate format strings from dtype
- - parse format strings?
- - Py_Ssize_t for strides and shape?
+.. todo::
+
+ Revise the decision on where to cache the format string -- dtype
+ would be a better place for this.
+
+.. todo::
+
+ There's some buffer code in numarray/_capi.c that needs to be addressed.
+
+.. todo::
+
+ Does altering the PyArrayObject structure require bumping the ABI?
- TODO: decide what to do with the fact that PyMemoryView object is not
- stand-alone. Do we need a separate "dummy" object?
PyString
+--------
+
+There is no PyString in Py3, everything is either Bytes or Unicode.
+Unicode is also preferred in many places, e.g., in __dict__.
+
+There are two issues related to the str/bytes change:
+
+1) Return values etc. should prefer unicode
+2) The 'S' dtype
+
+This entry discusses return values etc. only, the 'S' dtype is a
+separate topic.
+
+All uses of PyString in Numpy should be changed to one of
+
+- PyBytes: one-byte character strings in Py2 and Py3
+- PyUString (defined in npy_3kconfig.h): PyString in Py2, PyUnicode in Py3
+- PyUnicode: UCS in Py2 and Py3
+
+In many cases the conversion only entails replacing PyString with
+PyUString.
+
+PyString is currently defined to PyBytes in npy_3kcompat.h, for making
+things to build. This definition will be removed when Py3 support is
+finished.
+
+Where *_AsStringAndSize is used, more care needs to be taken, as
+encoding Unicode to Bytes may needed. If this cannot be avoided, the
+encoding should be ASCII, unless there is a very strong reason to do
+otherwise. Especially, I don't believe we should silently fall back to
+UTF-8 -- raising an exception may be a better choice.
+
+Exceptions should use PyUnicode_AsUnicodeEscape -- this should result
+to an ASCII-clean string that is appropriate for the exception
+message.
+
+Some specific decisions that have been made so far:
+
+* descriptor.c: dtype field names are UString
+
+ At some places in Numpy code, there are some guards for Unicode field
+ names. However, the dtype constructor accepts only strings as field names,
+ so we should assume field names are *always* UString.
- PyString is currently defined to PyBytes in npy_3kcompat.h.
- This definition will go away in the end.
+* descriptor.c: field titles can be arbitrary objects.
+ If they are UString (or, on Py2, Bytes or Unicode), insert to fields dict.
- All instances of PyString must be converted to one of:
+* descriptor.c: dtype strings are Unicode.
- - PyBytes: byte character strings in Py2 and Py3
- - PyUnicode: unicode strings in Py2 and Py3
- - PyUString: unicode in Py3, byte string in Py2
+* descriptor.c: datetime tuple contains Bytes only.
- Decisions:
+* repr() and str() should return UString
- * field names are UString
+* comparison between Unicode and Bytes is not defined in Py3
- * field titles can be arbitrary objects.
- If they are Unicode, insert to fields dict.
+* Type codes in numerictypes.typeInfo dict are Unicode
- * dtype strings are Unicode.
+* Func name in errobj is Bytes (should be forced to ASCII)
- * datetime tuple contains Unicode.
+.. todo::
- * Exceptions should preferably be ASCII-only -> use AsUnicodeEscape
+ tp_doc -- it's a char* pointer, but what is the encoding?
+ Check esp. lib/src/_compiled_base
+.. todo::
- TODO: Are exception strings bytes or unicode? What about tp_doc?
+ ufunc names -- again, what's the encoding?
- Fix lib/src/_compiled_base accordingly.
+.. todo::
- TODO: I have a feeling that we should avoid PyUnicode_AsUTF8EncodedString
- wherever possible...
+ Replace all occurrences of PyString by PyBytes, PyUnicode, or PyUString.
- TODO: Replace all occurrences of String by Bytes, Unicode or UString,
- to ensure that we have made a conscious choice for each case in Py3K.
+.. todo::
- #define PyBytes -> PyString for Python 2 in npy_3kcompath.h
+ Finally, remove the convenience PyString #define from npy_3kcompat.h
- Finally remove the PyString -> PyBytes defines from npy_3kcompat.h
- This is probably the *easiest* way to make sure all of
- the string/unicode transition has been audited.
+.. todo::
+
+ Revise errobj decision?
+
+.. todo::
+
+ Check that non-UString field names are not accepted anywhere.
+
+
+PyUnicode
+---------
+
+PyUnicode in Py3 is pretty much as it was in Py2, except that it is
+now the only "real" string type.
+
+In Py3, Unicode and Bytes are not comparable, ie., 'a' != b'a'. Numpy
+comparison routines were handled to act in the same way, leaving
+comparison between Unicode and Bytes undefined.
+
+.. todo::
+
+ Check that indeed all comparison routines were changed.
+
+
+Fate of the 'S' dtype
+---------------------
+
+"Strings" in Py3 are now Unicode, so it would make sense to
+re-associate Numpy's dtype letter 'S' with Unicode, and introduce
+a separate letter for Bytes.
+
+The Bytes dtype can probably not be wholly dropped -- there may be
+some use for 1-byte character strings in e.g. genetics?
+
+.. todo::
+
+ 'S' dtype should be aliased to 'U'. One of the two should be deprecated.
+
+.. todo::
+
+ All dtype code should be checked for usage of *_STRINGLTR.
+
+.. todo::
+
+ A new 'bytes' dtype? Should the type code be 'y'
+
+.. todo::
+
+ Catch all worms that come out of the can because of this change.
+ In any case, I guess many of the current failures in our test suite
+ are because code 'S' does not correspond to the `str` type.
+
+.. todo::
+
+ Currently, in parts of the code, both Bytes and Unicode strings
+ are classified as "strings", and share some of the code paths.
+
+ It should probably be checked if preferring Unicode for Py3 requires
+ changing some of these parts.
- The String/Unicode transition is simply too dangerous to handle
- by a blanket replacement.
PyInt
+-----
+
+There is no limited-range integer type any more in Py3. It makes no
+sense to inherit Numpy ints from Py3 ints.
+
+Currently, the following is done:
+
+1) Numpy's integer types no longer inherit from Python integer.
+2) int is taken dtype-equivalent to NPY_LONG
+3) ints are converted to NPY_LONG
+
+PyInt methods are currently replaced by PyLong, via macros in npy_3kcompat.h.
+
+Dtype decision rules were changed accordingly, so that Numpy understands
+Py3 int translate to NPY_LONG as far as dtypes are concerned.
- PyInt is currently replaced by PyLong, via macros in npy_3kcompat.h
+.. todo::
- Dtype decision rules were changed accordingly, so Numpy understands
- Python int to be dtype-compatible with NPY_LONG.
+ Decide on
- TODO: Decide on
+ * what is: array([1]).dtype
+ * what is: array([2**40]).dtype
+ * what is: array([2**256]).dtype
+ * what is: array([1]) + 2**40
+ * what is: array([1]) + 2**256
- ... what is: array([1]).dtype
- ... what is: array([2**40]).dtype
- ... what is: array([2**256]).dtype
- ... what is: array([1]) + 2**40
- ... what is: array([1]) + 2**256
+ ie. dtype casting rules. It seems to <pv> that we will want to
+ fix the dtype of Python 3 int to be the machine integer size,
+ despite the fact that the actual Python 3 object is not fixed-size.
- ie. dtype casting rules. It seems to <pv> that we will want to
- fix the dtype of Python 3 int to be the machine integer size,
- despite the fact that the actual Python 3 object is not fixed-size.
+.. todo::
+
+ Audit the automatic dtype decision -- did I plug all the cases?
- TODO: Audit the automatic dtype decision -- did I plug all the cases?
Divide
+------
+
+The Divide operation is no more.
+
+Calls to PyNumber_Divide were replaced by FloorDivide or TrueDivide,
+as appropriate.
+
+The PyNumberMethods entry is #ifdef'd out on Py3, see above.
+
- The Divide operation is no more.
+tp_compare, PyObject_Compare
+----------------------------
- So we change array(1) / 10 == array(0.1)
+The compare method has vanished, and is replaced with richcompare.
+We just #ifdef the compare methods out on Py3.
-tp_compare
+New richcompare methods were implemented for:
- The compare method has vanished.
+* flagsobject.c
- TODO: ensure that all types that had only tp_compare have also
- tp_richcompare.
+On the consumer side, we have a convenience wrapper in npy_3kcompat.h
+providing PyObject_Cmp also on Py3.
-pickles
+.. todo::
- It is not possible to support Python 2 pickles in Python 3.
+ Ensure that all types that had only tp_compare have also
+ tp_richcompare.
- This is because Python 2 strings pickle to Python 3 unicode objects,
- which causes problems if there are non-ascii characters there.
- The errors are raised before reaching any Numpy code, so we just can't
- preserve backwards compatibility here.
+
+Pickling
+--------
+
+The ndarray and dtype __setstate__ were modified to be
+backward-compatible with Py3: they need to accept a Unicode endian
+character, and Unicode data since that's what Py2 str is unpickled to
+in Py3.
+
+An encoding assumption is required for backward compatibility: the user
+must do
+
+ loads(f, encoding='latin1')
+
+to successfully read pickles created by Py2.
+
+.. todo::
+
+ Forward compatibility? Is it even possible?
+ For sure, we are not knowingly going to store data in PyUnicode,
+ so probably the only way for forward compatibility is to implement
+ a custom Unpickler for Py2?
+
+.. todo::
+
+ If forward compatibility is not possible, aim to store also the endian
+ character as Bytes...
PyTypeObject
@@ -218,11 +545,6 @@ keep in mind.
1) Because the first three slots are now part of a struct some compilers issue
warnings if they are initialized in the old way.
- In practice, it is necessary to use the Py_TYPE, Py_SIZE, Py_REFCNT
- macros instead of accessing ob_type, ob_size and ob_refcnt
- directly. These are defined for backward compatibility in
- private/npy_3kcompat.h
-
2) The compare slot has been made reserved in order to preserve binary
compatibily while the tp_compare function went away. The tp_richcompare
function has replaced it and we need to use that slot instead. This will
@@ -233,142 +555,22 @@ keep in mind.
bogus. They are not supposed to be explicitly initialized and were out of
place in any case because an extra base slot was added in python 2.6.
-Because of these facts it was thought better to use #ifdefs to bring the old
-initializers up to py3k snuff rather than just fill the tp_richcompare slot.
-They also serve to mark the places where changes have been made. The new form
-is shown below. Note that explicit initialization can stop once none of the
+Because of these facts it is better to use #ifdefs to bring the old
+initializers up to py3k snuff rather than just fill the tp_richcompare
+slot. They also serve to mark the places where changes have been
+made. Note that explicit initialization can stop once none of the
remaining entries are non-zero, because zero is the default value that
variables with non-local linkage receive.
-NPY_NO_EXPORT PyTypeObject Foo_Type = {
-#if defined(NPY_PY3K)
- PyVarObject_HEAD_INIT(0,0)
-#else
- PyObject_HEAD_INIT(0)
- 0, /* ob_size */
-#endif
- "numpy.foo" /* tp_name */
- 0, /* tp_basicsize */
- 0, /* tp_itemsize */
- /* methods */
- 0, /* tp_dealloc */
- 0, /* tp_print */
- 0, /* tp_getattr */
- 0, /* tp_setattr */
-#if defined(NPY_PY3K)
- (void *)0, /* tp_reserved */
-#else
- 0, /* tp_compare */
-#endif
- 0, /* tp_repr */
- 0, /* tp_as_number */
- 0, /* tp_as_sequence */
- 0, /* tp_as_mapping */
- 0, /* tp_hash */
- 0, /* tp_call */
- 0, /* tp_str */
- 0, /* tp_getattro */
- 0, /* tp_setattro */
- 0, /* tp_as_buffer */
- 0, /* tp_flags */
- 0, /* tp_doc */
- 0, /* tp_traverse */
- 0, /* tp_clear */
- 0, /* tp_richcompare */
- 0, /* tp_weaklistoffset */
- 0, /* tp_iter */
- 0, /* tp_iternext */
- 0, /* tp_methods */
- 0, /* tp_members */
- 0, /* tp_getset */
- 0, /* tp_base */
- 0, /* tp_dict */
- 0, /* tp_descr_get */
- 0, /* tp_descr_set */
- 0, /* tp_dictoffset */
- 0, /* tp_init */
- 0, /* tp_alloc */
- 0, /* tp_new */
- 0, /* tp_free */
- 0, /* tp_is_gc */
- 0, /* tp_bases */
- 0, /* tp_mro */
- 0, /* tp_cache */
- 0, /* tp_subclasses */
- 0, /* tp_weaklist */
- 0, /* tp_del */
- 0 /* tp_version_tag (2.6) */
-};
-
-checklist of types having tp_compare but no tp_richcompare
-
-1) multiarray/flagsobject.c
-
-PyNumberMethods
----------------
-
-Types with tp_as_number defined
-
-1) multiarray/arrayobject.c
-
-The slots np_divide, np_long, np_oct, np_hex, and np_inplace_divide
-have gone away. The slot np_int is what np_long used to be, tp_divide
-is now tp_floor_divide, and np_inplace_divide is now
-np_inplace_floor_divide. We will also have to make sure the
-*_true_divide variants are defined. This should also be done for
-python < 3.x, but that introduces a requirement for the
-Py_TPFLAGS_HAVE_CLASS in the type flag.
-
-/*
- * Number implementations must check *both* arguments for proper type and
- * implement the necessary conversions in the slot functions themselves.
-*/
-PyNumberMethods foo_number_methods = {
- (binaryfunc)0, /* nb_add */
- (binaryfunc)0, /* nb_subtract */
- (binaryfunc)0, /* nb_multiply */
- (binaryfunc)0, /* nb_remainder */
- (binaryfunc)0, /* nb_divmod */
- (ternaryfunc)0, /* nb_power */
- (unaryfunc)0, /* nb_negative */
- (unaryfunc)0, /* nb_positive */
- (unaryfunc)0, /* nb_absolute */
- (inquiry)0, /* nb_bool, nee nb_nonzero */
- (unaryfunc)0, /* nb_invert */
- (binaryfunc)0, /* nb_lshift */
- (binaryfunc)0, /* nb_rshift */
- (binaryfunc)0, /* nb_and */
- (binaryfunc)0, /* nb_xor */
- (binaryfunc)0, /* nb_or */
- (unaryfunc)0, /* nb_int */
- (void *)0, /* nb_reserved, nee nb_long */
- (unaryfunc)0, /* nb_float */
- (binaryfunc)0, /* nb_inplace_add */
- (binaryfunc)0, /* nb_inplace_subtract */
- (binaryfunc)0, /* nb_inplace_multiply */
- (binaryfunc)0, /* nb_inplace_remainder */
- (ternaryfunc)0, /* nb_inplace_power */
- (binaryfunc)0, /* nb_inplace_lshift */
- (binaryfunc)0, /* nb_inplace_rshift */
- (binaryfunc)0, /* nb_inplace_and */
- (binaryfunc)0, /* nb_inplace_xor */
- (binaryfunc)0, /* nb_inplace_or */
- (binaryfunc)0, /* nb_floor_divide */
- (binaryfunc)0, /* nb_true_divide */
- (binaryfunc)0, /* nb_inplace_floor_divide */
- (binaryfunc)0, /* nb_inplace_true_divide */
- (unaryfunc)0 /* nb_index */
-};
-
PySequenceMethods
-----------------
Types with tp_as_sequence defined
-1) multiarray/descriptor.c
-2) multiarray/scalartypes.c.src
-3) multiarray/arrayobject.c
+* multiarray/descriptor.c
+* multiarray/scalartypes.c.src
+* multiarray/arrayobject.c
PySequenceMethods in py3k are binary compatible with py2k, but some of the
slots have gone away. I suspect this means some functions need redefining so
@@ -387,16 +589,21 @@ PySequenceMethods foo_sequence_methods = {
(ssizeargfunc)0 /* sq_inplace_repeat */
};
+.. todo::
+
+ Check semantics of the PySequence methods.
+
+
PyMappingMethods
----------------
Types with tp_as_mapping defined
-1) multiarray/descriptor.c
-2) multiarray/iterators.c
-3) multiarray/scalartypes.c.src
-4) multiarray/flagsobject.c
-5) multiarray/arrayobject.c
+* multiarray/descriptor.c
+* multiarray/iterators.c
+* multiarray/scalartypes.c.src
+* multiarray/flagsobject.c
+* multiarray/arrayobject.c
PyMappingMethods in py3k look to be the same as in py2k. The semantics
of the slots needs to be checked.
@@ -407,47 +614,10 @@ PyMappingMethods foo_mapping_methods = {
(objobjargproc)0 /* mp_ass_subscript */
};
+.. todo::
-PyBuffer
---------
-
-Parts involving the PyBuffer_* likely require the most work, and they
-are widely spread in multiarray:
-
-1) The void scalar makes use of buffers
-2) Multiarray has methods for creating buffers etc. explicitly
-3) Arrays can be created from buffers etc.
-4) The .data attribute of an array is a buffer
-
-There are two things to note in 3K:
-
-1) The buffer protocol has changed. It is also now quite complicated,
- and implementing it properly requires several pieces.
-
-2) There is no PyBuffer object any more. Instead, a MemoryView
- object is present, but it always must piggy-pack on another existing
- object.
-
-Currently, what has been done is:
-
-1) Replace protocol implementations with stubs that either raise errors
- or offer limited functionality.
+ Check semantics of the PyMapping methods.
-2) Replace PyBuffer usage by PyMemoryView where possible.
-
-3) ... and where not possible, use stubs that raise errors.
-
-What likely needs to be done is:
-
-1) Implement a simple "stub" compatibility buffer object
- the memoryview can piggy-pack on.
-
-
-PyNumber_Divide
----------------
-
-This function has vanished -- needs to be replaced with PyNumber_TrueDivide
-or FloorDivide.
PyFile
------
@@ -458,50 +628,37 @@ Many of the PyFile items have disappeared:
2) PyFile_AsFile
3) PyFile_FromString
-Compatibility wrappers for these are now in private/npy_3kcompat.h
+Most importantly, in Py3 there is no way to extract a FILE* pointer
+from the Python file object. There are, however, new PyFile_* functions
+for writing and reading data from the file.
+Temporary compatibility wrappers that return a `fdopen` file pointer
+are in private/npy_3kcompat.h. However, this is an unsatisfactory
+approach, since the FILE* pointer returned by `fdopen` cannot be freed
+as `fclose` on it would also close the underlying file.
-PyString
---------
-
-PyString was removed, and needs to be replaced either by PyBytes or PyUnicode.
-The plan of attack currently is:
-
-1) The 'string' array dtype will be replaced by Bytes
-2) The 'unicode' array dtype will stay Unicode
-3) dtype fields names can be *either* Bytes or Unicode
+.. todo::
-Some compatibility wrappers are defined in private/npy_3kcompat.h,
-redefining essentially String as Bytes.
+ Adapt all Numpy I/O to use the PyFile_* methods or the low-level
+ IO routines. In any case, it's unlikely that C stdio can be used any more.
-However, at least following points need still to be audited:
+ Perhaps using PyFile_* makes numpy.tofile e.g. to a gzip to work?
-1) PyObject_Str -> it now returns unicodes
-2) tp_doc -> char* string, but is it in unicode or what?
-
-RO
---
+READONLY
+--------
The RO alias for READONLY is no more.
+These were replaced, as READONLY is present also on Py2.
+
Py_TPFLAGS_CHECKTYPES
---------------------
This has vanished and is always on in Py3K.
-
-PyInt
------
-
-There is no limited-range integer type any more in Py3K.
-
-Currently, the plan is the following:
-
-1) Numpy's integer types no longer inherit from Python integer.
-2) Convert Longs to integers, if their size is small enough and known.
-3) Otherwise, use long longs.
+It is currently #ifdef'd out for Py3.
PyOS
@@ -512,3 +669,16 @@ Deprecations:
1) PyOS_ascii_strtod -> PyOS_double_from_string;
curiously enough, PyOS_ascii_strtod is not only deprecated but also
causes segfaults
+
+
+PyInstance
+----------
+
+There are some checks for PyInstance in ``common.c`` and ``ctors.c``.
+
+Currently, ``PyInstance_Check`` is just #ifdef'd out for Py3. This is,
+quite likely, not the correct thing to do.
+
+.. todo::
+
+ Do the right thing for PyInstance checks.