doc: update Py3K notes

author: Pauli Virtanen <pav@iki.fi> 2009-12-06 12:21:15 +0000
committer: Pauli Virtanen <pav@iki.fi> 2009-12-06 12:21:15 +0000
commit: 03f0ace600624ffbb13467e6fdf4e112fc6d714f (patch)
tree: 7124eda182cebd954715426194416156a1d4fcde
parent: 5f8e25452e8aff0f74b3cbf8aca4e87c8c41cc23 (diff)
download: numpy-03f0ace600624ffbb13467e6fdf4e112fc6d714f.tar.gz
1 files changed, 475 insertions, 305 deletions
diff --git a/doc/Py3K.txt b/doc/Py3K.txt
index 631b5ffb7..92918f27b 100644
--- a/doc/Py3K.txt
+++ b/doc/Py3K.txt
@@ -1,10 +1,17 @@
-******************************************
-Notes on making the transition to Python 3
-******************************************
+.. -*-rst-*-
+
+*********************************************
+Developer notes on the transition to Python 3
+*********************************************
+
+:date: 2009-12-05
+:author: Charles R. Harris
+:author: Pauli Virtanen
 
 General
 =======
 
+
 Resources
 ---------
 
@@ -13,12 +20,15 @@ Information on porting to 3K:
 - http://wiki.python.org/moin/cporting
 - http://wiki.python.org/moin/PortingExtensionModulesToPy3k
 
+
 Git trees
 ---------
 
 - http://github.com/pv/numpy-work/commits/py3k
+- http://github.com/cournape/numpy/commits/py3_bootstrap
 - http://github.com/illume/numpy3k/commits/work
 
+
 Prerequisites
 -------------
 
@@ -28,184 +38,501 @@ compatible version. Its 3K SVN branch, however, works quite well:
 - http://python-nose.googlecode.com/svn/branches/py3k
 
 
-Semantic changes
-================
+Known semantic changes on Py2
+=============================
+
+As a side effect, the Py3 adaptation has caused the following semantic
+changes that are visible on Py2.
+
+* There are no known semantic changes.
 
-We make the following semantic changes:
 
-* division: integer division is by default true_divide, also for arrays
+Known semantic changes on Py3
+=============================
 
-* Unicode field names are no longer supported in Py2,
-  and Byte field names will not be supported in Py3.
+The following semantic changes have been made on Py3:
+
+* Division: integer division is by default true_divide, also for arrays.
+
+* Dtype field names are Unicode.
+
+* Only unicode dtype field titles are included in fields dict.
+
+.. todo::
+
+   Check for any other changes ... This we want in the end to include
+   in the release notes, and also in a "how to port" document.
 
 
 Python code
 ===========
 
 
-What we do now
---------------
-
 2to3 in setup.py
+----------------
+
+Currently, setup.py calls 2to3 automatically to convert Python sources
+to Python 3 ones, and stores the results under::
+
+    build/py3k
+
+Only changed files will be re-converted when setup.py is called a second
+time, making development much faster.
+
+Currently, this seems to handle most (all?) of the necessary Python
+code conversion.
+
+Not all of the 2to3 transformations are appropriate for all files.
+Especially, 2to3 seems to be quite trigger-happy in replacing e.g.
+``unicode`` by ``str`` which causes problems in ``defchararray.py``.
+For files that need special handling, add entries to
+``tools/py3tool.py``.
 
-    Currently, setup.py calls 2to3 automatically to convert Python sources
-    to Python 3 ones, and stores the results under::
+.. todo::
 
-        build/py3k
+   Should we be a good citizen and use ``lib2to3`` instead?
 
-    Only changed files will be re-converted when setup.py is called a second
-    time, making development much faster.
+.. todo::
+
+   Do we want to get rid of this hack in the long run?
 
-    Currently, this seems to handle most (all?) of the necessary Python
-    code conversion.
 
 numpy.compat.py3k
+-----------------
 
-    There are some utility functions needed for 3K compatibility in
-    ``numpy.compat.py3k`` -- they can be imported from ``numpy.compat``.
-    More can be added as needed.
+There are some utility functions needed for 3K compatibility in
+``numpy.compat.py3k`` -- they can be imported from ``numpy.compat``:
 
+- bytes: bytes constructor
+- asbytes: convert string to bytes (no-op on Py2)
+- getexception: get current exception (see below)
+- isfileobj: detect Python file objects
 
-Syntax changes
---------------
+More can be added as needed.
+
+
+Exception syntax
+----------------
 
-Code that wants to cater for both Python2 and Python3 needs to take
-at least the following into account:
+Syntax change: "except FooException, bar:" -> "except FooException as bar:"
 
-1) "except FooException, bar:" -> "except FooException as bar:"
+Code that wants to cater both for Py2 and Py3 should do something like::
 
-2) "from localmodule import foo"
+    try:
+       spam
+    except SpamException:
+       exc = getexception()
 
-   Syntax for relative imports has changed and is incompatible between
-   Python 2.4 and Python 3. The only way seems to use absolute imports
-   throughout.
+This is taken care also by 2to3, however.
 
-3) "print foo, bar" -> "print(foo, bar)"
 
-   Print is no longer a statement.
+Relative imports
+----------------
+
+The new relative import syntax,
+
+    from . import foo
+
+is not available on Py2.4, so we can't simply use it.
+
+Using absolute imports everywhere is probably OK, if they just happen
+to work.
+
+2to3, however, converts the old syntax to new syntax, so as long as we
+use the hack, it takes care of most parts.
+
+
+Print
+-----
+
+The Print statement changed to a builtin function in Py3.
+
+Probably can generally be replaced by something like::
+
+    print("%s %s %s" % (a, b, c))
+
+and in any case, there shouldn't be many print() statements in a
+library as low-level as Numpy is. When writing to a file, `file.write`
+should also be preferred to print.
+
+types module
+------------
+
+The following items were removed from `types` module in Py3:
+
+- StringType    (Py3: `bytes` is equivalent, to some degree)
+- InstanceType  (Py3: ???)
+- IntType       (Py3: no equivalent)
+- LongType      (Py3: equivalent `long`)
+- FloatType     (Py3: equivalent `float`)
+- BooleanType   (Py3: equivalent `bool`)
+- ComplexType   (Py3: equivalent `complex`)
+- UnicodeType   (Py3: equivalent `str`)
+- BufferType    (Py3: more-or-less equivalent `memoryview`)
+
+In ``numerictypes.py``, the "common" types were replaced by their
+plain equivalents, and `IntType` was dropped.
+
+.. todo::
+
+   BufferType should probably be replaced with `memoryview` in most places.
+   This was currently changed in a couple of places.
 
 
 C Code
 ======
 
-What has been done so far, and some known TODOs
------------------------------------------------
+
+NPY_PY3K
+--------
+
+A #define in config.h, defined when building for Py3.
+
+.. todo::
+
+   Currently, this is generated as a part of the config.
+   Is this sensible (we could also use PY_MAJOR_VERSION)?
+
 
 private/npy_3kcompat.h
+----------------------
 
-    Convenience macros for Python 3 support.
-    New ones that need to be added should be added in this file.
+Convenience macros for Python 3 support:
 
-ob_type etc.
+- PyInt -> PyLong on Py3
+- PyString -> PyBytes on Py3
+- PyUString -> PyUnicode on Py3 and PyString on Py2
+- PyBytes on Py3
+- Py_SIZE et al., for older Python versions
+- PyFile compatibility on Py3
+- PyObject_Cmp, convenience comparison function on Py3
+
+Any new ones that need to be added should be added in this file.
+
+.. todo::
+
+   Remove PyString_* eventually -- having a call to one of these in Numpy
+   sources is a sign of an error...
+
+
+ob_type, ob_size
+----------------
+
+These use Py_SIZE, etc. macros now.  The macros are also defined in
+npy_3kcompat.h for the Python versions that don't have them natively.
 
-    These use Py_SIZE, etc. macros now.  The macros are also defined in
-    npy_3kcompat.h for the Python versions that don't have them natively.
 
 PyNumberMethod
+--------------
+
+The structures have been converted to the new format:
+
+- number.c
+- scalartypes.c.src
+- scalarmathmodule.c.src
+
+The slots np_divide, np_long, np_oct, np_hex, and np_inplace_divide
+have gone away. The slot np_int is what np_long used to be, tp_divide
+is now tp_floor_divide, and np_inplace_divide is now
+np_inplace_floor_divide.
+
+These have simply been #ifdef'd out on Py3.
+
+.. todo::
+
+   Check if semantics of the methods have changed
+
+.. todo::
+
+   We will also have to make sure the
+   *_true_divide variants are defined. This should also be done for
+   python < 3.x, but that introduces a requirement for the
+   Py_TPFLAGS_HAVE_CLASS in the type flag.
+
+
+PyBuffer
+--------
+
+PyBuffer usage is widely spread in multiarray:
+
+1) The void scalar makes use of buffers
+2) Multiarray has methods for creating buffers etc. explicitly
+3) Arrays can be created from buffers etc.
+4) The .data attribute of an array is a buffer
+
+Py3 introduces the PEP 3118 buffer protocol as the *only* protocol,
+so we must implement it.
+
+The exporter parts of the PEP 3118 buffer protocol are currently
+implemented in ``buffer.c`` for arrays, and in ``scalartypes.c.src``
+for generic array scalars. The generic array scalar exporter, however,
+doesn't currently produce format strings, which needs to be fixed.
+
+Currently, the format string and some of the memory is cached in the
+PyArrayObject structure. This is partly needed because of Python bug #7433.
+
+From the consumer side, the new buffer protocol is mostly backward
+compatible with the old one, so little needs to be done here to retain
+basic functionality. However, we *do* want to make use of the new
+features, at least in `multiarray.frombuffer` and maybe in `multiarray.array`.
 
-    The structures have been converted to the new format.
+Since there is a native buffer object in Py3, the `memoryview`, the
+`newbuffer` and `getbuffer` functions are removed from `multiarray` in
+Py3: their functionality is taken over by the new `memoryview` object.
 
-    TODO: check if semantics of the methods have changed
+.. todo::
 
-PyBuffer_*
+   Implement support for consuming new buffer objects.
+   Probably in multiarray.frombuffer? Perhaps also in multiarray.array?
 
-    These parts have been replaced with stub code, marked by #warning XXX
+.. todo::
 
-    TODO: implement the new buffer protocol: for scalars and arrays
+   make ndarray shape and strides natively Py_ssize_t
 
-          - generate format strings from dtype
-          - parse format strings?
-          - Py_Ssize_t for strides and shape?
+.. todo::
+
+   Revise the decision on where to cache the format string -- dtype
+   would be a better place for this.
+
+.. todo::
+
+   There's some buffer code in numarray/_capi.c that needs to be addressed.
+
+.. todo::
+
+   Does altering the PyArrayObject structure require bumping the ABI?
 
-    TODO: decide what to do with the fact that PyMemoryView object is not
-          stand-alone. Do we need a separate "dummy" object?
 
 PyString
+--------
+
+There is no PyString in Py3, everything is either Bytes or Unicode.
+Unicode is also preferred in many places, e.g., in __dict__.
+
+There are two issues related to the str/bytes change:
+
+1) Return values etc. should prefer unicode
+2) The 'S' dtype
+
+This entry discusses return values etc. only, the 'S' dtype is a
+separate topic.
+
+All uses of PyString in Numpy should be changed to one of
+
+- PyBytes: one-byte character strings in Py2 and Py3
+- PyUString (defined in npy_3kconfig.h): PyString in Py2, PyUnicode in Py3
+- PyUnicode: UCS in Py2 and Py3
+
+In many cases the conversion only entails replacing PyString with
+PyUString.
+
+PyString is currently defined to PyBytes in npy_3kcompat.h, for making
+things to build. This definition will be removed when Py3 support is
+finished.
+
+Where *_AsStringAndSize is used, more care needs to be taken, as
+encoding Unicode to Bytes may needed. If this cannot be avoided, the
+encoding should be ASCII, unless there is a very strong reason to do
+otherwise. Especially, I don't believe we should silently fall back to
+UTF-8 -- raising an exception may be a better choice.
+
+Exceptions should use PyUnicode_AsUnicodeEscape -- this should result
+to an ASCII-clean string that is appropriate for the exception
+message.
+
+Some specific decisions that have been made so far:
+
+* descriptor.c: dtype field names are UString
+
+  At some places in Numpy code, there are some guards for Unicode field
+  names. However, the dtype constructor accepts only strings as field names,
+  so we should assume field names are *always* UString.
 
-    PyString is currently defined to PyBytes in npy_3kcompat.h.
-    This definition will go away in the end.
+* descriptor.c: field titles can be arbitrary objects.
+  If they are UString (or, on Py2, Bytes or Unicode), insert to fields dict.
 
-    All instances of PyString must be converted to one of:
+* descriptor.c: dtype strings are Unicode.
 
-    - PyBytes: byte character strings in Py2 and Py3
-    - PyUnicode: unicode strings in Py2 and Py3 
-    - PyUString: unicode in Py3, byte string in Py2
+* descriptor.c: datetime tuple contains Bytes only.
 
-    Decisions:
+* repr() and str() should return UString
 
-    * field names are UString
+* comparison between Unicode and Bytes is not defined in Py3
 
-    * field titles can be arbitrary objects.
-      If they are Unicode, insert to fields dict.
+* Type codes in numerictypes.typeInfo dict are Unicode
 
-    * dtype strings are Unicode.
+* Func name in errobj is Bytes (should be forced to ASCII)
 
-    * datetime tuple contains Unicode.
+.. todo::
 
-    * Exceptions should preferably be ASCII-only -> use AsUnicodeEscape
+   tp_doc -- it's a char* pointer, but what is the encoding?
+   Check esp. lib/src/_compiled_base
 
+.. todo::
 
-    TODO: Are exception strings bytes or unicode? What about tp_doc?
+   ufunc names -- again, what's the encoding?
 
-          Fix lib/src/_compiled_base accordingly.
+.. todo::
 
-    TODO: I have a feeling that we should avoid PyUnicode_AsUTF8EncodedString
-          wherever possible...
+   Replace all occurrences of PyString by PyBytes, PyUnicode, or PyUString.
 
-    TODO: Replace all occurrences of String by Bytes, Unicode or UString,
-          to ensure that we have made a conscious choice for each case in Py3K.
+.. todo::
 
-	  #define PyBytes -> PyString for Python 2 in npy_3kcompath.h
+   Finally, remove the convenience PyString #define from npy_3kcompat.h
 
-          Finally remove the PyString -> PyBytes defines from npy_3kcompat.h
-          This is probably the *easiest* way to make sure all of
-          the string/unicode transition has been audited.
+.. todo::
+
+   Revise errobj decision?
+
+.. todo::
+
+   Check that non-UString field names are not accepted anywhere.
+
+
+PyUnicode
+---------
+
+PyUnicode in Py3 is pretty much as it was in Py2, except that it is
+now the only "real" string type.
+
+In Py3, Unicode and Bytes are not comparable, ie., 'a' != b'a'.  Numpy
+comparison routines were handled to act in the same way, leaving
+comparison between Unicode and Bytes undefined.
+
+.. todo::
+
+   Check that indeed all comparison routines were changed.
+
+
+Fate of the 'S' dtype
+---------------------
+
+"Strings" in Py3 are now Unicode, so it would make sense to
+re-associate Numpy's dtype letter 'S' with Unicode, and introduce
+a separate letter for Bytes.
+
+The Bytes dtype can probably not be wholly dropped -- there may be
+some use for 1-byte character strings in e.g. genetics?
+
+.. todo::
+
+   'S' dtype should be aliased to 'U'. One of the two should be deprecated.
+
+.. todo::
+
+   All dtype code should be checked for usage of *_STRINGLTR.
+
+.. todo::
+
+   A new 'bytes' dtype? Should the type code be 'y'
+
+.. todo::
+
+   Catch all worms that come out of the can because of this change.
+   In any case, I guess many of the current failures in our test suite
+   are because code 'S' does not correspond to the `str` type.
+
+.. todo::
+
+   Currently, in parts of the code, both Bytes and Unicode strings
+   are classified as "strings", and share some of the code paths.
+
+   It should probably be checked if preferring Unicode for Py3 requires
+   changing some of these parts.
 
-          The String/Unicode transition is simply too dangerous to handle
-          by a blanket replacement. 
 
 PyInt
+-----
+
+There is no limited-range integer type any more in Py3.  It makes no
+sense to inherit Numpy ints from Py3 ints.
+
+Currently, the following is done:
+
+1) Numpy's integer types no longer inherit from Python integer.
+2) int is taken dtype-equivalent to NPY_LONG
+3) ints are converted to NPY_LONG 
+
+PyInt methods are currently replaced by PyLong, via macros in npy_3kcompat.h.
+
+Dtype decision rules were changed accordingly, so that Numpy understands
+Py3 int translate to NPY_LONG as far as dtypes are concerned.
 
-    PyInt is currently replaced by PyLong, via macros in npy_3kcompat.h
+.. todo::
 
-    Dtype decision rules were changed accordingly, so Numpy understands
-    Python int to be dtype-compatible with NPY_LONG.
+   Decide on
 
-    TODO: Decide on
+   * what is: array([1]).dtype
+   * what is: array([2**40]).dtype
+   * what is: array([2**256]).dtype
+   * what is: array([1]) + 2**40
+   * what is: array([1]) + 2**256
 
-          ... what is: array([1]).dtype
-          ... what is: array([2**40]).dtype
-          ... what is: array([2**256]).dtype
-          ... what is: array([1]) + 2**40
-          ... what is: array([1]) + 2**256
+   ie. dtype casting rules. It seems to <pv> that we will want to
+   fix the dtype of Python 3 int to be the machine integer size,
+   despite the fact that the actual Python 3 object is not fixed-size.
 
-          ie. dtype casting rules. It seems to <pv> that we will want to
-          fix the dtype of Python 3 int to be the machine integer size,
-          despite the fact that the actual Python 3 object is not fixed-size.
+.. todo::
+
+   Audit the automatic dtype decision -- did I plug all the cases?
 
-    TODO: Audit the automatic dtype decision -- did I plug all the cases?
 
 Divide
+------
+
+The Divide operation is no more.
+
+Calls to PyNumber_Divide were replaced by FloorDivide or TrueDivide,
+as appropriate.
+
+The PyNumberMethods entry is #ifdef'd out on Py3, see above.
+
 
-    The Divide operation is no more.
+tp_compare, PyObject_Compare
+----------------------------
 
-    So we change array(1) / 10 == array(0.1)
+The compare method has vanished, and is replaced with richcompare.
+We just #ifdef the compare methods out on Py3.
 
-tp_compare
+New richcompare methods were implemented for:
 
-    The compare method has vanished.
+* flagsobject.c
 
-    TODO: ensure that all types that had only tp_compare have also
-          tp_richcompare.
+On the consumer side, we have a convenience wrapper in npy_3kcompat.h
+providing PyObject_Cmp also on Py3.
 
-pickles
+.. todo::
 
-    It is not possible to support Python 2 pickles in Python 3.
+   Ensure that all types that had only tp_compare have also
+   tp_richcompare.
 
-    This is because Python 2 strings pickle to Python 3 unicode objects,
-    which causes problems if there are non-ascii characters there.
-    The errors are raised before reaching any Numpy code, so we just can't
-    preserve backwards compatibility here.
+
+Pickling
+--------
+
+The ndarray and dtype __setstate__ were modified to be
+backward-compatible with Py3: they need to accept a Unicode endian
+character, and Unicode data since that's what Py2 str is unpickled to
+in Py3.
+
+An encoding assumption is required for backward compatibility: the user
+must do
+
+    loads(f, encoding='latin1')
+
+to successfully read pickles created by Py2.
+
+.. todo::
+
+   Forward compatibility? Is it even possible?
+   For sure, we are not knowingly going to store data in PyUnicode,
+   so probably the only way for forward compatibility is to implement
+   a custom Unpickler for Py2?
+
+.. todo::
+
+   If forward compatibility is not possible, aim to store also the endian
+   character as Bytes...
 
 
 PyTypeObject
@@ -218,11 +545,6 @@ keep in mind.
 1) Because the first three slots are now part of a struct some compilers issue
    warnings if they are initialized in the old way.
 
-   In practice, it is necessary to use the Py_TYPE, Py_SIZE, Py_REFCNT
-   macros instead of accessing ob_type, ob_size and ob_refcnt
-   directly.  These are defined for backward compatibility in
-   private/npy_3kcompat.h
-
 2) The compare slot has been made reserved in order to preserve binary
    compatibily while the tp_compare function went away. The tp_richcompare
    function has replaced it and we need to use that slot instead. This will
@@ -233,142 +555,22 @@ keep in mind.
    bogus. They are not supposed to be explicitly initialized and were out of
    place in any case because an extra base slot was added in python 2.6.
 
-Because of these facts it was thought better to use #ifdefs to bring the old
-initializers up to py3k snuff rather than just fill the tp_richcompare slot.
-They also serve to mark the places where changes have been made. The new form
-is shown below. Note that explicit initialization can stop once none of the
+Because of these facts it is better to use #ifdefs to bring the old
+initializers up to py3k snuff rather than just fill the tp_richcompare
+slot.  They also serve to mark the places where changes have been
+made. Note that explicit initialization can stop once none of the
 remaining entries are non-zero, because zero is the default value that
 variables with non-local linkage receive.
 
 
-NPY_NO_EXPORT PyTypeObject Foo_Type = {
-#if defined(NPY_PY3K)
-    PyVarObject_HEAD_INIT(0,0)
-#else
-    PyObject_HEAD_INIT(0)
-    0,                                          /* ob_size */
-#endif
-    "numpy.foo"                                 /* tp_name */
-    0,                                          /* tp_basicsize */
-    0,                                          /* tp_itemsize */
-    /* methods */
-    0,                                          /* tp_dealloc */
-    0,                                          /* tp_print */
-    0,                                          /* tp_getattr */
-    0,                                          /* tp_setattr */
-#if defined(NPY_PY3K)
-    (void *)0,                                  /* tp_reserved */
-#else
-    0,                                          /* tp_compare */
-#endif
-    0,                                          /* tp_repr */
-    0,                                          /* tp_as_number */
-    0,                                          /* tp_as_sequence */
-    0,                                          /* tp_as_mapping */
-    0,                                          /* tp_hash */
-    0,                                          /* tp_call */
-    0,                                          /* tp_str */
-    0,                                          /* tp_getattro */
-    0,                                          /* tp_setattro */
-    0,                                          /* tp_as_buffer */
-    0,                                          /* tp_flags */
-    0,                                          /* tp_doc */
-    0,                                          /* tp_traverse */
-    0,                                          /* tp_clear */
-    0,                                          /* tp_richcompare */
-    0,                                          /* tp_weaklistoffset */
-    0,                                          /* tp_iter */
-    0,                                          /* tp_iternext */
-    0,                                          /* tp_methods */
-    0,                                          /* tp_members */
-    0,                                          /* tp_getset */
-    0,                                          /* tp_base */
-    0,                                          /* tp_dict */
-    0,                                          /* tp_descr_get */
-    0,                                          /* tp_descr_set */
-    0,                                          /* tp_dictoffset */
-    0,                                          /* tp_init */
-    0,                                          /* tp_alloc */
-    0,                                          /* tp_new */
-    0,                                          /* tp_free */
-    0,                                          /* tp_is_gc */
-    0,                                          /* tp_bases */
-    0,                                          /* tp_mro */
-    0,                                          /* tp_cache */
-    0,                                          /* tp_subclasses */
-    0,                                          /* tp_weaklist */
-    0,                                          /* tp_del */
-    0                                           /* tp_version_tag (2.6) */
-};
-
-checklist of types having tp_compare but no tp_richcompare
-
-1) multiarray/flagsobject.c
-
-PyNumberMethods
----------------
-
-Types with tp_as_number defined
-
-1) multiarray/arrayobject.c
-
-The slots np_divide, np_long, np_oct, np_hex, and np_inplace_divide
-have gone away. The slot np_int is what np_long used to be, tp_divide
-is now tp_floor_divide, and np_inplace_divide is now
-np_inplace_floor_divide. We will also have to make sure the
-*_true_divide variants are defined. This should also be done for
-python < 3.x, but that introduces a requirement for the
-Py_TPFLAGS_HAVE_CLASS in the type flag.
-
-/*
- * Number implementations must check *both* arguments for proper type and
- * implement the necessary conversions in the slot functions themselves.
-*/
-PyNumberMethods foo_number_methods = {
-    (binaryfunc)0,                              /* nb_add */
-    (binaryfunc)0,                              /* nb_subtract */
-    (binaryfunc)0,                              /* nb_multiply */
-    (binaryfunc)0,                              /* nb_remainder */
-    (binaryfunc)0,                              /* nb_divmod */
-    (ternaryfunc)0,                             /* nb_power */
-    (unaryfunc)0,                               /* nb_negative */
-    (unaryfunc)0,                               /* nb_positive */
-    (unaryfunc)0,                               /* nb_absolute */
-    (inquiry)0,                                 /* nb_bool, nee nb_nonzero */
-    (unaryfunc)0,                               /* nb_invert */
-    (binaryfunc)0,                              /* nb_lshift */
-    (binaryfunc)0,                              /* nb_rshift */
-    (binaryfunc)0,                              /* nb_and */
-    (binaryfunc)0,                              /* nb_xor */
-    (binaryfunc)0,                              /* nb_or */
-    (unaryfunc)0,                               /* nb_int */
-    (void *)0,                                  /* nb_reserved, nee nb_long */
-    (unaryfunc)0,                               /* nb_float */
-    (binaryfunc)0,                              /* nb_inplace_add */
-    (binaryfunc)0,                              /* nb_inplace_subtract */
-    (binaryfunc)0,                              /* nb_inplace_multiply */
-    (binaryfunc)0,                              /* nb_inplace_remainder */
-    (ternaryfunc)0,                             /* nb_inplace_power */
-    (binaryfunc)0,                              /* nb_inplace_lshift */
-    (binaryfunc)0,                              /* nb_inplace_rshift */
-    (binaryfunc)0,                              /* nb_inplace_and */
-    (binaryfunc)0,                              /* nb_inplace_xor */
-    (binaryfunc)0,                              /* nb_inplace_or */
-    (binaryfunc)0,                              /* nb_floor_divide */
-    (binaryfunc)0,                              /* nb_true_divide */
-    (binaryfunc)0,                              /* nb_inplace_floor_divide */
-    (binaryfunc)0,                              /* nb_inplace_true_divide */
-    (unaryfunc)0                                /* nb_index */
-};
-
 PySequenceMethods
 -----------------
 
 Types with tp_as_sequence defined
 
-1) multiarray/descriptor.c
-2) multiarray/scalartypes.c.src
-3) multiarray/arrayobject.c
+* multiarray/descriptor.c
+* multiarray/scalartypes.c.src
+* multiarray/arrayobject.c
 
 PySequenceMethods in py3k are binary compatible with py2k, but some of the
 slots have gone away. I suspect this means some functions need redefining so
@@ -387,16 +589,21 @@ PySequenceMethods foo_sequence_methods = {
     (ssizeargfunc)0                             /* sq_inplace_repeat */
 };
 
+.. todo::
+
+   Check semantics of the PySequence methods.
+
+
 PyMappingMethods
 ----------------
 
 Types with tp_as_mapping defined
 
-1) multiarray/descriptor.c
-2) multiarray/iterators.c
-3) multiarray/scalartypes.c.src
-4) multiarray/flagsobject.c
-5) multiarray/arrayobject.c
+* multiarray/descriptor.c
+* multiarray/iterators.c
+* multiarray/scalartypes.c.src
+* multiarray/flagsobject.c
+* multiarray/arrayobject.c
 
 PyMappingMethods in py3k look to be the same as in py2k. The semantics
 of the slots needs to be checked.
@@ -407,47 +614,10 @@ PyMappingMethods foo_mapping_methods = {
     (objobjargproc)0                        /* mp_ass_subscript */
 };
 
+.. todo::
 
-PyBuffer
---------
-
-Parts involving the PyBuffer_* likely require the most work, and they
-are widely spread in multiarray:
-
-1) The void scalar makes use of buffers
-2) Multiarray has methods for creating buffers etc. explicitly
-3) Arrays can be created from buffers etc.
-4) The .data attribute of an array is a buffer
-
-There are two things to note in 3K:
-
-1) The buffer protocol has changed.  It is also now quite complicated,
-   and implementing it properly requires several pieces.
-
-2) There is no PyBuffer object any more. Instead, a MemoryView
-   object is present, but it always must piggy-pack on another existing
-   object.
-
-Currently, what has been done is:
-
-1) Replace protocol implementations with stubs that either raise errors
-   or offer limited functionality.
+   Check semantics of the PyMapping methods.
 
-2) Replace PyBuffer usage by PyMemoryView where possible.
-
-3) ... and where not possible, use stubs that raise errors.
-
-What likely needs to be done is:
-
-1) Implement a simple "stub" compatibility buffer object 
-   the memoryview can piggy-pack on.
-
-
-PyNumber_Divide
----------------
-
-This function has vanished -- needs to be replaced with PyNumber_TrueDivide
-or FloorDivide.
 
 PyFile
 ------
@@ -458,50 +628,37 @@ Many of the PyFile items have disappeared:
 2) PyFile_AsFile
 3) PyFile_FromString
 
-Compatibility wrappers for these are now in private/npy_3kcompat.h
+Most importantly, in Py3 there is no way to extract a FILE* pointer
+from the Python file object. There are, however, new PyFile_* functions
+for writing and reading data from the file.
 
+Temporary compatibility wrappers that return a `fdopen` file pointer
+are in private/npy_3kcompat.h.  However, this is an unsatisfactory
+approach, since the FILE* pointer returned by `fdopen` cannot be freed
+as `fclose` on it would also close the underlying file.
 
-PyString
---------
-
-PyString was removed, and needs to be replaced either by PyBytes or PyUnicode.
-The plan of attack currently is:
-
-1) The 'string' array dtype will be replaced by Bytes
-2) The 'unicode' array dtype will stay Unicode
-3) dtype fields names can be *either* Bytes or Unicode
+.. todo::
 
-Some compatibility wrappers are defined in private/npy_3kcompat.h,
-redefining essentially String as Bytes.
+   Adapt all Numpy I/O to use the PyFile_* methods or the low-level
+   IO routines. In any case, it's unlikely that C stdio can be used any more.
 
-However, at least following points need still to be audited:
+   Perhaps using PyFile_* makes numpy.tofile e.g. to a gzip to work?
 
-1) PyObject_Str -> it now returns unicodes
-2) tp_doc -> char* string, but is it in unicode or what?
 
-
-RO
---
+READONLY
+--------
 
 The RO alias for READONLY is no more.
 
+These were replaced, as READONLY is present also on Py2.
+
 
 Py_TPFLAGS_CHECKTYPES
 ---------------------
 
 This has vanished and is always on in Py3K.
 
-
-PyInt
------
-
-There is no limited-range integer type any more in Py3K.
-
-Currently, the plan is the following:
-
-1) Numpy's integer types no longer inherit from Python integer.
-2) Convert Longs to integers, if their size is small enough and known.
-3) Otherwise, use long longs.
+It is currently #ifdef'd out for Py3.
 
 
 PyOS
@@ -512,3 +669,16 @@ Deprecations:
 1) PyOS_ascii_strtod -> PyOS_double_from_string;
    curiously enough, PyOS_ascii_strtod is not only deprecated but also
    causes segfaults
+
+
+PyInstance
+----------
+
+There are some checks for PyInstance in ``common.c`` and ``ctors.c``.
+
+Currently, ``PyInstance_Check`` is just #ifdef'd out for Py3. This is,
+quite likely, not the correct thing to do.
+
+.. todo::
+
+   Do the right thing for PyInstance checks.
author	Pauli Virtanen <pav@iki.fi>	2009-12-06 12:21:15 +0000
committer	Pauli Virtanen <pav@iki.fi>	2009-12-06 12:21:15 +0000
commit	03f0ace600624ffbb13467e6fdf4e112fc6d714f (patch)
tree	7124eda182cebd954715426194416156a1d4fcde
parent	5f8e25452e8aff0f74b3cbf8aca4e87c8c41cc23 (diff)
download	numpy-03f0ace600624ffbb13467e6fdf4e112fc6d714f.tar.gz