diff options
-rw-r--r-- | doc/neps/npy-format.rst | 23 | ||||
-rw-r--r-- | numpy/lib/format.py | 13 |
2 files changed, 33 insertions, 3 deletions
diff --git a/doc/neps/npy-format.rst b/doc/neps/npy-format.rst index f7bb799f2..bf88c3fee 100644 --- a/doc/neps/npy-format.rst +++ b/doc/neps/npy-format.rst @@ -199,6 +199,21 @@ bytes of the array. Consumers can figure out the number of bytes by multiplying the number of elements given by the shape (noting that shape=() means there is 1 element) by dtype.itemsize. +Format Specification: Version 1.0 +--------------------------------- + +The version 1.0 format only allowed the array header to have a +total size of 65535 bytes. This can be exceeded by structured +arrays with a large number of columns. The version 2.0 format +extends the header size to 4 GiB. `numpy.save` will automatically +save in 2.0 format if the data requires it, else it will always use +the more compatible 1.0 format. + +The description of the fourth element of the header therefore has +become: + + The next 4 bytes form a little-endian unsigned int: the length + of the header data HEADER_LEN. Conventions ----------- @@ -269,13 +284,15 @@ the file format. Implementation -------------- -The current implementation is included in the 1.0.5 release of numpy. - - http://github.com/numpy/numpy/blob/v1.5.0/numpy/lib/format.py +The version 1.0 implementation was first included in the 1.0.5 release of +numpy, and remains available. The version 2.0 implementation was first +included in the 1.9.0 release of numpy. Specifically, the file format.py in this directory implements the format as described here. + http://github.com/numpy/numpy/blob/master/numpy/lib/format.py + References ---------- diff --git a/numpy/lib/format.py b/numpy/lib/format.py index 67da0d6d1..4ff0a660f 100644 --- a/numpy/lib/format.py +++ b/numpy/lib/format.py @@ -128,6 +128,19 @@ Consumers can figure out the number of bytes by multiplying the number of elements given by the shape (noting that ``shape=()`` means there is 1 element) by ``dtype.itemsize``. +Format Version 2.0 +------------------ + +The version 1.0 format only allowed the array header to have a total size of +65535 bytes. This can be exceeded by structured arrays with a large number of +columns. The version 2.0 format extends the header size to 4 GiB. +`numpy.save` will automatically save in 2.0 format if the data requires it, +else it will always use the more compatible 1.0 format. + +The description of the fourth element of the header therefore has become: +"The next 4 bytes form a little-endian unsigned int: the length of the header +data HEADER_LEN." + Notes ----- The ``.npy`` format, including reasons for creating it and a comparison of |