summaryrefslogtreecommitdiff
path: root/numpy/lib/npyio.py
Commit message (Collapse)AuthorAgeFilesLines
* BUG: properly handle tuple keys in NpZFile.__getitem__ (#23757)Nathan Goldbaum2023-05-121-1/+1
| | | | | | | | | * BUG: properly handle tuple keys in NpZFile.__getitem__ * TST: test tuple rendering specifically. --------- Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>
* EHN: add __contains__() to np.lib.npyio.NpzFilef380cedric2023-04-251-0/+3
| | | | | | | NpzFile inherits from collections.abc.Mapping, which provides __contains__(). However, it calls __getitem__(), which can be slow because it performs file decompression on success.
* ENH: ``__repr__`` for NpzFile object (#23357)Ganesh Kathiresan2023-04-061-0/+16
| | | | | | | | | | | Improves the repr to include information about the arrays contained, e.g.: >>> npzfile = np.load('arr.npz') >>> npzfile NpzFile 'arr.npz' with keys arr_0, arr_1, arr_2, arr_3, arr_4... closes #23319 Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>
* MAINT: Fix reference roles of astyuki2023-03-251-5/+5
|
* MAINT, DOC: string_ → bytes_ and unicode_ → str_Dimitri Papadopoulos2023-02-101-1/+1
|
* API: Raise EOFError when trying to load past the end of a `.npy` file (#23105)Noé Rubinstein2023-01-271-0/+5
| | | | | | | | | | | | | | | | | | Currently, the following code: ``` import numpy as np with open('foo.npy', 'wb') as f: for i in range(np.random.randint(10)): np.save(f, 1) with open('foo.npy', 'rb') as f: while True: np.load(f) ``` Will raise: ``` ValueError: Cannot load file containing pickled data when allow_pickle=False ``` While there is no pickled data in the file.
* ENH: Improve array function overhead by using vectorcallSebastian Berg2023-01-171-28/+5
| | | | | | | | | | | | | | | | | | | | | | | | This moves dispatching for `__array_function__` into a C-wrapper. This helps speed for multiple reasons: * Avoids one additional dispatching function call to C * Avoids the use of `*args, **kwargs` which is slower. * For simple NumPy calls we can stay in the faster "vectorcall" world This speeds up things generally a little, but can speed things up a lot when keyword arguments are used on lightweight functions, for example:: np.can_cast(arr, dtype, casting="same_kind") is more than twice as fast with this. There is one alternative in principle to get best speed: We could inline the "relevant argument"/dispatcher extraction. That changes behavior in an acceptable but larger way (passes default arguments). Unless the C-entry point seems unwanted, this should be a decent step in the right direction even if we want to do that eventually, though. Closes gh-20790 Closes gh-18547 (although not quite sure why)
* BUG: np.loadtxt cannot load text file with quoted fields separated by ↵dmbelov2023-01-011-0/+8
| | | | | | | whitespace (#22906) Fix issue with `delimiter=None` and quote character not working properly (not using whitespace delimiter mode). Closes gh-22899
* Merge pull request #22393 from seberg/npy_headerMatti Picus2022-10-071-6/+26
|\ | | | | MAINT: Ensure graceful handling of large header sizes
| * MAINT: Ensure graceful handling of large header sizesSebastian Berg2022-10-061-6/+26
| | | | | | | | | | | | | | | | | | This ensures graceful handling of large header files. Unfortunately, it may be a bit inconvenient for users, thus the new kwarg and the work-around of also accepting allow-pickle. See also the documation here: https://docs.python.org/3.10/library/ast.html#ast.literal_eval
* | DOC: Use versionchanged and add in note about newline chars.Ross Barnowski2022-10-041-3/+7
| |
* | DOC: Update delimiter param description.Ross Barnowski2022-10-031-1/+2
|/ | | | | Explicitly state that only single-character delimiters are supported.
* DOC: Improve `converters` parameter description for loadtxt (#22254)Ross Barnowski2022-09-281-6/+4
| | | | | | | | | * DOC: Make converters param description more concise. A wording proposal to hopefully make the description of the converters parameter of loadtxt more clear, and direct readers to the example section. * DOC: Combine both suggestions for param descr.
* DOC: Add versionchanged for converter callable behavior.Ross Barnowski2022-07-191-0/+5
|
* DOC: Clarify loadtxt input cols requirement (#21861)Pranab Das2022-07-021-2/+14
| | | | Also add an example to illustrate how usecols can be used to read a file with varying number of fields.
* DOC: mention changes to `max_rows` behaviour in `np.loadtxt` (#21854)Pranab Das2022-06-271-2/+9
| | | | | * DOC: mention changes to `max_rows` behaviour * Clarify how line counting works in max_rows
* Remove deprecated iteratesBrigitta Sipocz2022-05-171-20/+0
|
* Add space after argument nameOscar Gustafsson2022-04-031-4/+4
|
* Merge pull request #20580 from seberg/add-npreadtextMatti Picus2022-02-081-325/+427
|\ | | | | ENH: Move `loadtxt` to C for much better speed
| * Add two new examples of converters to docstring examplesRoss Barnowski2022-02-071-0/+19
| | | | | | | | | | - Floats with underscores - Floats + hex floats.
| * Handle delimiter as bytes.Ross Barnowski2022-01-281-1/+3
| |
| * Add test for empty string as control characters.Ross Barnowski2022-01-281-0/+5
| | | | | | | | Includes comments param, which is handled on the Python side.
| * TST: Some tests for control character collisions.Ross Barnowski2022-01-281-6/+7
| | | | | | Adds some tests for the behavior of control characters, e.g. comments, delimiter and quotechar, when they have the same value. At this stage, these tests are more to frame the discussion about what the behavior should be, not to test what it currently is. I personally think raising an exception is correct for most of these situations, though it's worth noting that np.loadtxt currently doesn't for most of these corner cases (and seems to randomly assign precedence to delimiter over comments or vice versa depending on the values).
| * Add quotechar to examples.Ross Barnowski2022-01-181-1/+21
| |
| * Update and add converters examples.Ross Barnowski2022-01-181-1/+33
| |
| * BUG: Fix loadtxt no data warning stacklevelSebastian Berg2022-01-141-1/+1
| |
| * DOC: Remove outdated loadtxt TODOs from codeSebastian Berg2022-01-141-3/+5
| |
| * MAINT: Use skiplines rather than skiprows internally throughoutSebastian Berg2022-01-141-6/+6
| | | | | | | | | | | | Skiplines is just the more clear names since "rows" make a lot of sense for output rows (which implies that a line is not empty for example)
| * MAINT: Move usecol handling to C and support more than integer colsSebastian Berg2022-01-141-17/+4
| | | | | | | | | | | | | | | | Of course to actually use that many columns you need A LOT of memory right now. Each field stores at least a UCS4 NUL character, but the field is padded enough to require 16 bytes. We always parse a full row, so that requires 20 bytes per field... (i.e. 32 GiB RAM is not enough to test this :)).
| * Add warning on empty file + tests.Ross Barnowski2022-01-141-6/+7
| |
| * Add UserWarning when reading no data.Ross Barnowski2022-01-141-0/+7
| |
| * Add tests for quote+multichar comments.Ross Barnowski2022-01-141-2/+2
| | | | | | | | Also correct exception message.
| * Rename quotechar param and update docstring.Ross Barnowski2022-01-141-6/+13
| |
| * ENH: Reject empty string as control characterSebastian Berg2022-01-141-25/+31
| | | | | | | | | | | | | | | | | | | | `None` is forced instead in all cases (mainly applies to comments). This is not really a change in behaviour: It was always utterly broken. The one weird thing about it is that `delimiter=None` means "any whitespace", while `quote=None` and `comments=None` means that no quote/comment character exists at all.
| * MAINT: Address Tylers review commentsSebastian Berg2022-01-141-3/+0
| | | | | | | | (Mainly revising the doc strings)
| * STY: Fix some style issues (mainly long lines)Sebastian Berg2022-01-141-12/+14
| | | | | | | | | | Note that one of the long lines is a link that cannot be split reasonably.
| * ENH: Allow a single converter to be used for all columnsSebastian Berg2022-01-141-12/+14
| | | | | | | | This is always used if it is a callable.
| * ENH: Move npreadtext into NumPy for faster text readingSebastian Berg2022-01-141-311/+323
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces `np.loadtxt` with the new textreader. The file has a few minor cleanups compared to the npreadtext version. npreadtext was started by Warren Weckesser for inclusion in NumPy and then very heavily modified by me (Sebastian Berg) to improve it and slim it down slightly. Some parts of this code is inspired or even taken from the pandas parser (mainly the integer parsers are fairly verbatim still). Co-authored-by: Warren Weckesser <warren.weckesser@gmail.com>
* | DOC: lib/io.py was renamed to lib/npyio.pyMatthias Bussonnier2022-01-281-1/+1
| | | | | | | | | | In 44118aedbac7c1c4465443ec23d104a83b9a24f9 (2010), so this docs examples would raise a `ValueError`.
* | MAINT, DOC: fix new typos detected by codespellDimitri Papadopoulos2022-01-121-1/+1
|/
* ENH: add ndmin to `genfromtxt` behaving the same as `loadtxt` (#20500)Ivan Gonzalez2021-12-161-19/+48
|
* BUG: Fix types of errors raised by genfromtxt (#20389)André Elimelek de Weber2021-12-031-11/+10
|
* DOC: updated file object docstringArushi Sharma2021-10-261-1/+1
|
* DOC: updated docstring for binary file objectArushi Sharma2021-10-211-1/+2
|
* MAINT: lib: Check that the dtype given to fromregex is structured.warren2021-09-221-8/+8
| | | | | | | | | | In fromregex, add a check that verifies that the given dtype is a structured datatype. This avoids confusing error messages that can occur when the given data type is not structured. Also tweaked the code in the Examples section. Closes gh-8891.
* DOC: Typos found by codespellDimitri Papadopoulos2021-09-211-1/+1
|
* MAINT: revise OSError aliases (IOError, EnvironmentError)Mike Taves2021-09-021-4/+6
|
* Merge pull request #19725 from anntzer/loadtxt-fh-closingMatti Picus2021-08-261-14/+12
|\ | | | | MAINT: Use a contextmanager to ensure loadtxt closes the input file.
| * MAINT: Use a contextmanager to ensure loadtxt closes the input file.Antony Lee2021-08-221-14/+12
| | | | | | | | | | | | | | | | This seems easier to track that a giant try... finally. Also move the `fencoding` initialization to within the contextmanager, in the rather unlikely case an exception occurs during the call to `getpreferredencoding`.
* | MAINT: Avoid use of confusing compat aliases.Antony Lee2021-08-241-2/+2
| | | | | | | | | | | | | | | | | | As of Py3, np.compat.unicode == str, but that's not entirely obvious (it could correspond to some numpy dtype too), so just use plain str. Likewise for np.compat.int. tests are intentionally left unchanged, as they can be considered as implicitly testing the np.compat.py3k interface as well.