summaryrefslogtreecommitdiff
path: root/rdflib/plugins/parsers
Commit message (Collapse)AuthorAgeFilesLines
* fix: eliminate bare `except:` (#2350)Iwan Aucamp2023-04-122-4/+4
| | | | | | | | | | | Replace bare `except:` with `except Exception`, there are some cases where it can be narrowed further, but this is already an improvement over the current situation. This is somewhat pursuant to eliminating [flakeheaven](https://github.com/flakeheaven/flakeheaven), as it no longer supports the latest version of flake8 [[ref](https://github.com/flakeheaven/flakeheaven/issues/132)]. But it also is just the right thing to do as bare exceptions can cause problems.
* refactor: eliminate inheritance from object (#2339)Iwan Aucamp2023-04-104-7/+7
| | | | | This change removes the redundant inheritance from `object` (i.e. `class Foo(object): pass`) that is no longer needed in Python 3 and is a relic from Python 2.
* build(deps-dev): bump black from 22.12.0 to 23.1.0 (#2248)dependabot[bot]2023-03-115-12/+2
|
* fix: small InputSource related issues (#2255)Iwan Aucamp2023-03-111-18/+19
| | | | | | | | | | | | | | | | | | | | | | | | I have added a bunch of tests for `InputSource` handling, checking most kinds of input source with most parsers. During this, I detected the following issues that I fixed: - `rdflib.util._iri2uri()` was URL quoting the `netloc` parameter, but this is wrong and the `idna` encoding already takes care of special characters. I removed the URL quoting of `netloc`. - HexTuple parsing was handling the input source in a way that would only work for some input sources, and not raising errors for other input sources. I changed the input source handling to be more generic. - `rdflib.parser.create_input_source()` incorrectly used `file.buffer` instead of `source.buffer` when dealing with IO stream sources. Other changes with no runtime impact include: - Changed the HTTP mocking stuff in test slightly to accommodate serving arbitrary files, as I used this in the `InputSource` tests. - Don't use Google in tests, as we keep getting `urllib.error.HTTPError: HTTP Error 429: Too Many Requests` from it.
* feat: add parser type hints (#2232)Iwan Aucamp2023-03-058-263/+521
| | | | | | | | | | | | | Add type hints to: - `rdflib/parser.py` - `rdflib/plugins/parser/*.py` - some JSON-LD utils - `rdflib/exceptions.py`. This is mainly because the work I'm doing to fix <https://github.com/RDFLib/rdflib/issues/1844> is touching some of this parser stuff and the type hints are useful to avoid mistakes. No runtime changes are included in this PR.
* Fix type errors resulting from new mypy (#2161)Iwan Aucamp2022-11-191-1/+1
| | | | | | New mypy version is reporting new errors. In the long run we need to switch to poetry so we can better control this.
* feat: Add type hints to rdflib.graph (#2080)Iwan Aucamp2022-08-231-1/+2
| | | | | | | | | | | | More or less complete type hints for the rdflib.graph module. Other changes: - Improved/simplified type hints in `rdflib.store` and store plugins. - Add type ignores for various type errors that occur with the type hints. This is split-off from <https://github.com/RDFLib/rdflib/pull/1850>. This PR does not change runtime behaviour.
* fix: always parse HexTuple files as utf-8 (#2070)Iwan Aucamp2022-08-071-1/+1
| | | | | | | | | | Always parse HexTuple files as utf-8 as was the intent anyway as evidenced by the code that will raise a warning if the encoding provided for a HexTuple file is something other than utf-8 or None. https://github.com/RDFLib/rdflib/blob/cfa418074b27b12aac905ba266b002a237c5ff4c/rdflib/plugins/parsers/hext.py#L73-L79 Not adding any tests as this code is already tested and an XFAIL is removed in this patch.
* fix: import xml.sax.handler from the right place (#2041)Iwan Aucamp2022-07-201-2/+1
| | | | | | | Change the import of `xml.sax.handler` in the TriX parser so that it imports from `xml.sax` and not from `xml.sax.saxutils`. Importing from `xml.sax.saxutils` causes mypy to fail but it is also wrong as there is no documented `handler` in `xml.sax.saxutils`.
* revert: fix: import xml.sax.handler from the right placeIwan Aucamp2022-07-201-1/+2
| | | | | | | | This reverts commit 1740214b591eb0f3e57fc6c6b63da2b29f7ae946. Was working on the wrong branch. Refs: 1740214b591eb0f3e57fc6c6b63da2b29f7ae946
* fix: import xml.sax.handler from the right placeIwan Aucamp2022-07-201-2/+1
| | | | | | | Change the import of `xml.sax.handler` in the TriX parser so that it imports from `xml.sax` and not from `xml.sax.saxutils`. Importing from `xml.sax.saxutils` causes mypy to fail but it is also wrong as there is no documented `handler` in `xml.sax.saxutils`.
* docs: fix sphinx nitpicky issues (#2036)Iwan Aucamp2022-07-183-6/+16
| | | | | | | | | | | | | Enable nitpicky mode for Sphinx and fix all warnings and errors that occur when running with nitpicky enabled. Other changes: - Add a tox environment for building docs (-docs). This is so we can test building docs on various versions of python as there seems to be some differences in warnings between different versions. This tox environment is enabled for linux CI builds. - Change readthedocs to use python 3.9 as earlier versions do not handle `@typing.overload` with type aliases. - Fixes https://github.com/RDFLib/rdflib/issues/1878
* More type hints for `rdflib.graph` and related (#1853)Iwan Aucamp2022-05-261-3/+3
| | | | | | | | | | | | | | | | | | This patch primarily adds more type hints for `rdflib.graph`, but also adds type hints to some related modules in order to work with the new type hints for `rdflib.graph`. I'm mainly doing this as a baseline for adding type hints to `rdflib.store`. I have created type aliases to make it easier to type everything consistently and to make type hints easier easier to change in the future. The type aliases are private however (i.e. `_`-prefixed) and should be kept as such for now. This patch only contains typing changes and does not change runtime behavior. Broken off from https://github.com/RDFLib/rdflib/pull/1850
* Fix trix parser to allow lowercase `trix`, add tests (#1966)Graham Higgins2022-05-251-2/+2
| | | | | | | Changed TriX parser to allow `trix` and `TriX` The RDFLib TriX parser currently only accepts TriX documents conforming to the the [Nokia-published XSD](https://web.archive.org/web/20040821093939/http://swdev.nokia.com/trix/trix-1.0.xsd) which specifies (the mixed-case) `TriX`, contradicting the [W3C-published XSD spec for TriX](https://www.w3.org/2004/03/trix/trix-1/trix-1.0.xsd) which specifies (the lower-case) `trix`. We should accept both. We were also a bit light on a TriX test suite for exercising the parser, so I recruited some TriX test fixtures from NG4J and Jena and fleshed out the TriX test suite, using the W3C "Manifest"-style approach.
* Remove testing and debug code from rdflibIwan Aucamp2022-04-191-31/+0
| | | | | | | | | | | | | | This patch removes code from `rdflib/` that does not seem like it belongs in `rdflib/`, most of it is related to doctest, some of it belongs in `test/` and was moved to `test/test_misc/test_collection.py`, and yet more of it seems to just be there for debugging purposes, though it would possibly be better to put that in a separate place if it is needed again or to debug using tests if possible. Other changes: - Removed an invocation of `rdflib.util.test` from `test_util.py`. This seems like an attempt to invoke doctest however pytest takes care of that so this is not needed.
* text: fix pytest configIwan Aucamp2022-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | pytest was using config from `tox.ini` preferentially and ignoring config from `setup.cfg`, as a side-effect doctests were not running on code/docstrings in `rdflib/`. The reason why some pytest config was in `tox.ini` instead of `setup.cfg` was because of these issues: - https://github.com/pypa/pip/issues/5182 - https://github.com/pytest-dev/pytest/issues/3062 As a compromise to fix this I have opted for moving all pytest config to `pyproject.toml`: - https://docs.pytest.org/en/stable/reference/customize.html#pyproject-toml This seems sensible as `pyproject.toml` is standarized by PEPs and eventually most things from `setup.cfg` will end up in there anyway. Also: - remove the pytest ignore on `test/translate_algebra` as tests in there have been running and passing for some time. - fixed path to test data in `rdflib/plugins/parsers/nquads.py`.
* [pre-commit.ci] auto fixes from pre-commit.com hookspre-commit-ci[bot]2022-04-156-42/+37
| | | | for more information, see https://pre-commit.ci
* Merge branch 'master' into jsonld_connegNatanael Arndt2022-03-165-37/+31
|\
| * Add isort (#1689)eggplants2022-02-212-17/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * add: isort configure file * fix: isort $ isort . * add: isort to dev deps * add: isort to CI * fix: move .isort.cfg into setup.cfg * fix: re-formatted * fix: isort target path * Use pre-commit to check isort pre-commit CI can auto fix this, and this way we can idenpendently evaluate the formatting of the code from the validity of the code. Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
| * Merge pull request #1683 from aucampia/iwana-20220122T0119-term_typingIwan Aucamp2022-01-301-1/+1
| |\ | | | | | | Merging pull request as it contains only type changes and has one review.
| | * Add typing to rdflib.termIwan Aucamp2022-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds as much typing as possible to `rdflib.term`. Other changes: - Added back `warn_unused_ignores`. I actually thought this was enabled but I forgot I disabled it because of some issue on python 3.10. - Disabled `warn_unused_ignores` only for `rdflib.plugin`. There is an ignore in this module which is not needed on python 3.10, this is the most targetted way to avoid having that fail the type checking that I can think of for now. - Removed unused type ignores. This changeset includes no runtime changes.
| * | fix: format with blackeggplants2022-01-241-2/+2
| |/ | | | | | | $ black --config black.toml .
| * Eliminate the use of `str.translate` in unquotingIwan Aucamp2022-01-161-2/+2
| | | | | | | | | | | | `str.translate` is not faster than dict lookup for this case, which is a one char lookup. For more info see https://github.com/RDFLib/rdflib/pull/1663#issuecomment-1013923380
| * Fixed the handling of escape sequences in the ntriples and nquads parsers.Iwan Aucamp2022-01-151-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These parsers will now correctly handle strings like `"\\r"`. The time it takes for these parsers to parse strings with escape sequences will be increased, and the increase will be correlated with the amount of escape sequences that occur in a string. For strings with many escape sequences the parsing speed seems to be almost 4 times slower. Also: - Add graph variant test scaffolding. Multiple files representing the same graph can now easily be tested to be isomorphic by just adding them in `test/variants`. - Add more things to `testutils.GraphHelper`, including some methods that does asserts with better messages. Also include some tests for GraphHelper. - Add some extra files to test_roundtrip, set the default identifier when parsing, and change verbose flag to rather be based on debug logging. - move one test from `test/test_issue247.py` to variants. - Fix problems with `.editorconfig` which prevents it from working properly. - Add xfail tests for a couple of issues This includes xfails for the following issues: - https://github.com/RDFLib/rdflib/issues/1216 - https://github.com/RDFLib/rdflib/issues/1649
| * Merge pull request #1656 from RDFLib/roundtrip_hextNicholas Car2022-01-151-2/+11
| |\ | | | | | | Allow hext to participate in RDF format roundtripping
| | * Update rdflib/plugins/parsers/hext.pyNicholas Car2022-01-151-1/+1
| | | | | | | | | Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
| | * Update rdflib/plugins/parsers/hext.pyNicholas Car2022-01-101-1/+1
| | | | | | | | | Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
| | * Update rdflib/plugins/parsers/hext.pyNicholas Car2022-01-101-3/+1
| | | | | | | | | Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
| | * Update rdflib/plugins/parsers/hext.pyNicholas Car2022-01-101-3/+1
| | | | | | | | | Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
| | * Update rdflib/plugins/parsers/hext.pyNicholas Car2022-01-101-1/+1
| | | | | | | | | Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
| | * Update rdflib/plugins/parsers/hext.pyNicholas Car2022-01-101-1/+1
| | | | | | | | | Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
| | * allow hext to participate in RDF format roundtrippingnicholascar2022-01-091-7/+20
| | |
| * | Don't update `SUFFIX_FORMAT_MAP` in `plugins/parsers/jsonld.py`Iwan Aucamp2022-01-081-11/+0
| |/ | | | | | | | | | | | | | | | | | | `jsonld` will already be in `SUFFIX_FORMAT_MAP`, so the code being removed here should have no effect. There are already tests for this and the tests would fail of the removed code did anything, see: https://github.com/RDFLib/rdflib/blob/b2fdaf5a1f45c09694dbd8925ab6b6dee84436b4/test/test_parse_file_guess_format.py#L23-L34
* | Merge branch 'master' into jsonld_connegNicholas Car2022-01-033-105/+149
|\ \ | |/
| * Merge pull request #1522 from aucampia/iwana-20211124T2220-more_typingNicholas Car2022-01-022-103/+146
| |\ | | | | | | Add typing for parsers
| | * Add typing for parsersIwan Aucamp2021-12-292-103/+146
| | | | | | | | | | | | This changeset include no runtime changes in rdflib.
| * | Add some type annotations to the JSON-LD codeIwan Aucamp2021-12-291-1/+2
| |/ | | | | | | This patch contains no runtime changes.
| * Merge branch 'RDFLib:master' into fix-issue1216-join-errorGraham Higgins2021-12-281-3/+3
| |\
| * | Revert error-raising change, enable Exception to be raised.Graham Higgins2021-12-271-1/+1
| | |
* | | Merge branch 'master' into jsonld_connegNicholas Car2021-12-291-3/+3
|\ \ \ | | |/ | |/|
| * | Fix `self.line` typos in call to BadSyntax.Graham Higgins2021-12-241-3/+3
| |/ | | | | | | Fix for issue #821 Invalid URI crashes without BadSyntax error
* | Merge remote-tracking branch 'origin/master' into jsonld_connegIwan Aucamp2021-12-284-19/+92
|\ \ | |/
| * style fixes onlynicholascar2021-12-161-2/+1
| |
| * more Flak8 improvementsnicholascar2021-12-111-10/+1
| |
| * fixing MyPy errorshextuplesnicholascar2021-12-091-0/+8
| |
| * more Flak8 improvementsnicholascar2021-12-071-2/+2
| |
| * ignore Flake Error W505 as soon it won't be considered an errornicholascar2021-12-071-2/+2
| |
| * backing all filesnicholascar2021-12-073-26/+26
| |
| * Flak8 improvementsnicholascar2021-12-074-33/+29
| |
| * blacked parsers & serializersnicholascar2021-12-074-27/+28
| |