summaryrefslogtreecommitdiff
path: root/rdflib/plugins
Commit message (Collapse)AuthorAgeFilesLines
* fix: eliminate bare `except:` (#2350)Iwan Aucamp2023-04-125-9/+9
| | | | | | | | | | | Replace bare `except:` with `except Exception`, there are some cases where it can be narrowed further, but this is already an improvement over the current situation. This is somewhat pursuant to eliminating [flakeheaven](https://github.com/flakeheaven/flakeheaven), as it no longer supports the latest version of flake8 [[ref](https://github.com/flakeheaven/flakeheaven/issues/132)]. But it also is just the right thing to do as bare exceptions can cause problems.
* fix: correct imports and `__all__` (#2340)Iwan Aucamp2023-04-122-0/+6
| | | | | | | | Disable [`implicit_reexport`](https://mypy.readthedocs.io/en/stable/config_file.html#confval-implicit_reexport) and eliminate all errors reported by mypy after this. This helps ensure that import statements import from the right module and that the `__all__` variable is correct.
* feat: add optional `target_graph` argument to `Graph.cbd` and use it for ↵Matt Goldberg2023-04-111-1/+1
| | | | | | | | | DESCRIBE queries (#2322) Add optional keyword only `target_graph` argument to `rdflib.graph.Graph.cbd` and use this new argument in `evalDescribeQuery`. This makes it possible to compute a concise bounded description without creating a new graph to hold the result, and also without potentially having to copy it to another final graph. Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
* refactor: eliminate inheritance from object (#2339)Iwan Aucamp2023-04-1012-17/+17
| | | | | This change removes the redundant inheritance from `object` (i.e. `class Foo(object): pass`) that is no longer needed in Python 3 and is a relic from Python 2.
* refactor: narrow imports (#2338)Iwan Aucamp2023-04-101-2/+1
| | | | | | | This change narrows import so that things are imported from the Python module where they are defined instead of importing them from a module that re-exports them, e.g. change import of `Graph` to import from the `rdflib.graph` module instead of from the `rdflib` module. This helps avoid problems with circular imports.
* refactor: eliminate unneeded `rdflib.compat` imports (#2336)Iwan Aucamp2023-04-091-1/+1
| | | | | | Compatibility handling for `collections.abc.Mapping` and `collections.abc.MutableMapping` is not needed as RDFLib currently only support Python 3.7 and newer, and those classes are available from `collections.abc` in Python 3.7.
* fix: eliminate some mutable default arguments in SPARQL code (#2301)Charles Tapley Hoyt2023-04-073-18/+26
| | | | | This change eliminates some situations where a mutable object (i.e., a dictionary) was used as the default value for functions in the `rdflib.plugins.sparql.processor` module and related code. It replaces these situations with `typing.Optinal` that defaults to None, and is then handled within the function. Luckily, some of the code that the SPARQL Processor relied on already had this style, meaning not a lot of changes had to be made. This change also makes a small update to the logic in the SPARQL Processor's query function to simplify the if/else statement. This better mirrors the implementation in the `UpdateProcessor`.
* fix: eliminate file intermediary in translate algebra (#2267)Jeffrey C. Lerman2023-03-271-189/+265
| | | | | Previously, `rdflib.plugins.sparql.algebra.translateAlgebra()` maintained state via a file, with a fixed filename `query.txt`. With this change, use of that file is eliminated; state is now maintained in memory so that multiple concurrent `translateAlgebra()` calls, for example, should no longer interfere with each other. The change is accomplished with no change to the client interface. Basically, the actual functionality has been moved into a class, which is instantiated and used as needed (once per call to `algrebra.translateAlgebra()`).
* fix: `ROUND`, `ENCODE_FOR_URI` and `SECONDS` SPARQL functions (#2314)Iwan Aucamp2023-03-261-4/+7
| | | | | | | | | `ROUND` was not correctly rounding negative numbers towards positive infinity, `ENCODE_FOR_URI` incorrectly treated `/` as safe, and `SECONDS` did not include fractional seconds. This change corrects these issues. - Closes <https://github.com/RDFLib/rdflib/issues/2151>.
* fix: Add `to_dict` method to the JSON-LD `Context` class. (#2310)Iwan Aucamp2023-03-251-2/+35
| | | | | | | | | | `Context.to_dict` is used in JSON-LD serialization, but it was not implemented. This change adds the method. - Closes <https://github.com/RDFLib/rdflib/issues/2138>. --------- Co-authored-by: Marc-Antoine Parent <maparent@acm.org>
* fix: JSON-LD context construction from a `dict` (#2306)Iwan Aucamp2023-03-241-1/+1
| | | | | | | | | | A variable was only being initialized for string-valued inputs, but if a `dict` input was passed the variable would still be accessed, resulting in a `UnboundLocalError`. This change initializes the variable always, instead of only when string-valued input is used to construct a JSON-LD context. - Closes <https://github.com/RDFLib/rdflib/issues/2303>.
* build(deps-dev): bump mypy from 1.0.1 to 1.1.1 (#2274)dependabot[bot]2023-03-191-2/+2
| | | | | | | | | | | | | | | | | | build(deps-dev): bump mypy from 1.0.1 to 1.1.1 Bumps [mypy](https://github.com/python/mypy) from 1.0.1 to 1.1.1. - [Release notes](https://github.com/python/mypy/releases) - [Commits](https://github.com/python/mypy/compare/v1.0.1...v1.1.1) updated-dependencies: - dependency-name: mypy dependency-type: direct:development update-type: version-update:semver-minor Also added type ignores for newly detected type errors. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
* docs: document avaiable security measures (#2270)Iwan Aucamp2023-03-163-0/+57
| | | | | | | | | docs: document available security measures Several security measures can be used to mitigate risk when processing potentially malicious input. This change adds documentation about available security measures and examples and tests that illustrate their usage.
* feat: more type hints for `rdflib.plugins.sparql` (#2268)Iwan Aucamp2023-03-1311-174/+446
| | | | | | | | | | | | | | | | | | | A bit of a roundabout reason why this matters now, but basically: I want to add examples for securing RDFLib with `sys.addaudithook` and `urllib.request.install_opener`. I also want to be sure examples are actually valid, and runnable, so I was adding static analysis and simple execution of examples to our CI. During this, I noticed that examples use `initBindings` with `Dict[str,...]`, which was not valid according to mypy, but then after some investigation I realized the type hints in some places were too strict. So the main impetus for this is actually to relax the type hints in `rdflib.graph`, but to ensure this is valid I'm adding a bunch of type hints I had saved up to `rdflib.plugins.sparql`. Even though this PR looks big, it has no runtime changes.
* fix: add more type-hinting for SPARQL plugin (#2265)Jeffrey C. Lerman2023-03-122-23/+37
| | | | | | | | | | | | | | Here, adding type-hints to some of the SPARQL parser plugin code. Includes a couple of small consequent changes: 1. Minor refactor of `prettify_parsetree()`, separating the public-facing callable from the internal code that does not need to be public-facing. That allows the public-facing callable to have more informative and restrictive type-hints for its arguments. 2. Added some test-coverage for `expandUnicodeEscapes()` - initially for my own understanding, but seems useful to leave it in place since I didn't see test-coverage for that function. There should be no backwards-incompatible changes in this PR - at least, not intentionally. --------- Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
* feat: add typing to `rdflib.util` (#2262)Iwan Aucamp2023-03-121-1/+2
| | | | | | | | Mainly so that users can use RDFLib in a safer way, and that we can make safer changes to RDFLib in future. There are also some accomodating type-hint related changes outside of `rdflib.util`. This change does not have a runtime impact.
* feat: diverse type hints (#2264)Iwan Aucamp2023-03-122-7/+14
| | | | | | | Add some small diverse type hints. Type hints make RDFLib safer to use and change, as changes and usage can be validated using static analysers like mypy. This change does not have a runtime impact.
* build(deps-dev): bump black from 22.12.0 to 23.1.0 (#2248)dependabot[bot]2023-03-1125-68/+3
|
* fix: small InputSource related issues (#2255)Iwan Aucamp2023-03-111-18/+19
| | | | | | | | | | | | | | | | | | | | | | | | I have added a bunch of tests for `InputSource` handling, checking most kinds of input source with most parsers. During this, I detected the following issues that I fixed: - `rdflib.util._iri2uri()` was URL quoting the `netloc` parameter, but this is wrong and the `idna` encoding already takes care of special characters. I removed the URL quoting of `netloc`. - HexTuple parsing was handling the input source in a way that would only work for some input sources, and not raising errors for other input sources. I changed the input source handling to be more generic. - `rdflib.parser.create_input_source()` incorrectly used `file.buffer` instead of `source.buffer` when dealing with IO stream sources. Other changes with no runtime impact include: - Changed the HTTP mocking stuff in test slightly to accommodate serving arbitrary files, as I used this in the `InputSource` tests. - Don't use Google in tests, as we keep getting `urllib.error.HTTPError: HTTP Error 429: Too Many Requests` from it.
* feat: add parser type hints (#2232)Iwan Aucamp2023-03-0510-332/+659
| | | | | | | | | | | | | Add type hints to: - `rdflib/parser.py` - `rdflib/plugins/parser/*.py` - some JSON-LD utils - `rdflib/exceptions.py`. This is mainly because the work I'm doing to fix <https://github.com/RDFLib/rdflib/issues/1844> is touching some of this parser stuff and the type hints are useful to avoid mistakes. No runtime changes are included in this PR.
* feat: Add SPARQL DESCRIBE query implementation (#2221)Matt Goldberg2023-02-073-6/+63
| | | | | | | | | This adds an implementation for SPARQL DESCRIBE queries, using the built-in `cbd` method. I see there are several issues and PRs for DESCRIBE implementation. I believe this should close #479 and should resolve #1913, or at least pick up where it left off. It should also resolve #1096. This implementation should support the full SPARQL specification for DESCRIBE; either explicit IRIs can be provided (with no WHERE clause), or variables projected from a WHERE clause can be provided, or variables projected from a WHERE clause AND explicit IRIs can be provided. If a WHERE clause is provided, it should be evaluated the same way as it would for a SELECT DISTINCT query (including dataset specifications). The expected results for the test cases provided match the behaviour seen when running the same queries against the same data using ARQ. A possible future extension would be to add a global option (similar to `rdflib.plugins.sparql.SPARQL_LOAD_GRAPHS`) to change the method used to describe resources instead of always using CBD.
* fix: bug applying VALUES outside of a GROUP BY (#2188)Rob B2023-01-291-4/+10
| | | Altering order in which aggregate variable aliases are renamed to user-defined variable names to ensure that when defining a `VALUES` pattern outside of a `GROUP BY`, the variables in the query are correctly joined to those defined in the `VALUES` pattern.
* fix: bug with `SELECT *` inside `SELECT *` (#2190)Rob B2023-01-292-1/+23
| | | | | | | Fixing use of `SELECT *` in sub-select within `SELECT *` parent query as discovered in https://github.com/RDFLib/rdflib/issues/1722. Now when an instance of `SELECT *` is encountered, the query tree/plan builder now correctly considers the projected variables of any sub-select statements when deciding which variables should be projected out. Fixes <https://github.com/RDFLib/rdflib/issues/1722>.
* fix: missing query string params in sparqlconnector when using POST method ↵David Andreoletti2023-01-021-1/+2
| | | | | | | (#2180) The SPARQLConnector instance created by a SPARQLStore instance providing endpoint url + query string params using POST method did not append the query string to endpoint url. Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
* feat: do not write prefix for empty graph id (#2160)Elie Roux2022-12-301-0/+3
| | | Since the names of empty graphs do not appear in the serialization, do not consider them for namespace issues.
* fix: don't modify base when processing context inputsHarold Solbrig2022-12-241-2/+3
| | | The the base URI passed to _prep_sources was being overwritten in anticipation of processing inner nestings, but this caused problems when processing multiple inputs. Changed the assignment to `base` to `new_base`.
* fix: don't reuse same dict for headers in SPARQL HTTP requestsgitmpje2022-12-241-2/+3
| | | | | | Copy kwargs dict to prevent POST headers to be used in GET request and vice versa. Co-authored-by: Mark van der Pas <mark.van.der.pas@semaku.com> Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
* refactor: remove redundant class (#2143)Veyndan Stuart2022-11-191-7/+1
| | | `rdflib.plugins.sparql.parserutils.plist` has no good purpose.
* Fix type errors resulting from new mypy (#2161)Iwan Aucamp2022-11-193-7/+4
| | | | | | New mypy version is reporting new errors. In the long run we need to switch to poetry so we can better control this.
* refactor: Pass service_query to _buildQueryStringForServiceCall instead of a ↵Veyndan Stuart2022-10-141-4/+2
| | | | | | Match (#2134) In order to use the function, we must know that we need to pass a Match with a capture group at position 2. This is non-obvious.
* feat: add type hint to part in evalServiceQuery (#2133)Veyndan Stuart2022-10-131-1/+1
|
* chore: remove outdated comment (#2129)Veyndan Stuart2022-10-101-1/+0
| | | ServiceGraphPattern has clearly been implemented per the preceding line.
* fix: type ignore compatibility with latest mypy (#2127)Iwan Aucamp2022-10-091-4/+4
| | | | New mypy has more specific type errors so the type: ignore[misc] is no longer effective.
* fix: add charset encoding to SPARQLConnector.update() request. (#2112)Robert Casties2022-09-151-1/+1
| | | | | Add encoding "charset=UTF-8" to Content-Type header in `SPARQLConnector.update()` request. Fixes #2095
* feat: add type hints to `rdflib.query` and related (#2097)Iwan Aucamp2022-08-248-56/+108
| | | | | | | Add type hints to `rdflib.query` and result format implementations, also add/adjust ignores and type hints in other modules to accommodate the changes. This does not include any runtime changes.
* fix: issue with trig reference counting across graphs (#2085)Iwan Aucamp2022-08-241-9/+5
| | | | | | | | | | | | | | | | | | | | | The TriG serializer was only considering BNode references inside a single graph and not counting the BNodes subjects as references when considering if a BNode should be serialized as unlabeled blank nodes (i.e. `[ ]`), and as a result it was serializing BNodes as unlabeled if they were in fact referencing BNodes in other graphs. One caveat of this change is that some RDF Datasets may be serialized less succinctly in that unlabeled blank nodes would not be used nodes where it is technically possible to use them. This can be trivially fixed, but a trivial fix increases the computational complexity of serialization significantly. Other changes: - Removed the roundtrip xfail that this change fixed. - Added another roundtrip test which has various combinations of BNode references across graphs in a dataset, this test fails for JSON-LD however, so while this change removes one xfail it also now adds another. - Set the default indent_size and style in `.editorconfig` as to avoid relying on undefined system defaults.
* feat: add type hints to `rdflib.plugins.sparql.{algebra,operators}` (#2094)Iwan Aucamp2022-08-232-137/+238
| | | | | | | | | | | | | More or less complete type hints for `rdflib.plugins.sparql.algebra` and `rdflib.plugins.sparql.operators`. This does not change runtime behaviour. Other changes: - Fixed line endings of `test/test_issues/test_issue1043.py` and `test/test_issues/test_issue910.py`. - Removed a type hint comment that was present in rdflib/plugins/sparql/algebra.py This is split-off from https://github.com/RDFLib/rdflib/pull/1850.
* feat: Add type hints to rdflib.graph (#2080)Iwan Aucamp2022-08-233-21/+24
| | | | | | | | | | | | More or less complete type hints for the rdflib.graph module. Other changes: - Improved/simplified type hints in `rdflib.store` and store plugins. - Add type ignores for various type errors that occur with the type hints. This is split-off from <https://github.com/RDFLib/rdflib/pull/1850>. This PR does not change runtime behaviour.
* fix: generate VALUES block for federated queries with variables only (#2084)Veyndan Stuart2022-08-211-1/+1
| | | | | | | | Fixed the generation of VALUES block for federated queries. The values block was including non-variable values like BNodes which resulted in invalid queries. Closes #2079.
* fix: handling of Literal datatype (#2076)Iwan Aucamp2022-08-121-11/+8
| | | | | | | | | | | | | | | | | | Check datatype against `None` instead of checking it's truthiness (i.e. `if datatype is not None:` instead of `if datatype:`). Checking truthiness instead of `is not None` causes a blank string to be treated the same as None. The consequence of this was that `Literal.datatype` could be a `str`, a `URIRef` or `None`, instead of just a `URIRef` or `None` as was seemingly intended. Other changes: - Changed the type of `Literal.datatype` to be `Optional[URIRef]` instead of `Optional[str]` now that `str` will always be converted to `URIRef` even if it is a blank string. - Changed `rdflib.util._coalesce` to make it easier and safer to use with a non-`None` default value. - Changed `rdflib.util` to avoid issues with circular imports.
* add chunk serializer & tests (#1968)Nicholas Car2022-08-121-10/+23
| | | | | | | | | This file provides a single function `serialize_in_chunks()` which can serialize a Graph into a number of NT files with a maximum number of triples or maximum file size. There is an option to preserve any prefixes declared for the original graph in the first file, which will be a Turtle file. Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
* fix: always parse HexTuple files as utf-8 (#2070)Iwan Aucamp2022-08-071-1/+1
| | | | | | | | | | Always parse HexTuple files as utf-8 as was the intent anyway as evidenced by the code that will raise a warning if the encoding provided for a HexTuple file is something other than utf-8 or None. https://github.com/RDFLib/rdflib/blob/cfa418074b27b12aac905ba266b002a237c5ff4c/rdflib/plugins/parsers/hext.py#L73-L79 Not adding any tests as this code is already tested and an XFAIL is removed in this patch.
* fix: narrow the context identifier type from `Node` to `IdentifiedNode` (#2069)Iwan Aucamp2022-08-021-10/+5
| | | | | | | | | Narrow the context identifier type from `Node` to `IdentifiedNode` as `Node` is too broad and no supported format (N3, RDF) allows using anything other than `IdentifiedNode` as context identifiers. The only change here that has a runtime impact is the change in `Graph.__init__` to check isinstance against `Node` instead of `IdentifiedNode`.
* feat: add type hints for `rdflib.store` and `rdflib.plugins.stores` (#2057)Iwan Aucamp2022-07-305-253/+707
| | | | | | | | | | | | | | | | | | | | Add type hints and aliases for `rdflib.store` and `rdflib.plugins.stores` and also add a couple of more type hints and aliases to `rdflib.graph`. This PR contains no runtime changes. Other changes: - Changed some imports to be more specific (e.g. `import from rdflib.graph` instead of `import from rdflib`). This is to reduce the probability of circular imports. - Ignore `E231` (missing whitespace after ',') in flake8 as black is managing the whitespaces and seems to be bumping heads with flake8 with spaces after `,` sometimes. - Install `berkeleydb-stubs` when doing extensive testing with tox. - Added `devtools/diffrtpy.py` which is a script that can be used with `git difftool` to generate compact diffs for python code. This should make it a lot easier to review PRs that change type hints to verify that they don't have a runtime impact.
* fix: SPARQL XML result parsing (#2044)Iwan Aucamp2022-07-261-30/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed the following problems with the SPARQL XML result parsing: - Both the parse method of both the lxml and `xml` modules does not work work well with `TextIO` objects, the `xml` module works with `TextIO` objects if the XML encoding is `utf-8`, but not if it is `utf-16`, and with `lxml` parse fails for both `utf-8` and `utf-16`. To fix this I changed the XML result parser to first convert `TextIO` to `bytes` and then feed the `bytes` to `parse()` using `BytesIO`. - The parser was operating on all elements inside `results` and `result` elements, even if those elements were not `result` and `binding` elements respectively. This was causing problems with `lxml`, as `lxml` also returns comments when iterating over elements. To fix this I added a check for the element tags so that only the correct elements are considered. Other changes: - Added type hints to `rdflib.plugins.sparql.results.xmlresults`. - Run with `lxml` one some permutations in the test matrix. - Removed `rdflib.compat.etree`, as this was not very helpful for the SPARQL XML Result parser and it was not used elsewhere. - Added an `lxml` environment to tox which installs `lxml` and `lxml-stubs`. - Expanded SPARQL result testing by adding some additional parameters. Related issues: - Fixes https://github.com/RDFLib/rdflib/issues/2035 - Fixes https://github.com/RDFLib/rdflib/issues/1847
* chore: remove pre python 3.5 regex related workaround (#2042)Iwan Aucamp2022-07-201-28/+1
| | | | | | | | Removing a pre-python 3.5 regex related workaround as it uses a private method from the `re` module, and mypy complains about this, it is also no longer needed as we no longer support pre 3.5 versions of python. No tests are added as all changed parts of the `Builtin_REPLACE` function is already covered by existing tests.
* fix: import xml.sax.handler from the right place (#2041)Iwan Aucamp2022-07-201-2/+1
| | | | | | | Change the import of `xml.sax.handler` in the TriX parser so that it imports from `xml.sax` and not from `xml.sax.saxutils`. Importing from `xml.sax.saxutils` causes mypy to fail but it is also wrong as there is no documented `handler` in `xml.sax.saxutils`.
* revert: fix: import xml.sax.handler from the right placeIwan Aucamp2022-07-201-1/+2
| | | | | | | | This reverts commit 1740214b591eb0f3e57fc6c6b63da2b29f7ae946. Was working on the wrong branch. Refs: 1740214b591eb0f3e57fc6c6b63da2b29f7ae946
* fix: import xml.sax.handler from the right placeIwan Aucamp2022-07-201-2/+1
| | | | | | | Change the import of `xml.sax.handler` in the TriX parser so that it imports from `xml.sax` and not from `xml.sax.saxutils`. Importing from `xml.sax.saxutils` causes mypy to fail but it is also wrong as there is no documented `handler` in `xml.sax.saxutils`.
* docs: fix sphinx nitpicky issues (#2036)Iwan Aucamp2022-07-186-12/+32
| | | | | | | | | | | | | Enable nitpicky mode for Sphinx and fix all warnings and errors that occur when running with nitpicky enabled. Other changes: - Add a tox environment for building docs (-docs). This is so we can test building docs on various versions of python as there seems to be some differences in warnings between different versions. This tox environment is enabled for linux CI builds. - Change readthedocs to use python 3.9 as earlier versions do not handle `@typing.overload` with type aliases. - Fixes https://github.com/RDFLib/rdflib/issues/1878