| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Replace bare `except:` with `except Exception`, there are some cases where it
can be narrowed further, but this is already an improvement over the current
situation.
This is somewhat pursuant to eliminating
[flakeheaven](https://github.com/flakeheaven/flakeheaven), as it no longer
supports the latest version of flake8
[[ref](https://github.com/flakeheaven/flakeheaven/issues/132)]. But it also is
just the right thing to do as bare exceptions can cause problems.
|
|
|
|
|
|
|
|
| |
Disable
[`implicit_reexport`](https://mypy.readthedocs.io/en/stable/config_file.html#confval-implicit_reexport)
and eliminate all errors reported by mypy after this.
This helps ensure that import statements import from the right module and that
the `__all__` variable is correct.
|
|
|
|
|
|
|
|
|
| |
DESCRIBE queries (#2322)
Add optional keyword only `target_graph` argument to `rdflib.graph.Graph.cbd` and use this new argument in `evalDescribeQuery`.
This makes it possible to compute a concise bounded description without creating a new graph to hold the result, and also without potentially having to copy it to another final graph.
Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
|
|
|
|
|
| |
This change removes the redundant inheritance from `object` (i.e. `class
Foo(object): pass`) that is no longer needed in Python 3 and is a relic from
Python 2.
|
|
|
|
|
|
|
| |
This change narrows import so that things are imported from the Python module
where they are defined instead of importing them from a module that re-exports
them, e.g. change import of `Graph` to import from the `rdflib.graph` module
instead of from the `rdflib` module. This helps avoid problems with circular
imports.
|
|
|
|
|
|
| |
Compatibility handling for `collections.abc.Mapping` and
`collections.abc.MutableMapping` is not needed as RDFLib currently only support
Python 3.7 and newer, and those classes are available from `collections.abc` in
Python 3.7.
|
|
|
|
|
| |
This change eliminates some situations where a mutable object (i.e., a dictionary) was used as the default value for functions in the `rdflib.plugins.sparql.processor` module and related code. It replaces these situations with `typing.Optinal` that defaults to None, and is then handled within the function. Luckily, some of the code that the SPARQL Processor relied on already had this style, meaning not a lot of changes had to be made.
This change also makes a small update to the logic in the SPARQL Processor's query function to simplify the if/else statement. This better mirrors the implementation in the `UpdateProcessor`.
|
|
|
|
|
| |
Previously, `rdflib.plugins.sparql.algebra.translateAlgebra()` maintained state via a file, with a fixed filename `query.txt`. With this change, use of that file is eliminated; state is now maintained in memory so that multiple concurrent `translateAlgebra()` calls, for example, should no longer interfere with each other.
The change is accomplished with no change to the client interface. Basically, the actual functionality has been moved into a class, which is instantiated and used as needed (once per call to `algrebra.translateAlgebra()`).
|
|
|
|
|
|
|
|
|
| |
`ROUND` was not correctly rounding negative numbers towards positive infinity,
`ENCODE_FOR_URI` incorrectly treated `/` as safe, and `SECONDS` did not include
fractional seconds.
This change corrects these issues.
- Closes <https://github.com/RDFLib/rdflib/issues/2151>.
|
|
|
|
|
|
|
|
|
|
| |
`Context.to_dict` is used in JSON-LD serialization, but it was not implemented.
This change adds the method.
- Closes <https://github.com/RDFLib/rdflib/issues/2138>.
---------
Co-authored-by: Marc-Antoine Parent <maparent@acm.org>
|
|
|
|
|
|
|
|
|
|
| |
A variable was only being initialized for string-valued inputs, but if a `dict`
input was passed the variable would still be accessed, resulting in a
`UnboundLocalError`.
This change initializes the variable always, instead of only when string-valued
input is used to construct a JSON-LD context.
- Closes <https://github.com/RDFLib/rdflib/issues/2303>.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
build(deps-dev): bump mypy from 1.0.1 to 1.1.1
Bumps [mypy](https://github.com/python/mypy) from 1.0.1 to 1.1.1.
- [Release notes](https://github.com/python/mypy/releases)
- [Commits](https://github.com/python/mypy/compare/v1.0.1...v1.1.1)
updated-dependencies:
- dependency-name: mypy
dependency-type: direct:development
update-type: version-update:semver-minor
Also added type ignores for newly detected type errors.
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
|
|
|
|
|
|
|
|
|
| |
docs: document available security measures
Several security measures can be used to mitigate risk when processing
potentially malicious input.
This change adds documentation about available security measures and
examples and tests that illustrate their usage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A bit of a roundabout reason why this matters now, but basically:
I want to add examples for securing RDFLib with `sys.addaudithook`
and `urllib.request.install_opener`. I also want to be sure examples
are actually valid, and runnable, so I was adding static analysis
and simple execution of examples to our CI.
During this, I noticed that examples use `initBindings` with
`Dict[str,...]`, which was not valid according to mypy, but then after
some investigation I realized the type hints in some places were too
strict.
So the main impetus for this is actually to relax the type hints in
`rdflib.graph`, but to ensure this is valid I'm adding a bunch of type
hints I had saved up to `rdflib.plugins.sparql`.
Even though this PR looks big, it has no runtime changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here, adding type-hints to some of the SPARQL parser plugin code.
Includes a couple of small consequent changes:
1. Minor refactor of `prettify_parsetree()`, separating the public-facing callable from the internal code that does not need to be public-facing. That allows the public-facing callable to have more informative and restrictive type-hints for its arguments.
2. Added some test-coverage for `expandUnicodeEscapes()` - initially for my own understanding, but seems useful to leave it in place since I didn't see test-coverage for that function.
There should be no backwards-incompatible changes in this PR - at least, not intentionally.
---------
Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
|
|
|
|
|
|
|
|
| |
Mainly so that users can use RDFLib in a safer way, and that we can make
safer changes to RDFLib in future.
There are also some accomodating type-hint related changes outside of `rdflib.util`.
This change does not have a runtime impact.
|
|
|
|
|
|
|
| |
Add some small diverse type hints. Type hints make RDFLib safer to use
and change, as changes and usage can be validated using static
analysers like mypy.
This change does not have a runtime impact.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I have added a bunch of tests for `InputSource` handling, checking
most kinds of input source with most parsers. During this, I detected
the following issues that I fixed:
- `rdflib.util._iri2uri()` was URL quoting the `netloc` parameter, but
this is wrong and the `idna` encoding already takes care of special
characters. I removed the URL quoting of `netloc`.
- HexTuple parsing was handling the input source in a way that would
only work for some input sources, and not raising errors for other
input sources. I changed the input source handling to be more generic.
- `rdflib.parser.create_input_source()` incorrectly used `file.buffer`
instead of `source.buffer` when dealing with IO stream sources.
Other changes with no runtime impact include:
- Changed the HTTP mocking stuff in test slightly to accommodate
serving arbitrary files, as I used this in the `InputSource` tests.
- Don't use Google in tests, as we keep getting
`urllib.error.HTTPError: HTTP Error 429: Too Many Requests`
from it.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add type hints to:
- `rdflib/parser.py`
- `rdflib/plugins/parser/*.py`
- some JSON-LD utils
- `rdflib/exceptions.py`.
This is mainly because the work I'm doing to fix
<https://github.com/RDFLib/rdflib/issues/1844> is touching some of
this parser stuff and the type hints are useful to avoid mistakes.
No runtime changes are included in this PR.
|
|
|
|
|
|
|
|
|
| |
This adds an implementation for SPARQL DESCRIBE queries, using the built-in `cbd` method. I see there are several issues and PRs for DESCRIBE implementation. I believe this should close #479 and should resolve #1913, or at least pick up where it left off. It should also resolve #1096.
This implementation should support the full SPARQL specification for DESCRIBE; either explicit IRIs can be provided (with no WHERE clause), or variables projected from a WHERE clause can be provided, or variables projected from a WHERE clause AND explicit IRIs can be provided. If a WHERE clause is provided, it should be evaluated the same way as it would for a SELECT DISTINCT query (including dataset specifications).
The expected results for the test cases provided match the behaviour seen when running the same queries against the same data using ARQ.
A possible future extension would be to add a global option (similar to `rdflib.plugins.sparql.SPARQL_LOAD_GRAPHS`) to change the method used to describe resources instead of always using CBD.
|
|
|
| |
Altering order in which aggregate variable aliases are renamed to user-defined variable names to ensure that when defining a `VALUES` pattern outside of a `GROUP BY`, the variables in the query are correctly joined to those defined in the `VALUES` pattern.
|
|
|
|
|
|
|
| |
Fixing use of `SELECT *` in sub-select within `SELECT *` parent query as discovered in https://github.com/RDFLib/rdflib/issues/1722.
Now when an instance of `SELECT *` is encountered, the query tree/plan builder now correctly considers the projected variables of any sub-select statements when deciding which variables should be projected out.
Fixes <https://github.com/RDFLib/rdflib/issues/1722>.
|
|
|
|
|
|
|
| |
(#2180)
The SPARQLConnector instance created by a SPARQLStore instance providing endpoint url + query string params using POST method did not append the query string to endpoint url.
Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
|
|
|
| |
Since the names of empty graphs do not appear in the serialization, do not consider them for namespace issues.
|
|
|
| |
The the base URI passed to _prep_sources was being overwritten in anticipation of processing inner nestings, but this caused problems when processing multiple inputs. Changed the assignment to `base` to `new_base`.
|
|
|
|
|
|
| |
Copy kwargs dict to prevent POST headers to be used in GET request and vice versa.
Co-authored-by: Mark van der Pas <mark.van.der.pas@semaku.com>
Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
|
|
|
| |
`rdflib.plugins.sparql.parserutils.plist` has no good purpose.
|
|
|
|
|
|
| |
New mypy version is reporting new errors.
In the long run we need to switch to poetry so we can better control
this.
|
|
|
|
|
|
| |
Match (#2134)
In order to use the function, we must know that we need to pass a Match with a capture group at position 2.
This is non-obvious.
|
| |
|
|
|
| |
ServiceGraphPattern has clearly been implemented per the preceding line.
|
|
|
|
| |
New mypy has more specific type errors so the type: ignore[misc] is no
longer effective.
|
|
|
|
|
| |
Add encoding "charset=UTF-8" to Content-Type header in `SPARQLConnector.update()` request.
Fixes #2095
|
|
|
|
|
|
|
| |
Add type hints to `rdflib.query` and result format implementations, also
add/adjust ignores and type hints in other modules to accommodate the
changes.
This does not include any runtime changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The TriG serializer was only considering BNode references inside a single
graph and not counting the BNodes subjects as references when considering if a
BNode should be serialized as unlabeled blank nodes (i.e. `[ ]`), and as a
result it was serializing BNodes as unlabeled if they were in fact referencing
BNodes in other graphs.
One caveat of this change is that some RDF Datasets may be serialized
less succinctly in that unlabeled blank nodes would not be used nodes where it is
technically possible to use them. This can be trivially fixed, but a trivial fix
increases the computational complexity of serialization significantly.
Other changes:
- Removed the roundtrip xfail that this change fixed.
- Added another roundtrip test which has various combinations of BNode
references across graphs in a dataset, this test fails for JSON-LD
however, so while this change removes one xfail it also now adds
another.
- Set the default indent_size and style in `.editorconfig` as to avoid
relying on undefined system defaults.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
More or less complete type hints for `rdflib.plugins.sparql.algebra` and
`rdflib.plugins.sparql.operators`.
This does not change runtime behaviour.
Other changes:
- Fixed line endings of `test/test_issues/test_issue1043.py`
and `test/test_issues/test_issue910.py`.
- Removed a type hint comment that was present in rdflib/plugins/sparql/algebra.py
This is split-off from https://github.com/RDFLib/rdflib/pull/1850.
|
|
|
|
|
|
|
|
|
|
|
|
| |
More or less complete type hints for the rdflib.graph module.
Other changes:
- Improved/simplified type hints in `rdflib.store` and store plugins.
- Add type ignores for various type errors that occur with the type
hints.
This is split-off from <https://github.com/RDFLib/rdflib/pull/1850>.
This PR does not change runtime behaviour.
|
|
|
|
|
|
|
|
| |
Fixed the generation of VALUES block for federated queries.
The values block was including non-variable values like BNodes which resulted
in invalid queries.
Closes #2079.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check datatype against `None` instead of checking it's truthiness (i.e.
`if datatype is not None:` instead of `if datatype:`).
Checking truthiness instead of `is not None` causes a blank string to
be treated the same as None. The consequence of this was that
`Literal.datatype` could be a `str`, a `URIRef` or `None`, instead of
just a `URIRef` or `None` as was seemingly intended.
Other changes:
- Changed the type of `Literal.datatype` to be `Optional[URIRef]`
instead of `Optional[str]` now that `str` will always be converted to
`URIRef` even if it is a blank string.
- Changed `rdflib.util._coalesce` to make it easier and safer to use
with a non-`None` default value.
- Changed `rdflib.util` to avoid issues with circular imports.
|
|
|
|
|
|
|
|
|
| |
This file provides a single function `serialize_in_chunks()` which can serialize a
Graph into a number of NT files with a maximum number of triples or maximum file size.
There is an option to preserve any prefixes declared for the original graph in the first
file, which will be a Turtle file.
Co-authored-by: Iwan Aucamp <aucampia@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Always parse HexTuple files as utf-8 as was the intent anyway as
evidenced by the code that will raise a warning if the encoding provided
for a HexTuple file is something other than utf-8 or None.
https://github.com/RDFLib/rdflib/blob/cfa418074b27b12aac905ba266b002a237c5ff4c/rdflib/plugins/parsers/hext.py#L73-L79
Not adding any tests as this code is already tested and an XFAIL is
removed in this patch.
|
|
|
|
|
|
|
|
|
| |
Narrow the context identifier type from `Node` to `IdentifiedNode` as
`Node` is too broad and no supported format (N3, RDF) allows using
anything other than `IdentifiedNode` as context identifiers.
The only change here that has a runtime impact is the change in
`Graph.__init__` to check isinstance against `Node` instead
of `IdentifiedNode`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add type hints and aliases for `rdflib.store` and
`rdflib.plugins.stores` and also add a couple of more type hints and
aliases to `rdflib.graph`.
This PR contains no runtime changes.
Other changes:
- Changed some imports to be more specific (e.g. `import from
rdflib.graph` instead of `import from rdflib`). This is to reduce
the probability of circular imports.
- Ignore `E231` (missing whitespace after ',') in flake8 as black is
managing the whitespaces and seems to be bumping heads with flake8
with spaces after `,` sometimes.
- Install `berkeleydb-stubs` when doing extensive testing with tox.
- Added `devtools/diffrtpy.py` which is a script that can be used with
`git difftool` to generate compact diffs for python code. This should
make it a lot easier to review PRs that change type hints to verify
that they don't have a runtime impact.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed the following problems with the SPARQL XML result parsing:
- Both the parse method of both the lxml and `xml` modules does not work
work well with `TextIO` objects, the `xml` module works with `TextIO`
objects if the XML encoding is `utf-8`, but not if it is `utf-16`, and
with `lxml` parse fails for both `utf-8` and `utf-16`. To fix this I
changed the XML result parser to first convert `TextIO` to `bytes` and
then feed the `bytes` to `parse()` using `BytesIO`.
- The parser was operating on all elements inside `results` and
`result` elements, even if those elements were not `result` and `binding`
elements respectively. This was causing problems with `lxml`, as `lxml`
also returns comments when iterating over elements. To fix this I added
a check for the element tags so that only the correct elements are
considered.
Other changes:
- Added type hints to `rdflib.plugins.sparql.results.xmlresults`.
- Run with `lxml` one some permutations in the test matrix.
- Removed `rdflib.compat.etree`, as this was not very helpful for the
SPARQL XML Result parser and it was not used elsewhere.
- Added an `lxml` environment to tox which installs `lxml` and
`lxml-stubs`.
- Expanded SPARQL result testing by adding some additional
parameters.
Related issues:
- Fixes https://github.com/RDFLib/rdflib/issues/2035
- Fixes https://github.com/RDFLib/rdflib/issues/1847
|
|
|
|
|
|
|
|
| |
Removing a pre-python 3.5 regex related workaround as it uses a private
method from the `re` module, and mypy complains about this, it is also
no longer needed as we no longer support pre 3.5 versions of python.
No tests are added as all changed parts of the `Builtin_REPLACE`
function is already covered by existing tests.
|
|
|
|
|
|
|
| |
Change the import of `xml.sax.handler` in the TriX parser so that it
imports from `xml.sax` and not from `xml.sax.saxutils`. Importing from
`xml.sax.saxutils` causes mypy to fail but it is also wrong as there is
no documented `handler` in `xml.sax.saxutils`.
|
|
|
|
|
|
|
|
| |
This reverts commit 1740214b591eb0f3e57fc6c6b63da2b29f7ae946.
Was working on the wrong branch.
Refs: 1740214b591eb0f3e57fc6c6b63da2b29f7ae946
|
|
|
|
|
|
|
| |
Change the import of `xml.sax.handler` in the TriX parser so that it
imports from `xml.sax` and not from `xml.sax.saxutils`. Importing from
`xml.sax.saxutils` causes mypy to fail but it is also wrong as there is
no documented `handler` in `xml.sax.saxutils`.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable nitpicky mode for Sphinx and fix all warnings and errors
that occur when running with nitpicky enabled.
Other changes:
- Add a tox environment for building docs (-docs). This is so we can
test building docs on various versions of python as there seems to be
some differences in warnings between different versions. This tox
environment is enabled for linux CI builds.
- Change readthedocs to use python 3.9 as earlier versions do not handle
`@typing.overload` with type aliases.
- Fixes https://github.com/RDFLib/rdflib/issues/1878
|