|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the handling of HTTP status code 308 to behave more like
`urllib.request.HTTPRedirectHandler`, most critically, the new 308 handling will
create a new `urllib.request.Request` object with the new URL, which will
prevent state from being carried over from the original request.
One case where this is important is when the domain name changes, for example,
when the original URL is `http://www.w3.org/ns/adms.ttl` and the redirect URL is
`https://uri.semic.eu/w3c/ns/adms.ttl`. With the previous behaviour, the redirect
would contain a `Host` header with the value `www.w3.org` instead of
`uri.semic.eu` because the `Host` header is placed in
`Request.unredirected_hdrs` and takes precedence over the `Host` header in
`Request.headers`.
Other changes:
- Only handle HTTP status code 308 on Python versions before 3.11 as Python 3.11
will handle 308 by default [[ref](https://docs.python.org/3.11/whatsnew/changelog.html#id128)].
- Move code which uses `http://www.w3.org/ns/adms.ttl` and
`http://www.w3.org/ns/adms.rdf` out of `test_guess_format_for_parse` into a
separate parameterized test, which instead uses the embedded http server.
This allows the test to fully control the `Content-Type` header in the
response instead of relying on the value that the server is sending.
This is needed because the server is sending `Content-Type: text/plain` for
the `adms.ttl` file, which is not a valid RDF format, and the test is
expecting `Content-Type: text/turtle`.
Fixes:
- <https://github.com/RDFLib/rdflib/issues/2382>.
|
|
I have added a bunch of tests for `InputSource` handling, checking
most kinds of input source with most parsers. During this, I detected
the following issues that I fixed:
- `rdflib.util._iri2uri()` was URL quoting the `netloc` parameter, but
this is wrong and the `idna` encoding already takes care of special
characters. I removed the URL quoting of `netloc`.
- HexTuple parsing was handling the input source in a way that would
only work for some input sources, and not raising errors for other
input sources. I changed the input source handling to be more generic.
- `rdflib.parser.create_input_source()` incorrectly used `file.buffer`
instead of `source.buffer` when dealing with IO stream sources.
Other changes with no runtime impact include:
- Changed the HTTP mocking stuff in test slightly to accommodate
serving arbitrary files, as I used this in the `InputSource` tests.
- Don't use Google in tests, as we keep getting
`urllib.error.HTTPError: HTTP Error 429: Too Many Requests`
from it.
|