summaryrefslogtreecommitdiff
path: root/test/test_misc/test_input_source.py
Commit message (Collapse)AuthorAgeFilesLines
* fix: HTTP 308 Permanent Redirect status code handling (#2389)Iwan Aucamp2023-05-171-16/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the handling of HTTP status code 308 to behave more like `urllib.request.HTTPRedirectHandler`, most critically, the new 308 handling will create a new `urllib.request.Request` object with the new URL, which will prevent state from being carried over from the original request. One case where this is important is when the domain name changes, for example, when the original URL is `http://www.w3.org/ns/adms.ttl` and the redirect URL is `https://uri.semic.eu/w3c/ns/adms.ttl`. With the previous behaviour, the redirect would contain a `Host` header with the value `www.w3.org` instead of `uri.semic.eu` because the `Host` header is placed in `Request.unredirected_hdrs` and takes precedence over the `Host` header in `Request.headers`. Other changes: - Only handle HTTP status code 308 on Python versions before 3.11 as Python 3.11 will handle 308 by default [[ref](https://docs.python.org/3.11/whatsnew/changelog.html#id128)]. - Move code which uses `http://www.w3.org/ns/adms.ttl` and `http://www.w3.org/ns/adms.rdf` out of `test_guess_format_for_parse` into a separate parameterized test, which instead uses the embedded http server. This allows the test to fully control the `Content-Type` header in the response instead of relying on the value that the server is sending. This is needed because the server is sending `Content-Type: text/plain` for the `adms.ttl` file, which is not a valid RDF format, and the test is expecting `Content-Type: text/turtle`. Fixes: - <https://github.com/RDFLib/rdflib/issues/2382>.
* build(deps-dev): bump black from 22.12.0 to 23.1.0 (#2248)dependabot[bot]2023-03-111-2/+1
|
* fix: small InputSource related issues (#2255)Iwan Aucamp2023-03-111-0/+693
I have added a bunch of tests for `InputSource` handling, checking most kinds of input source with most parsers. During this, I detected the following issues that I fixed: - `rdflib.util._iri2uri()` was URL quoting the `netloc` parameter, but this is wrong and the `idna` encoding already takes care of special characters. I removed the URL quoting of `netloc`. - HexTuple parsing was handling the input source in a way that would only work for some input sources, and not raising errors for other input sources. I changed the input source handling to be more generic. - `rdflib.parser.create_input_source()` incorrectly used `file.buffer` instead of `source.buffer` when dealing with IO stream sources. Other changes with no runtime impact include: - Changed the HTTP mocking stuff in test slightly to accommodate serving arbitrary files, as I used this in the `InputSource` tests. - Don't use Google in tests, as we keep getting `urllib.error.HTTPError: HTTP Error 429: Too Many Requests` from it.