summaryrefslogtreecommitdiff
path: root/tests/test_data.py
Commit message (Collapse)AuthorAgeFilesLines
* Speed up JSON and reduce HTML formatter consumption (#1569)Kurt McKee2020-10-261-56/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Update the JSON-LD keyword list to match JSON-LD 1.1 Changes in this patch: * Update the JSON-LD URL to HTTPS * Update the list of JSON-LD keywords * Make the JSON-LD parser less dependent on the JSON lexer implementation * Add unit tests for the JSON-LD lexer * Add unit tests for the JSON parser This includes: * Testing valid literals * Testing valid string escapes * Testing that object keys are tokenized differently from string values * Rewrite the JSON lexer Related to #1425 Included in this change: * The JSON parser is rewritten * The JSON bare object parser no longer requires additional code * `get_tokens_unprocessed()` returns as much as it can to reduce yields (for example, side-by-side punctuation is not returned separately) * The unit tests were updated * Add unit tests based on Hypothesis test results * Reduce HTML formatter memory consumption by ~33% and speed it up Related to #1425 Tested on a 118MB JSON file. Memory consumption tops out at ~3GB before this patch and drops to only ~2GB with this patch. These were the command lines used: python -m pygments -l json -f html -o .\new-code-classes.html .\jc-output.txt python -m pygments -l json -f html -O "noclasses" -o .\new-code-styles.html .\jc-output.txt * Add an LRU cache to the HTML formatter's HTML-escaping and line-splitting For a 118MB JSON input file, this reduces memory consumption by ~500MB and reduces formatting time by ~15 seconds. * JSON: Add a catastrophic backtracking test back to the test suite * JSON: Update the comment that documents the internal queue * JSON: Document in comments that ints/floats/constants are not validated
* fix regression in JSON lexer, bump to 2.7.12.7.1Georg Brandl2020-09-171-7/+12
| | | | Fixes #1544
* all: remove "u" string prefix (#1536)Georg Brandl2020-09-081-51/+51
| | | | | | | | | | | * all: remove "u" string prefix * util: remove unirange Since Python 3.3, all builds are wide unicode compatible. * unistring: remove support for narrow-unicode builds which stopped being relevant with Python 3.3
* more explicitly define escape sequencies in JsonLexer (fix #1065) (#1528)Nick Gerner2020-08-311-0/+28
| | | | | * more explicitly define escape sequencies in JsonLexer (fix #1065) * adding test coverage for #1065
* Remove unittest classes from the test suite.Georg Brandl2019-11-101-103/+110
|
* Fix #1528 -- Yaml gets confused when a comment contains a key:value pair.Matth?us G. Chajdas2019-07-201-1/+18
|
* Robustify json-object against unexpected '}'Tim Hatch2016-05-311-0/+13
|
* Add a new lexer that assumes json object is already open.Tim Hatch2016-05-311-0/+87
Fixes #884