summaryrefslogtreecommitdiff
path: root/tests/test_syntax
Commit message (Collapse)AuthorAgeFilesLines
* Port all smarty tests to the new frameworkHEADmasterDmitry Shachnev2023-05-161-1/+161
|
* Use pyspelling to check spelling.Waylan Limberg2023-04-067-19/+19
| | | In addition to checking the spelling in our documentation, we are now also checking the spelling of the README.md and similar files as well as comments in our Python code.
* Improve standalone * and _ parsing.Waylan Limberg2022-11-151-3/+45
| | | | | | | | | | | The `NOT_STRONG_RE` regex matchs 1, 2, or 3 * or _ which are surrounded by white space to prevent them from being parsed as tokens. However, the surrounding white space should not be consumed by the regex, which is why lookhead and lookbehind assertions are used. As `^` cannot be matched in a lookbehind assertion, it is left outside the assertion, but as it is zero length, that should not matter. Tests added and/or updated to cover various edge cases. Fixes #1300.
* Move backslash unescaping to treeprocessorWaylan Limberg2022-07-151-0/+36
| | | | | | | | | | By unescaping backslash escapes in a treeprocessor, the text is properly escaped during serialization. Fixes #1131. As it is recognized that various third-party extensions may be calling the old class at `postprocessors.UnescapePostprocessor` the old class remains in the codebase, but has been deprecated and will be removed in a future release. The new class `treeprocessors.UnescapeTreeprocessor` should be used instead.
* Pass language to Pygments formatter in CodeHiliteLiang-Bo Wang2022-05-182-0/+132
| | | | | | | | * Add an extra option `lang_str` to pass the language of the code block to the specified Pygments formatter. * Include an example custom Pygments formatter in the documentation that includes the language of the code in the output using the new option. Resolves #1255.
* Support for custom Pygments formatterShrikant Sharat Kandula2022-05-091-1/+129
| | | | | This adds configuration support for using a custom Pygments formatter, either by giving the string name, or a custom formatter class (or callable).
* Support custom CSS class on TOC elementJannis Vajen2022-05-051-0/+43
| | | Closes #1224
* Footnotes improvementsysard2022-05-051-0/+36
| | | | | | | | | | | | | * footnotes: Allow to use backlink title without footnote number - The placeholder '{}' is optional. So a user can choose to include or not the footnote number in the backlink text. - The modification is backward compatible with configurations using the old '%d' placeholder. * footnotes: Allow to use custom superscript text - The addition of a new SUPERSCRIPT_TEXT option allows to specify a placeholder receiving the footnote number for the superscript text.
* Update th/td to use style attributeGaige B Paulsen2022-05-051-0/+860
| | | | | | | | | | This allows better interoperation with CSS style sheets, as the align object on the TH is skipped if the css uses 'text-align: inherit' and the previous 'text-align' is used instead (or the default: left). Added an override to restore the original `align` behavior Moved existing tests to the new test infrastructure Added new tests to test the configuration parameter Updated documentation to document the configuration parameter.
* Ensure fenced code attributes are properly escaped.Waylan Limberg2022-05-041-0/+18
| | | Fixes #1247.
* extensions: copy config dict on each highlighted blockGert van Dijk2022-04-182-0/+77
| | | | | | | | | This fixes a bug where any subsequent highlighted block with codehilite would result in the omission of the style setting, because it was popped off the dict. It would then fall back to pygments_style 'default' after the first block. Fixes #1240
* [style]: fix various typos in docstrings and commentsFlorian Best2022-03-187-13/+13
|
* Disallow square brackets in reference link ids.Waylan Limberg2022-01-101-0/+34
| | | | | | We already disallow right square brackets. This also disallows left square brackets, which ensures link references will be less likely to collide with standard links in some weird edge cases. Fixes #1209.
* Ensure <summary> tags are parsed correctly.Waylan Limberg2021-11-031-0/+20
| | | | Fixes #1079.
* Improve email address validation for Automatic LinksCarlos2021-08-111-0/+63
|
* Don't process shebangs in codehilite when processing fenced codeIsaac Muse2021-08-041-0/+30
| | | Fixes #1156.
* Better toc detectionCharles de Beauchesne2021-07-271-0/+8
| | | Fixes #1160.
* toc: Do not remove diacritical marks when slugify_unicode is usedDmitry Shachnev2021-03-241-4/+14
| | | | | | | Update the existing test and add a new one to make sure that the behavior of default slugify function has not changed. Fixes #1118.
* Ensure permalinks and ankorlinks are not restricted by toc_depthWaylan Limberg2021-02-241-5/+393
| | | | | | | | | | | | This fixes a regression which was introduced with support for toc_depth. Relevant tests have been moved and updated to the new framework. Fixes #1107. The test framework also received an addition. The assertMarkdownRenders method now accepts a new keyword expected_attrs which consists of a dict of attrs and expected values. Each is checked against the attr of the Markdown instance. This was needed to check the value of md.toc and md.toc_tokens in some of the included tests.
* Ensure admonition content is detabbed properlyIsaac Muse2021-02-051-0/+30
|
* Preserve text immediately before an admonitionOleh Prypin2020-12-301-0/+21
|
* Use simplified regex for html placeholders (#1086)Waylan Limberg2020-12-081-0/+11
| | | Co-authored-by: Reilly Raab <raabrp@gmail.com>
* Properly parse unclosed tags in code spansWaylan Limberg2020-11-231-0/+105
| | | | | | | * fix unclosed pi in code span * fix unclosed dec in code span * fix unclosed tag in code span Closes #1066.
* Properly parse processing instructions in md_in_htmlWaylan Limberg2020-11-191-0/+50
| | | | | | | Empty tags do not have a `mardkown` attribute set on them. Therefore, there is no need to check the mdstack to determine behavior. If we are in any md_in_html state (regardless of block, span, etc) the behavior is the same. Fixes #1070.
* Properly parse code spans in md_in_html (#1069)Waylan Limberg2020-11-181-0/+66
| | | | | | | | | | This reverts part of 2766698 and re-implements handling of tails in the same manner as the core. Also, ensure line_offset doesn't raise an error on bad input (see #1066) and properly handle script tags in code spans (same as in the core). Fixes #1068.
* Fix issues related to hr tagsIsaac Muse2020-10-242-0/+222
| | | | | | | | | | | Ensure that start/end tag handler does not include tags in the previous paragraph. Provide special handling for tags like hr that never have content. Use sets for block tag lists as they are much faster when comparing if an item is in the list. Fixes #1053.
* Avoid catastrophic backtracking in `hr` regexWaylan Limberg2020-10-241-0/+23
| | | | Fixes #1055.
* Ensure when tag text is None that it is converted to empty stringIsaac Muse2020-10-211-0/+18
| | | Fixes #1049
* Properly parse inline HTML in md_in_htmlIsaac Muse2020-10-191-0/+160
| | | Fixes #1040 and fixes #1045.
* Account for Etree Elements in HTML StashWaylan Limberg2020-10-141-1/+16
| | | | | | | | | | | | | | | | By calling str on all stash elements we ensure they don't raise an error. Worse case, soemthing like `<Element 'div' at 0x000001B2DAE94900>` gets inserted into the output. However, with the override in the md_in_html extension, we actually serialize and reinsert the original HTML. Worse case, an HTML block which should be parsed as Markdown gets skipped by the extension (`<div markdown="block"></div>` gets inserting into the output). The tricky part is testing as there should be no known cases where this ever occurs. Therefore, we forefully pass an etree Element directly to the method in the test. That said, as #1040 is unresolved at this point, I have tested locally with a real existing case and it works well. Related to #1040.
* Correctly parse raw `script` and `style` tags. (#1038)Waylan Limberg2020-10-121-0/+85
| | | | | | | * Ensure unclosed script tags are parsed correctly by providing a workaround for https://bugs.python.org/issue41989. * Avoid cdata_mode outside of HTML blocks, such as in inline code spans. Fixes #1036.
* Skip tests with pygments version mismatch.Waylan Limberg2020-10-082-261/+282
| | | | | | | | If pygments is installed and the version doesn't match the expected version. then any relevant tests will fail. To avoid failing tests due to different output by pygments, those tests will be skipped. The pygments tox env sets the `PYGMENTS_VERSION environment variable, so that env will always run those tests against the expected version.
* Some test tweaks.Waylan Limberg2020-10-081-1/+6
| | | | | | | | | | * Pygments specific tests now only run when the pygments version installed matches the expected version. That version is defined in an environment variable (PYGMENTS_VERSION) in the 'pygments' tox env (see #1030). * When the Python lib tidylib is installed but the underlying c lib is not, the relevant tests are now skipped rather than fail. This matches the behavior when the Python lib is not installed. The tox envs are now useful on systems which don't have the c lib installed.
* Ensure consistent handling of classes by fenced_code and codehilite (#1033)Waylan Limberg2020-10-081-17/+17
| | | | | | | * All non-language classes should always be assigned to the pre tag. * The language identifying class should never be included with the general list of classes. Fixes #1032
* Update tests for pygments-2.7.1Michał Górny2020-10-072-12/+12
| | | | Closes #1030
* Support unicode ids in toc (#970)Antoine2020-10-011-0/+22
| | | A second function, `slugify_unicode` was added rather than changing the existing function so as to maintain backward compatibility. While an `encoding` parameter was added to the `slugify` function, we can't expect existing third party functions to accept a third parameter. Therefore, the two parameter API was preserved with this change.
* Refactor HTML Parser (#803)Waylan Limberg2020-09-225-3/+2747
| | | | | | | | | | The HTML parser has been completely replaced. The new HTML parser is built on Python's html.parser.HTMLParser, which alleviates various bugs and simplifies maintenance of the code. The md_in_html extension has been rebuilt on the new HTML Parser, which drastically simplifies it. Note that raw HTML elements with a markdown attribute defined are now converted to ElementTree Elements and are rendered by the serializer. Various bugs have been fixed. Link reference parsing, abbreviation reference parsing and footnote reference parsing has all been moved from preprocessors to blockprocessors, which allows them to be nested within other block level elements. Specifically, this change was necessary to maintain the current behavior in the rebuilt md_in_html extension. A few random edge-case bugs (see the included tests) were resolved in the process. Closes #595, closes #780, closes #830 and closes #1012.
* Fix complex scenarios with definition, ordered, and unordered lists (#1007)Isaac Muse2020-07-271-0/+323
| | | Fixes #918.
* Fix complex scenarios with lists and admonitions (#1006)Isaac Muse2020-07-261-0/+194
| | | | | Add better logic to admonitions to account for more complex list cases Fixes #1004
* Fix HR which follows strong em.Waylan Limberg2020-07-011-0/+16
| | | | Fixes #897.
* Support short reference image links.Waylan Limberg2020-07-011-0/+24
| | | | Fixes #894.
* Add suport for attr_lists in table headers.Waylan Limberg2020-06-301-6/+14
|
* Tune attr list regexWaylan Limberg2020-06-301-0/+72
| | | | | | | | | | | Ignore empty braces. Braces must contain at least one non-whitepsace character to be recognized as an attr list. Attr lists for table cells must be at the end of the cell content and must be seperated from the content by at least one space. This appears to be a breaking change. However, it is consistent with the behavior elsewhere. Fixes #898.
* Fix unescaping of HTML characters <> in CodeHilite. (#990)Rohitt Vashishtha2020-06-291-0/+23
| | | | | | | | | | | Previously, we'd unescape both `&amp;gt;` and `&gt;` to the same string because we were running the &amp; => & replacement first. By changing the order of this replacement, we now convert: `&amp;gt; &gt;` => `&gt; >` as expected. Fixes #988.
* Limit depth of blockquotes using Python's recursion limit. (#991)Waylan Limberg2020-06-291-0/+51
| | | | | | | | | | | | | | | | | If the Python stack comes within 100 frames of the recursion limit, then the nesting limit of blockquotes is met. Any remaining text, including angle brackets, are simply wrapped in a paragraph. To increasing the nesting depth, increase Python's recursion limit. However, be aware that each level of recursion will likely result in multiple frames being added to the Python stack. Therefore, the recursion depth and nesting depth are not one-to-one. Performance is an concern here. However, the current solution seems like a reasonable compromise. It doesn't slow things down too much, but also avoids Markdown input resulting in an error. This is mostly only a concern with contrived input anyway. For the average Markdown document, this will likely never be an issue. Fixes #799.
* Refactor fenced_code & codehilite options (#816)Waylan Limberg2020-06-232-2/+1301
| | | | | | | | | | | | | | | | | | | | * Add `language-` prefix to output when syntax highlighting is disabled for both codehilite and fenced_code extensions. * Add `lang_prefix` config option to customize the prefix. * Add a 'pygments' env to tox which runs the tests with Pygments installed. Pygments is locked to a specific version in the env. * Updated codehilite to accept any Pygments options. * Refactor fenced code attributes. - ID attr is defined on `pre` tag. - Add support for attr_list extension, which allows setting arbitrary attributes. - When syntax highlighting is enabled, any pygments options can be defined per block in the attr list. - For backward compatibility, continue to support `hi_lines` outside of an attr_list. That is the only attr other than lang which is allowed without the brackets (`{}`) of an attr list. Note that if the brackets exist, then everything, including lang and hl_lines, must be within them. * Resolves #775. Resolves #334. Addresses #652.
* Fix issues with complex emphasisfacelessuser2020-06-222-0/+28
| | | | | Resolves issue that can occur with complex emphasis combinations. Fixes #979
* TOC fix for AtomicString handling (#934)Isaac Muse2020-04-061-0/+22
| | | Fixes #931.
* Add permalink_title option (#886)Waylan Limberg2019-11-261-0/+20
| | | | | Addes a new `permalink_title` option to the TOC extension, which allows the title attribute of a permalink to be set to something other than the default English string "Permanent link". Fixes #781.
* Add anchorlink_class and permalink_class options to TOCWaylan Limberg2019-11-261-0/+67
| | | | | | | | | Two new configuration options have been added to the toc extension: `anchorlink_class` and `permalink_class` which allows class(es) to be assigned to the `anchorlink` and `permalink` HTML respectively. This allows using icon fonts from CSS for the links. Therefore, an empty string passed to `permalink` now generates an empty `permalink`. Previously no `permalink` would have been generated. Based on #776.