| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
In addition to checking the spelling in our documentation, we are now also checking the spelling of the README.md and similar files as well as comments in our Python code.
|
|
|
|
|
|
|
|
|
|
|
| |
The `NOT_STRONG_RE` regex matchs 1, 2, or 3 * or _ which are surrounded by
white space to prevent them from being parsed as tokens. However, the
surrounding white space should not be consumed by the regex, which is why
lookhead and lookbehind assertions are used. As `^` cannot be matched in a
lookbehind assertion, it is left outside the assertion, but as it is zero
length, that should not matter.
Tests added and/or updated to cover various edge cases. Fixes #1300.
|
|
|
| |
This completely removes all objects which were deprecated in version 3.0 (this change will be included in version 3.4). Given the time that has passed, and the fact that older unmaintained extensions are not likely to support the new minimum Python version, this is little concern about breaking older extensions.
|
| |
|
| |
|
|
|
|
| |
Corrected "shorte" to "short"
|
|
|
|
| |
Fixes #894.
|
|
|
|
|
| |
Resolves issue that can occur with complex emphasis combinations.
Fixes #979
|
|
|
|
|
|
|
|
| |
cElementTree is a deprecated alias for ElementTree since Python 3.3.
Also drop the recommendation to import etree from markdown.util,
and deprecate markdown.util.etree.
|
|
|
|
|
|
|
| |
* Python syntax upgraded using `pyupgrade --py3-plus`
* Travis no longer uses `sudo`. See https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration
See #760 for Python Version Support Timeline and related dicussion.
|
| |
|
|
|
|
|
|
| |
Remove misleading escaped_chars_in_js test
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
|
|
|
|
|
|
| |
Part of the discussion in #798.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
|
|
|
|
|
|
| |
All whitespace characters should be treated the same by inline patterns.
Previoulsy, emphasis patterns were only accounting for spaces, but not
other whitepsace characters such as newlines. Fixes #783.
|
|
|
| |
Previously only newlines preceded by whitespace were collapsed. Fixes #742.
|
|
|
|
| |
Fixes #712.
|
|
|
|
|
|
|
|
|
| |
The smart_strong extension has been removed and its behavior is now the
default (smart em and smart strong are the default). The legacy_em
extension restores legacy behavior (no smart em or smart strong).
This completes the removal of keywords. All parser behavior is now
modified by extensions, not by keywords on the Markdown class.
|
|
|
|
|
|
|
|
|
|
| |
Serializer should only escape & in attributes if not part of &
Better regex avoid Unicode and `_` in amp detection.
In general, we don't want to escape already escaped content, but with code content, we want literal representations of escaped content, so have code content explicitly escape its content before placing in AtomicStrings.
Closes #669.
|
|
|
|
| |
Fixes #435.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, instances of the Markdown class were represented as any one
of 'md', 'md_instance', or 'markdown'. This inconsistency made it
difficult when developing extensions, or just maintaining the existing
code. Now, all instances are consistently represented as 'md'.
The old attributes on class instances still exist, but raise a
DeprecationWarning when accessed. Also on classes where the instance was
optional, the attribute always exists now and is simply None if no
instance was provided (previously the attribute wouldn't exist).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All processors and patterns now get "registered" to a Registry.
Each item is given a name (string) and a priority. The name is for
later reference and the priority can be either an integer or float
and is used to sort. Priority is sorted from highest to lowest. A
Registry instance is a list-like iterable with the items auto-sorted
by priority. If two items have the same priority, then they are
listed in the order there were "registered". Registering a new
item with the same name as an already registered item replaces
the old item with the new item (however, the new item is sorted by
its newly assigned priority). To remove an item, "deregister" it by
name or index.
A backwards compatible shim is included so that existing simple
extensions should continue to work. DeprecationWarnings will
be raised for any code which calls the old API.
Fixes #418.
|
|
|
|
|
|
|
| |
If you have existing documents that use the legacy attributes format,
then you should enable the legacy_attrs extension for those documents.
Everyone is encouraged to use the attr_list extension going forward.
Closes #643. Work adapted from 0005d7a of the md3 branch.
|
|
|
|
| |
Add new InlineProcessor class that handles inline processing much better and allows for more flexibility. This adds new InlineProcessors that no longer utilize unnecessary pretext and posttext captures. New class can accept the buffer that is being worked on and manually process the text without regex and return new replacement bounds. This helps us to handle links in a better way and handle nested brackets and logic that is too much for regular expression. The refactor also allows image links to have links/paths with spaces like links. Ref #551, #613, #590, #161.
|
| |
|
| |
|
|
|
| |
Python 3.6 is starting to reject invalid escapes. Regular expression patterns should be raw strings to avoid having regex escapes being mistaken for invalid string escapes. Fixes #611.
|
|
|
|
|
| |
Ancestry exclusion for inline patterns.
Adds the ability for an inline pattern to define a list of ancestor tag names that should be avoided. If a pattern would create a descendant of one of the listed tag names, the pattern will not match. Fixes #596.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
This aims to escape code in a more expected fashion. This handles
when backticks are escaped and when the escapes before backticks are
escaped.
|
|
|
|
| |
This fixes warnings with pycodestyle ≥ 2.1, see PyCQA/pycodestyle#400.
|
|
|
|
|
| |
Don’t allow spaces in image links. This was also causing an issue
where any text following a space was treated as a title. Ref #484.
|
|
|
|
|
|
| |
Drppoed the non-greedy quantifier from the end of the inlinePatterns as it
served no useful purpose and was actually (in very rare edge cases) causing
newlines to be dropped. FIxes #439. Thanks to @munificent for the report.
|
|
|
|
|
|
|
| |
Apparently this is a new requirement of flake8. That's the thing about using
tox. Every test run reinstalls all dependencies so an updated dependency might
instroduce new errors. I could specify a specific version, but I like staying
current.
|
|
|
|
|
|
| |
Got all but a couple files in the tests (ran out of time today).
Apparently I have been using some bad form for years (although a few
things seemed to look better before the update). Anyway, conformant now.
|
|
|
|
|
|
|
|
|
|
|
| |
The logic for the current regex for strong/em and em/strong was sound,
but the way it was implemented caused some unintended side effects.
Whether it is a quirk with regex in general or just with Python’s re
engine, I am not sure. Put basically `(\*|_){3}` causes issues with
nested bold/italic. So, allowing the group to be defined, and then
using the group number to specify the remaining sequential chars is a
better way that works more reliably `(\*|_)\2{2}. Test from issue #365
was also added to check for this case in the future.
|
|
|
|
|
|
|
|
|
| |
Fixes #253. Thanks to @facelessuser for the tests. Although I removed
a bunch of weird ones (even some that passed) from his PR (#342). For
the most part, there is no definitive way for those to be parsed. So
there is no point of testing for them. In most of those situations,
authors should be mixing underscores and astericks so it is clear
what is intended.
|
|
|
|
|
|
| |
See #253. Prior to this patch, if any inline processors returned an element
with a tail, the tail would end up empty. This resolves that issue and will
allow for #253 to be fixed. Thanks to @facelessuser for the work on this.
|
|
|
|
|
|
| |
These couple lines were from an old - no longer used - method of
stashing inlines. There is no need for it today. The if statement would
never evaluate True.
|
|
|
|
| |
The rest should have test cases added.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The current implementation was wrong as it also percent encoded query strings
(which should be plus encoded) and calling urllib.quote on the path (and
urllib.quote_plus on the query string) assumes the url is not already encoded.
What if the document author pasted a url that was already encoded? She probably
did not intend for `%20` to become `%2520`. Or did she? It is now clear to me
why many implementation do nothing to urls. Just pass them though as-is. To bad
if they are not valid HTML. HTML authors have to encodee their own urls, so I
guess markdown authors have to as well.
|
|
|
|
|
|
| |
Leave all other chars prefaced by a backslash alone. Fixes #242.
Not sure why I thought that I needed to add another backslash.
Thanks for the report and the test case @mhubig.
|
| |
|
|
|
|
|
|
|
|
|
| |
A `from __future__ import ...` statement must go after any docstrings;
since putting them before the docstring means the docstring loses its
magic and just becomes a string literal. That then causes a syntax
error if there are further future statements after the false docstring.
This fixes issue #203, using the patch provided by @Arfrever.
|
|
|
|
|
|
|
|
|
|
| |
The most notable changes are the use of unicode_literals
and absolute_imports. Actually, absolute_imports was the
biggest deal as it gives us relative imports. For the first
time extensions import markdown relative to themselves.
This allows other packages to embed the markdown lib in a
subdir of their project and still be able to use our
extensions.
|
| |
|
| |
|
| |
|