delta/python-lxml.git - github.com: lxml/lxml.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Avoid using the deprecated "imp" module.HEAD master	Stefan Behnel	2023-05-11	1	-1/+2
\| \| \| \|	Closes https://bugs.launchpad.net/lxml/+bug/2018137
*	Avoid using the deprecated "imp" module.	Stefan Behnel	2023-05-11	1	-2/+4
\| \| \| \|	Closes https://bugs.launchpad.net/lxml/+bug/2018137
*	Fix inheritance order of mixin classes in lxml.html (GH-340)	xmo-odoo	2022-05-17	1	-2/+42
\| \| \| \| \| \| \| \| \| \| \|	As the old FIXME comment from https://github.com/lxml/lxml/commit/8132c755adad4a75ba855d985dd257493bccc7fd notes, the mixin should come first for the inheritance to be correct (the left-most class is the first in the MRO, at least if no diamond inheritance is involved). Also fix the odd `super` call in `HtmlMixin`, likely stemming from the incorrect MRO. Fixes the inheritance order of all `HTML*` base classes though it probably doesn't matter for other than `HtmlElement`.
*	Fix a test in Py2.lxml-4.6.5 lxml-4.6	Stefan Behnel	2021-12-12	1	-1/+6
\|
*	Cleaner: cover some more cases where scripts could sneak through in ↵	Stefan Behnel	2021-12-11	1	-1/+64
\| \| \| \|	specially crafted style content.
*	Cleaner: Remove SVG image data URLs since they can embed script content.	Stefan Behnel	2021-11-11	1	-0/+45
\| \| \| \|	Reported as GHSL-2021-1038
*	Cleaner: Prevent "@import" from re-occurring in the CSS after replacements, ↵	Stefan Behnel	2021-11-11	1	-0/+20
\| \| \| \| \| \|	e.g. "@@importimport". Reported as GHSL-2021-1037
*	Add HTML-5 "formaction" attribute to "defs.link_attrs" (GH-316)	Kevin Chung	2021-03-21	1	-0/+15
\| \| \| \|	Resolves https://bugs.launchpad.net/lxml/+bug/1888153 See https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-28957
*	Prevent combinations of <math/svg> and <style> to sneak JavaScript through ↵	Stefan Behnel	2020-11-26	2	-3/+25
\| \| \| \|	the HTML cleaner.
*	Prevent combinations of <noscript> and <style> to sneak JavaScript through ↵	Stefan Behnel	2020-10-18	1	-0/+10
\| \| \| \|	the HTML cleaner.
*	html: Add InputGetter.items() method and make .keys() return the field names ↵	Stefan Behnel	2020-08-12	1	-0/+16
\| \| \| \|	in document order.
*	Cleaner: Catch bad arg combo in constructor (GH-301)	Mike Lissner	2020-06-20	1	-0/+15
\| \| \|	Fixes https://bugs.launchpad.net/lxml/+bug/1882606
*	LP#1882606: ``Cleaner.clean_html()`` discarded comments and PIs regardless ↵	Stefan Behnel	2020-06-13	2	-0/+42
\| \| \| \|	of the corresponding configuration option, if "remove_unknown_tags=True" was set.
*	Merge branch lxml-4.2 into master.	Stefan Behnel	2018-09-09	1	-3/+3
\|\
\| *	Fix typo in test file.	Stefan Behnel	2018-08-26	1	-1/+1
\| \|
\| *	Fix: make the cleaner also remove javascript URLs that use escaping.	Stefan Behnel	2018-09-09	1	-3/+3
\| \|
* \|	Merge pull request #270 from hugovk/rm-2.6	scoder	2018-08-26	12	-62/+32
\|\ \ \| \| \| \| \| \|	Remove redundant Python <= 2.6 code
\| * \|	Remove ununsed imports	Hugo	2018-08-26	10	-12/+8
\| \| \|
\| * \|	Use tempfile.NamedTemporaryFile directly	Hugo	2018-08-26	1	-3/+1
\| \| \|
\| * \|	Min version of LIBXML_VERSION is now 2.7	Hugo	2018-08-26	1	-2/+1
\| \| \|
\| * \|	Replace function call with set literal	Hugo	2018-08-25	1	-1/+1
\| \| \|
\| * \|	Remove redundant code for Python <= 2.6	Hugo	2018-08-25	9	-48/+25
\| \|/
* \|	Fix typo in test file.	Stefan Behnel	2018-08-26	1	-1/+1
\|/
*	Clean up test code for better readability.	Stefan Behnel	2017-11-12	1	-6/+16
\|
*	Add better fallbacks to SelectElement.value	Christopher Schramm	2017-10-05	2	-1/+38
\| \| \| \|	If a browser encounters a select element without any selected option element, it automatically pre-selects the first one. If multiple options are selected, all but the last one get deselected.
*	LP#1567526: Make soupparser sort-of handle empty and plain text input ↵	Stefan Behnel	2017-08-13	1	-0/+10
\| \| \| \|	instead of raising a TypeError.
*	Fix tests after making "useChardet" handling smarter.	Stefan Behnel	2017-08-12	1	-5/+16
\|
*	soupparse: add test case for double-hyphen	ha shao	2017-07-29	1	-0/+11
\|
*	Fix a typo: referrs -> refers	Felix Yan	2017-06-12	1	-1/+1
\|
*	Perform full-document detection on decoded bytes.	Koert van der Veer	2017-03-16	1	-0/+6
\| \| \| \|	Closes #1673355
*	add tests for bug #1665241	Ashish Kulkarni	2017-02-16	1	-1/+25
\|
*	ignore disabled form inputs	Kristian Klemon	2016-07-26	1	-1/+3
\|
*	Merge pull request #180 from chripede/patch-2	scoder	2016-07-24	1	-2/+21
\|\ \| \| \| \|	Add inline_style option
\| *	Fix tests for inline_style	Christian Pedersen	2015-11-20	1	-2/+21
\| \|
* \|	Exclude `file` field `value` from `FormElement.form_values`.	Tomas Divis	2016-07-20	1	-0/+2
\|/ \| \| \|	Similar to `submit`, `image` and `reset`, browsers don't send `file` field values in the POST when form is submitted. `FormElement.form_values` method already correctly excluded `submit`, `image` and `reset` fields, now it also excludes the `file` fields.
*	simplify import check in test and keep original import exception on failures	Stefan Behnel	2015-06-05	1	-13/+6
\|
*	unittest check beautifulsoup/bs4 import properly	mozbugbox	2015-06-06	1	-5/+14
\|
*	BeautifulSoup 4: handle Doctype and Declaration	mozbugbox	2015-06-05	1	-9/+12
\| \| \| \| \|	bs4 can use lxml or html5lib to parse html content. Force bs4 builtin html parser when parse html with soupparser.
*	fix doctest in Py3	Stefan Behnel	2015-02-18	1	-2/+2
\|
*	implement a set-like interface for the HTML 'class' attribute	Stefan Behnel	2015-02-18	1	-0/+57
\|
*	refactor new code in soupparser, extend tests	Stefan Behnel	2015-02-16	1	-8/+22
\|
*	Make soupparser properly handle everything outside the root tag (doctype	Olli Pottonen	2015-02-16	1	-0/+55
\| \| \| \| \| \|	declaration, comments, processing instructions.) See https://bugs.launchpad.net/lxml/+bug/1341964.
*	LP#1419354: fix meta-redirect URL parsing when preceded by whitespace	Stefan Behnel	2015-02-08	1	-0/+12
\|
*	lxml.html.document_fromstring ensure_head_body	jab	2014-09-04	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using lxml.html.document_fromstring to process html outside your control, you can't be sure it will have a head element or body element. Allowing document_fromstring to accept an ensure_head_body option saves you from having to write code like: doc = document_fromstring(html) try: doc.head except IndexError: doc.insert(0, Element('head')) # now we can safely reference doc.head You can instead just write: doc = document_fromstring(html, ensure_head_body=True)
*	include links in meta refresh tags in iterlinks	jab	2014-08-22	1	-0/+7
\|
*	strip control characters before looking for evil text content in Cleaner	Stefan Behnel	2014-04-17	1	-1/+8
\|
*	clean up test module (mostly formatting)	Stefan Behnel	2014-02-21	1	-2/+16
\|
*	fix typo in comment	Stefan Behnel	2014-02-21	1	-1/+1
\|
*	add test	Stefan Behnel	2014-02-20	1	-1/+11
\|
*	more faking of NamedTemporaryFile(delete=False) in Py2.[45]	Stefan Behnel	2014-02-19	1	-1/+12
\|