summaryrefslogtreecommitdiff
path: root/src/lxml/html/tests
diff options
context:
space:
mode:
authorjab <jab@math.brown.edu>2014-09-04 11:20:04 -0400
committerjab <jab@math.brown.edu>2014-09-04 13:39:08 -0400
commitfef78cdafefb61ed9114a4697b8a0630119ee6d7 (patch)
tree8cd5f5f65abf2d25b953be278fb9938c854abf97 /src/lxml/html/tests
parent10572e15b3d7b85fe807e249ec4e5808ef5733e6 (diff)
downloadpython-lxml-fef78cdafefb61ed9114a4697b8a0630119ee6d7.tar.gz
lxml.html.document_fromstring ensure_head_body
When using lxml.html.document_fromstring to process html outside your control, you can't be sure it will have a head element or body element. Allowing document_fromstring to accept an ensure_head_body option saves you from having to write code like: doc = document_fromstring(html) try: doc.head except IndexError: doc.insert(0, Element('head')) # now we can safely reference doc.head You can instead just write: doc = document_fromstring(html, ensure_head_body=True)
Diffstat (limited to 'src/lxml/html/tests')
-rw-r--r--src/lxml/html/tests/test_basic.txt17
1 files changed, 17 insertions, 0 deletions
diff --git a/src/lxml/html/tests/test_basic.txt b/src/lxml/html/tests/test_basic.txt
index d7066402..ddbb0bf5 100644
--- a/src/lxml/html/tests/test_basic.txt
+++ b/src/lxml/html/tests/test_basic.txt
@@ -160,3 +160,20 @@ Bug 690319: Leading whitespace before doctype declaration should not raise an er
>>> print(tostring(etree_document, encoding=unicode))
<html></html>
+Feature https://github.com/lxml/lxml/pull/140: ensure_head_body option:
+
+ >>> from lxml.html import document_fromstring, tostring
+ >>> from functools import partial
+ >>> tos = partial(tostring, encoding=unicode)
+ >>> print(tos(document_fromstring('<p>test</p>')))
+ <html><body><p>test</p></body></html>
+ >>> print(tos(document_fromstring('<p>test</p>', ensure_head_body=True)))
+ <html><head></head><body><p>test</p></body></html>
+ >>> print(tos(document_fromstring('<meta>')))
+ <html><head><meta></head></html>
+ >>> print(tos(document_fromstring('<meta>', ensure_head_body=True)))
+ <html><head><meta></head><body></body></html>
+ >>> print(tos(document_fromstring('<html></html>')))
+ <html></html>
+ >>> print(tos(document_fromstring('<html></html>', ensure_head_body=True)))
+ <html><head></head><body></body></html>