>>> from lxml.html import document_fromstring, fragment_fromstring, tostring lxml.html has two parsers, one for HTML, one for XHTML: >>> from lxml.html import HTMLParser, XHTMLParser >>> html = "
Hi!
" >>> root = document_fromstring(html, parser=HTMLParser()) >>> print(root.tag) html >>> root = document_fromstring(html, parser=XHTMLParser()) >>> print(root.tag) html There are two functions for converting between HTML and XHTML: >>> from lxml.html import xhtml_to_html, html_to_xhtml >>> doc = document_fromstring(html, parser=HTMLParser()) >>> tostring(doc) b'Hi!
' >>> html_to_xhtml(doc) >>> tostring(doc) b'Hi!
'