The C library libxml has huge benefits:

  * standards-compliant XML support

  * full-featured

  * actively maintained by XML experts

  * fast. fast! FAST!

Features of libxml:

  * Parsing

  * Tree based (DOM-ish) XML structure

  * XPath support

  * XSLT support

  * Relax NG (schema) support

libxml ships with C bindings. These are autogenerated from C. This is
cool. Life is perfect. Or is it?

The libxml Python bindings have a bunch of disadvantages:

  * very low level and C-ish (not Pythonic)

  * underdocumented. huge, you get lost in them

  * works with UTF-8, not native Python unicode

  * can cause segfaults from Python

  * have to do manual memory management!

lxml is a Python binding based on Pyrex that aims to fix these
drawbacks. Aims (read: these are TODOs):

  * Pythonic API

  * documented

  * use Python unicode strings in API

  * safe (no segfaults)

  * No manual memory management!

Tradeoffs:

  * slower because of better wrapping. Then again, underlying libxml
    is plenty fast, and many operations (parsing, xpath evaluation,
    etc) will work at full speed as they fully take place in C.

  * not all features of libxml are exposed (at least not right
    away. not without help)

All we have now is proof of concept bindings with:

  * automatic destruction of documents (refcounted)

  * simple tree walking API

  * embryonic ElementTree style API
