[Lxml-checkins] r45116 - in lxml/trunk: doc src/lxml
scoder at codespeak.net
scoder at codespeak.net
Sun Jul 15 23:45:24 CEST 2007
Author: scoder
Date: Sun Jul 15 23:45:24 2007
New Revision: 45116
Modified:
lxml/trunk/doc/compatibility.txt
lxml/trunk/src/lxml/parser.pxi
Log:
new ETCompatXMLParser subclass of XMLParser with an ElementTree compatible default setup
Modified: lxml/trunk/doc/compatibility.txt
==============================================================================
--- lxml/trunk/doc/compatibility.txt (original)
+++ lxml/trunk/doc/compatibility.txt Sun Jul 15 23:45:24 2007
@@ -106,8 +106,14 @@
while etree will read them in and treat them as Comment or
ProcessingInstruction elements respectively. This is especially visible
where comments are found inside text content, which is then split by the
- Comment element. You can disable this behaviour by passing the boolean
- ``remove_comments`` keyword argument to the parser you use.
+ Comment element.
+
+ You can disable this behaviour by passing the boolean ``remove_comments``
+ and/or ``remove_pis`` keyword arguments to the parser you use. For
+ convenience and to support portable code, you can also use the
+ ``etree.ETCompatXMLParser`` instead of the default ``etree.XMLParser``. It
+ tries to provide a default setup that is as close to the ElementTree parser
+ as possible.
* ElementTree has a bug when serializing an empty Comment (no text argument
given) to XML, etree serializes this successfully.
Modified: lxml/trunk/src/lxml/parser.pxi
==============================================================================
--- lxml/trunk/src/lxml/parser.pxi (original)
+++ lxml/trunk/src/lxml/parser.pxi Sun Jul 15 23:45:24 2007
@@ -742,6 +742,25 @@
self._parse_options = parse_options
+cdef class ETCompatXMLParser(XMLParser):
+ """An XML parser with an ElementTree compatible default setup. See the
+ XMLParser class for details.
+
+ This parser defaults to removing processing instructions and comments from
+ the tree.
+ """
+ def __init__(self, attribute_defaults=False, dtd_validation=False,
+ load_dtd=False, no_network=True, ns_clean=False,
+ recover=False, remove_blank_text=False, compact=True,
+ resolve_entities=True, remove_comments=True,
+ remove_pis=True):
+ XMLParser.__init__(self,
+ attribute_defaults, dtd_validation,
+ load_dtd, no_network, ns_clean,
+ recover, remove_blank_text, compact,
+ resolve_entities, remove_comments,
+ remove_pis)
+
cdef xmlDoc* _internalParseDoc(char* c_text, int options,
_ResolverContext context) except NULL:
# internal parser function for XSLT
More information about the lxml-checkins
mailing list