[Lxml-checkins] r45116 - in lxml/trunk: doc src/lxml

scoder at codespeak.net scoder at codespeak.net
Sun Jul 15 23:45:24 CEST 2007


Author: scoder
Date: Sun Jul 15 23:45:24 2007
New Revision: 45116

Modified:
   lxml/trunk/doc/compatibility.txt
   lxml/trunk/src/lxml/parser.pxi
Log:
new ETCompatXMLParser subclass of XMLParser with an ElementTree compatible default setup

Modified: lxml/trunk/doc/compatibility.txt
==============================================================================
--- lxml/trunk/doc/compatibility.txt	(original)
+++ lxml/trunk/doc/compatibility.txt	Sun Jul 15 23:45:24 2007
@@ -106,8 +106,14 @@
   while etree will read them in and treat them as Comment or
   ProcessingInstruction elements respectively.  This is especially visible
   where comments are found inside text content, which is then split by the
-  Comment element.  You can disable this behaviour by passing the boolean
-  ``remove_comments`` keyword argument to the parser you use.
+  Comment element.
+
+  You can disable this behaviour by passing the boolean ``remove_comments``
+  and/or ``remove_pis`` keyword arguments to the parser you use.  For
+  convenience and to support portable code, you can also use the
+  ``etree.ETCompatXMLParser`` instead of the default ``etree.XMLParser``.  It
+  tries to provide a default setup that is as close to the ElementTree parser
+  as possible.
 
 * ElementTree has a bug when serializing an empty Comment (no text argument
   given) to XML, etree serializes this successfully.

Modified: lxml/trunk/src/lxml/parser.pxi
==============================================================================
--- lxml/trunk/src/lxml/parser.pxi	(original)
+++ lxml/trunk/src/lxml/parser.pxi	Sun Jul 15 23:45:24 2007
@@ -742,6 +742,25 @@
 
         self._parse_options = parse_options
 
+cdef class ETCompatXMLParser(XMLParser):
+    """An XML parser with an ElementTree compatible default setup.  See the
+    XMLParser class for details.
+
+    This parser defaults to removing processing instructions and comments from
+    the tree.
+    """
+    def __init__(self, attribute_defaults=False, dtd_validation=False,
+                 load_dtd=False, no_network=True, ns_clean=False,
+                 recover=False, remove_blank_text=False, compact=True,
+                 resolve_entities=True, remove_comments=True,
+                 remove_pis=True):
+        XMLParser.__init__(self,
+                 attribute_defaults, dtd_validation,
+                 load_dtd, no_network, ns_clean,
+                 recover, remove_blank_text, compact,
+                 resolve_entities, remove_comments,
+                 remove_pis)
+
 cdef xmlDoc* _internalParseDoc(char* c_text, int options,
                                _ResolverContext context) except NULL:
     # internal parser function for XSLT


More information about the lxml-checkins mailing list