[Lxml-checkins] r39458 - lxml/trunk/doc

scoder at codespeak.net scoder at codespeak.net
Mon Feb 26 17:59:50 CET 2007


Author: scoder
Date: Mon Feb 26 17:59:44 2007
New Revision: 39458

Modified:
   lxml/trunk/doc/validation.txt
Log:
section on DTD validation

Modified: lxml/trunk/doc/validation.txt
==============================================================================
--- lxml/trunk/doc/validation.txt	(original)
+++ lxml/trunk/doc/validation.txt	Mon Feb 26 17:59:44 2007
@@ -2,24 +2,67 @@
 Validation with lxml
 ====================
 
-Apart from DTD support in the parsers, lxml currently supports two schema
-languages: `Relax NG`_ and `XML Schema`_.  Both provide identical APIs,
-represented by a validator class with the obvious names.
+Apart from the built-in DTD support in parsers, lxml currently supports three
+schema languages: DTD, `Relax NG`_ and `XML Schema`_.  All three provide
+identical APIs, represented by a validator class with the obvious names.
 
 .. _`Relax NG`: http://www.relaxng.org/
 .. _`XML Schema`: http://www.w3.org/XML/Schema
 
 .. contents::
 .. 
-   1  RelaxNG
+   1  DTD
+   2  RelaxNG
    2  XMLSchema
 
+The usual setup procedure::
+
+  >>> from lxml import etree
+  >>> from StringIO import StringIO
+
+
+DTD
+---
+
+There are two places in lxml where DTDs are supported: parsers and the DTD
+class.  If you pass a keyword option to a parser that requires DTD loading,
+lxml will automatically include the DTD in the parsing process.  If you pass
+the keyword for DTD validation, lxml (or rather libxml2) will use this DTD
+right inside the parser and report failure or success when parsing terminates.
+
+The parser support for DTDs depends on internal or external subsets of the XML
+file.  This means that the XML file itself must either contain a DTD or must
+reference a DTD to make this work.  If you want to validate an XML document
+against a DTD that is not referenced by the document itself, you can use the
+``DTD`` class.
+
+To use the ``DTD`` class, you must first pass a filename or file-like object
+into the constructor to parse a DTD::
+
+  >>> f = StringIO("<!ELEMENT b EMPTY>")
+  >>> dtd = etree.DTD(f)
+
+Now you can use it to validate documents::
+
+  >>> root = etree.XML("<b/>")
+  >>> print dtd.validate(root)
+  1
+
+  >>> root = etree.XML("<b><a/></b>")
+  >>> print dtd.validate(root)
+  0
+
+The reason for the validation failure can be found in the error log::
+
+  >>> print dtd.error_log.filter_from_errors()[0]
+  <string>:1:ERROR:VALID:DTD_NOT_EMPTY: Element b was declared EMPTY this one has content
+
 
 RelaxNG
 -------
 
-lxml.etree introduces a new class, lxml.etree.RelaxNG. The class can
-be given an ElementTree object to construct a Relax NG validator::
+The ``RelaxNG`` class takes an ElementTree object to construct a Relax NG
+validator::
 
   >>> f = StringIO('''\
   ... <element name="a" xmlns="http://relaxng.org/ns/structure/1.0">


More information about the lxml-checkins mailing list