[Lxml-checkins] r39229 - lxml/trunk

scoder at codespeak.net scoder at codespeak.net
Tue Feb 20 13:58:37 CET 2007


Author: scoder
Date: Tue Feb 20 13:58:35 2007
New Revision: 39229

Modified:
   lxml/trunk/TODO.txt
Log:
cleanup in TODOs

Modified: lxml/trunk/TODO.txt
==============================================================================
--- lxml/trunk/TODO.txt	(original)
+++ lxml/trunk/TODO.txt	Tue Feb 20 13:58:35 2007
@@ -4,10 +4,9 @@
 Exposing libxml2 functionalities
 --------------------------------
 
-* See whether XInclude support can mimic ElementTree's API.
-
 * Test XML entities, also in an ElementTree context.
 
+
 In general
 ----------
 
@@ -15,67 +14,32 @@
 
 * will namespace nodes of unknown namespaces be added (and never freed?)
 
-Top level
----------
-
-* ProcessingInstruction
+* more testing on multi-threading
 
-ElementInterface
------------------
 
 ElementTree
 -----------
 
 * _setroot(), even though this is not strictly a public method.
 
+
 QName
 -----
 
 * expose prefix support?
 
-Features
---------
 
-* Relaxed NG compact notation (rnc versus rng) support. May consider
-  integrating this:
+Objectify
+---------
+
+* set special __attributes__ on ObjectifiedElement's as Python attributes, not
+  XML children
 
-  http://www.gnosis.cx/download/relax/
 
-Notes on implementing iterparse
--------------------------------
+Features
+--------
 
-"iterparse" will be (or will return) an iterable object, let's call it
-IterParse for clarity. A class is basically the only way of implementing
-iterators in Pyrex. For the internal SAX part, IterParse will likely work a
-lot like lxml.sax.ElementTreeContentHandler.
-
-We'd need a custom wrapper to the default libxml2 SAX handler to intercept the
-parse events (this means implementing C helper functions for the SAX events)
-/after/ they were processed by libxml2. See xmlSAXVersion (SAX2.c) on how to
-retrieve the SAX2 default parser structure.
-
-IterParse should pass chunks into the parser and buffer the events it
-receives. When its __next__() method is called, it returns one event or passes
-new chunks until there is an event to return. This is needed as IterParse has
-to convert between libxml2 push (SAX) and Python pull (iter).
-
-As for the input to the libxml2 parser, there are two possible ways: one is to
-pass data chunks in through xmlParseChunk and the other is to use
-xmlCreateIOParserCtxt and implement xmlInputReadCallback (xmlio.h) to have
-libxml2 request data by itself. However, xmlParseChunk allows us to control
-how far libxml2 parses in advance, so this is preferable.
-
-Python events (start, end, start-ns, end-ns) are created as follows:
-
-* "*-ns" events must be extracted from the libxml2 xmlSAX2StartElementNs call
-(passed in arguments "prefix"/"URI" and the char* array "namespaces"). They
-must be stored on a stack to build the respective "end-ns" events.
-
-* "start" is somewhat tricky, as it would be a bad idea to allow modifications
-of the XML structure during that iterator cycle. Maybe it's enough to document
-that, but there may be ways to crash lxml with certain tree operations. Note
-also that care has to be taken to prevent Python from garbage collecting the
-element before the "end" event. The best way to do that is to store a Python
-reference to that element on a stack.
+* Relaxed NG compact notation (rnc versus rng) support.  Currently not
+  supported by libxml2 (patch exists)
 
-* "end" is simple then: pop the element from the stack and return it.
+* setting a DTD for validation (maybe a ``DTD`` class like RelaxNG?)


More information about the lxml-checkins mailing list