From scoder at codespeak.net Sat Mar 1 17:59:51 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sat, 1 Mar 2008 17:59:51 +0100 (CET) Subject: [Lxml-checkins] r52006 - in lxml/trunk: . doc Message-ID: <20080301165951.77B341684EC@codespeak.net> Author: scoder Date: Sat Mar 1 17:59:49 2008 New Revision: 52006 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/FAQ.txt Log: r3651 at delle: sbehnel | 2008-03-01 17:00:46 +0100 clarification on MacOS-X crashes Modified: lxml/trunk/doc/FAQ.txt ============================================================================== --- lxml/trunk/doc/FAQ.txt (original) +++ lxml/trunk/doc/FAQ.txt Sat Mar 1 17:59:49 2008 @@ -408,15 +408,15 @@ the FAQ section on threading_ to check if you touch on one of the potential pitfalls. -b) If you are on Mac-OS X, make sure lxml uses the correct libraries. If you - have updated the old system libraries (e.g. through fink), this is best - achieved by building lxml statically to prevent the different library - versions from interfering. If you choose to use a dynamically linked - version, make sure the ``DYLD_LIBRARY_PATH`` environment variable contains - the directory where you installed the libraries. To make sure the correct - libraries are used, print the module level version numbers that - ``lxml.etree`` provides from *within* your application rather than relying - on what your operating system tells you. +b) If you are on Mac-OS X, make sure lxml uses the correct libraries. + Since the normal system libraries are pretty much outdated, you + likely have installed newer versions through a package management + system like fink or macports. In this case, please make sure the + ``DYLD_LIBRARY_PATH`` environment variable contains the directory + where you installed the libraries. There are other Python packages + that depend on libxml2, so it is up to you to make sure that *all* + packages that dynamically load libxml2 load the *same* library + version. Loading conflicting versions *will* lead to a crash. In any case, try to reproduce the problem with the latest versions of libxml2 and libxslt. From time to time, bugs and race conditions are found From scoder at codespeak.net Sat Mar 1 17:59:55 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sat, 1 Mar 2008 17:59:55 +0100 (CET) Subject: [Lxml-checkins] r52007 - in lxml/trunk: . doc Message-ID: <20080301165955.934AB1684EC@codespeak.net> Author: scoder Date: Sat Mar 1 17:59:55 2008 New Revision: 52007 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/FAQ.txt Log: r3652 at delle: sbehnel | 2008-03-01 17:58:57 +0100 more clarifications on MacOS-X crashes Modified: lxml/trunk/doc/FAQ.txt ============================================================================== --- lxml/trunk/doc/FAQ.txt (original) +++ lxml/trunk/doc/FAQ.txt Sat Mar 1 17:59:55 2008 @@ -396,33 +396,53 @@ My application crashes! ----------------------- -One of the goals of lxml is "no segfaults", so if there is no clear warning in -the documentation that you were doing something potentially harmful, you have -found a bug and we would like to hear about it. Please report this bug to the -`mailing list`_. See the next section on how to do that. - -However, there are a few things to try first, to make sure the problem is -really within lxml (or libxml2 or libxslt): - -a) If your application (or e.g. your web container) uses threads, please see - the FAQ section on threading_ to check if you touch on one of the - potential pitfalls. - -b) If you are on Mac-OS X, make sure lxml uses the correct libraries. - Since the normal system libraries are pretty much outdated, you - likely have installed newer versions through a package management - system like fink or macports. In this case, please make sure the - ``DYLD_LIBRARY_PATH`` environment variable contains the directory - where you installed the libraries. There are other Python packages - that depend on libxml2, so it is up to you to make sure that *all* - packages that dynamically load libxml2 load the *same* library - version. Loading conflicting versions *will* lead to a crash. +One of the goals of lxml is "no segfaults", so if there is no clear +warning in the documentation that you were doing something potentially +harmful, you have found a bug and we would like to hear about it. +Please report this bug to the `mailing list`_. See the section on bug +reporting to learn how to do that. + +If your application (or e.g. your web container) uses threads, please +see the FAQ section on threading_ to check if you touch on one of the +potential pitfalls. In any case, try to reproduce the problem with the latest versions of libxml2 and libxslt. From time to time, bugs and race conditions are found in these libraries, so a more recent version might already contain a fix for your problem. +Remember: even if you see lxml appear in a crash stack trace, it is +not necessarily lxml that *caused* the crash. + + +My application crashes on MacOS-X! +---------------------------------- + +Since the normal system libraries are pretty much outdated, you likely +have installed newer versions through a package management system like +fink or macports in addition to the system libraries. Chances are +high that your system is confused by the conflicting library versions. + +To work around this, please set the ``DYLD_LIBRARY_PATH`` environment +variable *at runtime* to the directory where you installed the newer +libraries. There are other Python packages that depend on libxml2, so +it is up to you to make sure that *all* packages that dynamically load +libxml2 load the *same* library version. Loading conflicting versions +*will* lead to a crash and has confused a lot of MacOS users already. + +Please understand that if your system uses conflicting library +versions, there is nothing lxml can do about it. It is up to you as a +user to make sure you have a sane execution environment. + +See `bug 197243`_ for more information. + +.. _`bug 197243`: https://bugs.launchpad.net/lxml/+bug/197243 + +If you want a sane, reliable execution environment, especially for +production systems, `using a buildout`_ might be a good idea. + +.. _`using a buildout`: http://comments.gmane.org/gmane.comp.python.lxml.devel/3297?set_lines=100000 + I think I have found a bug in lxml. What should I do? ----------------------------------------------------- @@ -604,11 +624,13 @@ more robust against possible pitfalls. So newer versions might already fix your problem in a reliable way. -* make sure the library versions you installed are really used. Do not rely - on what your operating system tells you! Print the version constants in - ``lxml.etree`` from within your runtime environment to make sure it is the - case. This is especially a problem under MacOS-X when newer library - versions were installed in addition to the outdated system libraries. +* make sure the library versions you installed are really used. Do + not rely on what your operating system tells you! Print the version + constants in ``lxml.etree`` from within your runtime environment to + make sure it is the case. This is especially a problem under + MacOS-X when newer library versions were installed in addition to + the outdated system libraries. Please read the bugs section + regarding MacOS-X in this FAQ. * if you use ``mod_python``, try setting this option: From scoder at codespeak.net Sun Mar 2 09:31:22 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:22 +0100 (CET) Subject: [Lxml-checkins] r52025 - in lxml/trunk: . src/lxml src/lxml/tests Message-ID: <20080302083122.37468168538@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:20 2008 New Revision: 52025 Modified: lxml/trunk/ (props changed) lxml/trunk/TODO.txt lxml/trunk/src/lxml/readonlytree.pxi lxml/trunk/src/lxml/tests/test_xslt.py lxml/trunk/src/lxml/xslt.pxd Log: r3664 at delle: sbehnel | 2008-03-02 08:56:19 +0100 r3650 at delle: sbehnel | 2008-02-29 21:39:21 +0100 initial import: will use read-only elements to access the XSLT tree, the input tree and the output tree Modified: lxml/trunk/TODO.txt ============================================================================== --- lxml/trunk/TODO.txt (original) +++ lxml/trunk/TODO.txt Sun Mar 2 09:31:20 2008 @@ -45,6 +45,22 @@ by libxml2 (patch exists) +XSLT extension elements +----------------------- + +* implementation: one base class that represents the result parent + + - .append(), .extend() and .text will add to the result tree (no .tail) + + - difference: Elements should be copied, not moved? (will break + later changes, but this just means that Elements in the result + tree are immutable, including those that were added) + + - how to make input tree read-only? maybe just document? + + - docs: "once in the result tree, Elements must no longer be changed"? + + lxml 2.0 ======== Modified: lxml/trunk/src/lxml/readonlytree.pxi ============================================================================== --- lxml/trunk/src/lxml/readonlytree.pxi (original) +++ lxml/trunk/src/lxml/readonlytree.pxi Sun Mar 2 09:31:20 2008 @@ -207,17 +207,21 @@ cdef _ReadOnlyElementProxy NEW_RO_PROXY "PY_NEW" (object t) cdef _ReadOnlyElementProxy _newReadOnlyProxy( - _ReadOnlyElementProxy sourceProxy, xmlNode* c_node): + _ReadOnlyElementProxy source_proxy, xmlNode* c_node): cdef _ReadOnlyElementProxy el el = NEW_RO_PROXY(_ReadOnlyElementProxy) el._c_node = c_node - if sourceProxy is None: + _initReadOnlyProxy(el, source_proxy) + return el + +cdef inline _initReadOnlyProxy(_ReadOnlyElementProxy el, + _ReadOnlyElementProxy source_proxy): + if source_proxy is None: el._source_proxy = el el._dependent_proxies = [el] else: - el._source_proxy = sourceProxy - python.PyList_Append(sourceProxy._dependent_proxies, el) - return el + el._source_proxy = source_proxy + python.PyList_Append(source_proxy._dependent_proxies, el) cdef _freeReadOnlyProxies(_ReadOnlyElementProxy sourceProxy): cdef _ReadOnlyElementProxy el @@ -228,3 +232,71 @@ for el in sourceProxy._dependent_proxies: el._c_node = NULL del sourceProxy._dependent_proxies[:] + + +cdef class _ReadOnlyRootElementProxy(_ReadOnlyElementProxy): + """A read-only element that frees the subtree on deallocation. + """ + def __dealloc__(self): + if self._c_node is not NULL: + tree.xmlFreeNode(self._c_node) + +cdef class _AppendOnlyElementProxy(_ReadOnlyElementProxy): + """A read-only element that allows adding children and changing the + text content (i.e. everything that adds to the subtree). + """ + cpdef append(self, other_element): + """Append a copy of an Element to the list of children. + """ + cdef xmlNode* c_next + cdef xmlNode* c_node + self._assertNode() + c_node = _roNodeOf(other_element) + c_node = _copyNodeToDoc(c_node, self._c_node.doc) + c_next = c_node.next + tree.xmlAddChild(self._c_node, c_node) + _moveTail(c_next, c_node) + + def extend(self, elements): + """Append a copy of all Elements from a sequence to the list of + children. + """ + self._assertNode() + for element in elements: + self.append(element) + + property text: + """Text before the first subelement. This is either a string or the + value None, if there was no text. + """ + def __get__(self): + self._assertNode() + return _collectText(self._c_node.children) + + def __set__(self, value): + self._assertNode() + if isinstance(value, QName): + value = python.PyUnicode_FromEncodedObject( + _resolveQNameText(self, value), 'UTF-8', 'strict') + _setNodeText(self._c_node, value) + +cdef _AppendOnlyElementProxy _newAppendOnlyProxy( + _ReadOnlyElementProxy source_proxy, xmlNode* c_node): + cdef _AppendOnlyElementProxy el + el = <_AppendOnlyElementProxy>NEW_RO_PROXY(_AppendOnlyElementProxy) + el._c_node = c_node + _initReadOnlyProxy(el, source_proxy) + return el + +cdef xmlNode* _roNodeOf(element) except NULL: + cdef xmlNode* c_node + if isinstance(element, _Element): + c_node = (<_Element>element)._c_node + elif isinstance(element, _ReadOnlyElementProxy): + c_node = (<_ReadOnlyElementProxy>element)._c_node + else: + raise TypeError("invalid value to append()") + + if c_node is NULL: + raise TypeError("invalid element") + return c_node Modified: lxml/trunk/src/lxml/tests/test_xslt.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_xslt.py (original) +++ lxml/trunk/src/lxml/tests/test_xslt.py Sun Mar 2 09:31:20 2008 @@ -604,6 +604,26 @@ self.assertEquals(self._rootstring(result), 'X') + def test_extension_element(self): + tree = self.parse('B') + style = self.parse('''\ + + + b + +''') + + class mytext(etree.XSLTExtension): + pass + + result = tree.xslt(style, extensions={}) + self.assertEquals(self._rootstring(result), + 'X') + def test_xslt_document_XML(self): # make sure document('') works from parsed strings xslt = etree.XSLT(etree.XML("""\ Modified: lxml/trunk/src/lxml/xslt.pxd ============================================================================== --- lxml/trunk/src/lxml/xslt.pxd (original) +++ lxml/trunk/src/lxml/xslt.pxd Sun Mar 2 09:31:20 2008 @@ -1,4 +1,4 @@ -from tree cimport xmlDoc, xmlDict +from tree cimport xmlDoc, xmlNode, xmlDict from xpath cimport xmlXPathContext, xmlXPathFunction cdef extern from "libxslt/xslt.h": @@ -22,6 +22,11 @@ void* _private xmlDict* dict int profile + xmlNode* node + xmlDoc* output + xmlNode* insert + + ctypedef struct xsltStackElem cdef xsltStylesheet* xsltParseStylesheetDoc(xmlDoc* doc) nogil cdef void xsltFreeStylesheet(xsltStylesheet* sheet) nogil @@ -59,6 +64,9 @@ char** params, char* output, void* profile, xsltTransformContext* context) nogil + cdef void xsltProcessOneNode(xsltTransformContext* ctxt, + xmlNode* contextNode, + xsltStackElem* params) cdef xsltTransformContext* xsltNewTransformContext(xsltStylesheet* style, xmlDoc* doc) nogil cdef void xsltFreeTransformContext(xsltTransformContext* context) nogil From scoder at codespeak.net Sun Mar 2 09:31:28 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:28 +0100 (CET) Subject: [Lxml-checkins] r52026 - in lxml/trunk: . doc src/lxml src/lxml/tests Message-ID: <20080302083128.96BF3168539@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:28 2008 New Revision: 52026 Added: lxml/trunk/src/lxml/xsltext.pxi Modified: lxml/trunk/ (props changed) lxml/trunk/CHANGES.txt lxml/trunk/doc/extensions.txt lxml/trunk/doc/xpathxslt.txt lxml/trunk/src/lxml/lxml.etree.pyx lxml/trunk/src/lxml/readonlytree.pxi lxml/trunk/src/lxml/tests/test_xslt.py lxml/trunk/src/lxml/xslt.pxd lxml/trunk/src/lxml/xslt.pxi Log: r3665 at delle: sbehnel | 2008-03-02 08:56:20 +0100 r3655 at delle: sbehnel | 2008-03-01 22:27:17 +0100 partial reimplementation of the extension element mechanism, works now Modified: lxml/trunk/CHANGES.txt ============================================================================== --- lxml/trunk/CHANGES.txt (original) +++ lxml/trunk/CHANGES.txt Sun Mar 2 09:31:28 2008 @@ -8,6 +8,8 @@ Features added -------------- +* Extension elements for XSLT + * ``Element.base`` property returns the xml:base or HTML base URL of an Element. Modified: lxml/trunk/doc/extensions.txt ============================================================================== --- lxml/trunk/doc/extensions.txt (original) +++ lxml/trunk/doc/extensions.txt Sun Mar 2 09:31:28 2008 @@ -1,15 +1,24 @@ -Extension functions for XPath and XSLT -====================================== +Python extensions for XPath and XSLT +==================================== -This document describes how to use Python extension functions in XPath and -XSLT. They allow you to do things like this:: +This document describes how to use Python extension functions in XPath +and XSLT like this:: -Here is how such a function looks like. As the first argument, it always -receives a context object (see below). The other arguments are provided by -the respective call in the XPath expression, one in the following examples. -Any number of arguments is allowed:: +It also describes how to use Python extension elements in XSLT like +this:: + + + + + + + +Here is how an extension function looks like. As the first argument, +it always receives a context object (see below). The other arguments +are provided by the respective call in the XPath expression, one in +the following examples. Any number of arguments is allowed:: >>> def hello(dummy, a): ... return "Hello %s" % a @@ -18,14 +27,23 @@ >>> def loadsofargs(dummy, *args): ... return "Got %d arguments." % len(args) +And here is how an extension element looks like:: + + >>> from lxml import etree + >>> class MyExtElement(etree.XSLTExtension): + ... def execute(self, context, self_node, input_node, output_parent): + ... # just copy own content input to output + ... output_parent.extend( list(self_node) ) + .. contents:: .. 1 The FunctionNamespace 2 Global prefix assignment - 3 Evaluators and XSLT - 4 Evaluator-local extensions - 5 What to return from a function + 3 The XPath context + 4 Evaluators and XSLT + 5 Evaluator-local extensions + 6 What to return from a function The FunctionNamespace @@ -36,7 +54,6 @@ FunctionNamespace class. For simplicity, we choose the empty namespace (None):: - >>> from lxml import etree >>> ns = etree.FunctionNamespace(None) >>> ns['hello'] = hello >>> ns['countargs'] = loadsofargs Modified: lxml/trunk/doc/xpathxslt.txt ============================================================================== --- lxml/trunk/doc/xpathxslt.txt (original) +++ lxml/trunk/doc/xpathxslt.txt Sun Mar 2 09:31:28 2008 @@ -454,6 +454,14 @@ '\nText\n' +Extension elements +------------------ + +Just like `custom extension functions`_, lxml supports custom +extension *elements*. + + + The ``xslt()`` tree method -------------------------- Modified: lxml/trunk/src/lxml/lxml.etree.pyx ============================================================================== --- lxml/trunk/src/lxml/lxml.etree.pyx (original) +++ lxml/trunk/src/lxml/lxml.etree.pyx Sun Mar 2 09:31:28 2008 @@ -2578,9 +2578,15 @@ include "iterparse.pxi" # incremental XML parsing include "xmlid.pxi" # XMLID and IDDict include "xinclude.pxi" # XInclude + + +################################################################################ +# Include submodules for XPath and XSLT + include "extensions.pxi" # XPath/XSLT extension functions include "xpath.pxi" # XPath evaluation include "xslt.pxi" # XSL transformations +include "xsltext.pxi" # XSL extension elements ################################################################################ Modified: lxml/trunk/src/lxml/readonlytree.pxi ============================================================================== --- lxml/trunk/src/lxml/readonlytree.pxi (original) +++ lxml/trunk/src/lxml/readonlytree.pxi Sun Mar 2 09:31:28 2008 @@ -2,6 +2,7 @@ cdef class _ReadOnlyElementProxy: "The main read-only Element proxy class (for internal use only!)." + cdef bint _free_after_use cdef xmlNode* _c_node cdef object _source_proxy cdef object _dependent_proxies @@ -12,6 +13,11 @@ assert self._c_node is not NULL, "Proxy invalidated!" return 0 + cdef void free_after_use(self): + """Should the xmlNode* be freed when releasing the proxy? + """ + self._free_after_use = 1 + property tag: """Element tag """ @@ -216,6 +222,7 @@ cdef inline _initReadOnlyProxy(_ReadOnlyElementProxy el, _ReadOnlyElementProxy source_proxy): + el._free_after_use = 0 if source_proxy is None: el._source_proxy = el el._dependent_proxies = [el] @@ -224,23 +231,19 @@ python.PyList_Append(source_proxy._dependent_proxies, el) cdef _freeReadOnlyProxies(_ReadOnlyElementProxy sourceProxy): + cdef xmlNode* c_node cdef _ReadOnlyElementProxy el if sourceProxy is None: return if sourceProxy._dependent_proxies is None: return for el in sourceProxy._dependent_proxies: + c_node = el._c_node el._c_node = NULL + if el._free_after_use: + tree.xmlFreeNode(c_node) del sourceProxy._dependent_proxies[:] - -cdef class _ReadOnlyRootElementProxy(_ReadOnlyElementProxy): - """A read-only element that frees the subtree on deallocation. - """ - def __dealloc__(self): - if self._c_node is not NULL: - tree.xmlFreeNode(self._c_node) - cdef class _AppendOnlyElementProxy(_ReadOnlyElementProxy): """A read-only element that allows adding children and changing the text content (i.e. everything that adds to the subtree). Modified: lxml/trunk/src/lxml/tests/test_xslt.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_xslt.py (original) +++ lxml/trunk/src/lxml/tests/test_xslt.py Sun Mar 2 09:31:28 2008 @@ -613,17 +613,68 @@ extension-element-prefixes="myns" exclude-result-prefixes="myns"> - b + b ''') - class mytext(etree.XSLTExtension): - pass + class MyExt(etree.XSLTExtension): + def execute(self, context, self_node, input_node, output_parent): + child = etree.Element(self_node.text) + child.text = 'X' + output_parent.append(child) + + extensions = { ('testns', 'myext') : MyExt() } - result = tree.xslt(style, extensions={}) + result = tree.xslt(style, extensions=extensions) self.assertEquals(self._rootstring(result), 'X') + def test_extension_element_content(self): + tree = self.parse('B') + style = self.parse('''\ + + + XY + +''') + + class MyExt(etree.XSLTExtension): + def execute(self, context, self_node, input_node, output_parent): + output_parent.extend(list(self_node)[1:]) + + extensions = { ('testns', 'myext') : MyExt() } + + result = tree.xslt(style, extensions=extensions) + self.assertEquals(self._rootstring(result), + 'Y') + + def test_extension_element_raise(self): + tree = self.parse('B') + style = self.parse('''\ + + + b + +''') + + class MyError(Exception): + pass + + class MyExt(etree.XSLTExtension): + def execute(self, context, self_node, input_node, output_parent): + raise MyError("expected!") + + extensions = { ('testns', 'myext') : MyExt() } + self.assertRaises(MyError, tree.xslt, style, extensions=extensions) + def test_xslt_document_XML(self): # make sure document('') works from parsed strings xslt = etree.XSLT(etree.XML("""\ Modified: lxml/trunk/src/lxml/xslt.pxd ============================================================================== --- lxml/trunk/src/lxml/xslt.pxd (original) +++ lxml/trunk/src/lxml/xslt.pxd Sun Mar 2 09:31:28 2008 @@ -8,6 +8,11 @@ cdef int LIBXSLT_VERSION cdef extern from "libxslt/xsltInternals.h": + ctypedef enum xsltTransformState: + XSLT_STATE_OK # 0 + XSLT_STATE_ERROR # 1 + XSLT_STATE_STOPPED # 2 + ctypedef struct xsltDocument: xmlDoc* doc @@ -25,6 +30,7 @@ xmlNode* node xmlDoc* output xmlNode* insert + xsltTransformState state ctypedef struct xsltStackElem @@ -32,6 +38,11 @@ cdef void xsltFreeStylesheet(xsltStylesheet* sheet) nogil cdef extern from "libxslt/extensions.h": + ctypedef void (*xsltTransformFunction)(xsltTransformContext* ctxt, + xmlNode* context_node, + xmlNode* inst, + void* precomp_unused) + cdef int xsltRegisterExtFunction(xsltTransformContext* ctxt, char* name, char* URI, @@ -43,6 +54,9 @@ char* name, char* URI) nogil cdef int xsltRegisterExtPrefix(xsltStylesheet* style, char* prefix, char* URI) nogil + cdef int xsltRegisterExtElement(xsltTransformContext* ctxt, + char* name, char* URI, + xsltTransformFunction function) nogil cdef extern from "libxslt/documents.h": ctypedef enum xsltLoadType: @@ -82,7 +96,9 @@ cdef void xsltSetTransformErrorFunc( xsltTransformContext*, void* ctxt, void (*handler)(void* ctxt, char* msg, ...)) nogil - + cdef void xsltTransformError(xsltTransformContext* ctxt, + xsltStylesheet* style, + xmlNode* node, char* msg, ...) cdef extern from "libxslt/security.h": ctypedef struct xsltSecurityPrefs ctypedef enum xsltSecurityOption: Modified: lxml/trunk/src/lxml/xslt.pxi ============================================================================== --- lxml/trunk/src/lxml/xslt.pxi (original) +++ lxml/trunk/src/lxml/xslt.pxi Sun Mar 2 09:31:28 2008 @@ -229,15 +229,34 @@ cdef class _XSLTContext(_BaseContext): cdef xslt.xsltTransformContext* _xsltCtxt + cdef object _extension_elements + cdef _ReadOnlyElementProxy _extension_element_proxy def __init__(self, namespaces, extensions, enable_regexp): self._xsltCtxt = NULL - if extensions is not None: - for ns, prefix in extensions: - if ns is None: + self._extension_elements = EMPTY_READ_ONLY_DICT + if extensions is not None and extensions: + for ns_name_tuple, extension in extensions.items(): + if ns_name_tuple[0] is None: raise XSLTExtensionError( "extensions must not have empty namespaces") + if isinstance(extension, XSLTExtension): + if self._extension_elements is EMPTY_READ_ONLY_DICT: + self._extension_elements = {} + extensions = python.PyDict_Copy(extensions) + ns_utf = _utf8(ns_name_tuple[0]) + name_utf = _utf8(ns_name_tuple[1]) + python.PyDict_SetItem( + self._extension_elements, (ns_utf, name_utf), + extension) + python.PyDict_DelItem(extensions, ns_name_tuple) _BaseContext.__init__(self, namespaces, extensions, enable_regexp) + cdef _BaseContext _copy(self): + cdef _XSLTContext context + context = <_XSLTContext>_BaseContext._copy(self) + context._extension_elements = self._extension_elements + return context + cdef register_context(self, xslt.xsltTransformContext* xsltCtxt, _Document doc): self._xsltCtxt = xsltCtxt @@ -245,6 +264,7 @@ self._register_context(doc) self.registerLocalFunctions(xsltCtxt, _register_xslt_function) self.registerGlobalFunctions(xsltCtxt, _register_xslt_function) + _registerXSLTExtensions(xsltCtxt, self._extension_elements) cdef free_context(self): self._cleanup_context() @@ -437,6 +457,11 @@ tree.xmlFreeDoc(c_result) resolver_context._raise_if_stored() + if context._exc._has_raised(): + if c_result is not NULL: + tree.xmlFreeDoc(c_result) + context._exc._raise_if_stored() + if c_result is NULL: # last error seems to be the most accurate here error = self._error_log.last_error Added: lxml/trunk/src/lxml/xsltext.pxi ============================================================================== --- (empty file) +++ lxml/trunk/src/lxml/xsltext.pxi Sun Mar 2 09:31:28 2008 @@ -0,0 +1,111 @@ +# XSLT extension elements + +cdef class XSLTExtension: + """Base class of an XSLT extension element. + """ + def execute(self, context, self_node, input_node, output_parent): + """execute(self, context, self_node, input_node, output_parent) + Execute this extension element. + + Subclasses may append elements to the `output_parent` element + here, or set its text content. To this end, the `input_node` + provides read-only access to the current node in the input + document, and the `self_node` points to the extension element + in the stylesheet. + """ + pass + + def apply_templates(self, _XSLTContext context not None, node): + """apply_templates(self, context, node) + + Call this method to continue applying templates to the input + document. Starts at the + + The return value is a list of elements that were generated. + """ + cdef xmlNode* c_parent + cdef xmlNode* c_node + cdef xmlNode* c_next + cdef xmlNode* c_context_node + cdef _ReadOnlyElementProxy proxy + c_context_node = _roNodeOf(node) + #assert c_context_node.doc is context._xsltContext.node.doc, \ + # "switching input documents during transformation is not currently supported" + + c_parent = tree.xmlNewDocNode( + context._xsltCtxt.output, NULL, "fake-parent", NULL) + + c_node = context._xsltCtxt.insert + context._xsltCtxt.insert = c_parent + xslt.xsltProcessOneNode( + context._xsltCtxt, c_context_node, NULL) + context._xsltCtxt.insert = c_node + + results = [] + c_node = c_parent.children + try: + while c_node is not NULL: + c_next = c_node.next + tree.xmlUnlinkNode(c_node) + proxy = _newReadOnlyProxy( + context._extension_element_proxy, c_node) + proxy.free_after_use() + python.PyList_Append(results, proxy) + c_node = c_next + finally: + tree.xmlFreeNode(c_parent) + return results + + +cdef _registerXSLTExtensions(xslt.xsltTransformContext* c_ctxt, + extension_dict): + for ns, name in extension_dict: + xslt.xsltRegisterExtElement( + c_ctxt, _cstr(name), _cstr(ns), _callExtensionElement) + +cdef void _callExtensionElement(xslt.xsltTransformContext* c_ctxt, + xmlNode* c_context_node, + xmlNode* c_inst_node, + void* dummy) with gil: + cdef _XSLTContext context + cdef XSLTExtension extension + cdef python.PyObject* dict_result + cdef char* c_uri + cdef _ReadOnlyElementProxy context_node, self_node, output_parent + c_uri = _getNs(c_inst_node) + if c_uri is NULL: + # not allowed, and should never happen + return + if c_ctxt.xpathCtxt.userData is NULL: + # just for safety, should never happen + return + context = <_XSLTContext>c_ctxt.xpathCtxt.userData + try: + dict_result = python.PyDict_GetItem( + context._extension_elements, (c_uri, c_inst_node.name)) + if dict_result is NULL: + raise KeyError("extension element %s not found", + c_inst_node.name) + extension = dict_result + + try: + self_node = _newReadOnlyProxy(None, c_inst_node) + context_node = _newReadOnlyProxy(self_node, c_context_node) + output_parent = _newAppendOnlyProxy(self_node, c_ctxt.insert) + + context._extension_element_proxy = self_node + extension.execute(context, self_node, context_node, output_parent) + finally: + context._extension_element_proxy = None + if self_node is not None: + _freeReadOnlyProxies(self_node) + except Exception, e: + message = "Error executing extension element '%s': %s" % ( + c_inst_node.name, e) + xslt.xsltTransformError(c_ctxt, NULL, c_inst_node, message) + context._exc._store_raised() + except: + # just in case + message = "Error executing extension element '%s'" % c_inst_node.name + xslt.xsltTransformError(c_ctxt, NULL, c_inst_node, message) + context._exc._store_raised() From scoder at codespeak.net Sun Mar 2 09:31:32 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:32 +0100 (CET) Subject: [Lxml-checkins] r52027 - in lxml/trunk: . doc Message-ID: <20080302083132.EE45016853E@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:32 2008 New Revision: 52027 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/extensions.txt Log: r3666 at delle: sbehnel | 2008-03-02 08:56:20 +0100 r3656 at delle: sbehnel | 2008-03-02 07:52:50 +0100 reverted doc changes Modified: lxml/trunk/doc/extensions.txt ============================================================================== --- lxml/trunk/doc/extensions.txt (original) +++ lxml/trunk/doc/extensions.txt Sun Mar 2 09:31:32 2008 @@ -1,20 +1,11 @@ -Python extensions for XPath and XSLT -==================================== +Extension functions for XPath and XSLT +====================================== This document describes how to use Python extension functions in XPath and XSLT like this:: -It also describes how to use Python extension elements in XSLT like -this:: - - - - - - - Here is how an extension function looks like. As the first argument, it always receives a context object (see below). The other arguments are provided by the respective call in the XPath expression, one in @@ -27,14 +18,6 @@ >>> def loadsofargs(dummy, *args): ... return "Got %d arguments." % len(args) -And here is how an extension element looks like:: - - >>> from lxml import etree - >>> class MyExtElement(etree.XSLTExtension): - ... def execute(self, context, self_node, input_node, output_parent): - ... # just copy own content input to output - ... output_parent.extend( list(self_node) ) - .. contents:: .. @@ -54,6 +37,7 @@ FunctionNamespace class. For simplicity, we choose the empty namespace (None):: + >>> from lxml import etree >>> ns = etree.FunctionNamespace(None) >>> ns['hello'] = hello >>> ns['countargs'] = loadsofargs From scoder at codespeak.net Sun Mar 2 09:31:36 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:36 +0100 (CET) Subject: [Lxml-checkins] r52028 - in lxml/trunk: . src/lxml Message-ID: <20080302083136.6577216853F@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:36 2008 New Revision: 52028 Modified: lxml/trunk/ (props changed) lxml/trunk/src/lxml/xsltext.pxi Log: r3667 at delle: sbehnel | 2008-03-02 08:56:21 +0100 r3657 at delle: sbehnel | 2008-03-02 07:53:17 +0100 support text nodes as XSLT result of apply_templates() Modified: lxml/trunk/src/lxml/xsltext.pxi ============================================================================== --- lxml/trunk/src/lxml/xsltext.pxi (original) +++ lxml/trunk/src/lxml/xsltext.pxi Sun Mar 2 09:31:36 2008 @@ -47,10 +47,16 @@ while c_node is not NULL: c_next = c_node.next tree.xmlUnlinkNode(c_node) - proxy = _newReadOnlyProxy( - context._extension_element_proxy, c_node) - proxy.free_after_use() - python.PyList_Append(results, proxy) + if c_node.type == tree.XML_TEXT_NODE: + python.PyList_Append(results, _collectText(c_node)) + elif c_node.type == tree.XML_ELEMENT_NODE: + proxy = _newReadOnlyProxy( + context._extension_element_proxy, c_node) + proxy.free_after_use() + python.PyList_Append(results, proxy) + else: + raise TypeError("unsupported XSLT result type: %d" % + c_node.type) c_node = c_next finally: tree.xmlFreeNode(c_parent) From scoder at codespeak.net Sun Mar 2 09:31:40 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:40 +0100 (CET) Subject: [Lxml-checkins] r52029 - in lxml/trunk: . src/lxml/tests Message-ID: <20080302083140.AE6FC168538@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:40 2008 New Revision: 52029 Modified: lxml/trunk/ (props changed) lxml/trunk/src/lxml/tests/test_xslt.py Log: r3668 at delle: sbehnel | 2008-03-02 08:56:21 +0100 r3658 at delle: sbehnel | 2008-03-02 07:55:07 +0100 test case for apply_templates() Modified: lxml/trunk/src/lxml/tests/test_xslt.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_xslt.py (original) +++ lxml/trunk/src/lxml/tests/test_xslt.py Sun Mar 2 09:31:40 2008 @@ -652,6 +652,38 @@ self.assertEquals(self._rootstring(result), 'Y') + def test_extension_element_apply_templates(self): + tree = self.parse('B') + style = self.parse('''\ + + + XY + + + XYZ +''') + + class MyExt(etree.XSLTExtension): + def execute(self, context, self_node, input_node, output_parent): + for child in self_node: + for result in self.apply_templates(context, child): + if isinstance(result, basestring): + el = etree.Element("T") + el.text = result + else: + el = result + output_parent.append(el) + + extensions = { ('testns', 'myext') : MyExt() } + + result = tree.xslt(style, extensions=extensions) + self.assertEquals(self._rootstring(result), + 'YXYZ') + def test_extension_element_raise(self): tree = self.parse('B') style = self.parse('''\ From scoder at codespeak.net Sun Mar 2 09:31:44 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:44 +0100 (CET) Subject: [Lxml-checkins] r52030 - in lxml/trunk: . src/lxml/tests Message-ID: <20080302083144.B1A4B168540@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:44 2008 New Revision: 52030 Modified: lxml/trunk/ (props changed) lxml/trunk/src/lxml/tests/test_xslt.py Log: r3669 at delle: sbehnel | 2008-03-02 08:56:21 +0100 r3659 at delle: sbehnel | 2008-03-02 08:40:19 +0100 cleanup Modified: lxml/trunk/src/lxml/tests/test_xslt.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_xslt.py (original) +++ lxml/trunk/src/lxml/tests/test_xslt.py Sun Mar 2 09:31:44 2008 @@ -635,8 +635,7 @@ + extension-element-prefixes="myns"> XY @@ -658,8 +657,7 @@ + extension-element-prefixes="myns"> XY From scoder at codespeak.net Sun Mar 2 09:31:48 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:48 +0100 (CET) Subject: [Lxml-checkins] r52031 - in lxml/trunk: . src/lxml Message-ID: <20080302083148.5B325168540@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:48 2008 New Revision: 52031 Modified: lxml/trunk/ (props changed) lxml/trunk/src/lxml/xsltext.pxi Log: r3670 at delle: sbehnel | 2008-03-02 08:56:21 +0100 r3660 at delle: sbehnel | 2008-03-02 08:40:41 +0100 docstring fix Modified: lxml/trunk/src/lxml/xsltext.pxi ============================================================================== --- lxml/trunk/src/lxml/xsltext.pxi (original) +++ lxml/trunk/src/lxml/xsltext.pxi Sun Mar 2 09:31:48 2008 @@ -18,10 +18,11 @@ def apply_templates(self, _XSLTContext context not None, node): """apply_templates(self, context, node) - Call this method to continue applying templates to the input - document. Starts at the + Call this method to retrieve the result of applying templates + to an element. - The return value is a list of elements that were generated. + The return value is a list of elements or text strings that + were generated by the XSLT processor. """ cdef xmlNode* c_parent cdef xmlNode* c_node From scoder at codespeak.net Sun Mar 2 09:31:52 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:52 +0100 (CET) Subject: [Lxml-checkins] r52032 - in lxml/trunk: . src/lxml Message-ID: <20080302083152.3D3A4168540@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:51 2008 New Revision: 52032 Modified: lxml/trunk/ (props changed) lxml/trunk/src/lxml/readonlytree.pxi Log: r3671 at delle: sbehnel | 2008-03-02 08:56:22 +0100 r3661 at delle: sbehnel | 2008-03-02 08:41:06 +0100 support for deep copying read-only Elements Modified: lxml/trunk/src/lxml/readonlytree.pxi ============================================================================== --- lxml/trunk/src/lxml/readonlytree.pxi (original) +++ lxml/trunk/src/lxml/readonlytree.pxi Sun Mar 2 09:31:51 2008 @@ -119,6 +119,28 @@ c_node = _findChildBackwards(self._c_node, 0) return c_node != NULL + def __deepcopy__(self, memo): + "__deepcopy__(self, memo)" + return self.__copy__() + + def __copy__(self): + "__copy__(self)" + cdef xmlDoc* c_doc + cdef xmlNode* c_node + cdef _Document new_doc + c_doc = _copyDocRoot(self._c_node.doc, self._c_node) # recursive + new_doc = _documentFactory(c_doc, None) + root = new_doc.getroot() + if root is not None: + return root + # Comment/PI + c_node = c_doc.children + while c_node is not NULL and c_node.type != self._c_node.type: + c_node = c_node.next + if c_node is NULL: + return None + return _elementFactory(new_doc, c_node) + def __iter__(self): return iter(self.getchildren()) From scoder at codespeak.net Sun Mar 2 09:31:56 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:31:56 +0100 (CET) Subject: [Lxml-checkins] r52033 - in lxml/trunk: . doc/html Message-ID: <20080302083156.254A3168541@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:55 2008 New Revision: 52033 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/html/style.css Log: r3672 at delle: sbehnel | 2008-03-02 08:56:22 +0100 r3662 at delle: sbehnel | 2008-03-02 08:55:38 +0100 web site styling Modified: lxml/trunk/doc/html/style.css ============================================================================== --- lxml/trunk/doc/html/style.css (original) +++ lxml/trunk/doc/html/style.css Sun Mar 2 09:31:55 2008 @@ -190,6 +190,16 @@ background-color: transparent; } +dt { + line-height: 1.5em; + margin-left: 1em; + content: "\00BB" " "; +} + +dt:before { + content: "\00BB" " "; +} + ul { line-height: 1.5em; margin-left: 1em; From scoder at codespeak.net Sun Mar 2 09:32:00 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:32:00 +0100 (CET) Subject: [Lxml-checkins] r52034 - in lxml/trunk: . doc Message-ID: <20080302083200.2C89E168541@codespeak.net> Author: scoder Date: Sun Mar 2 09:31:59 2008 New Revision: 52034 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/xpathxslt.txt Log: r3673 at delle: sbehnel | 2008-03-02 08:56:22 +0100 r3663 at delle: sbehnel | 2008-03-02 08:56:13 +0100 doc section on XSLT extension elements Modified: lxml/trunk/doc/xpathxslt.txt ============================================================================== --- lxml/trunk/doc/xpathxslt.txt (original) +++ lxml/trunk/doc/xpathxslt.txt Sun Mar 2 09:31:59 2008 @@ -458,8 +458,127 @@ ------------------ Just like `custom extension functions`_, lxml supports custom -extension *elements*. +extension *elements* in XSLT. This means, you can write XSLT code +like this:: + + + + + + +And then you can implement the element in Python like this:: + + >>> class MyExtElement(etree.XSLTExtension): + ... def execute(self, context, self_node, input_node, output_parent): + ... print "Hello from XSLT!" + ... output_parent.text = "I did it!" + ... # just copy own content input to output + ... output_parent.extend( list(self_node) ) + +The arguments passed to this function are + +context + The opaque evaluation context. You need this when calling back + into the XSLT processor. + +self_node + A read-only Element object that represents the extension element + in the stylesheet. + +input_node + The current context Element in the input document (also read-only). + +output_parent + The current insertion point in the output document. You can + append elements or set the text value (not the tail). Apart from + that, the Element is read-only. + +In XSLT, extension elements can be used like any other XSLT element, +except that they must be declared as extensions using the standard +XSLT ``extension-element-prefixes`` option:: + + >>> xslt_ext_tree = etree.XML(''' + ... + ... + ... XYZ + ... + ... + ... --xyz-- + ... + ... ''') + +To register the extension, add its name and namespace to the extension +mapping of the XSLT object:: + + >>> my_extension = MyExtElement() + >>> extensions = { ('testns', 'ext') : my_extension } + >>> transform = etree.XSLT(xslt_ext_tree, extensions = extensions) + +Note how we pass an instance here, not the class of the extension. +Now we can run the transformation and see how our extension is +called:: + + >>> root = etree.XML('') + >>> result = transform(root) + Hello from XSLT! + >>> str(result) + '\nI did it!XYZ\n' + +XSLT extensions are a very powerful feature that allows you to +interact directly with the XSLT processor. You have full access to +the input document and the stylesheet, and you can even call back into +the XSLT processor to process templates. Here is an example that +passes an Element into the ``.apply_templates()`` method of the +``XSLTExtension`` instance:: + + >>> class MyExtElement(etree.XSLTExtension): + ... def execute(self, context, self_node, input_node, output_parent): + ... child = self_node[0] + ... results = self.apply_templates(context, child) + ... output_parent.append(results[0]) + + >>> my_extension = MyExtElement() + >>> extensions = { ('testns', 'ext') : my_extension } + >>> transform = etree.XSLT(xslt_ext_tree, extensions = extensions) + + >>> root = etree.XML('') + >>> result = transform(root) + >>> str(result) + '\n--xyz--\n' + +Note how we applied the templates to a child of the extension element +itself, i.e. to an element inside the stylesheet instead of an element +of the input document. + +There is one important thing to keep in mind: all Elements that the +``execute()`` method gets to deal with are read-only Elements, so you +cannot modify them. They also will not easily work in the API. For +example, you cannot pass them to the ``tostring()`` function or wrap +them in an ``ElementTree``. + +What you can do, however, is to deepcopy them to make them normal +Elements, and then modify them using the normal etree API. So this +will work:: + + >>> from copy import deepcopy + >>> class MyExtElement(etree.XSLTExtension): + ... def execute(self, context, self_node, input_node, output_parent): + ... child = deepcopy(self_node[0]) + ... child.text = "NEW TEXT" + ... output_parent.append(child) + + >>> my_extension = MyExtElement() + >>> extensions = { ('testns', 'ext') : my_extension } + >>> transform = etree.XSLT(xslt_ext_tree, extensions = extensions) + + >>> root = etree.XML('') + >>> result = transform(root) + >>> str(result) + '\nNEW TEXT\n' The ``xslt()`` tree method From scoder at codespeak.net Sun Mar 2 09:32:04 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:32:04 +0100 (CET) Subject: [Lxml-checkins] r52035 - lxml/trunk Message-ID: <20080302083204.3479C168541@codespeak.net> Author: scoder Date: Sun Mar 2 09:32:03 2008 New Revision: 52035 Modified: lxml/trunk/ (props changed) lxml/trunk/TODO.txt Log: r3675 at delle: sbehnel | 2008-03-02 09:22:29 +0100 cleanup Modified: lxml/trunk/TODO.txt ============================================================================== --- lxml/trunk/TODO.txt (original) +++ lxml/trunk/TODO.txt Sun Mar 2 09:32:03 2008 @@ -45,22 +45,6 @@ by libxml2 (patch exists) -XSLT extension elements ------------------------ - -* implementation: one base class that represents the result parent - - - .append(), .extend() and .text will add to the result tree (no .tail) - - - difference: Elements should be copied, not moved? (will break - later changes, but this just means that Elements in the result - tree are immutable, including those that were added) - - - how to make input tree read-only? maybe just document? - - - docs: "once in the result tree, Elements must no longer be changed"? - - lxml 2.0 ======== From scoder at codespeak.net Sun Mar 2 09:32:08 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:32:08 +0100 (CET) Subject: [Lxml-checkins] r52036 - lxml/trunk Message-ID: <20080302083208.541C7168539@codespeak.net> Author: scoder Date: Sun Mar 2 09:32:07 2008 New Revision: 52036 Modified: lxml/trunk/ (props changed) lxml/trunk/CHANGES.txt Log: r3676 at delle: sbehnel | 2008-03-02 09:23:16 +0100 mark extension elements experimental Modified: lxml/trunk/CHANGES.txt ============================================================================== --- lxml/trunk/CHANGES.txt (original) +++ lxml/trunk/CHANGES.txt Sun Mar 2 09:32:07 2008 @@ -8,7 +8,7 @@ Features added -------------- -* Extension elements for XSLT +* Extension elements for XSLT (experimental!) * ``Element.base`` property returns the xml:base or HTML base URL of an Element. From scoder at codespeak.net Sun Mar 2 09:32:59 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:32:59 +0100 (CET) Subject: [Lxml-checkins] r52037 - lxml/trunk Message-ID: <20080302083259.65249168538@codespeak.net> Author: scoder Date: Sun Mar 2 09:32:59 2008 New Revision: 52037 Modified: lxml/trunk/ (props changed) lxml/trunk/CHANGES.txt lxml/trunk/version.txt Log: r3690 at delle: sbehnel | 2008-03-02 09:32:14 +0100 make next trunk version 2.1 Modified: lxml/trunk/CHANGES.txt ============================================================================== --- lxml/trunk/CHANGES.txt (original) +++ lxml/trunk/CHANGES.txt Sun Mar 2 09:32:59 2008 @@ -2,8 +2,8 @@ lxml changelog ============== -2.0.3 (Under development) -========================= +2.1alpha1 (Under development) +============================= Features added -------------- Modified: lxml/trunk/version.txt ============================================================================== --- lxml/trunk/version.txt (original) +++ lxml/trunk/version.txt Sun Mar 2 09:32:59 2008 @@ -1 +1 @@ -2.0.2 +2.1alpha1 From scoder at codespeak.net Sun Mar 2 09:40:30 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Sun, 2 Mar 2008 09:40:30 +0100 (CET) Subject: [Lxml-checkins] r52038 - in lxml/branch/lxml-2.0: doc src/lxml src/lxml/html Message-ID: <20080302084030.3F5F6168538@codespeak.net> Author: scoder Date: Sun Mar 2 09:40:29 2008 New Revision: 52038 Added: lxml/branch/lxml-2.0/src/lxml/saxparser.pxi - copied unchanged from r51849, lxml/trunk/src/lxml/saxparser.pxi Modified: lxml/branch/lxml-2.0/doc/lxml-source-howto.txt lxml/branch/lxml-2.0/src/lxml/html/__init__.py lxml/branch/lxml-2.0/src/lxml/lxml.etree.pyx lxml/branch/lxml-2.0/src/lxml/parser.pxi Log: trunk merge Modified: lxml/branch/lxml-2.0/doc/lxml-source-howto.txt ============================================================================== --- lxml/branch/lxml-2.0/doc/lxml-source-howto.txt (original) +++ lxml/branch/lxml-2.0/doc/lxml-source-howto.txt Sun Mar 2 09:40:29 2008 @@ -114,6 +114,9 @@ ... element = _elementFactory(doc, c_node) +A good place to see how this factory is used are the Element methods +``getparent()``, ``getnext()`` and ``getprevious()``. + The documentation ----------------- @@ -216,12 +219,15 @@ modules at the C level. For example, ``lxml.objectify`` makes use of these. See the `C-level API` documentation. +saxparser.pxi + SAX-like parser interfaces as known from ElementTree's TreeBuilder. + serializer.pxi XML output functions. Basically everything that creates byte sequences from XML trees. xinclude.pxi - XInclude implementation. + XInclude support. xmlerror.pxi Error log handling. All error messages that libxml2 generates Modified: lxml/branch/lxml-2.0/src/lxml/html/__init__.py ============================================================================== --- lxml/branch/lxml-2.0/src/lxml/html/__init__.py (original) +++ lxml/branch/lxml-2.0/src/lxml/html/__init__.py Sun Mar 2 09:40:29 2008 @@ -713,11 +713,11 @@ You can use this like:: - >>> form = doc.forms[0] # doctest: +SKIP - >>> form.inputs['foo'].value = 'bar' # etc # doctest: +SKIP - >>> response = form.submit() # doctest: +SKIP - >>> doc = parse(response) # doctest: +SKIP - >>> doc.make_links_absolute(response.geturl()) # doctest: +SKIP + form = doc.forms[0] + form.inputs['foo'].value = 'bar' # etc + response = form.submit() + doc = parse(response) + doc.make_links_absolute(response.geturl()) To change the HTTP requester, pass a function as ``open_http`` keyword argument that opens the URL for you. The function must have the following Modified: lxml/branch/lxml-2.0/src/lxml/lxml.etree.pyx ============================================================================== --- lxml/branch/lxml-2.0/src/lxml/lxml.etree.pyx (original) +++ lxml/branch/lxml-2.0/src/lxml/lxml.etree.pyx Sun Mar 2 09:40:29 2008 @@ -1110,8 +1110,7 @@ c_node = _parentElement(self._c_node) if c_node is NULL: return None - else: - return _elementFactory(self._doc, c_node) + return _elementFactory(self._doc, c_node) def getnext(self): """getnext(self) @@ -1120,9 +1119,9 @@ """ cdef xmlNode* c_node c_node = _nextElement(self._c_node) - if c_node is not NULL: - return _elementFactory(self._doc, c_node) - return None + if c_node is NULL: + return None + return _elementFactory(self._doc, c_node) def getprevious(self): """getprevious(self) @@ -1131,9 +1130,9 @@ """ cdef xmlNode* c_node c_node = _previousElement(self._c_node) - if c_node is not NULL: - return _elementFactory(self._doc, c_node) - return None + if c_node is NULL: + return None + return _elementFactory(self._doc, c_node) def itersiblings(self, tag=None, *, preceding=False): """itersiblings(self, tag=None, preceding=False) @@ -2534,6 +2533,7 @@ include "nsclasses.pxi" # Namespace implementation and registry include "docloader.pxi" # Support for custom document loaders include "parser.pxi" # XML Parser +include "saxparser.pxi" # SAX-like Parser interface and tree builder include "parsertarget.pxi" # ET Parser target include "serializer.pxi" # XML output functions include "iterparse.pxi" # incremental XML parsing Modified: lxml/branch/lxml-2.0/src/lxml/parser.pxi ============================================================================== --- lxml/branch/lxml-2.0/src/lxml/parser.pxi (original) +++ lxml/branch/lxml-2.0/src/lxml/parser.pxi Sun Mar 2 09:40:29 2008 @@ -1011,441 +1011,6 @@ return htmlparser.htmlParseChunk(c_ctxt, c_data, buffer_len, 0) return 0 - -############################################################ -## SAX event handler -############################################################ - -ctypedef enum _SaxParserEvents: - SAX_EVENT_START = 1 - SAX_EVENT_END = 2 - SAX_EVENT_DATA = 4 - SAX_EVENT_DOCTYPE = 8 - SAX_EVENT_PI = 16 - SAX_EVENT_COMMENT = 32 - -cdef class _SaxParserTarget: - cdef int _sax_event_filter - cdef int _sax_event_propagate - cdef _handleSaxStart(self, tag, attrib, nsmap): - return None - cdef _handleSaxEnd(self, tag): - return None - cdef int _handleSaxData(self, data) except -1: - return 0 - cdef int _handleSaxDoctype(self, root_tag, public_id, system_id) except -1: - return 0 - cdef _handleSaxPi(self, target, data): - return None - cdef _handleSaxComment(self, comment): - return None - -cdef class _SaxParserContext(_ParserContext): - """This class maps SAX2 events to method calls. - """ - cdef _SaxParserTarget _target - cdef xmlparser.startElementNsSAX2Func _origSaxStart - cdef xmlparser.endElementNsSAX2Func _origSaxEnd - cdef xmlparser.startElementSAXFunc _origSaxStartNoNs - cdef xmlparser.endElementSAXFunc _origSaxEndNoNs - cdef xmlparser.charactersSAXFunc _origSaxData - cdef xmlparser.internalSubsetSAXFunc _origSaxDoctype - cdef xmlparser.commentSAXFunc _origSaxComment - cdef xmlparser.processingInstructionSAXFunc _origSaxPi - - cdef void _setSaxParserTarget(self, _SaxParserTarget target): - self._target = target - - cdef void _initParserContext(self, xmlparser.xmlParserCtxt* c_ctxt): - "wrap original SAX2 callbacks" - cdef xmlparser.xmlSAXHandler* sax - _ParserContext._initParserContext(self, c_ctxt) - sax = c_ctxt.sax - if self._target._sax_event_propagate & SAX_EVENT_START: - # propagate => keep orig callback - self._origSaxStart = sax.startElementNs - self._origSaxStartNoNs = sax.startElement - else: - # otherwise: never call orig callback - self._origSaxStart = sax.startElementNs = NULL - self._origSaxStartNoNs = sax.startElement = NULL - if self._target._sax_event_filter & SAX_EVENT_START: - # intercept => overwrite orig callback - if sax.initialized == xmlparser.XML_SAX2_MAGIC: - sax.startElementNs = _handleSaxStart - sax.startElement = _handleSaxStartNoNs - - if self._target._sax_event_propagate & SAX_EVENT_END: - self._origSaxEnd = sax.endElementNs - self._origSaxEndNoNs = sax.endElement - else: - self._origSaxEnd = sax.endElementNs = NULL - self._origSaxEndNoNs = sax.endElement = NULL - if self._target._sax_event_filter & SAX_EVENT_END: - if sax.initialized == xmlparser.XML_SAX2_MAGIC: - sax.endElementNs = _handleSaxEnd - sax.endElement = _handleSaxEndNoNs - - if self._target._sax_event_propagate & SAX_EVENT_DATA: - self._origSaxData = sax.characters - else: - self._origSaxData = sax.characters = NULL - if self._target._sax_event_filter & SAX_EVENT_DATA: - sax.characters = _handleSaxData - - if self._target._sax_event_propagate & SAX_EVENT_DOCTYPE: - self._origSaxDoctype = sax.internalSubset - else: - self._origSaxDoctype = sax.internalSubset = NULL - if self._target._sax_event_filter & SAX_EVENT_DOCTYPE: - sax.internalSubset = _handleSaxDoctype - - if self._target._sax_event_propagate & SAX_EVENT_PI: - self._origSaxPi = sax.processingInstruction - else: - self._origSaxPi = sax.processingInstruction = NULL - if self._target._sax_event_filter & SAX_EVENT_PI: - sax.processingInstruction = _handleSaxPI - - if self._target._sax_event_propagate & SAX_EVENT_COMMENT: - self._origSaxComment = sax.comment - else: - self._origSaxComment = sax.comment = NULL - if self._target._sax_event_filter & SAX_EVENT_COMMENT: - sax.comment = _handleSaxComment - - cdef void _handleSaxException(self, xmlparser.xmlParserCtxt* c_ctxt): - self._store_raised() - if c_ctxt.errNo == xmlerror.XML_ERR_OK: - c_ctxt.errNo = xmlerror.XML_ERR_INTERNAL_ERROR - c_ctxt.disableSAX = 1 - -cdef void _handleSaxStart(void* ctxt, char* c_localname, char* c_prefix, - char* c_namespace, int c_nb_namespaces, - char** c_namespaces, - int c_nb_attributes, int c_nb_defaulted, - char** c_attributes) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - cdef _Element element - cdef int i - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxStart is not NULL: - context._origSaxStart(c_ctxt, c_localname, c_prefix, c_namespace, - c_nb_namespaces, c_namespaces, c_nb_attributes, - c_nb_defaulted, c_attributes) - try: - tag = _namespacedNameFromNsName(c_namespace, c_localname) - if c_nb_defaulted > 0: - # only add default attributes if we asked for them - if c_ctxt.loadsubset & xmlparser.XML_COMPLETE_ATTRS == 0: - c_nb_attributes = c_nb_attributes - c_nb_defaulted - if c_nb_attributes == 0: - attrib = EMPTY_READ_ONLY_DICT - else: - attrib = {} - for i from 0 <= i < c_nb_attributes: - name = _namespacedNameFromNsName( - c_attributes[2], c_attributes[0]) - if c_attributes[3] is NULL: - value = "" - else: - value = python.PyUnicode_DecodeUTF8( - c_attributes[3], c_attributes[4] - c_attributes[3], - "strict") - python.PyDict_SetItem(attrib, name, value) - c_attributes = c_attributes + 5 - if c_nb_namespaces == 0: - nsmap = EMPTY_READ_ONLY_DICT - else: - nsmap = {} - for i from 0 <= i < c_nb_namespaces: - if c_namespaces[0] is NULL: - prefix = None - else: - prefix = funicode(c_namespaces[0]) - python.PyDict_SetItem( - nsmap, prefix, funicode(c_namespaces[1])) - c_namespaces = c_namespaces + 2 - element = context._target._handleSaxStart(tag, attrib, nsmap) - if element is not None and c_ctxt.input is not NULL: - if c_ctxt.input.line < 65535: - element._c_node.line = c_ctxt.input.line - else: - element._c_node.line = 65535 - except: - context._handleSaxException(c_ctxt) - -cdef void _handleSaxStartNoNs(void* ctxt, char* c_name, - char** c_attributes) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - cdef _Element element - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxStartNoNs is not NULL: - context._origSaxStartNoNs(c_ctxt, c_name, c_attributes) - try: - tag = funicode(c_name) - if c_attributes is NULL: - attrib = EMPTY_READ_ONLY_DICT - else: - attrib = {} - while c_attributes[0] is not NULL: - name = funicode(c_attributes[0]) - if c_attributes[1] is NULL: - value = "" - else: - value = funicode(c_attributes[1]) - c_attributes = c_attributes + 2 - python.PyDict_SetItem(attrib, name, value) - element = context._target._handleSaxStart( - tag, attrib, EMPTY_READ_ONLY_DICT) - if element is not None and c_ctxt.input is not NULL: - if c_ctxt.input.line < 65535: - element._c_node.line = c_ctxt.input.line - else: - element._c_node.line = 65535 - except: - context._handleSaxException(c_ctxt) - -cdef void _handleSaxEnd(void* ctxt, char* c_localname, char* c_prefix, - char* c_namespace) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxEnd is not NULL: - context._origSaxEnd(c_ctxt, c_localname, c_prefix, c_namespace) - try: - tag = _namespacedNameFromNsName(c_namespace, c_localname) - context._target._handleSaxEnd(tag) - except: - context._handleSaxException(c_ctxt) - -cdef void _handleSaxEndNoNs(void* ctxt, char* c_name) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxEndNoNs is not NULL: - context._origSaxEndNoNs(c_ctxt, c_name) - try: - context._target._handleSaxEnd(funicode(c_name)) - except: - context._handleSaxException(c_ctxt) - -cdef void _handleSaxData(void* ctxt, char* c_data, int data_len) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxData is not NULL: - context._origSaxData(c_ctxt, c_data, data_len) - try: - context._target._handleSaxData( - python.PyUnicode_DecodeUTF8(c_data, data_len, NULL)) - except: - context._handleSaxException(c_ctxt) - -cdef void _handleSaxDoctype(void* ctxt, char* c_name, char* c_public, - char* c_system) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxDoctype is not NULL: - context._origSaxDoctype(c_ctxt, c_name, c_public, c_system) - try: - if c_public is not NULL: - public_id = funicode(c_public) - if c_system is not NULL: - system_id = funicode(c_system) - context._target._handleSaxDoctype( - funicode(c_name), public_id, system_id) - except: - context._handleSaxException(c_ctxt) - -cdef void _handleSaxPI(void* ctxt, char* c_target, char* c_data) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxPi is not NULL: - context._origSaxPi(c_ctxt, c_target, c_data) - try: - if c_data is not NULL: - data = funicode(c_data) - context._target._handleSaxPi(funicode(c_target), data) - except: - context._handleSaxException(c_ctxt) - -cdef void _handleSaxComment(void* ctxt, char* c_data) with gil: - cdef _SaxParserContext context - cdef xmlparser.xmlParserCtxt* c_ctxt - c_ctxt = ctxt - if c_ctxt._private is NULL: - return - context = <_SaxParserContext>c_ctxt._private - if context._origSaxComment is not NULL: - context._origSaxComment(c_ctxt, c_data) - try: - context._target._handleSaxComment(funicode(c_data)) - except: - context._handleSaxException(c_ctxt) - - -############################################################ -## ET compatible XML tree builder -############################################################ - -cdef class TreeBuilder(_SaxParserTarget): - """TreeBuilder(self, element_factory=None, parser=None) - Parser target that builds a tree. - - The final tree is returned by the ``close()`` method. - """ - cdef _BaseParser _parser - cdef object _factory - cdef object _data - cdef object _element_stack - cdef object _element_stack_pop - cdef _Element _last - cdef bint _in_tail - - def __init__(self, *, element_factory=None, parser=None): - self._sax_event_filter = \ - SAX_EVENT_START | SAX_EVENT_END | SAX_EVENT_DATA | \ - SAX_EVENT_PI | SAX_EVENT_COMMENT - self._data = [] # data collector - self._element_stack = [] # element stack - self._element_stack_pop = self._element_stack.pop - self._last = None # last element - self._in_tail = 0 # true if we're after an end tag - self._factory = element_factory - self._parser = parser - - cdef int _flush(self) except -1: - if python.PyList_GET_SIZE(self._data) > 0: - if self._last is not None: - text = "".join(self._data) - if self._in_tail: - assert self._last.tail is None, "internal error (tail)" - self._last.tail = text - else: - assert self._last.text is None, "internal error (text)" - self._last.text = text - del self._data[:] - return 0 - - # Python level event handlers - - def close(self): - """close(self) - - Flushes the builder buffers, and returns the toplevel document - element. - """ - assert python.PyList_GET_SIZE(self._element_stack) == 0, "missing end tags" - assert self._last is not None, "missing toplevel element" - return self._last - - def data(self, data): - """data(self, data) - - Adds text to the current element. The value should be either an - 8-bit string containing ASCII text, or a Unicode string. - """ - self._handleSaxData(data) - - def start(self, tag, attrs, nsmap=None): - """start(self, tag, attrs, nsmap=None) - - Opens a new element. - """ - if nsmap is None: - nsmap = EMPTY_READ_ONLY_DICT - return self._handleSaxStart(tag, attrs, nsmap) - - def end(self, tag): - """end(self, tag) - - Closes the current element. - """ - element = self._handleSaxEnd(tag) - assert self._last.tag == tag,\ - "end tag mismatch (expected %s, got %s)" % ( - self._last.tag, tag) - return element - - def pi(self, target, data): - """pi(self, target, data) - """ - return self._handleSaxPi(target, data) - - def comment(self, comment): - """comment(self, comment) - """ - return self._handleSaxComment(comment) - - # internal SAX event handlers - - cdef _handleSaxStart(self, tag, attrib, nsmap): - self._flush() - if self._factory is not None: - self._last = self._factory(tag, attrib) - if python.PyList_GET_SIZE(self._element_stack) > 0: - _appendChild(self._element_stack[-1], self._last) - elif python.PyList_GET_SIZE(self._element_stack) > 0: - self._last = _makeSubElement( - self._element_stack[-1], tag, None, None, attrib, nsmap, None) - else: - self._last = _makeElement( - tag, NULL, None, self._parser, None, None, attrib, nsmap, None) - python.PyList_Append(self._element_stack, self._last) - self._in_tail = 0 - return self._last - - cdef _handleSaxEnd(self, tag): - self._flush() - self._last = self._element_stack_pop() - self._in_tail = 1 - return self._last - - cdef int _handleSaxData(self, data) except -1: - python.PyList_Append(self._data, data) - - cdef _handleSaxPi(self, target, data): - self._flush() - self._last = ProcessingInstruction(target, data) - if python.PyList_GET_SIZE(self._element_stack) > 0: - _appendChild(self._element_stack[-1], self._last) - self._in_tail = 1 - return self._last - - cdef _handleSaxComment(self, comment): - self._flush() - self._last = Comment(comment) - if python.PyList_GET_SIZE(self._element_stack) > 0: - _appendChild(self._element_stack[-1], self._last) - self._in_tail = 1 - return self._last - ############################################################ ## XML parser ############################################################ From scoder at codespeak.net Mon Mar 3 19:41:03 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:41:03 +0100 (CET) Subject: [Lxml-checkins] r52103 - in lxml/trunk: . src/lxml Message-ID: <20080303184103.BC9C8169ECF@codespeak.net> Author: scoder Date: Mon Mar 3 19:41:03 2008 New Revision: 52103 Modified: lxml/trunk/ (props changed) lxml/trunk/src/lxml/readonlytree.pxi Log: r3694 at delle: sbehnel | 2008-03-03 08:51:05 +0100 tag fix in read-only tree Modified: lxml/trunk/src/lxml/readonlytree.pxi ============================================================================== --- lxml/trunk/src/lxml/readonlytree.pxi (original) +++ lxml/trunk/src/lxml/readonlytree.pxi Mon Mar 3 19:41:03 2008 @@ -150,7 +150,7 @@ Iterate over the children of this element. """ children = self.getchildren() - if tag is not None: + if tag is not None and tag != '*': children = [ el for el in children if el.tag == tag ] if reversed: children = children[::-1] From scoder at codespeak.net Mon Mar 3 19:41:33 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:41:33 +0100 (CET) Subject: [Lxml-checkins] r52104 - in lxml/trunk: . doc Message-ID: <20080303184133.7EEF3169EB3@codespeak.net> Author: scoder Date: Mon Mar 3 19:41:32 2008 New Revision: 52104 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/build.txt lxml/trunk/doc/lxml-source-howto.txt lxml/trunk/doc/main.txt Log: r3695 at delle: sbehnel | 2008-03-03 08:51:17 +0100 doc updates Modified: lxml/trunk/doc/build.txt ============================================================================== --- lxml/trunk/doc/build.txt (original) +++ lxml/trunk/doc/build.txt Mon Mar 3 19:41:32 2008 @@ -58,11 +58,13 @@ svn co http://codespeak.net/svn/lxml/trunk lxml -This will create a directory ``lxml`` and download the source into it. You -can also `browse the repository through the web`_ or use your favourite SVN -client to access it. +This will create a directory ``lxml`` and download the source into it. +You can also browse the `Subversion repository`_ through the web, use +your favourite SVN client to access it, or browse the `Subversion +history`_. -.. _`browse the repository through the web`: http://codespeak.net/svn/lxml +.. _`Subversion repository`: http://codespeak.net/svn/lxml/ +.. _`Subversion history`: https://codespeak.net/viewvc/lxml/ Setuptools Modified: lxml/trunk/doc/lxml-source-howto.txt ============================================================================== --- lxml/trunk/doc/lxml-source-howto.txt (original) +++ lxml/trunk/doc/lxml-source-howto.txt Mon Mar 3 19:41:32 2008 @@ -153,8 +153,9 @@ lxml.etree ========== -The main module, ``lxml.etree``, is in the file **lxml.etree.pyx**. -It implements the main functions and types of the ElementTree API, as +The main module, ``lxml.etree``, is in the file `lxml.etree.pyx +`_. It +implements the main functions and types of the ElementTree API, as well as all the factory functions for proxies. It is the best place to start if you want to find out how a specific feature is implemented. @@ -219,6 +220,12 @@ modules at the C level. For example, ``lxml.objectify`` makes use of these. See the `C-level API` documentation. +readonlytree.pxi + A separate read-only implementation of the Element API. This is + used in places where non-intrusive access to a tree is required, + such as the ``PythonElementClassLookup`` or XSLT extension + elements. + saxparser.pxi SAX-like parser interfaces as known from ElementTree's TreeBuilder. @@ -295,15 +302,8 @@ A Cython implemented extension module that uses the public C-API of lxml.etree. It provides a Python object-like interface to XML trees. - - -lxml.pyclasslookup -================== - -A Cython implemented extension module that uses the public C-API of -lxml.etree. It provides a class lookup scheme that duplicates lxml's -ElementTree API in a very simple way to provide Python access to the -tree *before* instantiating the real Python proxies in lxml.etree. +The implementation resides in the file `lxml.objectify.pyx +`_. lxml.html Modified: lxml/trunk/doc/main.txt ============================================================================== --- lxml/trunk/doc/main.txt (original) +++ lxml/trunk/doc/main.txt Mon Mar 3 19:41:32 2008 @@ -159,13 +159,16 @@ svn co http://codespeak.net/svn/lxml/trunk lxml -You can also `browse it through the web`_. Please read `how to build lxml -from source`_ first. The `latest CHANGES`_ of the developer version are also -accessible. You can check there if a bug you found has been fixed or a -feature you want has been implemented in the latest trunk version. +You can also browse the `Subversion repository`_ through the web, or +take a look at the `Subversion history`_. Please read `how to build lxml +from source`_ first. The `latest CHANGES`_ of the developer version +are also accessible. You can check there if a bug you found has been +fixed or a feature you want has been implemented in the latest trunk +version. .. _`how to build lxml from source`: build.html -.. _`browse it through the web`: http://codespeak.net/svn/lxml +.. _`Subversion repository`: http://codespeak.net/svn/lxml/ +.. _`Subversion history`: https://codespeak.net/viewvc/lxml/ .. _`latest CHANGES`: http://codespeak.net/svn/lxml/trunk/CHANGES.txt From scoder at codespeak.net Mon Mar 3 19:41:43 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:41:43 +0100 (CET) Subject: [Lxml-checkins] r52105 - in lxml/trunk: . doc Message-ID: <20080303184143.EF4E2169EAE@codespeak.net> Author: scoder Date: Mon Mar 3 19:41:43 2008 New Revision: 52105 Modified: lxml/trunk/ (props changed) lxml/trunk/INSTALL.txt lxml/trunk/doc/main.txt lxml/trunk/doc/mkhtml.py Log: r3696 at delle: sbehnel | 2008-03-03 10:17:04 +0100 doc cleanup and fixes Modified: lxml/trunk/INSTALL.txt ============================================================================== --- lxml/trunk/INSTALL.txt (original) +++ lxml/trunk/INSTALL.txt Mon Mar 3 19:41:43 2008 @@ -40,12 +40,12 @@ Building lxml from sources -------------------------- -If you want to build lxml from SVN you should read `how to build lxml from -source`_ (or the file ``build.txt`` in the ``doc`` directory of the source -tree). Building from Subversion sources or from modified distribution sources -requires Cython_ to translate the lxml sources into C code. The source -distribution ships with pre-generated C source files, so you do not need -Cython installed to build from release sources. +If you want to build lxml from SVN you should read `how to build lxml +from source`_ (or the file ``doc/build.txt`` in the source tree). +Building from Subversion sources or from modified distribution sources +requires Cython_ to translate the lxml sources into C code. The +source distribution ships with pre-generated C source files, so you do +not need Cython installed to build from release sources. .. _Cython: http://www.cython.org .. _`how to build lxml from source`: build.html @@ -60,10 +60,10 @@ MS Windows ---------- -For MS Windows, the `binary egg distribution of lxml`_ is statically built -against the libraries, i.e. it already includes them. There is no need to -install the external libraries if you use an official lxml build from -cheeseshop. +For MS Windows, the `binary egg distribution of lxml`_ is statically +built against the libraries, i.e. it already includes them. There is +no need to install the external libraries if you use an official lxml +build from PyPI. If you want to upgrade the libraries and/or compile lxml from sources, you should install a `binary distribution`_ of libxml2 and libxslt. You need both @@ -76,13 +76,17 @@ MacOS-X ------- -On MacOS-X 10.4, you can try to use the installed system libraries when you -build lxml yourself. However, the library versions on this system are older -than the required versions, so you may encounter certain differences in -behaviour or even crashes. A number of users reported success with updated -libraries (e.g. using fink_), but needed to set the environment variable +The system libraries of libxml2 and libxslt installed under MacOS-X +tend to be rather outdated. In any case, they are older than the +required versions for lxml 2.x, so you will have a hard time getting +lxml to work without installing newer libraries. + +A number of users reported success with updated libraries (e.g. using +fink_ or macports), but needed to set the runtime environment variable ``DYLD_LIBRARY_PATH`` to the directory where fink keeps the libraries. +See the `FAQ entry on MacOS-X`_ for more information. .. _fink: http://finkproject.org/ +.. _`FAQ entry on MacOS-X`: FAQ.html#my-application-crashes-on-macos-x -A MacPort of lxml is available. Try ``port install py25-lxml``. +A macport of lxml is available. Try ``port install py25-lxml``. Modified: lxml/trunk/doc/main.txt ============================================================================== --- lxml/trunk/doc/main.txt (original) +++ lxml/trunk/doc/main.txt Mon Mar 3 19:41:43 2008 @@ -140,19 +140,17 @@ The source distribution is signed with `this key`_. Binary builds for MS Windows usually become available through PyPI a few days after a source release. If you can't wait, consider trying a less recent -version first. - -.. _`lxml at the Python Package Index`: http://pypi.python.org/pypi/lxml/ -.. _`this key`: pubkey.asc +release version first. The latest version is `lxml 2.0.2`_, released 2008-02-22 (`changes for 2.0.2`_). `Older versions`_ are listed below. -.. _`Older versions`: #old-versions - Please take a look at the `installation instructions`_! -.. _`installation instructions`: installation.html +This complete web site (including the generated API documentation) is +part of the source distribution, so if you want to download the +documentation for offline use, take the source archive and copy the +``doc/html`` directory out of the source tree. It's also possible to check out the latest development version of lxml from svn directly, using a command like this:: @@ -166,6 +164,10 @@ fixed or a feature you want has been implemented in the latest trunk version. +.. _`lxml at the Python Package Index`: http://pypi.python.org/pypi/lxml/ +.. _`this key`: pubkey.asc +.. _`Older versions`: #old-versions +.. _`installation instructions`: installation.html .. _`how to build lxml from source`: build.html .. _`Subversion repository`: http://codespeak.net/svn/lxml/ .. _`Subversion history`: https://codespeak.net/viewvc/lxml/ Modified: lxml/trunk/doc/mkhtml.py ============================================================================== --- lxml/trunk/doc/mkhtml.py (original) +++ lxml/trunk/doc/mkhtml.py Mon Mar 3 19:41:43 2008 @@ -3,8 +3,8 @@ import os, shutil, re, sys, copy, time SITE_STRUCTURE = [ - ('lxml', ('main.txt', 'intro.txt', 'lxml2.txt', 'FAQ.txt', - 'compatibility.txt', 'performance.txt')), + ('lxml', ('main.txt', 'intro.txt', '../INSTALL.txt', 'lxml2.txt', + 'FAQ.txt', 'compatibility.txt', 'performance.txt')), ('Developing with lxml', ('tutorial.txt', '@API reference', 'api.txt', 'parsing.txt', 'validation.txt', 'xpathxslt.txt', @@ -12,7 +12,8 @@ 'cssselect.txt', 'elementsoup.txt')), ('Extending lxml', ('resolvers.txt', 'extensions.txt', 'element_classes.txt', 'sax.txt', 'capi.txt')), - ('Developing lxml', ('build.txt', 'lxml-source-howto.txt')), + ('Developing lxml', ('build.txt', 'lxml-source-howto.txt', + '@Release Changelog')), ] RST2HTML_OPTIONS = " ".join([ @@ -26,6 +27,11 @@ "API reference" : "api/index.html" } +BASENAME_MAP = { + 'main' : 'index', + 'INSTALL' : 'installation', +} + htmlnsmap = {"h" : "http://www.w3.org/1999/xhtml"} find_title = XPath("/h:html/h:head/h:title/text()", namespaces=htmlnsmap) @@ -51,7 +57,7 @@ if page_title: page_title = page_title[0] else: - page_title = replace_invalid(' ', basename.capitalize()) + page_title = replace_invalid('', basename.capitalize()) build_menu_entry(page_title, basename+".html", section_head, headings=find_headings(tree)) @@ -78,7 +84,7 @@ tag = el.tag if tag[0] != '{': el.tag = "{http://www.w3.org/1999/xhtml}" + tag - current_menu = find_menu(menu_root, name=name) + current_menu = find_menu(menu_root, name=replace_invalid('', name)) if current_menu: for submenu in current_menu: submenu.set("class", submenu.get("class", ""). @@ -102,6 +108,10 @@ shutil.copy(pubkey, dirname) + href_map = HREF_MAP.copy() + changelog_basename = 'changes-%s' % release + href_map['Release Changelog'] = changelog_basename + '.html' + trees = {} menu = Element("div", {"class":"sidemenu"}) # build HTML pages and parse them back @@ -111,13 +121,12 @@ if filename.startswith('@'): # special menu entry page_title = filename[1:] - url = HREF_MAP[page_title] + url = href_map[page_title] build_menu_entry(page_title, url, section_head) else: path = os.path.join(doc_dir, filename) - basename = os.path.splitext(filename)[0] - if basename == 'main': - basename = 'index' + basename = os.path.splitext(os.path.basename(filename))[0] + basename = BASENAME_MAP.get(basename, basename) outname = basename + '.html' outpath = os.path.join(dirname, outname) @@ -128,20 +137,16 @@ build_menu(tree, basename, section_head) - # integrate menu - for tree, basename, outpath in trees.itervalues(): - new_tree = merge_menu(tree, menu, basename) - new_tree.write(outpath) - # also convert INSTALL.txt and CHANGES.txt rest2html(script, - os.path.join(lxml_path, 'INSTALL.txt'), - os.path.join(dirname, 'installation.html'), - stylesheet_url) - rest2html(script, os.path.join(lxml_path, 'CHANGES.txt'), os.path.join(dirname, 'changes-%s.html' % release), stylesheet_url) + # integrate menu + for tree, basename, outpath in trees.itervalues(): + new_tree = merge_menu(tree, menu, basename) + new_tree.write(outpath) + if __name__ == '__main__': publish(sys.argv[1], sys.argv[2], sys.argv[3]) From scoder at codespeak.net Mon Mar 3 19:41:50 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:41:50 +0100 (CET) Subject: [Lxml-checkins] r52106 - in lxml/trunk: . doc Message-ID: <20080303184150.F15D7169EB3@codespeak.net> Author: scoder Date: Mon Mar 3 19:41:50 2008 New Revision: 52106 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/FAQ.txt lxml/trunk/doc/mkhtml.py Log: r3697 at delle: sbehnel | 2008-03-03 10:52:47 +0100 FAQ fixes Modified: lxml/trunk/doc/FAQ.txt ============================================================================== --- lxml/trunk/doc/FAQ.txt (original) +++ lxml/trunk/doc/FAQ.txt Mon Mar 3 19:41:50 2008 @@ -63,16 +63,17 @@ important concepts in ``lxml.etree``. If you want to help out, improving the tutorial is a very good place to start. -There is also a `tutorial for ElementTree`_ which works for ``lxml.etree``. -The `API documentation`_ also contains many examples for ``lxml.etree``. To -learn using ``lxml.objectify``, read the `objectify documentation`_. +There is also a `tutorial for ElementTree`_ which works for +``lxml.etree``. The documentation of the `extended etree API`_ also +contains many examples for ``lxml.etree``. To learn using +``lxml.objectify``, read the `objectify documentation`_. John Shipman has written another tutorial called `Python XML processing with lxml`_ that contains lots of examples. .. _`lxml.etree Tutorial`: tutorial.html .. _`tutorial for ElementTree`: http://effbot.org/zone/element.htm -.. _`API documentation`: api.html +.. _`extended etree API`: api.html .. _`objectify documentation`: objectify.html .. _`Python XML processing with lxml`: http://www.nmt.edu/tcc/help/pubs/pylxml/ @@ -80,33 +81,56 @@ Where can I find more documentation about lxml? ----------------------------------------------- -There is a lot of documentation as lxml implements the well-known `ElementTree -API`_ and tries to follow its documentation as closely as possible. There are -a couple of issues where lxml cannot keep up compatibility. They are -described in the compatibility_ documentation. The lxml specific extensions -to the API are described by individual files in the ``doc`` directory of the -distribution and on `the web page`_. +There is a lot of documentation on the web and also in the Python +standard library documentation, as lxml implements the well-known +`ElementTree API`_ and tries to follow its documentation as closely as +possible. There are a couple of issues where lxml cannot keep up +compatibility. They are described in the compatibility_ +documentation. + +The lxml specific extensions to the API are described by individual +files in the ``doc`` directory of the source distribution and on `the +web page`_. + +The `generated API documentation`_ is a comprehensive API reference +for the lxml package. .. _`ElementTree API`: http://effbot.org/zone/element-index.htm .. _`the web page`: http://codespeak.net/lxml/#documentation +.. _`generated API documentation`: api/index.html What standards does lxml implement? ----------------------------------- The compliance to XML Standards depends on the support in libxml2 and libxslt. -Here is a quote from `http://xmlsoft.org/`: +Here is a quote from `http://xmlsoft.org/ `_: In most cases libxml2 tries to implement the specifications in a relatively strictly compliant way. As of release 2.4.16, libxml2 passed all 1800+ tests from the OASIS XML Tests Suite. -lxml currently supports libxml2 2.6.20 or later, which has even better support -for various XML standards. Some of the more important ones are: HTML, XML -namespaces, XPath, XInclude, XSLT, XML catalogs, canonical XML, RelaxNG, -XML:ID. Support for XML Schema and especially Schematron is currently -incomplete in libxml2, but is definitely usable and actively being worked on. -libxml2 also supports loading documents through HTTP and FTP. +lxml currently supports libxml2 2.6.20 or later, which has even better +support for various XML standards. The important ones are: + +* XML 1.0 +* HTML 4 +* XML namespaces +* XML Schema 1.0 +* XPath 1.0 +* XInclude 1.0 +* XSLT 1.0 +* EXSLT +* XML catalogs +* canonical XML +* RelaxNG +* xml:id +* xml:base + +Support for XML Schema is currently not 100% complete in libxml2, but +is definitely very close to compliance. Schematron is supported, +although not necessarily complete. libxml2 also supports loading +documents through HTTP and FTP. Who uses lxml? Modified: lxml/trunk/doc/mkhtml.py ============================================================================== --- lxml/trunk/doc/mkhtml.py (original) +++ lxml/trunk/doc/mkhtml.py Mon Mar 3 19:41:50 2008 @@ -4,7 +4,7 @@ SITE_STRUCTURE = [ ('lxml', ('main.txt', 'intro.txt', '../INSTALL.txt', 'lxml2.txt', - 'FAQ.txt', 'compatibility.txt', 'performance.txt')), + 'performance.txt', 'compatibility.txt', 'FAQ.txt')), ('Developing with lxml', ('tutorial.txt', '@API reference', 'api.txt', 'parsing.txt', 'validation.txt', 'xpathxslt.txt', From scoder at codespeak.net Mon Mar 3 19:41:59 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:41:59 +0100 (CET) Subject: [Lxml-checkins] r52107 - in lxml/trunk: . src/lxml src/lxml/tests Message-ID: <20080303184159.B2D87169EAE@codespeak.net> Author: scoder Date: Mon Mar 3 19:41:59 2008 New Revision: 52107 Modified: lxml/trunk/ (props changed) lxml/trunk/CHANGES.txt lxml/trunk/TODO.txt lxml/trunk/src/lxml/tests/test_xslt.py lxml/trunk/src/lxml/xslt.pxi Log: r3698 at delle: sbehnel | 2008-03-03 11:49:57 +0100 constant instances DENY_ALL/DENY_WRITE on XSLTAccessControl class Modified: lxml/trunk/CHANGES.txt ============================================================================== --- lxml/trunk/CHANGES.txt (original) +++ lxml/trunk/CHANGES.txt Mon Mar 3 19:41:59 2008 @@ -8,6 +8,9 @@ Features added -------------- +* Constant instances ``DENY_ALL`` and ``DENY_WRITE`` on + ``XSLTAccessControl`` class. + * Extension elements for XSLT (experimental!) * ``Element.base`` property returns the xml:base or HTML base URL of Modified: lxml/trunk/TODO.txt ============================================================================== --- lxml/trunk/TODO.txt (original) +++ lxml/trunk/TODO.txt Mon Mar 3 19:41:59 2008 @@ -45,6 +45,13 @@ by libxml2 (patch exists) +XSLT +---- + +* Support subclassing XSLTAccessControl to provide custom per-URL + access check methods + + lxml 2.0 ======== Modified: lxml/trunk/src/lxml/tests/test_xslt.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_xslt.py (original) +++ lxml/trunk/src/lxml/tests/test_xslt.py Mon Mar 3 19:41:59 2008 @@ -819,6 +819,29 @@ self.assertEquals(root[3].get("value"), 'B') + def test_xslt_document_parse_allow(self): + access_control = etree.XSLTAccessControl(read_file=True) + xslt = etree.XSLT(etree.parse(fileInTestDir("test-document.xslt")), + access_control = access_control) + result = xslt(etree.XML('')) + root = result.getroot() + self.assertEquals(root.tag, + 'test') + self.assertEquals(root[0].tag, + '{http://www.w3.org/1999/XSL/Transform}stylesheet') + + def test_xslt_document_parse_deny(self): + access_control = etree.XSLTAccessControl(read_file=False) + xslt = etree.XSLT(etree.parse(fileInTestDir("test-document.xslt")), + access_control = access_control) + self.assertRaises(etree.XSLTApplyError, xslt, etree.XML('')) + + def test_xslt_document_parse_deny_all(self): + access_control = etree.XSLTAccessControl.DENY_ALL + xslt = etree.XSLT(etree.parse(fileInTestDir("test-document.xslt")), + access_control = access_control) + self.assertRaises(etree.XSLTApplyError, xslt, etree.XML('')) + def test_xslt_move_result(self): root = etree.XML('''\ Modified: lxml/trunk/src/lxml/xslt.pxi ============================================================================== --- lxml/trunk/src/lxml/xslt.pxi (original) +++ lxml/trunk/src/lxml/xslt.pxi Mon Mar 3 19:41:59 2008 @@ -180,6 +180,11 @@ - read_network - write_network + For convenience, there is also a class member `DENY_ALL` that + provides an XSLTAccessControl instance that is readily configured + to deny everything, and a `DENY_WRITE` member that denies all + write access but allows read access. + See `XSLT`. """ cdef xslt.xsltSecurityPrefs* _prefs @@ -194,6 +199,14 @@ self._setAccess(xslt.XSLT_SECPREF_READ_NETWORK, read_network) self._setAccess(xslt.XSLT_SECPREF_WRITE_NETWORK, write_network) + DENY_ALL = XSLTAccessControl( + read_file=False, write_file=False, create_dir=False, + read_network=False, write_network=False) + + DENY_WRITE = XSLTAccessControl( + read_file=True, write_file=False, create_dir=False, + read_network=True, write_network=False) + def __dealloc__(self): if self._prefs is not NULL: xslt.xsltFreeSecurityPrefs(self._prefs) From scoder at codespeak.net Mon Mar 3 19:42:05 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:42:05 +0100 (CET) Subject: [Lxml-checkins] r52108 - lxml/trunk Message-ID: <20080303184205.9FB3A169ECE@codespeak.net> Author: scoder Date: Mon Mar 3 19:42:05 2008 New Revision: 52108 Modified: lxml/trunk/ (props changed) lxml/trunk/Makefile Log: r3699 at delle: sbehnel | 2008-03-03 12:01:18 +0100 docclean target in Makefile Modified: lxml/trunk/Makefile ============================================================================== --- lxml/trunk/Makefile (original) +++ lxml/trunk/Makefile Mon Mar 3 19:42:05 2008 @@ -41,7 +41,6 @@ $(PYTHON) test.py -f $(TESTFLAGS) $(TESTOPTS) html: inplace - mkdir -p doc/html PYTHONPATH=src $(PYTHON) doc/mkhtml.py doc/html . `cat version.txt` rm -fr doc/html/api @[ -x "`which epydoc`" ] \ @@ -65,7 +64,11 @@ find . \( -name '*.o' -o -name '*.so' -o -name '*.py[cod]' -o -name '*.dll' \) -exec rm -f {} \; rm -rf build -realclean: clean +docclean: + rm -f doc/html/*.html + rm -fr doc/html/api + +realclean: clean docclean find . -name '*.c' -exec rm -f {} \; rm -f TAGS $(PYTHON) setup.py clean -a From scoder at codespeak.net Mon Mar 3 19:42:25 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:42:25 +0100 (CET) Subject: [Lxml-checkins] r52109 - in lxml/trunk: . doc src/lxml src/lxml/tests Message-ID: <20080303184225.B1D03169EAE@codespeak.net> Author: scoder Date: Mon Mar 3 19:42:19 2008 New Revision: 52109 Modified: lxml/trunk/ (props changed) lxml/trunk/CHANGES.txt lxml/trunk/doc/api.txt lxml/trunk/doc/extensions.txt lxml/trunk/src/lxml/classlookup.pxi lxml/trunk/src/lxml/lxml.objectify.pyx lxml/trunk/src/lxml/parser.pxi lxml/trunk/src/lxml/tests/test_classlookup.py lxml/trunk/src/lxml/tests/test_etree.py lxml/trunk/src/lxml/tests/test_htmlparser.py lxml/trunk/src/lxml/tests/test_nsclasses.py lxml/trunk/src/lxml/tests/test_xslt.py lxml/trunk/src/lxml/xmlerror.pxi lxml/trunk/src/lxml/xpath.pxi lxml/trunk/src/lxml/xslt.pxi Log: r3700 at delle: sbehnel | 2008-03-03 12:30:47 +0100 removed most deprecated functions and methods Modified: lxml/trunk/CHANGES.txt ============================================================================== --- lxml/trunk/CHANGES.txt (original) +++ lxml/trunk/CHANGES.txt Mon Mar 3 19:42:19 2008 @@ -24,6 +24,8 @@ Other changes ------------- +* Most deprecated functions and methods were removed. + 2.0.2 (2008-02-22) ================== Modified: lxml/trunk/doc/api.txt ============================================================================== --- lxml/trunk/doc/api.txt (original) +++ lxml/trunk/doc/api.txt Mon Mar 3 19:42:19 2008 @@ -208,7 +208,7 @@ errors that occured and "might have" lead to the problem from the error log copy attached to the exception:: - >>> etree.clearErrorLog() + >>> etree.clear_error_log() >>> broken_xml = ''' ... ... Modified: lxml/trunk/doc/extensions.txt ============================================================================== --- lxml/trunk/doc/extensions.txt (original) +++ lxml/trunk/doc/extensions.txt Mon Mar 3 19:42:19 2008 @@ -176,7 +176,7 @@ register the namespace with the evaluator, however, we can access it via a prefix:: - >>> e.registerNamespace('foo', 'http://mydomain.org/myfunctions') + >>> e.register_namespace('foo', 'http://mydomain.org/myfunctions') >>> e.evaluate('/foo:a')[0].tag '{http://mydomain.org/myfunctions}a' Modified: lxml/trunk/src/lxml/classlookup.pxi ============================================================================== --- lxml/trunk/src/lxml/classlookup.pxi (original) +++ lxml/trunk/src/lxml/classlookup.pxi Mon Mar 3 19:42:19 2008 @@ -107,13 +107,6 @@ """ self._setFallback(lookup) - def setFallback(self, ElementClassLookup lookup not None): - """Sets the fallback scheme for this lookup method. - - :deprecated: use ``set_fallback()`` instead. - """ - self._setFallback(lookup) - cdef object _callFallback(self, _Document doc, xmlNode* c_node): return self._fallback_function(self.fallback, doc, c_node) @@ -408,10 +401,6 @@ ELEMENT_CLASS_LOOKUP_STATE = state LOOKUP_ELEMENT_CLASS = function -def setElementClassLookup(ElementClassLookup lookup = None): - ":deprecated: use ``set_element_class_lookup(lookup)`` instead" - set_element_class_lookup(lookup) - def set_element_class_lookup(ElementClassLookup lookup = None): """set_element_class_lookup(lookup = None) Modified: lxml/trunk/src/lxml/lxml.objectify.pyx ============================================================================== --- lxml/trunk/src/lxml/lxml.objectify.pyx (original) +++ lxml/trunk/src/lxml/lxml.objectify.pyx Mon Mar 3 19:42:19 2008 @@ -81,11 +81,6 @@ PYTYPE_ATTRIBUTE = cetree.namespacedNameFromNsName( _PYTYPE_NAMESPACE, _PYTYPE_ATTRIBUTE_NAME) -def setPytypeAttributeTag(attribute_tag=None): - """:deprecated: use ``set_pytype_attribute_tag()`` instead. - """ - set_pytype_attribute_tag(attribute_tag) - set_pytype_attribute_tag() @@ -1685,10 +1680,6 @@ cdef object objectify_parser objectify_parser = __DEFAULT_PARSER -def setDefaultParser(new_parser = None): - ":deprecated: use ``set_default_parser()`` instead." - set_default_parser(new_parser) - def set_default_parser(new_parser = None): """set_default_parser(new_parser = None) Modified: lxml/trunk/src/lxml/parser.pxi ============================================================================== --- lxml/trunk/src/lxml/parser.pxi (original) +++ lxml/trunk/src/lxml/parser.pxi Mon Mar 3 19:42:19 2008 @@ -678,7 +678,7 @@ return "libxml2 %d.%d.%d" % LIBXML_VERSION def setElementClassLookup(self, ElementClassLookup lookup = None): - "@deprecated: use ``parser.set_element_class_lookup(lookup)`` instead." + ":deprecated: use ``parser.set_element_class_lookup(lookup)`` instead." self.set_element_class_lookup(lookup) def set_element_class_lookup(self, ElementClassLookup lookup = None): @@ -1130,14 +1130,6 @@ __GLOBAL_PARSER_CONTEXT.setDefaultParser(__DEFAULT_XML_PARSER) -def setDefaultParser(parser=None): - ":deprecated: please use set_default_parser instead." - set_default_parser(parser) - -def getDefaultParser(): - ":deprecated: please use get_default_parser instead." - return get_default_parser() - def set_default_parser(_BaseParser parser=None): """set_default_parser(parser=None) Modified: lxml/trunk/src/lxml/tests/test_classlookup.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_classlookup.py (original) +++ lxml/trunk/src/lxml/tests/test_classlookup.py Mon Mar 3 19:42:19 2008 @@ -25,7 +25,7 @@ etree = etree def tearDown(self): - etree.setElementClassLookup() + etree.set_element_class_lookup() super(ClassLookupTestCase, self).tearDown() def test_namespace_lookup(self): @@ -33,7 +33,7 @@ FIND_ME = "namespace class" lookup = etree.ElementNamespaceClassLookup() - etree.setElementClassLookup(lookup) + etree.set_element_class_lookup(lookup) ns = lookup.get_namespace("myNS") ns[None] = TestElement @@ -57,7 +57,7 @@ lookup = etree.ElementDefaultClassLookup( element=TestElement, comment=TestComment, pi=TestPI) - parser.setElementClassLookup(lookup) + parser.set_element_class_lookup(lookup) root = etree.XML(""" @@ -78,7 +78,7 @@ lookup = etree.AttributeBasedElementClassLookup( "a1", class_dict) - etree.setElementClassLookup(lookup) + etree.set_element_class_lookup(lookup) root = etree.XML(xml_str) self.assertFalse(hasattr(root, 'FIND_ME')) @@ -95,7 +95,7 @@ if name == 'c1': return TestElement - etree.setElementClassLookup( MyLookup() ) + etree.set_element_class_lookup( MyLookup() ) root = etree.XML(xml_str) self.assertFalse(hasattr(root, 'FIND_ME')) @@ -116,7 +116,7 @@ return TestElement1 lookup = etree.ElementNamespaceClassLookup( MyLookup() ) - etree.setElementClassLookup(lookup) + etree.set_element_class_lookup(lookup) ns = lookup.get_namespace("otherNS") ns[None] = TestElement2 @@ -134,14 +134,14 @@ FIND_ME = "parser_based" lookup = etree.ParserBasedElementClassLookup() - etree.setElementClassLookup(lookup) + etree.set_element_class_lookup(lookup) class MyLookup(etree.CustomElementClassLookup): def lookup(self, t, d, ns, name): return TestElement parser = etree.XMLParser() - parser.setElementClassLookup( MyLookup() ) + parser.set_element_class_lookup( MyLookup() ) root = etree.parse(StringIO(xml_str), parser).getroot() self.assertEquals(root.FIND_ME, Modified: lxml/trunk/src/lxml/tests/test_etree.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_etree.py (original) +++ lxml/trunk/src/lxml/tests/test_etree.py Mon Mar 3 19:42:19 2008 @@ -254,7 +254,7 @@ parse = self.etree.parse # from StringIO f = StringIO('') - self.etree.clearErrorLog() + self.etree.clear_error_log() try: parse(f) logs = None Modified: lxml/trunk/src/lxml/tests/test_htmlparser.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_htmlparser.py (original) +++ lxml/trunk/src/lxml/tests/test_htmlparser.py Mon Mar 3 19:42:19 2008 @@ -27,7 +27,7 @@ def tearDown(self): super(HtmlParserTestCase, self).tearDown() - self.etree.setDefaultParser() + self.etree.set_default_parser() def test_module_HTML(self): element = self.etree.HTML(self.html_str) @@ -235,13 +235,13 @@ self.assertRaises(self.etree.XMLSyntaxError, self.etree.parse, StringIO(self.broken_html_str)) - self.etree.setDefaultParser( self.etree.HTMLParser() ) + self.etree.set_default_parser( self.etree.HTMLParser() ) tree = self.etree.parse(StringIO(self.broken_html_str)) self.assertEqual(self.etree.tostring(tree.getroot()), self.html_str) - self.etree.setDefaultParser() + self.etree.set_default_parser() self.assertRaises(self.etree.XMLSyntaxError, self.etree.parse, StringIO(self.broken_html_str)) Modified: lxml/trunk/src/lxml/tests/test_nsclasses.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_nsclasses.py (original) +++ lxml/trunk/src/lxml/tests/test_nsclasses.py Mon Mar 3 19:42:19 2008 @@ -25,11 +25,11 @@ lookup = etree.ElementNamespaceClassLookup() self.Namespace = lookup.get_namespace parser = etree.XMLParser() - parser.setElementClassLookup(lookup) - etree.setDefaultParser(parser) + parser.set_element_class_lookup(lookup) + etree.set_default_parser(parser) def tearDown(self): - etree.setDefaultParser() + etree.set_default_parser() del self.Namespace super(ETreeNamespaceClassesTestCase, self).tearDown() Modified: lxml/trunk/src/lxml/tests/test_xslt.py ============================================================================== --- lxml/trunk/src/lxml/tests/test_xslt.py (original) +++ lxml/trunk/src/lxml/tests/test_xslt.py Mon Mar 3 19:42:19 2008 @@ -29,7 +29,7 @@ B ''', - st.tostring(res)) + str(res)) def test_xslt_elementtree_error(self): self.assertRaises(ValueError, etree.XSLT, etree.ElementTree()) @@ -298,7 +298,7 @@ Bar ''', - st.tostring(res)) + str(res)) if etree.LIBXSLT_VERSION < (1,1,18): # later versions produce no error @@ -335,7 +335,7 @@ BarBaz ''', - st.tostring(res)) + str(res)) def test_xslt_parameter_xpath(self): tree = self.parse('BC') @@ -354,7 +354,7 @@ B ''', - st.tostring(res)) + str(res)) def test_xslt_default_parameters(self): @@ -375,13 +375,13 @@ Bar ''', - st.tostring(res)) + str(res)) res = st.apply(tree) self.assertEquals('''\ Default ''', - st.tostring(res)) + str(res)) def test_xslt_html_output(self): tree = self.parse('BC') @@ -471,7 +471,6 @@ styledoc = self.parse(xslt) style = etree.XSLT(styledoc) result = style.apply(source) - self.assertEqual('', style.tostring(result)) self.assertEqual('', str(result)) def test_xslt_message(self): @@ -488,7 +487,6 @@ styledoc = self.parse(xslt) style = etree.XSLT(styledoc) result = style.apply(source) - self.assertEqual('', style.tostring(result)) self.assertEqual('', str(result)) self.assert_("TEST TEST TEST" in [entry.message for entry in style.error_log]) @@ -507,7 +505,6 @@ styledoc = self.parse(xslt) style = etree.XSLT(styledoc) result = style.apply(source) - self.assertEqual('', style.tostring(result)) self.assertEqual('', str(result)) self.assert_("TEST TEST TEST" in [entry.message for entry in style.error_log]) @@ -907,7 +904,7 @@ B ''', - st.tostring(res)) + str(res)) def test_xslt_pi_embedded_id(self): # test XPath lookup mechanism @@ -941,7 +938,7 @@ B ''', - st.tostring(res)) + str(res)) def test_xslt_pi_get(self): tree = self.parse('''\ Modified: lxml/trunk/src/lxml/xmlerror.pxi ============================================================================== --- lxml/trunk/src/lxml/xmlerror.pxi (original) +++ lxml/trunk/src/lxml/xmlerror.pxi Mon Mar 3 19:42:19 2008 @@ -12,14 +12,6 @@ """ __GLOBAL_ERROR_LOG.clear() -def clearErrorLog(): - """Clear the global error log. Note that this log is already bound to a - fixed size. - - :deprecated: use ``clear_error_log()`` instead. - """ - __GLOBAL_ERROR_LOG.clear() - # dummy function: no debug output at all cdef void _nullGenericErrorFunc(void* ctxt, char* msg, ...): pass @@ -411,17 +403,6 @@ "Helper function for properties in exceptions." return __GLOBAL_ERROR_LOG.copy() -def useGlobalPythonLog(PyErrorLog log not None): - """Replace the global error log by an etree.PyErrorLog that uses the - standard Python logging package. - - Note that this disables access to the global error log from exceptions. - Parsers, XSLT etc. will continue to provide their normal local error log. - - :deprecated: use ``use_global_python_log()`` instead. - """ - use_global_python_log(log) - def use_global_python_log(PyErrorLog log not None): """use_global_python_log(log) Modified: lxml/trunk/src/lxml/xpath.pxi ============================================================================== --- lxml/trunk/src/lxml/xpath.pxi (original) +++ lxml/trunk/src/lxml/xpath.pxi Mon Mar 3 19:42:19 2008 @@ -235,26 +235,11 @@ python.PyErr_NoMemory() self.set_context(xpathCtxt) - def registerNamespace(self, prefix, uri): - """Register a namespace with the XPath context. - - :deprecated: use ``register_namespace()`` instead - """ - self._context.addNamespace(prefix, uri) - def register_namespace(self, prefix, uri): """Register a namespace with the XPath context. """ self._context.addNamespace(prefix, uri) - def registerNamespaces(self, namespaces): - """Register a prefix -> uri dict. - - :deprecated: use ``register_namespaces()`` instead - """ - for prefix, uri in namespaces.items(): - self._context.addNamespace(prefix, uri) - def register_namespaces(self, namespaces): """Register a prefix -> uri dict. """ Modified: lxml/trunk/src/lxml/xslt.pxi ============================================================================== --- lxml/trunk/src/lxml/xslt.pxi (original) +++ lxml/trunk/src/lxml/xslt.pxi Mon Mar 3 19:42:19 2008 @@ -387,15 +387,6 @@ :deprecated: call the object, not this method.""" return self(_input, profile_run=profile_run, **_kw) - def tostring(self, _ElementTree result_tree): - """tostring(self, result_tree) - - Save result doc to string based on stylesheet output method. - - :deprecated: use str(result_tree) instead. - """ - return str(result_tree) - def __deepcopy__(self, memo): return self.__copy__() From scoder at codespeak.net Mon Mar 3 19:42:27 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:42:27 +0100 (CET) Subject: [Lxml-checkins] r52110 - in lxml/trunk: . src/lxml Message-ID: <20080303184227.226BB169EAE@codespeak.net> Author: scoder Date: Mon Mar 3 19:42:26 2008 New Revision: 52110 Modified: lxml/trunk/ (props changed) lxml/trunk/src/lxml/parser.pxi Log: r3701 at delle: sbehnel | 2008-03-03 13:13:15 +0100 dropped one more Modified: lxml/trunk/src/lxml/parser.pxi ============================================================================== --- lxml/trunk/src/lxml/parser.pxi (original) +++ lxml/trunk/src/lxml/parser.pxi Mon Mar 3 19:42:26 2008 @@ -677,10 +677,6 @@ def __get__(self): return "libxml2 %d.%d.%d" % LIBXML_VERSION - def setElementClassLookup(self, ElementClassLookup lookup = None): - ":deprecated: use ``parser.set_element_class_lookup(lookup)`` instead." - self.set_element_class_lookup(lookup) - def set_element_class_lookup(self, ElementClassLookup lookup = None): """set_element_class_lookup(self, lookup = None) From scoder at codespeak.net Mon Mar 3 19:42:39 2008 From: scoder at codespeak.net (scoder at codespeak.net) Date: Mon, 3 Mar 2008 19:42:39 +0100 (CET) Subject: [Lxml-checkins] r52111 - in lxml/trunk: . doc src/lxml src/lxml/html src/lxml/tests Message-ID: <20080303184239.DDCB7169ECE@codespeak.net> Author: scoder Date: Mon Mar 3 19:42:39 2008 New Revision: 52111 Modified: lxml/trunk/ (props changed) lxml/trunk/doc/element_classes.txt lxml/trunk/doc/extensions.txt lxml/trunk/src/lxml/html/__init__.py lxml/trunk/src/lxml/lxml.etree.pyx lxml/trunk/src/lxml/parser.pxi lxml/trunk/src/lxml/tests/test_objectify.py lxml/trunk/src/lxml/tests/test_pyclasslookup.py lxml/trunk/src/lxml/tests/test_xpathevaluator.py lxml/trunk/src/lxml/tests/test_xslt.py Log: r3702 at delle: sbehnel | 2008-03-03 13:32:50 +0100 tons of API usage fixes in the docs Modified: lxml/trunk/doc/element_classes.txt ============================================================================== --- lxml/trunk/doc/element_classes.txt (original) +++ lxml/trunk/doc/element_classes.txt Mon Mar 3 19:42:39 2008 @@ -89,7 +89,7 @@ >>> parser_lookup = etree.ElementDefaultClassLookup(element=HonkElement) >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(parser_lookup) + >>> parser.set_element_class_lookup(parser_lookup) There is one drawback of the parser based scheme: the ``Element()`` factory does not know about your specialised parser and creates a new document that @@ -153,7 +153,7 @@ >>> lookup = etree.ElementDefaultClassLookup() >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(lookup) + >>> parser.set_element_class_lookup(lookup) Note that the default for new parsers is to use the global fallback, which is also the default lookup (if not configured otherwise). @@ -167,7 +167,7 @@ False >>> lookup = etree.ElementDefaultClassLookup(element=HonkElement) - >>> parser.setElementClassLookup(lookup) + >>> parser.set_element_class_lookup(lookup) >>> el = parser.makeelement("myelement") >>> print isinstance(el, HonkElement) @@ -189,7 +189,7 @@ >>> lookup = etree.ElementNamespaceClassLookup() >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(lookup) + >>> parser.set_element_class_lookup(lookup) See the separate section on `implementing namespaces`_ below to learn how to make use of it. @@ -203,7 +203,7 @@ >>> fallback = etree.ElementDefaultClassLookup(element=HonkElement) >>> lookup = etree.ElementNamespaceClassLookup(fallback) - >>> parser.setElementClassLookup(lookup) + >>> parser.set_element_class_lookup(lookup) Attribute based lookup @@ -217,7 +217,7 @@ >>> lookup = etree.AttributeBasedElementClassLookup( ... 'id', id_class_mapping) >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(lookup) + >>> parser.set_element_class_lookup(lookup) Instead of a global setup of this scheme, you should consider using a per-parser setup. @@ -230,7 +230,7 @@ >>> lookup = etree.AttributeBasedElementClassLookup( ... 'id', id_class_mapping, fallback) >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(lookup) + >>> parser.set_element_class_lookup(lookup) Custom element class lookup @@ -244,7 +244,7 @@ ... return MyElementClass # defined elsewhere >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(MyLookup()) + >>> parser.set_element_class_lookup(MyLookup()) The ``lookup()`` method must either return None (which triggers the fallback mechanism) or a subclass of ``lxml.etree.ElementBase``. It can take any @@ -270,7 +270,7 @@ ... return MyElementClass # defined elsewhere >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(MyLookup()) + >>> parser.set_element_class_lookup(MyLookup()) As before, the first argument to the ``lookup()`` method is the opaque document instance that contains the Element. The second arguments is a @@ -305,7 +305,7 @@ >>> lookup = etree.ElementNamespaceClassLookup() >>> parser = etree.XMLParser() - >>> parser.setElementClassLookup(lookup) + >>> parser.set_element_class_lookup(lookup) >>> namespace = lookup.get_namespace('http://hui.de/honk') Modified: lxml/trunk/doc/extensions.txt ============================================================================== --- lxml/trunk/doc/extensions.txt (original) +++ lxml/trunk/doc/extensions.txt Mon Mar 3 19:42:39 2008 @@ -141,12 +141,12 @@ XSL transformations:: >>> e = etree.XPathEvaluator(doc) - >>> print e.evaluate('es:hello(local-name(/a))') + >>> print e('es:hello(local-name(/a))') Ola a >>> namespaces = {'f' : 'http://mydomain.org/myfunctions'} >>> e = etree.XPathEvaluator(doc, namespaces=namespaces) - >>> print e.evaluate('f:hello(local-name(/a))') + >>> print e('f:hello(local-name(/a))') Hello a >>> xslt = etree.XSLT(etree.ElementTree(etree.XML(''' @@ -169,7 +169,7 @@ >>> f = StringIO('') >>> ns_doc = etree.parse(f) >>> e = etree.XPathEvaluator(ns_doc) - >>> e.evaluate('/a') + >>> e('/a') [] This returns nothing, as we did not ask for the right namespace. When we @@ -177,14 +177,14 @@ prefix:: >>> e.register_namespace('foo', 'http://mydomain.org/myfunctions') - >>> e.evaluate('/foo:a')[0].tag + >>> e('/foo:a')[0].tag '{http://mydomain.org/myfunctions}a' Note that this prefix mapping is only known to this evaluator, as opposed to the global mapping of the FunctionNamespace objects:: >>> e2 = etree.XPathEvaluator(ns_doc) - >>> e2.evaluate('/foo:a') + >>> e2('/foo:a') Traceback (most recent call last): ... XPathEvalError: Undefined namespace prefix @@ -202,7 +202,7 @@ >>> namespaces = {'l' : 'local-ns'} >>> e = etree.XPathEvaluator(doc, namespaces=namespaces, extensions=extensions) - >>> print e.evaluate('l:local-hello(string(b))') + >>> print e('l:local-hello(string(b))') Hello Haegar For larger numbers of extension functions, you can define classes or modules @@ -221,7 +221,7 @@ >>> extensions = etree.Extension( ext_module, functions, ns='local-ns' ) >>> e = etree.XPathEvaluator(doc, namespaces=namespaces, extensions=extensions) - >>> print e.evaluate('l:function1(string(b))') + >>> print e('l:function1(string(b))') 1Haegar The optional second argument to ``Extension`` can either be be a @@ -237,17 +237,17 @@ >>> functions = ('function1', 'function2', 'function3') >>> extensions = etree.Extension( ext_module, functions ) >>> e = etree.XPathEvaluator(doc, extensions=extensions) - >>> print e.evaluate('function1(function2(function3(string(b))))') + >>> print e('function1(function2(function3(string(b))))') 123Haegar >>> extensions = etree.Extension( ext_module, functions, ns=None ) >>> e = etree.XPathEvaluator(doc, extensions=extensions) - >>> print e.evaluate('function1(function2(function3(string(b))))') + >>> print e('function1(function2(function3(string(b))))') 123Haegar >>> extensions = etree.Extension(ext_module) >>> e = etree.XPathEvaluator(doc, extensions=extensions) - >>> print e.evaluate('function1(function2(function3(string(b))))') + >>> print e('function1(function2(function3(string(b))))') 123Haegar >>> functions = { @@ -257,7 +257,7 @@ ... } >>> extensions = etree.Extension(ext_module, functions) >>> e = etree.XPathEvaluator(doc, extensions=extensions) - >>> print e.evaluate('function1(function2(function3(string(b))))') + >>> print e('function1(function2(function3(string(b))))') 123Haegar For convenience, you can also pass a sequence of extensions:: @@ -266,7 +266,7 @@ >>> extensions2 = etree.Extension(ext_module, ns='local-ns') >>> e = etree.XPathEvaluator(doc, extensions=[extensions1, extensions2], ... namespaces=namespaces) - >>> print e.evaluate('function1(l:function2(function3(string(b))))') + >>> print e('function1(l:function2(function3(string(b))))') 123Haegar @@ -296,15 +296,15 @@ >>> ns['first'] = returnFirstNode >>> e = etree.XPathEvaluator(doc) - >>> e.evaluate("float()") + >>> e("float()") 1.7 - >>> e.evaluate("int()") + >>> e("int()") 1.0 - >>> int( e.evaluate("int()") ) + >>> int( e("int()") ) 1 - >>> e.evaluate("bool()") + >>> e("bool()") True - >>> e.evaluate("count(first(//b))") + >>> e("count(first(//b))") 1.0 As the last example shows, you can pass the results of functions back into @@ -327,11 +327,11 @@ >>> e = etree.XPathEvaluator(doc) - >>> r = e.evaluate("new-node-set()/result") + >>> r = e("new-node-set()/result") >>> print [ t.text for t in r ] ['Alpha', 'Beta', 'Gamma', 'Delta'] - >>> r = e.evaluate("new-node-set()") + >>> r = e("new-node-set()") >>> print [ t.tag for t in r ] ['results1', 'results2', 'subresult'] >>> print [ len(t) for t in r ] Modified: lxml/trunk/src/lxml/html/__init__.py ============================================================================== --- lxml/trunk/src/lxml/html/__init__.py (original)