[lxml-dev] Bug in XPath evaluation - not a bug :)
Torsten Rehn
scel at users.sourceforge.net
Tue Apr 24 17:19:07 CEST 2007
On Mon, 2007-04-23 at 22:13 +0200, Martijn Faassen wrote:
> > So you're really ignoring the namespace and just looking at the prefix? That's
> > definitely an unusual use case.
>
> Agreed, that is indeed odd. Makes me want to find out more. :) You have
> documents that use namespaces extensively, but they vary widely in the
> kinds of namespace URIs they use for the same prefixes? How did you
> arrive in such a situation?
I think we got a slight misunderstanding here. In my situation, each
prefix belongs to exactly one namespace. Here's an example of what I'd
like to do:
Let's say there is a store that has both a print catalogue and an online
shop.
For whatever reason (this is a very stupid example) we want some of the
items being sold to appear in the print catalogue and some others in the
eshop.
Here is the XML data that describes the items we sell:
<itemlist>
<item><!-- this is a print item -->
<name>TurboItem</name>
<price>23</price>
</item>
<item><!-- this is an eshop item -->
<name>SuperItem</name>
<price>42</price>
</item>
</itemlist>
Now I want some way to "tag" each item either for print or eshop. But
(and here's the twist: without altering the structure of the XML data.
That means that I can't add an attribute to each <item> element or
"encapsulate" the items like this:
<thisgoestoprint>
<item>...</item>
</thisgoestoprint>
<thisgoestoeshop>
<item>...</item>
</thisgoestoeshop>
However, adding namespace prefixes (and their xmlns definitions) is
acceptable.
If it had worked the way I intended it to in the beginning, the XPath
expression "//print:item" would have returned all items that go into the
print catalogue.
Now why do I want to avoid using the namespace URIs in the expression?
In what I'm actually up to, there are a lot more options than just print
and eshop. It shall be easy for users to handle a larger amount of these
"options" and requiring users to write out namespace-uris just isn't
convenient. Prefixes, however, are.
The only solution I see right now is to scan the XML data prior to the
XPath query in order to map each prefix to its namespace-uri.
I do understand now that this is such an exotic use case that it
wouldn't make much sense to have lxml do these mappings automatically if
the second argument of .xpath() is omitted.
The reason I gave this rather lengthy example was to find out if anyone
reading this has an idea of an alternative solution for my problem
(applying metadata to specific parts of an XML document without making
the XPath expressions to address these parts too complex).
Regards,
Torsten
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: This is a digitally signed message part
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070424/234f64c8/attachment-0001.pgp
More information about the lxml-dev
mailing list