[lxml-dev] c14n, pretty printing and diffing
Olivier Collioud
Olivier.Collioud at wipo.int
Tue Feb 12 07:26:20 CET 2008
Thanks Stephan.
I prefer visual diffing : the ones provided by Eclipse, TkDiff or
WinMerge.
I did not fin any doc or usage example of lxml.usedoctest,
could you please give some pointer ?
Let me share my simple (because I do not use any namespace, PI,
comment...)
solution based on iterparse:
depth = 0
sourceTree = ElementTree.iterparse(open(inputFile, 'r'),
events=("start", "end"))
for event, elem in sourceTree:
if event == "start":
i = "\n" + depth*" "
depth += 1
outputFile.write('%s<%s' % (i,elem.tag))
if len(elem.items()):
attrs = elem.items()
attrs.sort()
outputFile.write(' ')
outputFile.write(' '.join(['%s="%s"' % (a[0],a[1]) for
a in attrs if a[0] != 'size']))
if elem.text and elem.text.strip():
outputFile.write('>%s' %
elem.text.strip('\n').encode('utf-8'))
elif len(elem):
outputFile.write('>')
if event == "end":
if (elem.text and elem.text.strip()) or len(elem):
outputFile.write('%s</%s>' % (i,elem.tag))
else:
outputFile.write('/>')
if elem.tail and elem.tail.strip():
outputFile.write(elem.tail.strip('\n').encode('utf-8'))
depth -= 1
elem.clear()
Olivier.
>>> Stefan Behnel <stefan_ml at behnel.de> 11/02/08 7:56 pm >>>
Hi,
Olivier Collioud wrote:
> I would like to use my favourite text diffing tool to compare XML
> files.
Which is not lxml.html.diff, I assume? (I'm not sure how HTML specific
that
is, BTW). Also, for doctests, there is lxml.usedoctest that you can
import
(the lxml web pages use it for doctests).
> Is their a way to produce a pretty printed canonical version of my
XML
> files using lxml ?
Not using the c14n interface (libxml2 doesn't support it). Serialising
by hand
is not too hard, though. You can look at ElementTree._write() for an
example:
http://svn.effbot.org/public/elementtree/elementtree/ElementTree.py
Stefan
_______________________________________________
lxml-dev mailing list
lxml-dev at codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
------
World Intellectual Property Organization Disclaimer:
This electronic message may contain privileged, confidential and
copyright protected information. If you have received this e-mail
by mistake, please immediately notify the sender and delete this
e-mail and all its attachments. Please ensure all e-mail attachments
are scanned for viruses prior to opening or using.
More information about the lxml-dev
mailing list