[lxml-dev] I get CDATA inside parsed html <script> element, and can not retrieve it's text
Alexander Kozlovsky
alexander.kozlovsky at gmail.com
Fri Nov 3 22:35:47 CET 2006
Hello all!
I'm very new with lxml. Probably, I find a bug.
AFAIK, lxml does not expose direct interface to CDATA sections.
But, when I use etree.HTML function I get content of <script>
as CDATA section!
>>> html = etree.HTML('<script> alert("Hello!"); </script>')
>>> etree.tostring(html)
'<html><head><script><![CDATA[ alert("Hello!"); ]]></script></head></html>'
The problem is, I cannot retrieve content of <script> tag
because lxml does not allow this:
>>> script = html.find('.//script')
>>> len(script)
0
>>> print script.text
None
EXPECTED:
>>> print script.text
alert("Hello!");
Is it really a bug, or I don't understand something?
--
Best regards,
Alexander mailto:alexander.kozlovsky at gmail.com
More information about the lxml-dev
mailing list