[lxml-dev] HTML Meta Content-Type Tag not created as documenation states?

Stefan Behnel stefan_ml at behnel.de
Wed Oct 15 20:09:55 CEST 2008


Hi,

John Krukoff wrote:
> So, I was trying to figure out what happend to my meta tags when using
> the lxml.html module, and saw the note in the documentation that
> html.tostring will handle them as so:
> 
>> Note: if include_meta_content_type is true this will create a
>>     ``<meta http-equiv="Content-Type" ...>`` tag in the head;
>>     regardless of the value of include_meta_content_type any existing
>>     ``<meta http-equiv="Content-Type" ...>`` tag will be removed
>>     
> 
> However, that doesn't seem to actually be the case. It looks like
> etree.tostring is never creating the meta tag as html.tostring appears
> to expect
> [...]
> The really weird part of this for me though, is that I've set
> include_meta_content_type on my much more complicated application
> server, and it does in fact appear to be generating meta tags
> automatically (or at least something in my XSLT heavy processing chain
> is).

This hint you gave makes me wonder if this functionality wasn't lost when I
switched from the original XSLT based generation to the one based on
tostring(method="html"). AFAIR, that was long before 2.0 was released...

I assume that HTML generation using xsl:output generates the <meta> tag and
the normal HTML serialisation does not do it. There are some new features in
libxml2 2.7.2 that would allow moving the serialisation to the xmlSave*() API,
but that's not backportable to older versions (lxml currently runs with
libxml2 2.6.21).

IMHO, your current best bet is to always serialise using XSLT if you want to
have a <meta> tag. When pre-parsed, the obvious stylesheet that does that
shouldn't really be slower than a call to tostring().

Stefan


More information about the lxml-dev mailing list