[lxml-dev] .base and docinfo.URL
Stefan Behnel
stefan_ml at behnel.de
Tue Mar 11 10:35:01 CET 2008
Hi,
fixed on the trunk.
Stefan
Stefan Behnel wrote:
> Ian Bicking wrote:
>> Does .base inherit from docinfo.URL? It doesn't seem like it does. I
>> tried changing .base_url to just return self.base, but if I do:
>>
>> >>> from lxml.html import parse
>> >>> doc = parse('http://python.org').getroot()
>> >>> print doc.base
>> None
>> >>> doc.getroottree().docinfo.URL
>> 'http://python.org'
>
> I just checked the libxml2 source, it actually behaves completely different
> for HTML documents. Here, it looks for
>
> <html><head><base href="...">
>
> and takes that. It completely ignores the document URL for HTML.
>
> I think it would be good to override that (directly in etree), so that it
> returns the document URL if nothing is returned from the base search. That
> way, it's consistent with the fallback in XML.
>
> Stefan
>
> _______________________________________________
> lxml-dev mailing list
> lxml-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/lxml-dev
More information about the lxml-dev
mailing list