[lxml-dev] Setting URL from lxml.html.fromstring, etc

Ian Bicking ianb at colorstudy.com
Thu Feb 28 18:06:03 CET 2008


Stefan Behnel wrote:
> Hi,
> 
> Ian Bicking wrote:
>> Stefan Behnel wrote:
>>> I also added a "base" property to Elements that is based on the xml:base
>>> attribute (or the appropriate fallback to the document URL).
>> Hmm... there's a property in lxml.html called .base_url, which
>> previously just read docinfo.URL.  Now it could read .base... but
>> obviously that's silly, as it's just an alias.
>>
>> We could deprecate .base_url in lxml.html, or rename .base as .base_url,
>> but having both ain't good.
> 
> I agree, wasn't aware of it. (Here, we are actually lucky that it wasn't
> writable already!)
> 
> But 'base' is a better name for the XML environment given 'xml:base'. It feels
> weird to set '.base_url' and have it set an xml:base attribute on the Element.
> Also, it might just be a URI, although that's unlikely.
> 
> Don't you think it should behave differently for XML and HTML? For XML, I'd
> expect it to depend on xml:base, while for HTML, it'd rather always depend on
> the document URL (and not set an xml:base attribute on assignment).

Sure, they act somewhat differently, but does it make sense to use two 
different names?  I think they mean similar things in both cases, though 
perhaps the per-element base attribute in HTML shouldn't be writable. 
(Though the tree is kind of this weird invisible thing that you wouldn't 
know is there except for things like docinfo.URL, but a little 
documentation can fix that of course.)

   Ian


More information about the lxml-dev mailing list