[lxml-dev] findall() returns an iterable instead of a sequence in ET 1.3
Stefan Behnel
stefan_ml at behnel.de
Thu Sep 13 11:48:12 CEST 2007
Fredrik Lundh wrote:
>> I just noticed the above when I tried to copy over the new ElementPath
>> implementation from the current ET 1.3 SVN. The current ET docs of 1.2 clearly
>> state that findall() returns a sequence. I'm not questioning the new
>> behaviour, but it's not even mentioned in your "ET 1.3 intro" text.
>
> That's probably because I dropped in the new (still pretty rough) path
> implementation after I wrote the first episode, but before I uploaded
> the code...
>
> The ET 1.2 documentation does indeed say that findall may return a
> sequence *or* an iterator:
>
> http://effbot.org/zone/pythondoc-elementtree-ElementTree.htm#elementtree.ElementTree._ElementInterface.findall-method
Ok, but this page doesn't (and I find it pretty visible):
http://effbot.org/elementtree/elementtree-element.htm#tag-ET.Element.findall
> but as you say, chances are that people are relying on behaviour
> rather than implementation. Yet, it's pretty nice to have an iterator
> for things like:
>
> for elem in tree.findall(simple pattern):
> check elem properties
> if right elem:
> break
I think the "findALL()" makes it sound like something that returns a sequence
rather than an iterator. The API shouldn't work against people's intuition.
> But maybe we could provide an "iterfind", perhaps? (that may or may
> not be the same thing as findall).
I would definitely prefer that, and I like the name already. Then, findall()
could be as simple as
return list(self.iterfind())
And it /should/ do the same as findall(), as it carries "find" in its name. It
meats my intuition that findall() and iterfind() return exactly the same
results, just in the expected different ways.
> fwiw, I've had the same concerns wrt the iter/getiterator changes; in
> 1.2, getiterator returned a list, not an iterator. in 1.3a3, it's an
> alias for "elem.iter()". maybe it should be an alias for
> "list(elem.iter())" instead?
I think that's different as people do not generally expect something called
"getiterator" to return a sequence, so as long as they don't look into the
documentation, they would not easily use it in a way that breaks after the change.
BUT, since you already deprecated getiterator() anyway, why not make it a pure
legacy function that works as it did in the early days?
Stefan
More information about the lxml-dev
mailing list