[lxml-dev] findall() returns an iterable instead of a sequence in ET 1.3

Stefan Behnel stefan_ml at behnel.de
Thu Sep 13 11:48:12 CEST 2007


Fredrik Lundh wrote:
>> I just noticed the above when I tried to copy over the new ElementPath
>> implementation from the current ET 1.3 SVN. The current ET docs of 1.2 clearly
>> state that findall() returns a sequence. I'm not questioning the new
>> behaviour, but it's not even mentioned in your "ET 1.3 intro" text.
> 
> That's probably because I dropped in the new (still pretty rough) path
> implementation after I wrote the first episode, but before I uploaded
> the code...
> 
> The ET 1.2 documentation does indeed say that findall may return a
> sequence *or* an iterator:
> 
> http://effbot.org/zone/pythondoc-elementtree-ElementTree.htm#elementtree.ElementTree._ElementInterface.findall-method

Ok, but this page doesn't (and I find it pretty visible):

http://effbot.org/elementtree/elementtree-element.htm#tag-ET.Element.findall


> but as you say, chances are that people are relying on behaviour
> rather than implementation.  Yet, it's pretty nice to have an iterator
> for things like:
> 
>    for elem in tree.findall(simple pattern):
>         check elem properties
>         if right elem:
>             break

I think the "findALL()" makes it sound like something that returns a sequence
rather than an iterator. The API shouldn't work against people's intuition.


> But maybe we could provide an "iterfind", perhaps? (that may or may
> not be the same thing as findall).

I would definitely prefer that, and I like the name already. Then, findall()
could be as simple as

   return list(self.iterfind())

And it /should/ do the same as findall(), as it carries "find" in its name. It
meats my intuition that findall() and iterfind() return exactly the same
results, just in the expected different ways.


> fwiw, I've had the same concerns wrt the iter/getiterator changes; in
> 1.2, getiterator returned a list, not an iterator.  in 1.3a3, it's an
> alias for "elem.iter()".  maybe it should be an alias for
> "list(elem.iter())" instead?

I think that's different as people do not generally expect something called
"getiterator" to return a sequence, so as long as they don't look into the
documentation, they would not easily use it in a way that breaks after the change.

BUT, since you already deprecated getiterator() anyway, why not make it a pure
legacy function that works as it did in the early days?

Stefan



More information about the lxml-dev mailing list