[ftputil] Making path.walk go faster.
Stefan Schwarzer
sschwarzer at sschwarzer.net
Wed Nov 30 23:08:12 CET 2005
Hi Ido,
On 2005-11-30 13:26, Ido Abramovich wrote:
> Hi, First, thanks for ftputil, it saved me a bunch of
> time in my current project :)
that's intentional ;-)
> In my project, I'm using path.walk a lot (I'm
> traversing over an ftp directory and deciding on each
> file if it needs to be downloaded or not), but I found
> that path.walk is a bit too slow for my needs.
>
> So I started to poke around a bit with the source and
> found that when you perform a walk you get the list of
> the directory and then perform a stat on each file, so
> you perform N=D+F network connections (D=number of
> dirs, F=number of files). you could lower this number
> to only D if you remember the result of the stat
> operation in listdir and use it in walk.
Yes, the implementation of path.walk is possibly more a
principal thing currently. I'm aware of the repeated
"lookups" but haven't added caching so far, I must admit.
> I've done a small patch that adds this functionality
> without changing anything else:
> [patch snipped]
Thanks a lot for your work. I'll look into it. If I use it
or a derivation of it, you wouldn't mind to have the code
included in an upcoming ftputil release, would you? :-)
> 2) on a small check I did, this little hack is about 8
> times faster than the current implementation.
I assumed that it would be a lot faster with caching, but
it's interesting to know a number.
Stefan
More information about the ftputil
mailing list