[ftputil] host.download_if_newer problems
Yvan Strahm
yvan.strahm at bccs.uib.no
Thu May 22 08:54:17 CEST 2008
Stefan Schwarzer wrote:
> Hi Yvan,
>
> On 2008-05-20 09:35, Yvan Strahm wrote:
>> Thanks for the reply.
>
> You're welcome :-)
>
>> yes the compressed files are on the server(host) and the client would
>> have the uncompressed ones.
>> At the moment I am downloading the compressed files and uncompress them
>> on the client. My main problem id hard drive space... because to avoid
>> unnecessary downloads I am keeping on the client both the compressed and
>> the uncompressed files. I tried to use host.path.getmtime() to get the
>> time form the host and compare to the time given by stat on the client
>> for the same file, but as soon as the file are uncompressed the
>> timestamp change.
>
> If you show me your code, I may be able to suggest
> improvements.
>
>> Or am I completely wrong or on the wrong path here?
>
> If you uncompress from a program, you could test and
> remember the timestamp of last modification, uncompress
> the file and setting the modification timestamp to the
> remembered value. So the uncompressed file would have the
> same timestamp as the source file. (Or _almost_ the same
> timestamp because the timestamps fetched from the server
> can only be exact to a minute or even only a day,
> depending on the remote directory listing.)
>
> If the server and client are in different timezones, you
> will have to account for time shift. See the time shift-
> related methods of FTPHost and for examples how they are
> used in download_if_newer and upload_if_newer.
>
> In case you have trouble modifying the timestamps, you
> could also store them as text in a file, e. g. in the
> format
>
> 2008-05-20 17:00 my_file.gz
>
>> I don't really understand the file-like object term,
>
> You can use FTPHost.file to construct a file(-like) object,
> just as Python's "open" function does. However, while Python's
> "open" call refers to local files, ftputil's files are opened
> on the remote host.
>
> For example, you could do with ftputil:
>
> host = ftputil.FTPHost(server, userid, password)
> # write access and binary transfer are supported, too
> f = host.file("remote_file.txt")
> # gets the lines of the remote file one by one
> for line in f:
> print f,
> f.close()
>
> As you see, just like in Python, you can iterate over the
> file's lines. You can also read a number of bytes with
> f.read(number). The remote file is _not_ copied to the
> file system of the client unless to put it there yourself.
>
>> does it imply to
>> uncompressed to a tmp folder, compare the date, delete if not newer or
>> download if newer?
>
> The idea was to open a file-like object similar to the
> code above and "pipe" it into a generator which uncompresses
> the stream's data and writes the uncompressed data to the
> disk.
>
> Would you mind me sending your mail and my reply to the
> ftputil list? The information may be useful to others.
>
> Best regards,
> Stefan
Yes, not problem for sending the mails to the list I thought the reply
went to the mailing list, sorry.
I guess i will just compare the date of the compressed file on the
server with the date of the uncompressed file on the client, then if the
server file is newer then it will be downloaded.
This is how I thought it can be done
1. getting a dictionary (key:filename ,value:date) from the server:
host=ftputil.FTPHost(ftp,user,password)
host.chdir(ftp_dir)
files=host.listdir(host.curdir)
for f in files:
mtime=host.path.getmtime(f)
dict={f:mtime}
close(ftp)
2. getting a equivalent dictionary from the client
3. comparing both list
if not present on the client, add the file to a list
if server date > client date, add the file to a list
4. reopen ftp and download the list to_be_download
but it doesn't work at all!
Problems are size of the dictionaries, more than 50'000 files
and how to compare efficiently files. And the main problem is the ftp
connection dying.
Best regards
yvan
More information about the ftputil
mailing list