[kupu-dev] Kupu says links are bad when they aren't
sisi
sisi at foei.org
Thu Apr 12 14:33:16 CEST 2007
Hi all,
We're trying to get our content migrated from our flat html site into a
plone site, and while using the kupu relative path to uid tool we've
come up against some strange problems.
Our plone site settings are:
Zope 2.9.6-final, Plone 2.5.2, python 2.4.4, linux, kupu 1.4 (svn, trunk
- Revision: 41990)
All files have been dropped in using both FTP and WebDAV (separate
experiments).
Link checker in Kupu erroneously flags some links (relative, site
internal) as bad. Some common elements are:
. involving GIFs (and sometimes jpgs, we think)
. link is often a combination of anchor and img like:
<a href="..."><img src="../../images/something.gif"></a>
. after editing (but making no changes) with Kupu and saving, these
links become:
<a><img src="../../images/something.gif"></a> (the anchor point has
silently been removed)
This means that while kupu will flag these links as bad (even though
they are not, and they are within either href or src), as soon as we
have saved a page and rerun the kupu links script, the links get UID'ed
and no longer marked as bad by kupu. So it seems that one small element
that kupu does not like on our pages is causing it to ignore all the
relative paths on that page.
Since we have close to 4500 pages we cannot save every page in our
migrated content and then rerun the links scripts. At least, we'd rather
not!
Other types (non specific) are also involved, but not as frequently.
Furthermore, due to the fact the Kupu believes these links are bad, it
will not convert them to UID.
Point of note is that, for some unknown reason, the content type
registry was empty on first loading the files. This peculiarity has been
noted by several people on the net, and is easily fixed by uninstalling
and reinstalling the ATContentTypes product. After fixing this, the
problem went away for many of the PDFs, for instance, but still persists
(even after repopulating the folders via FTP or WebDAV) for many GIFs
(and still some jpegs and pdfs and normal href links).
If anyone has any ideas about what might be causing this problem please
send them on. We think there are 3900 pages with bad links on them, and
each page has several bad links. One thing that is throwing us off the
scent is that some pages have a mixture of resolved uids and relative
paths after running the scripts.
One question we have is:
Where is the code that identifies links for checking? Is it in kupu or
is kupu calling it from a function or something in Plone? Because we'd
like to look at the code and make some progress that way but we can't
find it, using all our ninja powers of grep and find etc :-)
Cheers,
sisi
--
# sisi nutt # extranet coordinator
# Friends of the Earth International
# PO Box 19199 # 1000 GD Amsterdam # The Netherlands
# Tel 31 20 6221369 # Fax 31 20 6392181 # http://www.foei.org
# email sisi at foei.org # skype foei_sisi
More information about the kupu-dev
mailing list