[icalendar-dev] Possible bug: Linebreaks mess up unicode?

Lennart Regebro regebro at gmail.com
Wed Nov 22 19:44:55 CET 2006


On 10/19/06, Brad <ykardia at gmail.com> wrote:
> Hi,
>
> just tried loading a python-icalendar-generated ics file in the new
> Sunbird, and it complained that it wasn't utf-8. I investigated, and
> found that it looks as if the unicode is possibly messed up by the
> linebreaking. The same file shows up fine in Apple iCal and in Novell
> Evolution, so it is possible that Sunbird is just being too strict.

Well, it probably tries to UTF-8 decode it before it unfolds. I seem
to remember some versions of Sunbird not unfolding lines at all, so
that may be the issue.

However, folding in the middle of a UTF-8 character is bad, and
discussion on the ietf-calsify list has clarified the spec to say that
you are not allowed to do this (although you are allowed to split in
the middle of a UTF-8 character sequence).
The reason for that is that the file should still be viewable in UTF-8
aware editors and viewers.

This means that if you write ü with the unicode composing sequence
u+", it may break in the middle of these characters. The reason for
that is that composing sequences (at least in theory) can be longer
than 75 octets.

So, I have now implemented this in trunk (and it will be included in
the 1.1 release I plan to do Really Soon Now).

I think that makes trunk Bug Free (tm). :) Or does anybody know of any
other issues?

-- 
Lennart Regebro, Nuxeo     http://www.nuxeo.com/
CPS Content Management     http://www.nuxeo.org/


More information about the icalendar-dev mailing list