[lxml-dev] clean_html
Stefan Behnel
stefan_ml at behnel.de
Wed Jun 24 15:18:56 CEST 2009
Piet van Oostrum wrote:
>>>>>> Francesco <cattafra at hotmail.com> (F) wrote:
>
>>F> Thank you very much for your answers!
>>F> The html string is read from a file with:
>>F> inputfile = "test.txt"
>>F> # where test.txt contains "<title>My site » Homepage</title>"
>>F> input = open(inputfile, "rb")
>>F> html = input.read()
>
> Why do you use "rb"?
Because the file contains byte encoded data.
Stefan
More information about the lxml-dev
mailing list