[lxml-dev] Bug in lxml.html with nameless form submit?

Stefan Behnel stefan_ml at behnel.de
Mon Feb 11 12:12:49 CET 2008


Hi,

Paul Winkler wrote:
> This is a pretty common idiom in html - to have a submit button that
> has no name:
> 
>   <input type="submit" />

Definitely valid HTML:

http://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#h-17.4


> But lxml.html barfs if you use the form's fields.items() or
> fields.values() method:
> 
>>>> import lxml.html
>>>> tree = lxml.html.fromstring('''
> ... <html><body>
> ...  <form>
> ...   <input name="foo" value="bar"/>
> ...   <input type="submit" />
> ...  </form>
> ... </body></html>
> ... ''')
>>>> tree
> <Element html at 2b58b8edea78>
>>>> tree.forms
> tree.forms
>>>> tree.forms[0]
> <Element form at 2b58b8edec80>
>>>> tree.forms[0].fields
> <FieldsDict for form 0>
>>>> tree.forms[0].fields.keys()
> [None, 'foo']
>>>> tree.forms[0].fields.items()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.4/UserDict.py", line 112, in items
>     return list(self.iteritems())
>   File "/usr/lib/python2.4/UserDict.py", line 101, in iteritems
>     yield (k, self[k])
>   File "/usr/lib64/python2.4/site-packages/lxml-2.0-py2.4-linux-x86_64.egg/lxml/html/__init__.py", line 749, in __getitem__
>     return self.inputs[item].value
>   File "/usr/lib64/python2.4/site-packages/lxml-2.0-py2.4-linux-x86_64.egg/lxml/html/__init__.py", line 811, in __getitem__
>     raise KeyError(
> KeyError: 'No input element with the name None'
>>>> tree.forms[0].fields.values()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.4/UserDict.py", line 110, in values
>     return [v for _, v in self.iteritems()]
>   File "/usr/lib/python2.4/UserDict.py", line 101, in iteritems
>     yield (k, self[k])
>   File "/usr/lib64/python2.4/site-packages/lxml-2.0-py2.4-linux-x86_64.egg/lxml/html/__init__.py", line 749, in __getitem__
>     return self.inputs[item].value
>   File "/usr/lib64/python2.4/site-packages/lxml-2.0-py2.4-linux-x86_64.egg/lxml/html/__init__.py", line 811, in __getitem__
>     raise KeyError(
> KeyError: 'No input element with the name None'

Looks like a bug to me. Ian?

Stefan



More information about the lxml-dev mailing list