[lxml-dev] lxml forms and cookies...
John J Lee
jjl at pobox.com
Tue Dec 23 13:50:47 CET 2008
On Mon, 22 Dec 2008, Douglas Mayle wrote:
> Hey everyone, I've been trying to use the lxml forms with client
> cookies to handle html logins. Using cookielib, I'm able to manually
> login to a page using either urllib or urllib2 and have it work. The
> moment I try to use lxml.html.submit_form(), however, it fails. I've
> tried sniffing the packets, and it turns out that lxml is never
> sending cookies, which makes me guess that lxml is using neither
> urllib nor urllib2. How do I use cookes with lxml?
lxml.html.submit_form() has an open_http parameter:
import urllib
import urllib2
import urlparse
import lxml.html
def url_with_query(url, values):
parts = urlparse.urlparse(url)
rest, (query, frag) = parts[:-2], parts[-2:]
return urlparse.urlunparse(rest + (urllib.urlencode(values), None))
def make_open_http():
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
opener.addheaders = [] # pretend we're a human -- don't do this
def open_http(method, url, values={}):
if method == "POST":
return opener.open(url, urllib.urlencode(values))
else:
return opener.open(url_with_query(url, values))
return open_http
open_http = make_open_http()
tree = lxml.html.fromstring(open_http("GET", "http://python.org").read())
form = tree.forms[0]
form.fields["q"] = "lxml"
submit_values = {"submit": form.fields["submit"]}
response = lxml.html.submit_form(form,
extra_values=submit_values,
open_http=open_http)
html = response.read()
doc = lxml.html.fromstring(html)
lxml.html.open_in_browser(doc)
John
More information about the lxml-dev
mailing list