[wwwsearch-commits] r33288 - in wwwsearch/mechanize/trunk: mechanize test

jjlee at codespeak.net jjlee at codespeak.net
Sat Oct 14 21:29:08 CEST 2006


Author: jjlee
Date: Sat Oct 14 21:29:07 2006
New Revision: 33288

Modified:
   wwwsearch/mechanize/trunk/mechanize/_http.py
   wwwsearch/mechanize/trunk/test/test_request.doctest
Log:
Backport stdlib urllib2 fix: Patch #1542948: fix urllib2 header casing issue.

Modified: wwwsearch/mechanize/trunk/mechanize/_http.py
==============================================================================
--- wwwsearch/mechanize/trunk/mechanize/_http.py	(original)
+++ wwwsearch/mechanize/trunk/mechanize/_http.py	Sat Oct 14 21:29:07 2006
@@ -634,6 +634,8 @@
         # So make sure the connection gets closed after the (only)
         # request.
         headers["Connection"] = "close"
+        headers = dict(
+            (name.title(), val) for name, val in headers.items())
         try:
             h.request(req.get_method(), req.get_selector(), req.data, headers)
             r = h.getresponse()

Modified: wwwsearch/mechanize/trunk/test/test_request.doctest
==============================================================================
--- wwwsearch/mechanize/trunk/test/test_request.doctest	(original)
+++ wwwsearch/mechanize/trunk/test/test_request.doctest	Sat Oct 14 21:29:07 2006
@@ -2,3 +2,65 @@
 >>> r = Request("http://example.com/foo#frag")
 >>> r.get_selector()
 '/foo'
+
+
+Request Headers Dictionary
+--------------------------
+
+The Request.headers dictionary is not a documented interface.  It should
+stay that way, because the complete set of headers are only accessible
+through the .get_header(), .has_header(), .header_items() interface.
+However, .headers pre-dates those methods, and so real code will be using
+the dictionary.
+
+The introduction in 2.4 of those methods was a mistake for the same reason:
+code that previously saw all (urllib2 user)-provided headers in .headers
+now sees only a subset (and the function interface is ugly and incomplete).
+A better change would have been to replace .headers dict with a dict
+subclass (or UserDict.DictMixin instance?)  that preserved the .headers
+interface and also provided access to the "unredirected" headers.  It's
+probably too late to fix that, though.
+
+
+Check .capitalize() case normalization:
+
+>>> url = "http://example.com"
+>>> Request(url, headers={"Spam-eggs": "blah"}).headers["Spam-eggs"]
+'blah'
+>>> Request(url, headers={"spam-EggS": "blah"}).headers["Spam-eggs"]
+'blah'
+
+Currently, Request(url, "Spam-eggs").headers["Spam-Eggs"] raises KeyError,
+but that could be changed in future.
+
+
+Request Headers Methods
+-----------------------
+
+Note the case normalization of header names here, to .capitalize()-case.
+This should be preserved for backwards-compatibility.  (In the HTTP case,
+normalization to .title()-case is done by urllib2 before sending headers to
+httplib).
+
+>>> url = "http://example.com"
+>>> r = Request(url, headers={"Spam-eggs": "blah"})
+>>> r.has_header("Spam-eggs")
+True
+>>> r.header_items()
+[('Spam-eggs', 'blah')]
+>>> r.add_header("Foo-Bar", "baz")
+>>> items = r.header_items()
+>>> items.sort()
+>>> items
+[('Foo-bar', 'baz'), ('Spam-eggs', 'blah')]
+
+Note that e.g. r.has_header("spam-EggS") is currently False, and
+r.get_header("spam-EggS") returns None, but that could be changed in
+future.
+
+>>> r.has_header("Not-there")
+False
+>>> print r.get_header("Not-there")
+None
+>>> r.get_header("Not-there", "default")
+'default'


More information about the wwwsearch-commits mailing list