[wwwsearch-commits] r33284 - in wwwsearch/mechanize/trunk: . mechanize
jjlee at codespeak.net
jjlee at codespeak.net
Sat Oct 14 19:16:27 CEST 2006
Author: jjlee
Date: Sat Oct 14 19:16:24 2006
New Revision: 33284
Modified:
wwwsearch/mechanize/trunk/0.1-changes.txt
wwwsearch/mechanize/trunk/README.html.in
wwwsearch/mechanize/trunk/functional_tests.py
wwwsearch/mechanize/trunk/mechanize/__init__.py
wwwsearch/mechanize/trunk/mechanize/_mechanize.py
wwwsearch/mechanize/trunk/mechanize/_response.py
wwwsearch/mechanize/trunk/mechanize/_useragent.py
Log:
Reinstate set_seekable_responses() (had to create an additional class and change the base class of Browser to do so). Functional tests not run since SF is down :-(
Modified: wwwsearch/mechanize/trunk/0.1-changes.txt
==============================================================================
--- wwwsearch/mechanize/trunk/0.1-changes.txt (original)
+++ wwwsearch/mechanize/trunk/0.1-changes.txt Sat Oct 14 19:16:24 2006
@@ -52,7 +52,9 @@
- mechanize.Browser.default_encoding is gone.
- mechanize.Browser.set_seekable_responses() is gone (they're always
- .seek()able).
+ .seek()able). Browser and UserAgent now both inherit from
+ mechanize.UserAgentBase, and UserAgent is now there only to add the
+ single method .set_seekable_responses().
- Added Browser.encoding().
Modified: wwwsearch/mechanize/trunk/README.html.in
==============================================================================
--- wwwsearch/mechanize/trunk/README.html.in (original)
+++ wwwsearch/mechanize/trunk/README.html.in Sat Oct 14 19:16:24 2006
@@ -42,16 +42,18 @@
<ul>
<li><code>mechanize.Browser</code> is a subclass of
- <code>mechanize.UserAgent</code>, which is, in turn, a subclass of
+ <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
<code>urllib2.OpenerDirector</code> (in fact, of
<code>mechanize.OpenerDirector</code>), so:
<ul>
<li>any URL can be opened, not just <code>http:</code>
- <li><code>mechanize.UserAgent</code> offers easy dynamic configuration of
- user-agent features like protocol, cookie, redirection and
- <code>robots.txt</code> handling, without having to make a new
- <code>OpenerDirector</code> each time, e.g. by calling
- <code>build_opener()</code>.
+
+ <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+ configuration of user-agent features like protocol, cookie,
+ redirection and <code>robots.txt</code> handling, without having
+ to make a new <code>OpenerDirector</code> each time, e.g. by
+ calling <code>build_opener()</code>.
+
</ul>
<li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
interface.
@@ -181,6 +183,21 @@
way.
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
+
+<code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code>, which allows switching off the
+addition of the <code>.seek()</code> method to response objects:
+
+@{colorize("""
+import mechanize
+response = mechanize.urlopen("http://www.example.com/")
+print response.read()
+""")}
+
+
<a name="compatnotes"></a>
<h2>Compatibility</h2>
Modified: wwwsearch/mechanize/trunk/functional_tests.py
==============================================================================
--- wwwsearch/mechanize/trunk/functional_tests.py (original)
+++ wwwsearch/mechanize/trunk/functional_tests.py Sat Oct 14 19:16:24 2006
@@ -82,6 +82,18 @@
test_state(self.browser)
self.assert_("GeneralFAQ.html" in r.read(2048))
+ def test_non_seekable(self):
+ # check everything still works without response_seek_wrapper and
+ # the .seek() method on response objects
+ ua = mechanize.UserAgent()
+ ua.set_seekable_responses(False)
+ ua.set_handle_equiv(False)
+ ua._maybe_reindex_handlers()
+ response = ua.open('http://wwwsearch.sourceforge.net/')
+ self.failIf(hasattr(response, "seek"))
+ data = response.read()
+ self.assert_("Python bits" in data)
+
class ResponseTests(TestCase):
Modified: wwwsearch/mechanize/trunk/mechanize/__init__.py
==============================================================================
--- wwwsearch/mechanize/trunk/mechanize/__init__.py (original)
+++ wwwsearch/mechanize/trunk/mechanize/__init__.py Sat Oct 14 19:16:24 2006
@@ -66,6 +66,7 @@
'USE_BARE_EXCEPT',
'UnknownHandler',
'UserAgent',
+ 'UserAgentBase',
'XHTMLCompatibleHeadParser',
'__version__',
'build_opener',
@@ -86,7 +87,7 @@
BrowserStateError, LinkNotFoundError, FormNotFoundError
# configurable URL-opener interface
-from _useragent import UserAgent
+from _useragent import UserAgentBase, UserAgent
from _html import \
Link, \
Factory, DefaultFactory, RobustFactory, \
Modified: wwwsearch/mechanize/trunk/mechanize/_mechanize.py
==============================================================================
--- wwwsearch/mechanize/trunk/mechanize/_mechanize.py (original)
+++ wwwsearch/mechanize/trunk/mechanize/_mechanize.py Sat Oct 14 19:16:24 2006
@@ -11,7 +11,7 @@
import urllib2, sys, copy, re
-from _useragent import UserAgent
+from _useragent import UserAgentBase
from _html import DefaultFactory
from _response import response_seek_wrapper, closeable_response
import _upgrade
@@ -53,7 +53,7 @@
del self._history[:]
-class Browser(UserAgent):
+class Browser(UserAgentBase):
"""Browser-like class with support for history, forms and links.
BrowserStateError is raised whenever the browser is in the wrong state to
@@ -68,9 +68,9 @@
"""
- handler_classes = UserAgent.handler_classes.copy()
+ handler_classes = UserAgentBase.handler_classes.copy()
handler_classes["_response_upgrade"] = _upgrade.ResponseUpgradeProcessor
- default_others = copy.copy(UserAgent.default_others)
+ default_others = copy.copy(UserAgentBase.default_others)
default_others.append("_response_upgrade")
def __init__(self,
@@ -83,8 +83,8 @@
Only named arguments should be passed to this constructor.
factory: object implementing the mechanize.Factory interface.
- history: object implementing the mechanize.History interface. Note this
- interface is still experimental and may change in future.
+ history: object implementing the mechanize.History interface. Note
+ this interface is still experimental and may change in future.
request_class: Request class to use. Defaults to mechanize.Request
by default for Pythons older than 2.4, urllib2.Request otherwise.
@@ -116,10 +116,11 @@
self.request = None
self.set_response(None)
- UserAgent.__init__(self) # do this last to avoid __getattr__ problems
+ # do this last to avoid __getattr__ problems
+ UserAgentBase.__init__(self)
def close(self):
- UserAgent.close(self)
+ UserAgentBase.close(self)
if self._response is not None:
self._response.close()
if self._history is not None:
@@ -164,7 +165,8 @@
# relative URL
if self._response is None:
raise BrowserStateError(
- "can't fetch relative reference: not viewing any document")
+ "can't fetch relative reference: "
+ "not viewing any document")
url = _rfc3986.urljoin(self._response.geturl(), url)
request = self._request(url, data, visit)
@@ -178,12 +180,13 @@
if self.request is not None and update_history:
self._history.add(self.request, self._response)
self._response = None
- # we want self.request to be assigned even if UserAgent.open fails
+ # we want self.request to be assigned even if UserAgentBase.open
+ # fails
self.request = request
success = True
try:
- response = UserAgent.open(self, request, data)
+ response = UserAgentBase.open(self, request, data)
except urllib2.HTTPError, error:
success = False
if error.fp is None: # not a response
Modified: wwwsearch/mechanize/trunk/mechanize/_response.py
==============================================================================
--- wwwsearch/mechanize/trunk/mechanize/_response.py (original)
+++ wwwsearch/mechanize/trunk/mechanize/_response.py Sat Oct 14 19:16:24 2006
@@ -1,4 +1,13 @@
-"""(Mostly HTTP) response classes.
+"""Response classes.
+
+The seek_wrapper code is not used if you're using UserAgent with
+.set_seekable_responses(False), or if you're using the urllib2-level interface
+without SeekableProcessor or HTTPEquivProcessor. Class closeable_response is
+instantiated by some handlers (AbstractHTTPHandler), but the closeable_response
+interface is only depended upon by Browser-level code. Function
+upgrade_response is only used if you're using Browser or
+ResponseUpgradeProcessor.
+
Copyright 2006 John J. Lee <jjl at pobox.com>
Modified: wwwsearch/mechanize/trunk/mechanize/_useragent.py
==============================================================================
--- wwwsearch/mechanize/trunk/mechanize/_useragent.py (original)
+++ wwwsearch/mechanize/trunk/mechanize/_useragent.py Sat Oct 14 19:16:24 2006
@@ -35,18 +35,24 @@
https_request = http_request
-class UserAgent(OpenerDirector):
+class UserAgentBase(OpenerDirector):
"""Convenient user-agent class.
Do not use .add_handler() to add a handler for something already dealt with
by this code.
+ The only reason at present for the distinction between UserAgent and
+ UserAgentBase is so that classes that depend on .seek()able responses
+ (e.g. mechanize.Browser) can inherit from UserAgentBase. The subclass
+ UserAgent exposes a .set_seekable_responses() method that allows switching
+ off the adding of a .seek() method to responses.
+
Public attributes:
addheaders: list of (name, value) pairs specifying headers to send with
every request, unless they are overridden in the Request instance.
- >>> ua = UserAgent()
+ >>> ua = UserAgentBase()
>>> ua.addheaders = [
... ("User-agent", "Mozilla/5.0 (compatible)"),
... ("From", "responsible.person at example.com")]
@@ -349,3 +355,10 @@
if newhandler is not None:
self.add_handler(newhandler)
self._ua_handlers[name] = newhandler
+
+
+class UserAgent(UserAgentBase):
+
+ def set_seekable_responses(self, handle):
+ """Make response objects .seek()able."""
+ self._set_handler("_seek", handle)
More information about the wwwsearch-commits
mailing list