[wwwsearch-commits] r18292 - in wwwsearch/ClientForm/trunk: .
examples
jjlee at codespeak.net
jjlee at codespeak.net
Sat Oct 8 19:01:43 CEST 2005
Author: jjlee
Date: Sat Oct 8 19:01:42 2005
New Revision: 18292
Modified:
wwwsearch/ClientForm/trunk/README.html.in
wwwsearch/ClientForm/trunk/examples/example.py
Log:
Update / rework README and example.py
Modified: wwwsearch/ClientForm/trunk/README.html.in
==============================================================================
--- wwwsearch/ClientForm/trunk/README.html.in (original)
+++ wwwsearch/ClientForm/trunk/README.html.in Sat Oct 8 19:01:42 2005
@@ -49,12 +49,16 @@
response = urlopen(form.click("Thanks"))
""")}
-<p>A more complicated example (<em><strong>Note</strong>: this example makes
-use of the ClientForm 0.2 API; refer to the README.html file in the latest 0.1
-release for the corresponding code for that version.</em>):
+<p>A more complicated example, which you can actually run
+(<em><strong>Note</strong>: this example makes use of the ClientForm 0.2 API;
+refer to the README.html file in the latest 0.1 release for the corresponding
+code for that version.</em>):
+<a name="example"></a>
@{colorize("".join(open("examples/example.py").readlines()[2:]))}
+<a name="notes"></a>
+
<p>All of the standard control types are supported: <code>TEXT</code>,
<code>PASSWORD</code>, <code>HIDDEN</code>, <code>TEXTAREA</code>,
<code>ISINDEX</code>, <code>RESET</code>, <code>BUTTON</code> (<code>INPUT
@@ -87,14 +91,33 @@
0.1 interface.</strong> </em>
+<a name="parsers"></a>
+<h2>Parsers</h2>
+
+<p>ClientForm contains two parsers. See <a href="./#faq">the FAQ entry on
+XHTML</a> for details.
+
+<p><a href="http://www.egenix.com/files/python/mxTidy.html">mxTidy</a> or <a
+href="http://utidylib.berlios.de/">µTidylib</a> can be useful for dealing with
+bad HTML.
+
+<p>I think it would be nice to have an implementation of ClientForm based on <a
+href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a>
+(i.e. all methods and attributes implemented using the BeautifulSoup API),
+since that module does tolerant HTML parsing with a nice API for doing
+non-forms stuff. (I'm not about to do this, though. For anybody interested in
+doing this, note that the ClientForm tests would need making
+constructor-independent first.)
+
+
<a name="compat"></a>
<h2>Backwards-compatibility mode</h2>
<p>ClientForm 0.2 includes three minor backwards-incompatible interface
-changes.
+changes from version 0.1.
<p>To make upgrading from 0.1 easier, and to allow me to stop supporting
-version 0.1 sooner, version 0.1 contains support for operating in a
+version 0.1 sooner, version 0.2 contains support for operating in a
backwards-compatible mode, under which code written for 0.1 should work without
modification. This is done on a per-<code>HTMLForm</code> basis via the
<code>.backwards_compat</code> attribute, but for convenience the
@@ -102,7 +125,7 @@
<code>backwards_compat</code> arguments. These backwards-compatibility
features will be removed in version 0.3. The default is to operate in
backwards-compatible mode. To run with backwards compatible mode turned
-<em><strong>OFF</strong></em>:
+<em><strong>OFF</strong></em> (<strong>strongly recommended</strong>):
@{colorize(r"""
from urllib2 import urlopen
@@ -119,7 +142,8 @@
<code>nr=0</code> to indicate you want the first matching control or item.
<li><p>Item label matching is now done by substring, not by strict
-string-equality. (Control label matching is always done by substring.)
+string-equality (but note leading and trailing space is always stripped).
+(Control label matching is always done by substring.)
<li><p>Handling of disabled list items has changed. First, note that handling
of disabled list items in 0.1 (and in 0.2's backwards-compatibility mode!) is
@@ -225,10 +249,13 @@
<a href="http://www.opensource.org/licenses/bsd-license.php">BSD license</a>,
or the <a href="http://www.zope.org/Resources/ZPL">ZPL 2.1</a> (both are
included in the distribution).
+ <a name="xhtml"></a>
<li>Is XHTML supported?
<p>Yes. You must pass
<code>form_parser_class=ClientForm.XHTMLCompatibleFormParser</code> to
- <code>ParseResponse()</code> / <code>ParseFile()</code>.
+ <code>ParseResponse()</code> / <code>ParseFile()</code>. Note this parser
+ is less tolerant of bad HTML than the default,
+ <code>ClientForm.FormParser</code>
<li>How do I figure out what control names and values to use?
<p><code>print form</code> is usually all you need.
In your code, things like the <code>HTMLForm.items</code> attribute of
@@ -329,7 +356,11 @@
<br>
-<a href="./#download">Credits</a><br>
+<a href="./#example">Example</a><br>
+<a href="./#notes">Notes</a><br>
+<a href="./#parsers">Parsers</a><br>
+<a href="./#compat">Compatibility</a><br>
+<a href="./#credits">Credits</a><br>
<a href="./#download">Download</a><br>
<a href="./#faq">FAQs</a><br>
Modified: wwwsearch/ClientForm/trunk/examples/example.py
==============================================================================
--- wwwsearch/ClientForm/trunk/examples/example.py (original)
+++ wwwsearch/ClientForm/trunk/examples/example.py Sat Oct 8 19:01:42 2005
@@ -5,11 +5,11 @@
request = urllib2.Request(
"http://wwwsearch.sourceforge.net/ClientForm/example.html")
response = urllib2.urlopen(request)
-forms = ClientForm.ParseResponse(response, ignore_ambiguity=False)
+forms = ClientForm.ParseResponse(response, backwards_compat=False)
response.close()
## f = open("example.html")
## forms = ClientForm.ParseFile(f, "http://example.com/example.html",
-## ignore_ambiguity=False)
+## backwards_compat=False)
## f.close()
form = forms[0]
print form # very useful!
@@ -29,6 +29,25 @@
# equivalent, but more flexible:
form.set_value(["parmesan", "leicester", "cheddar"], name="cheeses")
+# Add files to FILE controls with .add_file(). Only call this multiple
+# times if the server is expecting multiple files.
+# add a file, default value for MIME type, no filename sent to server
+form.add_file(open("data.dat"))
+# add a second file, explicitly giving MIME type, and telling the server
+# what the filename is
+form.add_file(open("data.txt"), "text/plain", "data.txt")
+
+# All Controls may be disabled (equivalent of greyed-out in browser)...
+control = form.find_control("comments")
+print control.disabled
+# ...or readonly
+print control.readonly
+# readonly and disabled attributes can be assigned to
+control.disabled = False
+# convenience method, used here to make all controls writable (unless
+# they're disabled):
+form.set_all_readonly(False)
+
# A couple of notes about list controls and HTML:
# 1. List controls correspond to either a single SELECT element, or
@@ -54,7 +73,7 @@
# docstrings explain in detail, but playing around with an HTML file,
# ParseFile() and 'print form' is very useful to understand this!
-# You can also get the Control instances from inside the form...
+# You can get the Control instances from inside the form...
control = form.find_control("cheeses", type="select")
print control.name, control.value, control.type
control.value = ["mascarpone", "curd"]
@@ -63,6 +82,34 @@
print item.name, item.selected, item.id, item.attrs
item.selected = False
+# Controls may be referred to by label:
+# find control with label that has a *substring* "Cheeses"
+# (eg., a label "Please select a cheese" would match).
+control = form.find_control(label="select a cheese")
+
+# You can explicitly say that you're referring to a ListControl:
+# set value of "cheeses" ListControl
+form.set_value(["gouda"], name="cheeses", kind="list")
+# equivalent:
+form.find_control(name="cheeses", kind="list").value = ["gouda"]
+# the first example is also almost equivalent to the following (but
+# insists that the control be a ListControl -- so it will skip any
+# non-list controls that come before the control we want)
+form["cheeses"] = ["gouda"]
+# The kind argument can also take values "multilist", "singlelist", "text",
+# "clickable" and "file":
+# find first control that will accept text, and scribble in it
+form.set_value("rhubarb rhubarb", kind="text", nr=0)
+# find, and set the value of, the first single-selection list control
+form.set_value(["spam"], kind="singlelist", nr=0)
+
+# You can find controls with a general predicate function:
+# find first control with attribute named "whatever"
+def control_has_caerphilly(control):
+ for item in control.items:
+ if item.name == "caerphilly": return True
+form.find_control(kind="list", predicate=control_has_caerphilly)
+
# HTMLForm.controls is a list of all controls in the form
for control in form.controls:
if control.value == "inquisition": sys.exit()
@@ -71,48 +118,45 @@
for item in form.find_control("cheeses").items:
print item.name
-# Some list control examples:
-# Many methods have a by_label argument, allowing specification of list
-# items by label instead of by name. Sometimes labels are
-# easier to maintain than names, sometimes the other way around.
-form.set_value(["Mozzarella", "Caerphilly"], "cheeses", by_label=True)
-# is the "parmesan" item of the "cheeses" control selected?
+# To remove items from a list control, remove it from .items:
+cheeses = form.find_control("cheeses")
+curd = cheeses.get("curd")
+del cheeses.items[cheeses.items.index(curd)]
+# To add items to a list container, instantiate an Item with its control
+# and attributes:
+# Note that you are responsible for getting the attributes correct here,
+# and these are not quite identical to the original HTML, due to
+# defaulting rules and a few special attributes (e.g. Items that represent
+# OPTIONs have a special "contents" key in their .attrs dict). In future
+# there will be an explicitly supported way of using the parsing logic to
+# add items and controls from HTML strings without knowing these details.
+ClientForm.Item(cheeses, {"contents": "mascarpone",
+ "value": "mascarpone"})
+
+# You can specify list items by label using set/get_value_by_label() and
+# the label argument of the .get() method. Sometimes labels are easier to
+# maintain than names, sometimes the other way around.
+form.set_value_by_label(["Mozzarella", "Caerphilly"], "cheeses")
+
+# Which items are present, selected, and successful?
+# is the "parmesan" item of the "cheeses" control successful (selected
+# and not disabled)?
print "parmesan" in form["cheeses"]
+# is the "parmesan" item of the "cheeses" control selected?
+print "parmesan" in [
+ item.name for item in form.find_control("cheeses").items if item.selected]
# does cheeses control have a "caerphilly" item?
print "caerphilly" in [item.name for item in form.find_control("cheeses").items]
+
# Sometimes one wants to set or clear individual items in a list, rather
# than setting the whole .value:
# select the item named "gorgonzola" in the first control named "cheeses"
form.find_control("cheeses").get("gorgonzola").selected = True
-# You can be more specific (and many other methods take similar arguments):
+# You can be more specific:
# deselect "edam" in third CHECKBOX control
form.find_control(type="checkbox", nr=2).get("edam").selected = False
# deselect item labelled "Mozzarella" in control with id "chz"
-form.find_control(id="chz").get("Mozzarella", by_label=True).selected = False
-# As for list items, controls may also be referred to by label:
-# select "edam" in control with label that has a *substring* "Cheeses"
-# (eg., a label "Please select a cheese" would match).
-form.find_control(label="select a cheese").get("emmenthal").selected = True
-
-# You can explicitly say that you're referring to a ListControl:
-# set whole value (rather than just one item of) "cheeses" ListControl
-form.set_value(["gouda"], name="cheeses", kind="list")
-# last example is almost equivalent to following (but insists that the
-# control be a ListControl -- so it will skip any non-list controls that
-# come before the control we want)
-form["cheeses"] = ["gouda"]
-# The kind argument can also take values "multilist", "singlelist", "text",
-# "clickable" and "file":
-# find first control that will accept text, and scribble in it
-form.set_value("rhubarb rhubarb", kind="text")
-form.set_value(["spam"], kind="singlelist")
-
-# You can find controls with a general predicate function:
-# find first control with attribute named "whatever"
-def control_has_caerphilly(control):
- for item in control.items:
- if item.name == "caerphilly": return True
-form.find_control(kind="list", predicate=control_has_caerphilly)
+form.find_control(id="chz").get(label="Mozzarella").selected = False
# Often, a single checkbox (a CHECKBOX control with a single item) is
# present. In that case, the name of the single item isn't of much
@@ -121,25 +165,7 @@
form.find_control("smelly").items[0].selected = True # check
form.find_control("smelly").items[0].selected = False # uncheck
-# Add files to FILE controls with .add_file(). Only call this multiple
-# times if the server is expecting multiple files.
-# add a file, default value for MIME type, no filename sent to server
-form.add_file(open("data.dat"))
-# add a second file, explicitly giving MIME type, and telling the server
-# what the filename is
-form.add_file(open("data.txt"), "text/plain", "data.txt")
-
-# All Controls may be disabled (equivalent of greyed-out in browser)
-control = form.find_control("comments")
-print control.disabled
-# ...or readonly
-print control.readonly
-# readonly and disabled attributes can be assigned to
-control.disabled = False
-# convenience method, used here to make all controls writable (unless
-# they're disabled):
-form.set_all_readonly(False)
-# Items may also be disabled (selecting or de-selecting a disabled item is
+# Items may be disabled (selecting or de-selecting a disabled item is
# not allowed):
control = form.find_control("cheeses")
print control.get("emmenthal").disabled
More information about the wwwsearch-commits
mailing list