[wwwsearch-commits] r18292 - in wwwsearch/ClientForm/trunk: . examples

jjlee at codespeak.net jjlee at codespeak.net
Sat Oct 8 19:01:43 CEST 2005


Author: jjlee
Date: Sat Oct  8 19:01:42 2005
New Revision: 18292

Modified:
   wwwsearch/ClientForm/trunk/README.html.in
   wwwsearch/ClientForm/trunk/examples/example.py
Log:
Update / rework README and example.py

Modified: wwwsearch/ClientForm/trunk/README.html.in
==============================================================================
--- wwwsearch/ClientForm/trunk/README.html.in	(original)
+++ wwwsearch/ClientForm/trunk/README.html.in	Sat Oct  8 19:01:42 2005
@@ -49,12 +49,16 @@
 response = urlopen(form.click("Thanks"))
 """)}
 
-<p>A more complicated example (<em><strong>Note</strong>: this example makes
-use of the ClientForm 0.2 API; refer to the README.html file in the latest 0.1
-release for the corresponding code for that version.</em>):
+<p>A more complicated example, which you can actually run
+(<em><strong>Note</strong>: this example makes use of the ClientForm 0.2 API;
+refer to the README.html file in the latest 0.1 release for the corresponding
+code for that version.</em>):
 
+<a name="example"></a>
 @{colorize("".join(open("examples/example.py").readlines()[2:]))}
 
+<a name="notes"></a>
+
 <p>All of the standard control types are supported: <code>TEXT</code>,
 <code>PASSWORD</code>, <code>HIDDEN</code>, <code>TEXTAREA</code>,
 <code>ISINDEX</code>, <code>RESET</code>, <code>BUTTON</code> (<code>INPUT
@@ -87,14 +91,33 @@
 0.1 interface.</strong> </em>
 
 
+<a name="parsers"></a>
+<h2>Parsers</h2>
+
+<p>ClientForm contains two parsers.  See <a href="./#faq">the FAQ entry on
+XHTML</a> for details.
+
+<p><a href="http://www.egenix.com/files/python/mxTidy.html">mxTidy</a> or <a
+href="http://utidylib.berlios.de/">µTidylib</a> can be useful for dealing with
+bad HTML.
+
+<p>I think it would be nice to have an implementation of ClientForm based on <a
+href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a>
+(i.e. all methods and attributes implemented using the BeautifulSoup API),
+since that module does tolerant HTML parsing with a nice API for doing
+non-forms stuff.  (I'm not about to do this, though.  For anybody interested in
+doing this, note that the ClientForm tests would need making
+constructor-independent first.)
+
+
 <a name="compat"></a>
 <h2>Backwards-compatibility mode</h2>
 
 <p>ClientForm 0.2 includes three minor backwards-incompatible interface
-changes.
+changes from version 0.1.
 
 <p>To make upgrading from 0.1 easier, and to allow me to stop supporting
-version 0.1 sooner, version 0.1 contains support for operating in a
+version 0.1 sooner, version 0.2 contains support for operating in a
 backwards-compatible mode, under which code written for 0.1 should work without
 modification.  This is done on a per-<code>HTMLForm</code> basis via the
 <code>.backwards_compat</code> attribute, but for convenience the
@@ -102,7 +125,7 @@
 <code>backwards_compat</code> arguments.  These backwards-compatibility
 features will be removed in version 0.3.  The default is to operate in
 backwards-compatible mode.  To run with backwards compatible mode turned
-<em><strong>OFF</strong></em>:
+<em><strong>OFF</strong></em> (<strong>strongly recommended</strong>):
 
 @{colorize(r"""
 from urllib2 import urlopen
@@ -119,7 +142,8 @@
 <code>nr=0</code> to indicate you want the first matching control or item.
 
 <li><p>Item label matching is now done by substring, not by strict
-string-equality.  (Control label matching is always done by substring.)
+string-equality (but note leading and trailing space is always stripped).
+(Control label matching is always done by substring.)
 
 <li><p>Handling of disabled list items has changed.  First, note that handling
 of disabled list items in 0.1 (and in 0.2's backwards-compatibility mode!) is
@@ -225,10 +249,13 @@
      <a href="http://www.opensource.org/licenses/bsd-license.php">BSD license</a>,
      or the <a href="http://www.zope.org/Resources/ZPL">ZPL 2.1</a> (both are
      included in the distribution).
+  <a name="xhtml"></a>
   <li>Is XHTML supported?
   <p>Yes.  You must pass
      <code>form_parser_class=ClientForm.XHTMLCompatibleFormParser</code> to
-     <code>ParseResponse()</code> / <code>ParseFile()</code>.
+     <code>ParseResponse()</code> / <code>ParseFile()</code>.  Note this parser
+     is less tolerant of bad HTML than the default,
+     <code>ClientForm.FormParser</code>
   <li>How do I figure out what control names and values to use?
   <p><code>print form</code> is usually all you need.
      In your code, things like the <code>HTMLForm.items</code> attribute of
@@ -329,7 +356,11 @@
 
 <br>
 
-<a href="./#download">Credits</a><br>
+<a href="./#example">Example</a><br>
+<a href="./#notes">Notes</a><br>
+<a href="./#parsers">Parsers</a><br>
+<a href="./#compat">Compatibility</a><br>
+<a href="./#credits">Credits</a><br>
 <a href="./#download">Download</a><br>
 <a href="./#faq">FAQs</a><br>
 

Modified: wwwsearch/ClientForm/trunk/examples/example.py
==============================================================================
--- wwwsearch/ClientForm/trunk/examples/example.py	(original)
+++ wwwsearch/ClientForm/trunk/examples/example.py	Sat Oct  8 19:01:42 2005
@@ -5,11 +5,11 @@
 request = urllib2.Request(
     "http://wwwsearch.sourceforge.net/ClientForm/example.html")
 response = urllib2.urlopen(request)
-forms = ClientForm.ParseResponse(response, ignore_ambiguity=False)
+forms = ClientForm.ParseResponse(response, backwards_compat=False)
 response.close()
 ## f = open("example.html")
 ## forms = ClientForm.ParseFile(f, "http://example.com/example.html",
-##                              ignore_ambiguity=False)
+##                              backwards_compat=False)
 ## f.close()
 form = forms[0]
 print form  # very useful!
@@ -29,6 +29,25 @@
 #  equivalent, but more flexible:
 form.set_value(["parmesan", "leicester", "cheddar"], name="cheeses")
 
+# Add files to FILE controls with .add_file().  Only call this multiple
+# times if the server is expecting multiple files.
+#  add a file, default value for MIME type, no filename sent to server
+form.add_file(open("data.dat"))
+#  add a second file, explicitly giving MIME type, and telling the server
+#   what the filename is
+form.add_file(open("data.txt"), "text/plain", "data.txt")
+
+# All Controls may be disabled (equivalent of greyed-out in browser)...
+control = form.find_control("comments")
+print control.disabled
+#  ...or readonly
+print control.readonly
+#  readonly and disabled attributes can be assigned to
+control.disabled = False
+#  convenience method, used here to make all controls writable (unless
+#   they're disabled):
+form.set_all_readonly(False)
+
 # A couple of notes about list controls and HTML:
 
 # 1. List controls correspond to either a single SELECT element, or
@@ -54,7 +73,7 @@
 # docstrings explain in detail, but playing around with an HTML file,
 # ParseFile() and 'print form' is very useful to understand this!
 
-# You can also get the Control instances from inside the form...
+# You can get the Control instances from inside the form...
 control = form.find_control("cheeses", type="select")
 print control.name, control.value, control.type
 control.value = ["mascarpone", "curd"]
@@ -63,6 +82,34 @@
 print item.name, item.selected, item.id, item.attrs
 item.selected = False
 
+# Controls may be referred to by label:
+#  find control with label that has a *substring* "Cheeses"
+#  (eg., a label "Please select a cheese" would match).
+control = form.find_control(label="select a cheese")
+
+# You can explicitly say that you're referring to a ListControl:
+#  set value of "cheeses" ListControl
+form.set_value(["gouda"], name="cheeses", kind="list")
+#  equivalent:
+form.find_control(name="cheeses", kind="list").value = ["gouda"]
+#  the first example is also almost equivalent to the following (but
+#  insists that the control be a ListControl -- so it will skip any
+#  non-list controls that come before the control we want)
+form["cheeses"] = ["gouda"]
+# The kind argument can also take values "multilist", "singlelist", "text",
+# "clickable" and "file":
+#  find first control that will accept text, and scribble in it
+form.set_value("rhubarb rhubarb", kind="text", nr=0)
+#  find, and set the value of, the first single-selection list control
+form.set_value(["spam"], kind="singlelist", nr=0)
+
+# You can find controls with a general predicate function:
+#  find first control with attribute named "whatever"
+def control_has_caerphilly(control):
+    for item in control.items:
+        if item.name == "caerphilly": return True
+form.find_control(kind="list", predicate=control_has_caerphilly)
+
 # HTMLForm.controls is a list of all controls in the form
 for control in form.controls:
     if control.value == "inquisition": sys.exit()
@@ -71,48 +118,45 @@
 for item in form.find_control("cheeses").items:
     print item.name
 
-# Some list control examples:
-# Many methods have a by_label argument, allowing specification of list
-# items by label instead of by name.  Sometimes labels are
-# easier to maintain than names, sometimes the other way around.
-form.set_value(["Mozzarella", "Caerphilly"], "cheeses", by_label=True)
-#  is the "parmesan" item of the "cheeses" control selected?
+# To remove items from a list control, remove it from .items:
+cheeses = form.find_control("cheeses")
+curd = cheeses.get("curd")
+del cheeses.items[cheeses.items.index(curd)]
+# To add items to a list container, instantiate an Item with its control
+# and attributes:
+# Note that you are responsible for getting the attributes correct here,
+# and these are not quite identical to the original HTML, due to
+# defaulting rules and a few special attributes (e.g. Items that represent
+# OPTIONs have a special "contents" key in their .attrs dict).  In future
+# there will be an explicitly supported way of using the parsing logic to
+# add items and controls from HTML strings without knowing these details.
+ClientForm.Item(cheeses, {"contents": "mascarpone",
+                          "value": "mascarpone"})
+
+# You can specify list items by label using set/get_value_by_label() and
+# the label argument of the .get() method.  Sometimes labels are easier to
+# maintain than names, sometimes the other way around.
+form.set_value_by_label(["Mozzarella", "Caerphilly"], "cheeses")
+
+# Which items are present, selected, and successful?
+#  is the "parmesan" item of the "cheeses" control successful (selected
+#   and not disabled)?
 print "parmesan" in form["cheeses"]
+#  is the "parmesan" item of the "cheeses" control selected?
+print "parmesan" in [
+    item.name for item in form.find_control("cheeses").items if item.selected]
 #  does cheeses control have a "caerphilly" item?
 print "caerphilly" in [item.name for item in form.find_control("cheeses").items]
+
 # Sometimes one wants to set or clear individual items in a list, rather
 # than setting the whole .value:
 #  select the item named "gorgonzola" in the first control named "cheeses"
 form.find_control("cheeses").get("gorgonzola").selected = True
-# You can be more specific (and many other methods take similar arguments):
+# You can be more specific:
 #  deselect "edam" in third CHECKBOX control
 form.find_control(type="checkbox", nr=2).get("edam").selected = False
 #  deselect item labelled "Mozzarella" in control with id "chz"
-form.find_control(id="chz").get("Mozzarella", by_label=True).selected = False
-# As for list items, controls may also be referred to by label:
-#  select "edam" in control with label that has a *substring* "Cheeses"
-#  (eg., a label "Please select a cheese" would match).
-form.find_control(label="select a cheese").get("emmenthal").selected = True
-
-# You can explicitly say that you're referring to a ListControl:
-#  set whole value (rather than just one item of) "cheeses" ListControl
-form.set_value(["gouda"], name="cheeses", kind="list")
-#  last example is almost equivalent to following (but insists that the
-#  control be a ListControl -- so it will skip any non-list controls that
-#  come before the control we want)
-form["cheeses"] = ["gouda"]
-# The kind argument can also take values "multilist", "singlelist", "text",
-# "clickable" and "file":
-#  find first control that will accept text, and scribble in it
-form.set_value("rhubarb rhubarb", kind="text")
-form.set_value(["spam"], kind="singlelist")
-
-# You can find controls with a general predicate function:
-#  find first control with attribute named "whatever"
-def control_has_caerphilly(control):
-    for item in control.items:
-        if item.name == "caerphilly": return True
-form.find_control(kind="list", predicate=control_has_caerphilly)
+form.find_control(id="chz").get(label="Mozzarella").selected = False
 
 # Often, a single checkbox (a CHECKBOX control with a single item) is
 # present.  In that case, the name of the single item isn't of much
@@ -121,25 +165,7 @@
 form.find_control("smelly").items[0].selected = True  # check
 form.find_control("smelly").items[0].selected = False  # uncheck
 
-# Add files to FILE controls with .add_file().  Only call this multiple
-# times if the server is expecting multiple files.
-#  add a file, default value for MIME type, no filename sent to server
-form.add_file(open("data.dat"))
-#  add a second file, explicitly giving MIME type, and telling the server
-#   what the filename is
-form.add_file(open("data.txt"), "text/plain", "data.txt")
-
-# All Controls may be disabled (equivalent of greyed-out in browser)
-control = form.find_control("comments")
-print control.disabled
-# ...or readonly
-print control.readonly
-# readonly and disabled attributes can be assigned to
-control.disabled = False
-# convenience method, used here to make all controls writable (unless
-# they're disabled):
-form.set_all_readonly(False)
-# Items may also be disabled (selecting or de-selecting a disabled item is
+# Items may be disabled (selecting or de-selecting a disabled item is
 # not allowed):
 control = form.find_control("cheeses")
 print control.get("emmenthal").disabled


More information about the wwwsearch-commits mailing list