[lxml-dev] lxml.html and forms
Ian Bicking
ianb at colorstudy.com
Mon Jul 16 23:33:05 CEST 2007
Stefan Behnel wrote:
>> A single
>> checkbox to a boolean (kind of... it's a little fuzzy; it kind of maps
>> to None/the-value-of-the-checkbox, but I could allow a true/false setter
>> as well).
>
> Hmm, except for an empty string value, Python's idea of a truth value would
> match that. And as you said, changing the form structure is not really
> intended, so you'd normally not change the value string but rather the
> "checked" property. So, assigning a truth value would simply change that,
> whereas a string value could still change the value property. The return value
> would then be the string value or None.
>
> For the special case of an empty string, you could return a string subclass
> that evaluates to the bool value True. Not sure if I like this, though, sounds
> like too much magic - and you never know where values end up in in application
> code... Maybe it's a rare enough corner case to accept this, though. Or isn't
> there a Unicode character like "zero width space" or something like that, that
> we could return instead?
The empty string is definitely a corner case, as many server-side
languages would treat that as false already.
Maybe it could just be returned as True in that case. This could break
code that expects a string, but it's such a strange case anyway that I
don't mind too much. Or I could return a string subclass of str that is
true, which is also very weird, but again it's very much a corner case
so maybe it's not that big a deal. If you don't give a value to a
checkbox it defaults to "on" anyway, so only an explicit value="" causes
this.
>> Multi-select to a set, etc. Radio buttons would map to a
>> single value, but I'd also want to give some access to the possible set
>> of values (since unlike a text box there is a constrained set of
>> possible values).
>
> Ok, so, how would you set them?
>
> >>> form.inputs["my_radio_name"] = "new_value"
>
> Like this? This would then deselect all other radio buttons with the name
> "my_radio_name" and only select the one with the "new_value" value. If we
> adopt this, reading the property should definitely return the selected value
> as a single string:
>
> >>> form.inputs["my_radio_name"]
> 'new_value'
Yes, right now it works like:
form.inputs['my_radio_name'].value = 'new_value'
Where form.inputs['my_radio_name'] is a subclass of list, which contains
all the radio input elements and also allows this group setting. If
it's a group of checkboxes, it's:
form.inputs['my_checkbox_name'].value.add('value1')
Which checks the checkbox with the value 'value1'. You can also assign
to value, which clears the set and assigns values from the iterator you
give. So basically I could take what I have now, and just always
get/set .value to create a flatish dictionary. And if you assign
directly to the dictionary, it would clear the current values and then
update with the values you give, just like the set works.
Whether this should replace or augment .inputs, I'm not sure. I think
augment, since .inputs gives you access to all the elements, which
sometimes you will want.
> Maybe we could return a subclass with an "element" property that returns the
> Element that carries that value?
>
> >>> form.inputs["my_radio_name"].element
> <Element 'radio' at ...>
Then we have something stringish, but isn't quite a string. And when
you an assignment, you get back something that's different than what you
assigned. It all feels too magic to me. I think we can just have two
accessors, one that gives you elements (like the current form.inputs)
and one that gives you values only.
>> Right now you get that with
>> form.inputs['radio_name'].value_options, but that won't work with a
>> flatter dictionary.
>
> Why not? I actually like that.
You'd also have to augment the string-like object, since
form.inputs['radio_name'] would be the value of the currently checked
radio button.
>> Maybe there'd generally be a
>> form_values.options('field_name'), which would be None for
>> unconstrained, and a set for constrained fields.
>
> Sounds too generic for a simple case. You shouldn't forget that you can't
> really fill a form without knowing what is a radio button and what is a
> checkbox, so there is not much to gain by providing a generic API.
>
> hasattr(el, "value_options")
>
> is also easy to write and reads better than
>
> el.value_options is None
Yes, most of the time you'll be filling out forms that you expect to
have very particular fields. But it's useful generally. With a flat
dictionary it's hard to get access to per-field information, so there
has to be some other means of access.
Anyway, currently value_options is only set on those elements and
objects where it makes sense.
>>>> Another option question is actual form submission. Right now it uses
>>>> urllib. But I like httplib2, for instance, and I'd like it to be
>>>> possible to use that.
>>> Alternatively, you could provide a simple interface that takes a URL
>>> and a list of name-value pairs and opens it.
>> That's what I was thinking of. I don't like module global settings at
>> all. Passing it in to submit seems fine. I was thinking about using a
>> class variable too, if you wanted to subclass the elements, or just set
>> it manually on a particular instance. Maybe it would be attached to the
>> tree object? E.g.:
>>
>> foo = parse(blah)
>> foo.getroottree().urlfetch = my_url_fetch
>
> That wouldn't work, as ElementTrees (and Elements) are not kept alive by the
> tree, so you can't store state in them.
Hrm... that's too bad. I'd like to keep some kind of local information
around, ideally inherited as you go from page to page. I really hate
global settings.
>> I was also thinking about whether I should return a new parsed page, or
>> just a file-like, or what. Or a file-like object that has a method to
>> get the page, perhaps; e.g., new_page = form.submit().document(). I
>> don't think the url fetching function would need to do any of this, it
>> would just have a very minimal interface and the submit method would
>> wrap it up in whatever seems most convenient.
>
> You can't return a parsed tree as the server reply can be anything from XML to
> weird binary. I think a file-like serves most purposes. Maybe an additional
> "parse()" method would work here, but I don't think it's necessary.
>
> >>> reply_tree = parse(form.submit())
>
> works just fine, is intuitive and avoids overhead.
Yeah, you are probably right. The etree parse method works just fine
right now, especially if it already picks up the url.
--
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
: Write code, do good : http://topp.openplans.org/careers
More information about the lxml-dev
mailing list