Parsing HTML Forms¶
Sometimes in functional tests, information from a generated form must
be extracted in order to re-submit it as part of a subsequent request.
The zope.testing.formparser
module can be used for this purpose.
The scanner is implemented using the FormParser
class. The
constructor arguments are the page data containing the form and
(optionally) the URL from which the page was retrieved:
>>> import zope.testing.formparser
>>> page_text = '''\
... <html><body>
... <form name="form1" action="/cgi-bin/foobar.py" method="POST">
... <input type="hidden" name="f1" value="today" />
... <input type="submit" name="do-it-now" value="Go for it!" />
... <input type="IMAGE" name="not-really" value="Don't."
... src="dont.png" />
... <select name="pick-two" size="3" multiple>
... <option value="one" selected>First</option>
... <option value="two" label="Second">Another</option>
... <optgroup>
... <option value="three">Third</option>
... <option selected="selected">Fourth</option>
... </optgroup>
... </select>
... </form>
...
... Just for fun, a second form, after specifying a base:
... <base href="http://www.example.com/base/" />
... <form action = 'sproing/sprung.html' enctype="multipart/form">
... <textarea name="sometext" rows="5">Some text.</textarea>
... <input type="Image" name="action" value="Do something."
... src="else.png" />
... <input type="text" value="" name="multi" size="2" />
... <input type="text" value="" name="multi" size="3" />
... </form>
... </body></html>
... '''
>>> parser = zope.testing.formparser.FormParser(page_text)
>>> forms = parser.parse()
>>> len(forms)
2
>>> forms.form1 is forms[0]
True
>>> forms.form1 is forms[1]
False
More often, the parse()
convenience function is all that’s needed:
>>> forms = zope.testing.formparser.parse(
... page_text, "http://cgi.example.com/somewhere/form.html")
>>> len(forms)
2
>>> forms.form1 is forms[0]
True
>>> forms.form1 is forms[1]
False
Once we have the form we’re interested in, we can check form attributes and individual field values:
>>> form = forms.form1
>>> form.enctype
'application/x-www-form-urlencoded'
>>> form.method
'post'
>>> keys = sorted(form.keys())
>>> keys
['do-it-now', 'f1', 'not-really', 'pick-two']
>>> not_really = form["not-really"]
>>> not_really.type
'image'
>>> not_really.value
"Don't."
>>> not_really.readonly
False
>>> not_really.disabled
False
Note that relative URLs are converted to absolute URLs based on the
<base>
element (if present) or using the base passed in to the
constructor.
>>> form.action
'http://cgi.example.com/cgi-bin/foobar.py'
>>> not_really.src
'http://cgi.example.com/somewhere/dont.png'
>>> forms[1].action
'http://www.example.com/base/sproing/sprung.html'
>>> forms[1]["action"].src
'http://www.example.com/base/else.png'
Fields which are repeated are reported as lists of objects that represent each instance of the field:
>>> field = forms[1]["multi"]
>>> isinstance(field, list)
True
>>> [o.value for o in field]
['', '']
>>> [o.size for o in field]
[2, 3]
The <textarea>
element provides some additional attributes:
>>> ta = forms[1]["sometext"]
>>> print_(ta.rows)
5
>>> print_(ta.cols)
None
>>> ta.value
'Some text.'
The <select>
element provides access to the options as well:
>>> select = form["pick-two"]
>>> select.multiple
True
>>> select.size
3
>>> select.type
'select'
>>> select.value
['one', 'Fourth']
>>> options = select.options
>>> len(options)
4
>>> [opt.label for opt in options]
['First', 'Second', 'Third', 'Fourth']
>>> [opt.value for opt in options]
['one', 'two', 'three', 'Fourth']