blob: 1e0793d413b0fdb1191683c293d61f663618a91d [file] [log] [blame]
% mechanize -- Documentation
Full API documentation is in the docstrings and the documentation of
[`urllib2`](http://docs.python.org/release/2.6/library/urllib2.html). The
documentation in these web pages is in need of reorganisation at the moment,
after the merge of ClientCookie and ClientForm into mechanize.
Tests and examples
------------------
### Examples ###
The [front page](./) has some introductory examples.
The `examples` directory in the source packages contains a couple of silly,
but working, scripts to demonstrate basic use of the module.
See also the [forms examples](./forms.html) (these examples use the forms API
independently of `mechanize.Browser`).
### Tests ###
To run the tests:
python test.py
There are some tests that try to fetch URLs from the internet. To include
those in the test run:
python test.py discover --tag internet
The `urllib2` interface
-----------------------
mechanize exports the complete interface of `urllib2`. See the [`urllib2`
documentation](http://docs.python.org/release/2.6/library/urllib2.html). For
example:
~~~~{.python}
import mechanize
response = mechanize.urlopen("http://www.example.com/")
print response.read()
~~~~
Compatibility
-------------
These notes explain the relationship between mechanize, ClientCookie,
ClientForm, `cookielib` and `urllib2`, and which to use when. If you're just
using mechanize, and not any of those other libraries, you can ignore this
section.
#. mechanize works with Python 2.4, Python 2.5, Python 2.6, and Python 2.7.
#. When using mechanize, anything you would normally import from `urllib2`
should be imported from `mechanize` instead.
#. Use of mechanize classes with `urllib2` (and vice-versa) is no longer
supported. However, existing classes implementing the `urllib2 Handler`
interface are likely to work unchanged with mechanize.
#. mechanize now only imports `urllib2.URLError` and `urllib2.HTTPError` from
`urllib2`. The rest is forked. I intend to merge fixes from Python trunk
frequently.
#. ClientForm is no longer maintained as a separate package. The code is
now part of mechanize, and its interface is now exported through module
mechanize (since mechanize 0.2.0). Old code can simply be changed to
`import mechanize as ClientForm` and should continue to work.
#. ClientCookie is no longer maintained as a separate package. The code is
now part of mechanize, and its interface is now exported through module
mechanize (since mechanize 0.1.0). Old code can simply be changed to
`import mechanize as ClientCookie` and should continue to work.
#. The cookie handling parts of mechanize are in Python 2.4 standard library
as module `cookielib` and extensions to module `urllib2`. mechanize does
not currently use `cookielib`, due to the presence of thread
synchronisation code in `cookielib` that is not present in the mechanize
fork of `cookielib`.
API differences between mechanize and `urllib2`:
#. mechanize provides additional features.
#. `mechanize.urlopen` differs in its behaviour: it handles cookies, whereas
`urllib2.urlopen` does not. To make a `urlopen` function with the
`urllib2` behaviour:
~~~~{.python}
import mechanize
handler_classes = [mechanize.ProxyHandler,
mechanize.UnknownHandler,
mechanize.HTTPHandler,
mechanize.HTTPDefaultErrorHandler,
mechanize.HTTPRedirectHandler,
mechanize.FTPHandler,
mechanize.FileHandler,
mechanize.HTTPErrorProcessor]
opener = mechanize.OpenerDirector()
for handler_class in handler_classes:
opener.add_handler(handler_class())
urlopen = opener.open
~~~~
#. Since Python 2.6, `urllib2` uses a `.timeout` attribute on `Request`
objects internally. However, `urllib2.Request` has no timeout constructor
argument, and `urllib2.urlopen()` ignores this parameter.
`mechanize.Request` has a `timeout` constructor argument which is used to
set the attribute of the same name, and `mechanize.urlopen()` does not
ignore the timeout attribute.
UserAgent vs UserAgentBase
--------------------------
`mechanize.UserAgent` is a trivial subclass of `mechanize.UserAgentBase`,
adding just one method, `.set_seekable_responses()` (see the [documentation on
seekable responses](./doc.html#seekable-responses)).
The reason for the extra class is that `mechanize.Browser` depends on seekable
response objects (because response objects are used to implement the browser
history).
<!-- Local Variables: -->
<!-- fill-column:79 -->
<!-- End: -->