| % mechanize -- Documentation |
| |
| Full API documentation is in the docstrings and the documentation of |
| [`urllib2`](http://docs.python.org/release/2.6/library/urllib2.html). The |
| documentation in these web pages is in need of reorganisation at the moment, |
| after the merge of ClientCookie and ClientForm into mechanize. |
| |
| |
| Tests and examples |
| ------------------ |
| |
| ### Examples ### |
| |
| The [front page](./) has some introductory examples. |
| |
| The `examples` directory in the source packages contains a couple of silly, |
| but working, scripts to demonstrate basic use of the module. |
| |
| See also the [forms examples](./forms.html) (these examples use the forms API |
| independently of `mechanize.Browser`). |
| |
| |
| ### Tests ### |
| |
| To run the tests: |
| |
| python test.py |
| |
| There are some tests that try to fetch URLs from the internet. To include |
| those in the test run: |
| |
| python test.py discover --tag internet |
| |
| |
| The `urllib2` interface |
| ----------------------- |
| |
| mechanize exports the complete interface of `urllib2`. See the [`urllib2` |
| documentation](http://docs.python.org/release/2.6/library/urllib2.html). For |
| example: |
| |
| ~~~~{.python} |
| import mechanize |
| response = mechanize.urlopen("http://www.example.com/") |
| print response.read() |
| ~~~~ |
| |
| |
| Compatibility |
| ------------- |
| |
| These notes explain the relationship between mechanize, ClientCookie, |
| ClientForm, `cookielib` and `urllib2`, and which to use when. If you're just |
| using mechanize, and not any of those other libraries, you can ignore this |
| section. |
| |
| #. mechanize works with Python 2.4, Python 2.5, Python 2.6, and Python 2.7. |
| |
| #. When using mechanize, anything you would normally import from `urllib2` |
| should be imported from `mechanize` instead. |
| |
| #. Use of mechanize classes with `urllib2` (and vice-versa) is no longer |
| supported. However, existing classes implementing the `urllib2 Handler` |
| interface are likely to work unchanged with mechanize. |
| |
| #. mechanize now only imports `urllib2.URLError` and `urllib2.HTTPError` from |
| `urllib2`. The rest is forked. I intend to merge fixes from Python trunk |
| frequently. |
| |
| #. ClientForm is no longer maintained as a separate package. The code is |
| now part of mechanize, and its interface is now exported through module |
| mechanize (since mechanize 0.2.0). Old code can simply be changed to |
| `import mechanize as ClientForm` and should continue to work. |
| |
| #. ClientCookie is no longer maintained as a separate package. The code is |
| now part of mechanize, and its interface is now exported through module |
| mechanize (since mechanize 0.1.0). Old code can simply be changed to |
| `import mechanize as ClientCookie` and should continue to work. |
| |
| #. The cookie handling parts of mechanize are in Python 2.4 standard library |
| as module `cookielib` and extensions to module `urllib2`. mechanize does |
| not currently use `cookielib`, due to the presence of thread |
| synchronisation code in `cookielib` that is not present in the mechanize |
| fork of `cookielib`. |
| |
| API differences between mechanize and `urllib2`: |
| |
| #. mechanize provides additional features. |
| |
| #. `mechanize.urlopen` differs in its behaviour: it handles cookies, whereas |
| `urllib2.urlopen` does not. To make a `urlopen` function with the |
| `urllib2` behaviour: |
| |
| ~~~~{.python} |
| import mechanize |
| handler_classes = [mechanize.ProxyHandler, |
| mechanize.UnknownHandler, |
| mechanize.HTTPHandler, |
| mechanize.HTTPDefaultErrorHandler, |
| mechanize.HTTPRedirectHandler, |
| mechanize.FTPHandler, |
| mechanize.FileHandler, |
| mechanize.HTTPErrorProcessor] |
| opener = mechanize.OpenerDirector() |
| for handler_class in handler_classes: |
| opener.add_handler(handler_class()) |
| urlopen = opener.open |
| ~~~~ |
| |
| #. Since Python 2.6, `urllib2` uses a `.timeout` attribute on `Request` |
| objects internally. However, `urllib2.Request` has no timeout constructor |
| argument, and `urllib2.urlopen()` ignores this parameter. |
| `mechanize.Request` has a `timeout` constructor argument which is used to |
| set the attribute of the same name, and `mechanize.urlopen()` does not |
| ignore the timeout attribute. |
| |
| |
| UserAgent vs UserAgentBase |
| -------------------------- |
| |
| `mechanize.UserAgent` is a trivial subclass of `mechanize.UserAgentBase`, |
| adding just one method, `.set_seekable_responses()` (see the [documentation on |
| seekable responses](./doc.html#seekable-responses)). |
| |
| The reason for the extra class is that `mechanize.Browser` depends on seekable |
| response objects (because response objects are used to implement the browser |
| history). |
| |
| |
| <!-- Local Variables: --> |
| <!-- fill-column:79 --> |
| <!-- End: --> |