README.html.in - external/github.com/jjlee/mechanize - Git at Google

 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
         "http://www.w3.org/TR/html4/strict.dtd">
 @# This file is processed by EmPy
 @{
 from colorize import colorize
 import time
 import release
 last_modified = release.last_modified(empy.identify()[0])
 try:
     base
 except NameError:
     base = False
 try:
     version
 except NameError:
     version = "dummy version"
 }
 <html>
 <!--This file was generated by EmPy: do not edit-->
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <meta name="author" content="John J. Lee &lt;jjl@@pobox.com&gt;">
   <meta name="date" content="@(time.strftime("%Y-%m-%d", last_modified))">
   <meta name="keywords" content="Python,HTML,HTTP,browser,stateful,web,client,client-side,mechanize,cookie,form,META,HTTP-EQUIV,Refresh,ClientForm,ClientCookie,pullparser,WWW::Mechanize">
   <meta name="keywords" content="cookie,HTTP,Python,web,client,client-side,HTML,META,HTTP-EQUIV,Refresh">
   <title>mechanize</title>
   <style type="text/css" media="screen">@@import "../styles/style.css";</style>
   <!--[if IE 6]>
   <style type="text/css" media="screen">@@import "../styles/style-ie6.css";</style>
   <![endif]-->
   @[if base]<base href="http://wwwsearch.sourceforge.net/mechanize/">@[end if]
 </head>
 <body>

 <div id="sf"><a href="http://sourceforge.net">
 <img src="http://sourceforge.net/sflogo.php?group_id=48205&amp;type=2"
  width="125" height="37" alt="SourceForge.net Logo"></a></div>
 <!--<img src="../images/sflogo.png"-->

 <h1>mechanize</h1>

 <div id="Content">

 <p>Stateful programmatic web browsing in Python, after Andy Lester's Perl
 module <a
 href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
 </a>.

 <ul>

   <li><code>mechanize.Browser</code> and <code>mechanize.UserAgentBase</code>
     implement the interface of <code>urllib2.OpenerDirector</code>, so:
     <ul>
       <li>any URL can be opened, not just <code>http:</code>

       <li><code>mechanize.UserAgentBase</code> offers easy dynamic
       configuration of user-agent features like protocol, cookie,
       redirection and <code>robots.txt</code> handling, without having
       to make a new <code>OpenerDirector</code> each time, e.g.  by
       calling <code>build_opener()</code>.

     </ul>
   <li>Easy HTML form filling.
   <li>Convenient link parsing and following.
   <li>Browser history (<code>.back()</code> and <code>.reload()</code>
     methods).
   <li>The <code>Referer</code> HTTP header is added properly (optional).
   <li>Automatic observance of <a
     href="http://www.robotstxt.org/wc/norobots.html">
     <code>robots.txt</code></a>.
   <li>Automatic handling of HTTP-Equiv and Refresh.
 </ul>


 <a name="examples"></a>
 <h2>Examples</h2>

 <p class="docwarning">This documentation is in need of reorganisation and
 extension!</p>

 <p>The examples below are written for a website that does not exist
 (<code>example.com</code>), so cannot be run.  There are also
 some <a href="./#tests">working examples</a> that you can run.

 @{colorize(r"""
 import re
 import mechanize

 br = mechanize.Browser()
 br.open("http://www.example.com/")
 # follow second link with element text matching regular expression
 response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)
 assert br.viewing_html()
 print br.title()
 print response1.geturl()
 print response1.info()  # headers
 print response1.read()  # body

 br.select_form(name="order")
 # Browser passes through unknown attributes (including methods)
 # to the selected HTMLForm.
 br["cheeses"] = ["mozzarella", "caerphilly"]  # (the method here is __setitem__)
 # Submit current form.  Browser calls .close() on the current response on
 # navigation, so this closes response1
 response2 = br.submit()

 # print currently selected form (don't call .submit() on this, use br.submit())
 print br.form

 response3 = br.back()  # back to cheese shop (same data as response1)
 # the history mechanism returns cached response objects
 # we can still use the response, even though it was .close()d
 response3.get_data()  # like .seek(0) followed by .read()
 response4 = br.reload()  # fetches from server

 for form in br.forms():
     print form
 # .links() optionally accepts the keyword args of .follow_/.find_link()
 for link in br.links(url_regex="python.org"):
     print link
     br.follow_link(link)  # takes EITHER Link instance OR keyword args
     br.back()
 """)}

 <p>You may control the browser's policy by using the methods of
 <code>mechanize.Browser</code>'s base class, <code>mechanize.UserAgent</code>.
 For example:

 @{colorize("""
 br = mechanize.Browser()
 # Explicitly configure proxies (Browser will attempt to set good defaults).
 # Note the userinfo ("joe:password@") and port number (":3128") are optional.
 br.set_proxies({"http": "joe:password@myproxy.example.com:3128",
                 "ftp": "proxy.example.com",
                 })
 # Add HTTP Basic/Digest auth username and password for HTTP proxy access.
 # (equivalent to using "joe:password@..." form above)
 br.add_proxy_password("joe", "password")
 # Add HTTP Basic/Digest auth username and password for website access.
 br.add_password("http://example.com/protected/", "joe", "password")
 # Don't handle HTTP-EQUIV headers (HTTP headers embedded in HTML).
 br.set_handle_equiv(False)
 # Ignore robots.txt.  Do not do this without thought and consideration.
 br.set_handle_robots(False)
 # Don't add Referer (sic) header
 br.set_handle_referer(False)
 # Don't handle Refresh redirections
 br.set_handle_refresh(False)
 # Don't handle cookies
 br.set_cookiejar()
 # Supply your own mechanize.CookieJar (NOTE: cookie handling is ON by
 # default: no need to do this unless you have some reason to use a
 # particular cookiejar)
 br.set_cookiejar(cj)
 # Log information about HTTP redirects and Refreshes.
 br.set_debug_redirects(True)
 # Log HTTP response bodies (ie. the HTML, most of the time).
 br.set_debug_responses(True)
 # Print HTTP headers.
 br.set_debug_http(True)

 # To make sure you're seeing all debug output:
 logger = logging.getLogger("mechanize")
 logger.addHandler(logging.StreamHandler(sys.stdout))
 logger.setLevel(logging.INFO)

 # Sometimes it's useful to process bad headers or bad HTML:
 response = br.response()  # this is a copy of response
 headers = response.info()  # currently, this is a mimetools.Message
 headers["Content-type"] = "text/html; charset=utf-8"
 response.set_data(response.get_data().replace("<!---", "<!--"))
 br.set_response(response)
 """)}

 <p>mechanize exports the complete interface of <code>urllib2</code>:

 @{colorize("""
 import mechanize
 response = mechanize.urlopen("http://www.example.com/")
 print response.read()
 """)}


 <p>When using mechanize, anything you would normally import
 from <code>urllib2</code> should be imported from mechanize instead.  In many
 cases, objects imported from mechanize are the same objects provided by
 <code>urllib2</code>.  In many other cases, though, the implementation comes
 from mechanize, either because bug fixes have been applied or the functionality
 of <code>urllib2</code> has been extended in some way.


 <a name="useragentbase"></a>
 <h2>UserAgent vs UserAgentBase</h2>

 <p><code>mechanize.UserAgent</code> is a trivial subclass of
 <code>mechanize.UserAgentBase</code>, adding just one method,
 <code>.set_seekable_responses()</code> (see the <a
 href="./doc.html#seekable">documentation on seekable responses</a>).

 <p>The reason for the extra class is that
 <code>mechanize.Browser</code> depends on seekable response objects
 (because response objects are used to implement the browser history).


 <a name="compatnotes"></a>
 <h2>Compatibility</h2>

 <p>These notes explain the relationship between mechanize, ClientCookie,
 <code>cookielib</code> and <code>urllib2</code>, and which to use when.  If
 you're just using mechanize, and not any of those other libraries, you can
 ignore this section.

 <ol>

   <li>mechanize works with Python 2.4, Python 2.5, and Python 2.6.

   <li>When using mechanize, anything you would normally import
       from <code>urllib2</code> should be imported from <code>mechanize</code>
       instead.

   <li>Use of mechanize classes with <code>urllib2</code> (and vice-versa) is no
       longer supported.  However, existing classes implementing the urllib2
       Handler interface are likely to work unchanged with mechanize.

   <li>mechanize now only imports urllib2.URLError and urllib2.HTTPError.  The
       rest is forked.  I intend to merge fixes from Python trunk frequently.

   <li>ClientCookie is no longer maintained as a separate package.  The code is
       now part of mechanize, and its interface is now exported through module
       mechanize (since mechanize 0.1.0).  Old code can simply be changed to
       <code>import mechanize as ClientCookie</code> and should continue to
       work.

   <li>The cookie handling parts of mechanize are in Python 2.4 standard library
       as module <code>cookielib</code> and extensions to module
       <code>urllib2</code>.  mechanize does not currently use cookielib, due to
       the presence of thread synchronisation code in cookielib that is not
       present in the mechanize fork of cookielib.

 </ol>


 <a name="docs"></a>
 <h2>Documentation</h2>

 <p>Full API documentation is in the docstrings.

 <p>The documentation in the web pages is in need of reorganisation at the
 moment, after the merge of ClientCookie into mechanize.


 <a name="credits"></a>
 <h2>Credits</h2>

 <p>Thanks to all the too-numerous-to-list people who reported bugs and provided
 patches.  Also thanks to Ian Bicking, for persuading me that a
 <code>UserAgent</code> class would be useful, and to Ronald Tschalar for advice
 on Netscape cookies.

 <p>A lot of credit must go to Gisle Aas, who wrote libwww-perl, from which
 large parts of mechanize originally derived, and Andy Lester for the original,
 <a href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
 </a>.  Finally, thanks to the (coincidentally-named) Johnny Lee for the MSIE
 CookieJar Perl code from which mechanize's support for that is derived.


 <a name="download"></a>
 <h2>Download</h2>

 <p>You can install from source, or
 using <a href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>:

 <pre>easy-install mechanize</pre>

 <p><a href="./#git">git access</a> is also available.

 <p>All documentation (including this web page) is included in the distribution.

 <p>This is a stable release.

 <ul>
 <li><a href="./src/mechanize-@(version).tar.gz">mechanize-@(version).tar.gz</a>
 <li><a href="./src/mechanize-@(version).zip">mechanize-@(version).zip</a>
 <li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
 <li><a href="./src/">Older versions.</a>
 </ul>

 <p>For an installation procedure that does not invoke EasyInstall's dependency
 resolution system, see the INSTALL file included with the distribution.


 <a name="git"></a>
 <h2>git repository</h2>

 <p>The <a href="http://git-scm.com/">git</a> repository is <a href="http://github.com/">here</a>.
 To check it out:

 <pre>
 git clone git://github.com/jjlee/mechanize.git
 </pre>

 <a name="tests"></a>
 <h2>Tests and examples</h2>

 <h3>Examples</h3>

 <p>The <code>examples</code> directory in the source packages contains a couple
 of silly, but working, scripts to demonstrate basic use of the module.  Note
 that it's in the nature of web scraping for such scripts to break, so don't be
 too surprised if that happens &#8211; do let me know, though!

 <p>See also the <a href="./forms/">forms examples</a> (these examples use the
 forms code independently of Browser).

 <h3>Functional tests</h3>

 <p>To run the functional tests (which <strong>do</strong> access the network):

 <pre>python functional_tests.py</pre>

 <p>To start a local server and run the functional tests against that (depends
 on <code>twisted.web2</code>):

 <pre>python functional_tests.py -l</pre>

 <h3>Unit tests</h3>

 <p>To run the unit tests (none of which access the network), run the following
 command:

 <pre>python test.py</pre>


 <h2>See also</h2>

 <p>There are several wrappers around mechanize designed for functional testing
 of web applications:

 <ul>

   <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
     <code>zope.testbrowser</code></a> (or
     <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
     <code>ZopeTestBrowser</code></a>, the standalone version).
   <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
 </ul>

 <p>See <a href="../bits/GeneralFAQ.html">General FAQ</a> page for other links
 to related software.


 <a name="faq"></a>
 <h2>FAQs - pre install</h2>
 <ul>
   <li>Which version of Python do I need?
   <p>Python 2.4, 2.5 or 2.6.  Python 3 is not yet supported.
   <li>Does mechanize depend on BeautifulSoup?
   <p>No.  mechanize offers a few (still rather experimental) classes that make
      use of BeautifulSoup, but these classes are not required to use mechanize.
      mechanize bundles BeautifulSoup version 2, so that module is no longer
      required.  A future version of mechanize will support BeautifulSoup
      version 3, at which point mechanize will likely no longer bundle the
      module.
   <li>Does mechanize depend on ClientForm?
   <p>No, ClientForm is now part of mechanize.
   <li>Which license?
   <p>mechanize is dual-licensed: you may pick either the
      <a href="http://www.opensource.org/licenses/bsd-license.php">BSD license</a>,
      or the <a href="http://www.zope.org/Resources/ZPL">ZPL 2.1</a> (both are
      included in the distribution).
 </ul>

 <a name="usagefaq"></a>
 <h2>FAQs - usage</h2>
 <ul>
   <li>I'm not getting the HTML page I expected to see.
     <ul>
       <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
       <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
      </ul>
   <li><code>Browser</code> doesn't have all of the forms/links I see in the
     HTML.  Why not?
   <p>Perhaps the default parser can't cope with invalid HTML.  Try using the
     included BeautifulSoup 2 parser instead:
 @{colorize("""
 import mechanize

 browser = mechanize.Browser(factory=mechanize.RobustFactory())
 browser.open("http://example.com/")
 print browser.forms
 """)}
   <li>Is JavaScript supported?
   <p>No, sorry.  Try <a href="http://htmlunit.sourceforge.net/">htmlunit</a>.
   <li>My HTTP response data is truncated.
   <p><code>mechanize.Browser's</code> response objects support the .seek()
      method, and can still be used after .close() has been called.  Response
      data is not fetched until it is needed, so navigation away from a URL
      before fetching all of the response will truncate it.
      Call <code>response.get_data()</code> before navigation if you don't want
      that to happen.
   <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
      <code>mechanize.Browser</code> think otherwise?
 @{colorize("""
 b = mechanize.Browser(
     # mechanize's XHTML support needs work, so is currently switched off.  If
     # we want to get our work done, we have to turn it on by supplying a
     # mechanize.Factory (with XHTML support turned on):
     factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
     )
 """)}
   <li>Why don't timeouts work for me?
   <p>Timeouts are ignored with with versions of Python earlier than 2.6.
     Timeouts do not apply to DNS lookups.
 </ul>

 <a name="bug_tracker"></a>
 <h2>Bug tracker</h2>

 <p>The (rather new) bug tracker is <a href="http://github.com/jjlee/mechanize/issues">here on github</a>.  It's equally acceptable to file bugs on the tracker or post about them to the mailing list.

 <a name="mailing_list"></a>
 <h2>Mailing list</h2>

 <p>There is
 a <a href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
 mailing list</a>.  I prefer questions and comments to be sent there rather than
 direct to me.

 <p><a href="mailto:jjl@@pobox.com">John J. Lee</a>,
 @(time.strftime("%B %Y", last_modified)).

 <hr>

 </div>

 <div id="Menu">

 @(release.navbar('mechanize'))

 <br>

 <a href="./#examples">Examples</a><br>
 <a href="./#compatnotes">Compatibility</a><br>
 <a href="./#docs">Documentation</a><br>
 <a href="./#download">Download</a><br>
 <a href="./#git">git</a><br>
 <a href="./#faq">FAQs</a><br>
 <a href="./#bug_tracker">Bug tracker</a><br>
 <a href="./#mailing_list">Mailing list</a><br>

 </div>


 </body>
 </html>
	<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
	"http://www.w3.org/TR/html4/strict.dtd">
	@# This file is processed by EmPy
	@{
	from colorize import colorize
	import time
	import release
	last_modified = release.last_modified(empy.identify()[0])
	try:
	base
	except NameError:
	base = False
	try:
	version
	except NameError:
	version = "dummy version"
	}
	<html>
	<!--This file was generated by EmPy: do not edit-->
	<head>
	<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
	<meta name="author" content="John J. Lee <jjl@@pobox.com>">
	<meta name="date" content="@(time.strftime("%Y-%m-%d", last_modified))">
	<meta name="keywords" content="Python,HTML,HTTP,browser,stateful,web,client,client-side,mechanize,cookie,form,META,HTTP-EQUIV,Refresh,ClientForm,ClientCookie,pullparser,WWW::Mechanize">
	<meta name="keywords" content="cookie,HTTP,Python,web,client,client-side,HTML,META,HTTP-EQUIV,Refresh">
	<title>mechanize</title>
	<style type="text/css" media="screen">@@import "../styles/style.css";</style>
	<!--[if IE 6]>
	<style type="text/css" media="screen">@@import "../styles/style-ie6.css";</style>
	<![endif]-->
	@[if base]<base href="http://wwwsearch.sourceforge.net/mechanize/">@[end if]
	</head>
	<body>

	<div id="sf"><a href="http://sourceforge.net">
	<img src="http://sourceforge.net/sflogo.php?group_id=48205&type=2"
	width="125" height="37" alt="SourceForge.net Logo"></a></div>
	<!--<img src="../images/sflogo.png"-->

	<h1>mechanize</h1>

	<div id="Content">

	<p>Stateful programmatic web browsing in Python, after Andy Lester's Perl
	module <a
	href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
	</a>.

	<ul>

	<li><code>mechanize.Browser</code> and <code>mechanize.UserAgentBase</code>
	implement the interface of <code>urllib2.OpenerDirector</code>, so:
	<ul>
	<li>any URL can be opened, not just <code>http:</code>

	<li><code>mechanize.UserAgentBase</code> offers easy dynamic
	configuration of user-agent features like protocol, cookie,
	redirection and <code>robots.txt</code> handling, without having
	to make a new <code>OpenerDirector</code> each time, e.g. by
	calling <code>build_opener()</code>.

	</ul>
	<li>Easy HTML form filling.
	<li>Convenient link parsing and following.
	<li>Browser history (<code>.back()</code> and <code>.reload()</code>
	methods).
	<li>The <code>Referer</code> HTTP header is added properly (optional).
	<li>Automatic observance of <a
	href="http://www.robotstxt.org/wc/norobots.html">
	<code>robots.txt</code></a>.
	<li>Automatic handling of HTTP-Equiv and Refresh.
	</ul>


	<a name="examples"></a>
	<h2>Examples</h2>

	<p class="docwarning">This documentation is in need of reorganisation and
	extension!</p>

	<p>The examples below are written for a website that does not exist
	(<code>example.com</code>), so cannot be run. There are also
	some <a href="./#tests">working examples</a> that you can run.

	@{colorize(r"""
	import re
	import mechanize

	br = mechanize.Browser()
	br.open("http://www.example.com/")
	# follow second link with element text matching regular expression
	response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)
	assert br.viewing_html()
	print br.title()
	print response1.geturl()
	print response1.info() # headers
	print response1.read() # body

	br.select_form(name="order")
	# Browser passes through unknown attributes (including methods)
	# to the selected HTMLForm.
	br["cheeses"] = ["mozzarella", "caerphilly"] # (the method here is __setitem__)
	# Submit current form. Browser calls .close() on the current response on
	# navigation, so this closes response1
	response2 = br.submit()

	# print currently selected form (don't call .submit() on this, use br.submit())
	print br.form

	response3 = br.back() # back to cheese shop (same data as response1)
	# the history mechanism returns cached response objects
	# we can still use the response, even though it was .close()d
	response3.get_data() # like .seek(0) followed by .read()
	response4 = br.reload() # fetches from server

	for form in br.forms():
	print form
	# .links() optionally accepts the keyword args of .follow_/.find_link()
	for link in br.links(url_regex="python.org"):
	print link
	br.follow_link(link) # takes EITHER Link instance OR keyword args
	br.back()
	""")}

	<p>You may control the browser's policy by using the methods of
	<code>mechanize.Browser</code>'s base class, <code>mechanize.UserAgent</code>.
	For example:

	@{colorize("""
	br = mechanize.Browser()
	# Explicitly configure proxies (Browser will attempt to set good defaults).
	# Note the userinfo ("joe:password@") and port number (":3128") are optional.
	br.set_proxies({"http": "joe:password@myproxy.example.com:3128",
	"ftp": "proxy.example.com",
	})
	# Add HTTP Basic/Digest auth username and password for HTTP proxy access.
	# (equivalent to using "joe:password@..." form above)
	br.add_proxy_password("joe", "password")
	# Add HTTP Basic/Digest auth username and password for website access.
	br.add_password("http://example.com/protected/", "joe", "password")
	# Don't handle HTTP-EQUIV headers (HTTP headers embedded in HTML).
	br.set_handle_equiv(False)
	# Ignore robots.txt. Do not do this without thought and consideration.
	br.set_handle_robots(False)
	# Don't add Referer (sic) header
	br.set_handle_referer(False)
	# Don't handle Refresh redirections
	br.set_handle_refresh(False)
	# Don't handle cookies
	br.set_cookiejar()
	# Supply your own mechanize.CookieJar (NOTE: cookie handling is ON by
	# default: no need to do this unless you have some reason to use a
	# particular cookiejar)
	br.set_cookiejar(cj)
	# Log information about HTTP redirects and Refreshes.
	br.set_debug_redirects(True)
	# Log HTTP response bodies (ie. the HTML, most of the time).
	br.set_debug_responses(True)
	# Print HTTP headers.
	br.set_debug_http(True)

	# To make sure you're seeing all debug output:
	logger = logging.getLogger("mechanize")
	logger.addHandler(logging.StreamHandler(sys.stdout))
	logger.setLevel(logging.INFO)

	# Sometimes it's useful to process bad headers or bad HTML:
	response = br.response() # this is a copy of response
	headers = response.info() # currently, this is a mimetools.Message
	headers["Content-type"] = "text/html; charset=utf-8"
	response.set_data(response.get_data().replace("<!---", "<!--"))
	br.set_response(response)
	""")}

	<p>mechanize exports the complete interface of <code>urllib2</code>:

	@{colorize("""
	import mechanize
	response = mechanize.urlopen("http://www.example.com/")
	print response.read()
	""")}


	<p>When using mechanize, anything you would normally import
	from <code>urllib2</code> should be imported from mechanize instead. In many
	cases, objects imported from mechanize are the same objects provided by
	<code>urllib2</code>. In many other cases, though, the implementation comes
	from mechanize, either because bug fixes have been applied or the functionality
	of <code>urllib2</code> has been extended in some way.


	<a name="useragentbase"></a>
	<h2>UserAgent vs UserAgentBase</h2>

	<p><code>mechanize.UserAgent</code> is a trivial subclass of
	<code>mechanize.UserAgentBase</code>, adding just one method,
	<code>.set_seekable_responses()</code> (see the <a
	href="./doc.html#seekable">documentation on seekable responses</a>).

	<p>The reason for the extra class is that
	<code>mechanize.Browser</code> depends on seekable response objects
	(because response objects are used to implement the browser history).


	<a name="compatnotes"></a>
	<h2>Compatibility</h2>

	<p>These notes explain the relationship between mechanize, ClientCookie,
	<code>cookielib</code> and <code>urllib2</code>, and which to use when. If
	you're just using mechanize, and not any of those other libraries, you can
	ignore this section.

	<ol>

	<li>mechanize works with Python 2.4, Python 2.5, and Python 2.6.

	<li>When using mechanize, anything you would normally import
	from <code>urllib2</code> should be imported from <code>mechanize</code>
	instead.

	<li>Use of mechanize classes with <code>urllib2</code> (and vice-versa) is no
	longer supported. However, existing classes implementing the urllib2
	Handler interface are likely to work unchanged with mechanize.

	<li>mechanize now only imports urllib2.URLError and urllib2.HTTPError. The
	rest is forked. I intend to merge fixes from Python trunk frequently.

	<li>ClientCookie is no longer maintained as a separate package. The code is
	now part of mechanize, and its interface is now exported through module
	mechanize (since mechanize 0.1.0). Old code can simply be changed to
	<code>import mechanize as ClientCookie</code> and should continue to
	work.

	<li>The cookie handling parts of mechanize are in Python 2.4 standard library
	as module <code>cookielib</code> and extensions to module
	<code>urllib2</code>. mechanize does not currently use cookielib, due to
	the presence of thread synchronisation code in cookielib that is not
	present in the mechanize fork of cookielib.

	</ol>



	<a name="docs"></a>
	<h2>Documentation</h2>

	<p>Full API documentation is in the docstrings.

	<p>The documentation in the web pages is in need of reorganisation at the
	moment, after the merge of ClientCookie into mechanize.


	<a name="credits"></a>
	<h2>Credits</h2>

	<p>Thanks to all the too-numerous-to-list people who reported bugs and provided
	patches. Also thanks to Ian Bicking, for persuading me that a
	<code>UserAgent</code> class would be useful, and to Ronald Tschalar for advice
	on Netscape cookies.

	<p>A lot of credit must go to Gisle Aas, who wrote libwww-perl, from which
	large parts of mechanize originally derived, and Andy Lester for the original,
	<a href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
	</a>. Finally, thanks to the (coincidentally-named) Johnny Lee for the MSIE
	CookieJar Perl code from which mechanize's support for that is derived.


	<a name="download"></a>
	<h2>Download</h2>

	<p>You can install from source, or
	using <a href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>:

	<pre>easy-install mechanize</pre>

	<p><a href="./#git">git access</a> is also available.

	<p>All documentation (including this web page) is included in the distribution.

	<p>This is a stable release.

	<ul>
	<li><a href="./src/mechanize-@(version).tar.gz">mechanize-@(version).tar.gz</a>
	<li><a href="./src/mechanize-@(version).zip">mechanize-@(version).zip</a>
	<li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
	<li><a href="./src/">Older versions.</a>
	</ul>

	<p>For an installation procedure that does not invoke EasyInstall's dependency
	resolution system, see the INSTALL file included with the distribution.


	<a name="git"></a>
	<h2>git repository</h2>

	<p>The <a href="http://git-scm.com/">git</a> repository is <a href="http://github.com/">here</a>.
	To check it out:

	<pre>
	git clone git://github.com/jjlee/mechanize.git
	</pre>

	<a name="tests"></a>
	<h2>Tests and examples</h2>

	<h3>Examples</h3>

	<p>The <code>examples</code> directory in the source packages contains a couple
	of silly, but working, scripts to demonstrate basic use of the module. Note
	that it's in the nature of web scraping for such scripts to break, so don't be
	too surprised if that happens – do let me know, though!

	<p>See also the <a href="./forms/">forms examples</a> (these examples use the
	forms code independently of Browser).

	<h3>Functional tests</h3>

	<p>To run the functional tests (which <strong>do</strong> access the network):

	<pre>python functional_tests.py</pre>

	<p>To start a local server and run the functional tests against that (depends
	on <code>twisted.web2</code>):

	<pre>python functional_tests.py -l</pre>

	<h3>Unit tests</h3>

	<p>To run the unit tests (none of which access the network), run the following
	command:

	<pre>python test.py</pre>


	<h2>See also</h2>

	<p>There are several wrappers around mechanize designed for functional testing
	of web applications:

	<ul>

	<li><a href="http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser">
	<code>zope.testbrowser</code></a> (or
	<a href="http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser">
	<code>ZopeTestBrowser</code></a>, the standalone version).
	<li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
	</ul>

	<p>See <a href="../bits/GeneralFAQ.html">General FAQ</a> page for other links
	to related software.


	<a name="faq"></a>
	<h2>FAQs - pre install</h2>
	<ul>
	<li>Which version of Python do I need?
	<p>Python 2.4, 2.5 or 2.6. Python 3 is not yet supported.
	<li>Does mechanize depend on BeautifulSoup?
	<p>No. mechanize offers a few (still rather experimental) classes that make
	use of BeautifulSoup, but these classes are not required to use mechanize.
	mechanize bundles BeautifulSoup version 2, so that module is no longer
	required. A future version of mechanize will support BeautifulSoup
	version 3, at which point mechanize will likely no longer bundle the
	module.
	<li>Does mechanize depend on ClientForm?
	<p>No, ClientForm is now part of mechanize.
	<li>Which license?
	<p>mechanize is dual-licensed: you may pick either the
	<a href="http://www.opensource.org/licenses/bsd-license.php">BSD license</a>,
	or the <a href="http://www.zope.org/Resources/ZPL">ZPL 2.1</a> (both are
	included in the distribution).
	</ul>

	<a name="usagefaq"></a>
	<h2>FAQs - usage</h2>
	<ul>
	<li>I'm not getting the HTML page I expected to see.
	<ul>
	<li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
	<li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
	</ul>
	<li><code>Browser</code> doesn't have all of the forms/links I see in the
	HTML. Why not?
	<p>Perhaps the default parser can't cope with invalid HTML. Try using the
	included BeautifulSoup 2 parser instead:
	@{colorize("""
	import mechanize

	browser = mechanize.Browser(factory=mechanize.RobustFactory())
	browser.open("http://example.com/")
	print browser.forms
	""")}
	<li>Is JavaScript supported?
	<p>No, sorry. Try <a href="http://htmlunit.sourceforge.net/">htmlunit</a>.
	<li>My HTTP response data is truncated.
	<p><code>mechanize.Browser's</code> response objects support the .seek()
	method, and can still be used after .close() has been called. Response
	data is not fetched until it is needed, so navigation away from a URL
	before fetching all of the response will truncate it.
	Call <code>response.get_data()</code> before navigation if you don't want
	that to happen.
	<li>I'm <strong><em>sure</em></strong> this page is HTML, why does
	<code>mechanize.Browser</code> think otherwise?
	@{colorize("""
	b = mechanize.Browser(
	# mechanize's XHTML support needs work, so is currently switched off. If
	# we want to get our work done, we have to turn it on by supplying a
	# mechanize.Factory (with XHTML support turned on):
	factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
	)
	""")}
	<li>Why don't timeouts work for me?
	<p>Timeouts are ignored with with versions of Python earlier than 2.6.
	Timeouts do not apply to DNS lookups.
	</ul>

	<a name="bug_tracker"></a>
	<h2>Bug tracker</h2>

	<p>The (rather new) bug tracker is <a href="http://github.com/jjlee/mechanize/issues">here on github</a>. It's equally acceptable to file bugs on the tracker or post about them to the mailing list.

	<a name="mailing_list"></a>
	<h2>Mailing list</h2>

	<p>There is
	a <a href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
	mailing list</a>. I prefer questions and comments to be sent there rather than
	direct to me.

	<p><a href="mailto:jjl@@pobox.com">John J. Lee</a>,
	@(time.strftime("%B %Y", last_modified)).

	<hr>

	</div>

	<div id="Menu">

	@(release.navbar('mechanize'))

	<br>

	<a href="./#examples">Examples</a><br>
	<a href="./#compatnotes">Compatibility</a><br>
	<a href="./#docs">Documentation</a><br>
	<a href="./#download">Download</a><br>
	<a href="./#git">git</a><br>
	<a href="./#faq">FAQs</a><br>
	<a href="./#bug_tracker">Bug tracker</a><br>
	<a href="./#mailing_list">Mailing list</a><br>

	</div>


	</body>
	</html>