Skip to content

A Python library for automating interaction with websites.

License

Notifications You must be signed in to change notification settings

icesunx/MechanicalSoup

 
 

Repository files navigation

MechanicalSoup

A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do Javascript.

I was a fond user of the Mechanize library, but unfortunately it's incompatible with Python 3 and development is inactive. MechanicalSoup provides a similar API, built on Python giants Requests (for http sessions) and BeautifulSoup (for document navigation).

Installation

Latest Version

From PyPI

 pip install MechanicalSoup

Pythons version 2.6 through 3.5 are supported (and tested against).

Example

From example.py, code to log into the GitHub website:

"""Example app to login to GitHub"""
import argparse
import mechanicalsoup

parser = argparse.ArgumentParser(description='Login to GitHub.')
parser.add_argument("username")
parser.add_argument("password")
args = parser.parse_args()

browser = mechanicalsoup.Browser()

# request github login page. the result is a requests.Response object http://docs.python-requests.org/en/latest/user/quickstart/#response-content
login_page = browser.get("https://github.com/login")

# login_page.soup is a BeautifulSoup object http://www.crummy.com/software/BeautifulSoup/bs4/doc/#beautifulsoup 
# we grab the login form
login_form = login_page.soup.select("#login")[0].select("form")[0]

# specify username and password
login_form.select("#login_field")[0]['value'] = args.username
login_form.select("#password")[0]['value'] = args.password

# (or alternatively)
# login_form.input({"login": args.username, "password": args.password})

# submit form
page2 = browser.submit(login_form, login_page.url)

# verify we are now logged in
messages = page2.soup.find('div', class_='flash-messages')
if messages:
    print(messages.text)
assert page2.soup.select(".logout-form")

print(page2.soup.title.text)

# verify we remain logged in (thanks to cookies) as we browse the rest of the site
page3 = browser.get("https://github.com/hickford/MechanicalSoup")
assert page3.soup.select(".logout-form")

For an example with a more complex form (checkboxes, radio buttons and textareas), read tests/test_browser.py and tests/test_form.py.

Development

Build Status

Tests

py.test

Roadmap

See also

About

A Python library for automating interaction with websites.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%