Learning BeautifulSoup
Documentation: https://www.crummy.com/software/BeautifulSoup/bs4/doc/
Setup the environment:
python3 -m venv .venv
OSX / Linux:
source .venv/bin/activate
Windows:
\.venv\Scripts\activate.bat
Install dependencies:
pip install -r requirements.txt
pytest
- I need to have code that requests a page
- I will then pass the html to BeautifulSoup
- I will extract the page title from the HTML with BeautifulSoup
- I will build this as a small library/module
Webpage > requests > HTML > BeautifulSoup > ???
Webpage as HTML string
-
Request the HTML data from the webpage
-
Take the HTML from the response and pass it to BeautifulSoup
Return the value of the HTML page title. Not the complete title tag. Just the value