All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Improved reliability of scrapped content
- Page Scrolls to load viewport based content before exiting
- Debug flags added to cli and
scraper.get
for testing purposes- Print network activity and launch chromium browser
- No longer hanging trying to get content from emptry frames
- Browser is now in headless mode
- Replaced pyppeteer with playwright as a backend
- Scraper now waits to networkidle2 before returning
ipython
reference in the main scraper file
BrowserError
andTimeoutError
as public exceptionsipython
dev dependency
- CLI now longer print stack traces for
BrowserError
andTimeoutError
- Scraper no longer errors on blank html tags
- CHANGELOG.md