Skip to content

Latest commit

 

History

History
16 lines (14 loc) · 462 Bytes

README.md

File metadata and controls

16 lines (14 loc) · 462 Bytes

Install Dependencies

conda env create -f PyWebScraper.yml

Run

conda activate PyWebScraper
python main.py -u http://domain.com
Paramters
  • -u : Starting URL to parse (e.g. http://main.com).
  • -m : XPath to look for main content (e.g. 'div.main', 'div[id="main"]').
  • -n : XPath to look for site navigation links (e.g. 'div.nav a').
  • -js : Whether to run JavaScript on page or not (0=False, 1=True (default)).