Skip to content

Latest commit

 

History

History

template

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Template application

Template files and instructions for facilitating the creation of custom scrapers.


Instructions

  1. Supply the url to be scraped and the selector of the website element to be scraped into the check dictionary in Config.

    • To obtain the item selector, right-click the element in your browser and select inspect, then right-click the element in the inspection widget and select copy css selector or copy full xpath.
  2. Ensure that the mail_info attribute in the Config class contains the following attributes:

    • addr: gmail address with working application password. See google help for the steps in setting application passwords.
    • app_pw: application password.
  3. Change the pre-supplied runtime configuration in Config to your needs.

  4. Define how the browser obtains the text, and possible images, to be scraped in Browser._check_update_of_url(). Code section here.

  5. Define how the main scraper class passes the scraped data from the browser to the mailer class in TemplateScraper._check_update_of_url(). Code section here.

  6. Define how the mailer creates the email body from the received data in Mailer._make_msg(). Code section here.

  7. Run the /template/main.py script.


File structure

  • browser.py contains browser classes utilizing either puppeteer (Chromium) or selenium (Firefox) libraries.

  • main.py contains Config and TemplateScraper class.

  • mailer.py contains Mailer class.

  • make_executable.py contains PyInstaller class. Running this script automatically creates an executable of the program to be run as a standalone package without Python.

    • If the custom program requires use of other custom data files: The path to these must be supplied in the PyInstaller._local_files attribute to allow for bundling of those files with the created standalone program.
    • --onefile script argument: Generates a single executable file. See /applications/PyInstaller/main.py for other PyInstaller options.