Skip to content
This repository was archived by the owner on Feb 17, 2025. It is now read-only.

Implement gatherling.com scraper #15

Open
Badaro opened this issue Jul 24, 2024 · 0 comments
Open

Implement gatherling.com scraper #15

Badaro opened this issue Jul 24, 2024 · 0 comments

Comments

@Badaro
Copy link
Owner

Badaro commented Jul 24, 2024

This has been suggested by @Aliquanto3 as a well to get Premodern data into our dataset, and bakert commented on Discord that we could also use this for Penny Dreadful.

There's two paths to go here.

Plan A: Database dump. According to bakert they generate a database dump (with some things redatacted) every 24h here:
https://pennydreadfulmagic.com/static/dev-db.sql.gz

This is a MariaDB data dump, so the process should be fairly simple:

  • Automate the creation of a docker container with MariaDB importing this script
  • Extract data from the DB dump

We could also explore using embedded MariaDB which would facilitate a few things, but it doesn't look like they provide Windows builds for that.

Plan B: Scraping. There's an eventinfo route that seems to contain all the info we need for an individual tournament.
https://gatherling.com/api.php?action=eventinfo&event=Pre-Modern%20Monthly%20League%2011.05

The only thing missing is a way to list older events. There's an event list page here, but it doesn't match the way the scraper works very well since there's no way to navigate by date.
https://gatherling.com/eventreport.php

There's no documentation for the API but the code is available on Github, so we can also explore if there's other routes that could help:
https://github.com/PennyDreadfulMTG/gatherling/blob/dev/gatherling/api.php

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant