Skip to content

EnriqueKhai/Historian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Kattis Web Scraper

image

Image by Ivo Rainha

Kattis is an Online Judge platform where users can create an account, solve algorithmic problems and earn points to climb the ranks. That said, Kattis has neither an inbuilt feature nor an API that allows users to visualise their score change (i.e. progress) over time.

It is this problem that Historian attempts to solve.

Table of Contents

Overview

Historian is a Kattis web scraper built to reconstruct a user's score history over time (illustration below).

image

For a video demo, click here. Otherwise, here's how it works.

Historian will log into Kattis on behalf of its user and determine which problems have been solved so far. From there, it will locate each problems' first accepted submission and record the corresponding dates.

Historian then credits those problem's difficulties (as a score) to those dates, effectively backdating the user's ACs and determining the total number of points earned on those dates.

Once done, the cumulative total (or the user's timeline) can be reconstructed by computing a running sum. Suppose the user earns 1.0, 5.0 and 3.2 points on days 1, 4 and 7 respectively. The reconstructed timeline will then be:

Days           |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |
----------------------------------------------------------------
Points Earned  | 1.0 |  -  |  -  | 5.0 |  -  |  -  | 3.2 |  -  |
Cum. Score     | 1.0 |  -  |  -  | 6.0 |  -  |  -  | 9.2 |  -  |

As the final step, Historian will save the user's timeline as a .csv file. This allows users to plot (using Excel, Pandas, etc) graphs like this:

image

Installation

git clone https://github.com/EnriqueKhai/Historian.git
cd Historian
pip3 install -r requirements.txt
python3 historian.py

Upon successful scraping, a <kattis handle>.csv file will be written (giraffe6 is my moniker).

image

As you can see, its takes ~7 minutes to scrape 421 problems. The program run time is expected to scale linearly with this baseline rate of 1 problem per second. Users who have solved upwards of 800 problems can expect to > 14 minutes.

For non-NUS users

Replace 'https://nus.kattis.com' in sites[] (historian.py, line 21) with your own school's Kattis site (e.g. 'https://kth.kattis.com') or remove it altogether. This ensures that Historian knows where to look in trying to scrape your submissions.

16   print(flush=True)
17
18   # Locations to scrape user submissions.
19   sites = [
20       'https://open.kattis.com',
21       'https://nus.kattis.com'       # <------- replace this line!
22   ]
23 
24   # Scrape all problems solved.
25   handle, timeline = kattis.scrape_user(username, password, sites)

About

Kattis web scraper to reconstruct score history

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages