(2021-09-02)
- The
change_incidence
traffic light indicator, which was introduced in 0.4.3, is no longer included in the dashboard. It is still included as0.0
in the output data for backwards compatibility. - Instead of
change_incidence
, a new indicator calledincidence_hospitalisation
is now included.
(2021-07-23)
- The R-number is no longer parsed, since it is not included in the dashboard anymore (thanks @jaimergp).
R is still included in the output data (as
0.0
), since some apps might expect to find it there.
(2021-05-07)
- Adjust times for running the scraper, because the dashboard is now updated around noon, rather than in the evening. See .github/workflows/scraper.yml.
(2021-02-27)
- Add vaccination data to the traffic light file (thanks @jaimergp). Vaccination data has been included since 2021-02-15.
(2021-01-17)
- Update Nokogiri via dependabot.
(2020-12-17)
- Check out a specific version of bundler to prevent the "can't find gem bundler (>= 0.a) with executable bundle" error.
- Extend the cron by one hour to catch unusually late publications.
(2020-12-06)
- Extract change in 7-day incidence as an additional traffic light indicator. The change in the 7-day incidence ("Veränderung der 7-Tage-Inzidenz") was introduced as an additional metric to the dashboard on 2020-11-11.
(2020-11-24)
- Another bugfix to adjust to differences in the markup (space as thousands separator, changed column name for incidence column).
(2020-11-24)
- Small bugfix (add test for
nil
togerman_to_international_float()
) to prevent crashes due to missing values.
(2020-10-18)
- The scraper now runs automatically every day with GitHub Actions, as recommended by @jaimergp. There is a really cool little blog post on Git scraping by @simonw at https://simonwillison.net/2020/Oct/9/git-scraping/.
- Add documentation for the GitHub Actions workflow to the README.
- Update changelog with dates for each release.
(2020-09-30)
- Add the color code for a red traffic light indicator. Had to wait for an indicator to actually turn red to see what the code is. Unfortunately, this happened today (2020-09-30).
(2020-09-24)
- Add a Nokogiri-based (we're doing Ruby now, because reasons) scraper to extract both the case numbers and the traffic light data from the new corona dashboard.
- Update Makefile.
- Remove all Scrapy-related code.
- Update README to reflect all this.
(2020-09-01)
- Remove some Scrapy-related make targets.
(2020-08-31)
- The Senatsverwaltung stopped publishing corona press releases, so the scraper doesn't work anymore. Instead, I will now try to convert the daily JOSN with case numbers and extract the traffic light indicators from the new dashboard at https://www.berlin.de/corona/lagebericht/desktop/corona.html. The press releases weren't great, but at least there was an official record with a history of corona data. Now, there is only the data for the current day, which is lost once new data is published.
- Initially, the conversion from the new sources is done only half-automatically (it's late...).
(2020-07-22)
- Add quick links to data files at the top of the README).
- Add conditions to deal with bad source data for data's date (yeah).
(2020-07-13)
- Enable the Scrapy venv from within the Makefile. This requires a
SCRAPY_HOME
environment variable to be set (see README).
(2020-07-12)
- Two more case number PRs (2020-04-26 and 2020-04-11) and two traffic light PRs (2020-06-08 and 2020-05-31) had been missing because the patterns to match their titles were too restrictive. They are now included.
(2020-07-11)
- There were two PRs that had fallen through the cracks because they were named differently (2020-05-24 and 2020-05-25). Those two have now been added.
(2020-06-11)
- Add a second scraper for the Corona traffic light press releases.
- Add
pr_date
(date of press release) to case number data. - Move some scraper helper methods up to module.
- Restructure make targets: there is now one for each scraper (
case-numbers
andtraffic-light
). Both are now triggered bydata
.
(2020-06-07)
- Counts per age group have been added as a new key
counts_per_age_group
for each press release that includes them (all but the very first ones).
(2020-06-07)
- Fix bug where case numbers with thousands separators resulted in wrong data.
(2020-06-07)
- Instead of extracting the press release's date for specific day of Corona case numbers, we now extract the date when the data itself was released. This was mostly identical, but in some cases not.
- Instead of extracting just a date, we now extract a datetime (no timezone).
- Logo added to project.
(2020-06-05)
- initial version
- contains only counts per district