Collect data from Washington Post Database #2

trystant · 2017-05-19T07:58:45Z

It looks like the Washington Post data can be retrieved from this spreadsheet and also available from this Github repository. We need to determine what to fetch, how to compare it to the data we have and come up with a long term strategy for it.

rlgreen91 · 2017-05-24T00:43:00Z

Ok, so in taking a look at the data, it seems that all we would need to do is grab the name, date of incident, race, gender, city, and state. Then, we need to search for a matching case - if none exists, then we create one with the appropriate data.

I'm wondering how we handle searching. Right now, there are definitely aspects of our search that can be approved. Additionally, cases can be started with "incomplete" information - without an age, race, or state. Additionally, names can vary when people enter the information. We will basically need an API that we can use to both search the cases as well as add a new case. With adding a new case, we'll have to rethink our form, including the "nested" aspect of adding a subject, and its current requirements to avoid data discrepancies.

I think the best thing we can do next is to address the search capabilities of our website. After that, we can develop the API, then finally write the utility to go through the csv files and add stuff where necessary. I'd be interested to know what you think.

rlgreen91 · 2019-10-10T21:21:44Z

It's been a bit since I checked in on this repo, so...here we are :) . Here's an update:

We did separate search into its own type of domain, so that should be enough to get started. We'll keep working on augmenting that so we'll get better at searching.

Next is fixing our form for cases - specifically, what is and isn't required to create a case. We really need to figure out how to transition to a single subject per case. Once we do that, we can hopefully reduce some of the fields and then have a cleaner interface for the API.

trystant · 2019-10-11T18:10:31Z

It's been a bit since anything was done on this repo, but we do want to be able to grab data from other places in an automated fashion. That's a bit separate from fixing our case form and searching for cases indexed with elasticsearch

trystant assigned trystant and rlgreen91 May 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect data from Washington Post Database #2

Collect data from Washington Post Database #2

trystant commented May 19, 2017

rlgreen91 commented May 24, 2017

rlgreen91 commented Oct 10, 2019

trystant commented Oct 11, 2019

Collect data from Washington Post Database #2

Collect data from Washington Post Database #2

Comments

trystant commented May 19, 2017

rlgreen91 commented May 24, 2017

rlgreen91 commented Oct 10, 2019

trystant commented Oct 11, 2019