Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect data from Washington Post Database #2

Open
trystant opened this issue May 19, 2017 · 3 comments
Open

Collect data from Washington Post Database #2

trystant opened this issue May 19, 2017 · 3 comments
Assignees

Comments

@trystant
Copy link

It looks like the Washington Post data can be retrieved from this spreadsheet and also available from this Github repository. We need to determine what to fetch, how to compare it to the data we have and come up with a long term strategy for it.

@rlgreen91
Copy link

Ok, so in taking a look at the data, it seems that all we would need to do is grab the name, date of incident, race, gender, city, and state. Then, we need to search for a matching case - if none exists, then we create one with the appropriate data.

I'm wondering how we handle searching. Right now, there are definitely aspects of our search that can be approved. Additionally, cases can be started with "incomplete" information - without an age, race, or state. Additionally, names can vary when people enter the information. We will basically need an API that we can use to both search the cases as well as add a new case. With adding a new case, we'll have to rethink our form, including the "nested" aspect of adding a subject, and its current requirements to avoid data discrepancies.

I think the best thing we can do next is to address the search capabilities of our website. After that, we can develop the API, then finally write the utility to go through the csv files and add stuff where necessary. I'd be interested to know what you think.

@rlgreen91
Copy link

It's been a bit since I checked in on this repo, so...here we are :) . Here's an update:

We did separate search into its own type of domain, so that should be enough to get started. We'll keep working on augmenting that so we'll get better at searching.

Next is fixing our form for cases - specifically, what is and isn't required to create a case. We really need to figure out how to transition to a single subject per case. Once we do that, we can hopefully reduce some of the fields and then have a cleaner interface for the API.

@trystant
Copy link
Author

It's been a bit since anything was done on this repo, but we do want to be able to grab data from other places in an automated fashion. That's a bit separate from fixing our case form and searching for cases indexed with elasticsearch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants