State Trust Lands Benefitting Prisons

This repository accompanies Grist's State Trust Lands Benefitting Prisons investigation. It allows users to build and modify the dataset underlying the project. For more details on our methodology, please view METHODOLOGY.md in our initial investigation. A user guide for working with similar datasets to the STLbp project is available at USER-GUIDE.md in the inital repo. Final built STLbp datasets are available in the public_data folder of this repo.

The investigation was written and reported by Alleen Brown, Clayton Aldern, and Maria Parazo Rose. This repository was authored by Parker Ziegler, Clayton Aldern, and Maria Parazo Rose.

Looking for the code behind the first story in this series? That repo is here. Looking for the code behind the interactives in the project? That repo is here.

Installation

Python Version Requirement

This project currently requires using Python <= 3.10.4. To set the appropriate Python version locally, consider using a Python version manager like pyenv.

Prerequisites

This project uses Git LFS to store .zip, .dbf, .shp, and .geojson files remotely on GitHub rather than directly in the repository source. In order to access these files and build the datasets locally, you'll need to do the following:

Install Git LFS. Follow the installation instructions for your operating system.
At the terminal, run the following two commands (without typing the dollar sign):

$ git lfs install
$ git lfs pull

Together, these commands will install the Git LFS configuration, fetch LFS changes from the remote, and replace pointer files locally with the actual data.

Dependencies

After completing the above, install Python dependencies. At the terminal, run the following command (again, omit the dollar sign):

$ pip install -e .

This only needs to be done the first time you run the script.

Building the datasets

All functionality is orchestrated by a single top-level command run.py. To see a listing of all available command options, run the following command:

$ DATA=data python run.py --help

run.py expects a particular directory structure for the datasets, which is already enforced by this repository's directory structure. As such, you should not move any files from their current locations.

State Trust Lands benefitting prisons (STLbp) Dataset

The STLbp dataset is built in stages, with the output of each stage becoming the input of the next stage. The following assumes your data directory is itself called data and all paths will refer to it as such.

Stage 1

To execute Stage 1, run the following command at the terminal:

$ DATA=data python run.py stl-stage-1

This command:

Gathers the raw data from both remote state-run servers and the input data directory.
Collects all fields-of-interest from the raw data and normalizes the naming across datasets.
Unifies all individual states' data into one large dataset.

Individual state datasets are written to data/stl_dataset/step_1/output/merged/<state-abbreviation>.[csv,geojson]. The unified dataset is written to data/stl_dataset/step_1/output/merged/all-states.[csv,geojson].

Stage 2

To execute Stage 2, run the following command at the terminal:

$ DATA=data PYTHONHASHSEED=42 python run.py stl-stage-2

This command matches activity information to the parcels from the unified multi-state dataset output in Stage 1 (data/stl_dataset/step_1/output/merged/all-states.[csv,geojson]). The data directory for Stage 2 already includes state-specific information about the activities occurring on all parcels.

The output of this stage is written to data/stl_dataset/step_2/output/stl_dataset_extra_activities.[csv, geojson]

Stage 2.5

Stage 2.5 involves enriching the unified dataset from Stage 2 (data/stl_dataset/step_2/output/stl_dataset_extra_activities.[csv, geojson]) with land-cession information for each parcel. In previous investigations, this step was a manual effort; it has since been automated. The new dataset will be named stl_dataset_extra_activities_plus_cessions.csv and located at data/stl_dataset/step_2_5/output/ (though, as evidenced by the Stage 3 input details, the title is not important to the code).

To execute Stage 2.5, run the following command at the terminal:

$ DATA=data python run.py stl-stage-2-5

Stage 3

To execute Stage 3, run the following command at the terminal:

$ DATA=data python run.py stl-stage-3

This command will take the first CSV found in data/stl_dataset/step_2_5/output/ and augment it with the prices paid for each parcel of land. The data directory for Stage 3 already contains a listing a CSV with the listing of prices paid for land (data/stl_dataset/step_3/input/Cession_Data.csv).

The output of this stage is written to data/stl_dataset/step_3/output/stl_dataset_extra_activities_plus_cessions_plus_prices.[csv, geojson]

Stage 4

To execute Stage 4, run the following command at the terminal:

$ DATA=data python run.py stl-stage-4

This command will calculate summaries connecting cessions to tribes using the output of Stage 3 (data/stl_dataset/step_3/output/stl_dataset_extra_activities_plus_cessions_plus_prices.[csv, geojson]).

The outputs of this stage are two files, written to:

data/stl_dataset/step_4/output/tribe-summary.csv
data/stl_dataset/step_4/output/tribe-summary-condensed.csv

Name		Name	Last commit message	Last commit date
Latest commit History 451 Commits
data		data
land_grab_2		land_grab_2
node_modules		node_modules
public_data		public_data
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

State Trust Lands Benefitting Prisons

Installation

Python Version Requirement

Prerequisites

Dependencies

Building the datasets

State Trust Lands benefitting prisons (STLbp) Dataset

Stage 1

Stage 2

Stage 2.5

Stage 3

Stage 4

About

Releases

Packages

Languages

License

Grist-Data-Desk/STLbp

Folders and files

Latest commit

History

Repository files navigation

State Trust Lands Benefitting Prisons

Installation

Python Version Requirement

Prerequisites

Dependencies

Building the datasets

State Trust Lands benefitting prisons (STLbp) Dataset

Stage 1

Stage 2

Stage 2.5

Stage 3

Stage 4

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages