Topical-Analysis

This repository houses code for the clustering over time project conducted by Samyu and Vasco in Spring 2021.

The structure of the code is as follows:

clustering.py

This Python file can be run using the following command structure: python3 clustering.py [tsv file to import data from] [number of clusters (k)] [elbow plot figurename]

The TSV file to import data from will need to be in the following format for this code to work: id | title

The number of clusters is selected before the code is run; the elbow plot is helpful in making this selection.

The code is commented to show each of its portions (reading in data, elbow plot creation-- which can be commented out if needed, running the k-means clustering algorithm and printing output) Output is printed to console but can be piped into a file with the following syntax: python3 clustering.py [tsv file to import data from] [number of clusters (k)] [elbow plot figurename] > output_filename

The data is outputted using the pandas print functionality in the following format: row_number | id | cluster

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
graph_data		graph_data
sql		sql
.gitignore		.gitignore
README.md		README.md
clustering.py		clustering.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Topical-Analysis

clustering.py

About

Releases

Packages

Contributors 2

Languages

Whole-Earth-Catalog/Topical-Analysis

Folders and files

Latest commit

History

Repository files navigation

Topical-Analysis

clustering.py

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages