The objective of this repository is to replicate, in python, this code, which was originally written in Lua.
This code reads in a .csv file and generates summaries of the columns. For numeric (Num) columns, this is median (the middle value of a sorted list of numbers thus far seen) and standard deviation (a measure of the spread of numbers); note that Num is a reservoir sampler which keeps only a finite quantity of numbers. For symbolic (Sym) columns, this is mode (the most common symbol) and entropy (the effort required to recreate a signal).
For an example .csv in the required format, check out \data\auto93.csv
- To install necessary packages, run
pip install -r requirements.txt
- To run the program, navigate to
\CSC510-HW_37
and runpython code\Csv.py
- Tests are contained in
\CSC510-HW_37\code\Tests.py
Example output is shown below:
The output of running \CSC510-HW_37\code\Tests.py
is shown below:
- M M Abid Naziri
- Nikhil Mehra
- Bella Samuelsson
- Parth Katlana
- Heidi Reichert