This project consists of several Python scripts and a build script to fetch, process, and analyze data from various sources using Elasticsearch.
- Prerequisites
- Installation
- Usage
- Scripts
- Build Script
- Dependencies
- API Endpoints
- README
- Geo-data Reference
- Elasticsearch
- Elasticsearch - Jupyter
- Mastodon
- Testing
Ensure you have the following installed:
- Python 3.7+
- Pip
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Install the required Python packages:
pip3 install -r requirements.txt
To run any of the scripts, use the following command:
python3 <script_name>.py
Fetches accident data from the Elasticsearch index with support for scroll functionality.
Fetches liquor data from the Elasticsearch index.
Fetches data from the Mastodon social network index in Elasticsearch.
Fetches sensor data from the Elasticsearch index with support for scroll functionality.
Fetches weather data from the Elasticsearch index with support for scroll functionality.
Fetches all data from a specified Fission function using the scroll API.
The build.sh
script installs the required packages and prepares the deployment package.
#!/bin/sh
pip3 install -r ${SRC_PKG}/requirements.txt -t ${SRC_PKG} && cp -r ${SRC_PKG} ${DEPLOY_PKG}
Run the build script using:
sh build.sh
The project dependencies are listed in requirements.txt
:
elasticsearch8==8.11.0
Install them using:
pip3 install -r requirements.txt
The following API endpoints are available for fetching data:
/liquor/data
/accidents/data
/mastodon/data
/sensors/data
/weather/data
eg. curl "http://127.0.0.1:9090/sensors/data" | jq '.'
Parameters
- scroll_id: scroll_id is used to locate the documents (Can be None when first call)
- size: setting the size of returned documents (default = 5000)
eg. curl "http://127.0.0.1:9090/sensors/data?size=100" | jq '.'
Return
- data
- scroll_id: used for next search
This repository contains several Python scripts for creating and inserting data into Elasticsearch indices. The scripts use the Elasticsearch Python client to interact with the Elasticsearch service. Each script serves a specific purpose, such as creating indices or bulk inserting data from JSON or CSV files.
- Python 3.x
- Elasticsearch
- Python packages:
elasticsearch
,python-dotenv
-
Clone the repository.
-
Install the required Python packages:
pip install elasticsearch python-dotenv
-
Ensure Elasticsearch is running and accessible.
Creates an index for Mastodon social data in Elasticsearch.
Inserts Mastodon social data into the created Elasticsearch index from JSON files.
Inserts filtered Mastodon social data into Elasticsearch.
Creates an index for pedestrian counting data.
Inserts pedestrian counting data from a CSV file into the Elasticsearch index.
Creates an index for weather station data.
Inserts weather station data from a CSV file into the Elasticsearch index.
Inserts liquor license data from a CSV file into Elasticsearch.
Inserts road crash data from a CSV file into Elasticsearch.
Creates an index for road crash data.
Creates an index for liquor license data.
Creates an index for filtered Mastodon social data.
- Create the necessary indices by running the respective
create_*
scripts. - Insert data by running the
insert_*
scripts with the appropriate data files.
To create and populate the Mastodon social data index:
python create_mastodon.py
python insert_mastodon.py
Create a .env
file in the root directory with the following content:
ELASTIC_PSW=your_elasticsearch_password
Ensure to replace your_elasticsearch_password
with your actual Elasticsearch password.
This project is licensed under the MIT License.
For more details, refer to the individual script files.
This repository contains several Python scripts for creating and inserting data into Elasticsearch indices. The scripts use the Elasticsearch Python client to interact with the Elasticsearch service. Each script serves a specific purpose, such as creating indices or bulk inserting data from JSON or CSV files.
- Python 3.x
- Elasticsearch
- Python packages:
elasticsearch
,python-dotenv
-
Clone the repository.
-
Install the required Python packages:
pip install elasticsearch python-dotenv
-
Ensure Elasticsearch is running and accessible.
Creates an index for Mastodon social data in Elasticsearch.
Inserts Mastodon social data into the created Elasticsearch index from JSON files.
Inserts filtered Mastodon social data into Elasticsearch.
Creates an index for pedestrian counting data.
Inserts pedestrian counting data from a CSV file into the Elasticsearch index.
Creates an index for weather station data.
Inserts weather station data from a CSV file into the Elasticsearch index.
Inserts liquor license data from a CSV file into Elasticsearch.
Inserts road crash data from a CSV file into Elasticsearch.
Creates an index for road crash data.
Creates an index for liquor license data.
Creates an index for filtered Mastodon social data.
- Create the necessary indices by running the respective
create_*
scripts. - Insert data by running the
insert_*
scripts with the appropriate data files.
To create and populate the Mastodon social data index:
python create_mastodon.py
python insert_mastodon.py
Create a .env
file in the root directory with the following content:
ELASTIC_PSW=your_elasticsearch_password
Ensure to replace your_elasticsearch_password
with your actual Elasticsearch password.
This project is licensed under the MIT License.
For more details, refer to the individual script files.
This folder contains the front-end of the project which is a Jupyter Notebook showcasing the visualisation for the data that we have collected.
- The relationship between the location of car crashes and location of pubs.
- The relationship between the weather and the flow of pedestrian.
- How could the weather affect people's mood and emotion.
- The changes of the sentiment score over a year.
This file generates a world cloud based on the data that we have retrieved from Mastodon
To run this file, install the packages below first:
pip install wordcloud
pip install matplotlib
pip install nltk
This project is designed to collect and process data from various Mastodon servers, specifically focusing on the aus. Social and mastodon. Social servers. It aims to provide insights into the usage patterns and trends within these communities.
harvester.py
: Script for collecting data from specified Mastodon servers.request.py
: Handles API requests.server.py
: Sets up and runs the server.cleaner.py
: Cleans and processes the data.token.json
: Stores tokens for accessing Mastodon servers.MASTODON_SOCIAL_TOKEN
: Token for the mastodon.social server.MASTODON_AU_TOKEN
: Token for the aus.social server.
- Ensure Python and all necessary dependencies are installed.
- Configure the
token.json
file, ensuring the tokens are up-to-date.
Navigate to the project directory in the command line and run the following command to start the server:
python harvester.py
This repository contains unit test Python scripts for creating data and inserting it into Elasticsearch indices. These scripts simulate a server using the mock_client library and then perform operational checks against the simulated server.
- Python 3.x
- Elasticsearch
- Python packages:
elasticsearch
,python-dotenv
- Python packages:
unittest
-
Clone the repository.
-
Install the required packages:
./install.sh
-
Ensure Elasticsearch is running and accessible.
Tests if the server is running successfully or not, and echoes the state code.
Runs all available unit tests for the Elasticsearch server.
To create and populate the Mastodon social data index:
./UT.sh
./test_api.sh
Create a .env
file in the root directory with the following content:
ELASTIC_PSW=your_elasticsearch_password
Ensure to replace your_elasticsearch_password
with your actual Elasticsearch password.
This project is licensed under the MIT License.
For more details, refer to the individual script files.