This repository contains a Natural Language Understanding (NLU) engine developed using Ludwig AI, a high-level, declarative machine learning framework. Ludwig streamlines the process of training and serving machine learning models, enabling rapid prototyping and development without requiring extensive programming skills. For more details, please refer to https://ludwig.ai/latest/.
The Hexabot Ludwig NLU engine is designed to process and analyze text input, extracting key information such as intent, entities, and sentiment to facilitate downstream tasks like chatbot interactions.
This project is ideal for developers, data scientists, and organizations looking to build efficient and scalable NLU capabilities for chatbots, virtual assistants, or other intelligent systems.
- Simplified Configuration: Configure and customize models effortlessly using YAML files, eliminating the need for complex coding.
- Scalable and Flexible: Easily adaptable to various domains and use cases, ensuring efficient performance across diverse datasets and requirements.
- Versatile Task Support: Capable of handling a wide range of NLU tasks, including multi-class classification, slot filling, and intent recognition.
- Python 3.8 or higher
- Ludwig AI
- GPU (optional but recommended for faster training)
- Docker
The repository uses environment variables to configure training and serving processes. Two example .env
files are provided:
.env.train.example:
Used for training models..env.serve.example:
Used for model serving and inference
Set up your virtual evironment by running the following commands:
python3 -m venv venv
source venv/bin/activate
Install the necessary dependencies by running the following command:
pip install -r requirements.txt
Train your own model locally using the following command:
ludwig experiment --config /src/config.yaml
--dataset /data/train.csv
--output_directory /results
Test out your trained model using the following command. Please remember to adjust the path to your model accordingly
by modifying the model_path
argument in the command below.
ludwig predict
--model_path /results/experiment_run_0/model
--dataset /data/predict.csv
--output_directory /predictions
Visualize key metrics for your trained model using the following command. Please remember to adjust the path to your model accordingly
by modifying the training_statistics
argument in the command below.
ludwig visualize --visualization learning_curves
--ground_truth_metadata /results/experiment_run_0/model/training_set_metadata.json
--training_statistics /results/experiment_run_0/training_statistics.json
--file_format png
--output_directory /results/visualizations
Set up a serve API locally using the following command. Please remember to adjust the path to your model accordingly
by modifying the model_path
argument in the command below.
ludwig serve --model_path /results/experiment_run_0/model
The model's name is set as an environment variable. Please modify it accordingly. Remember to adjust the path to your dataset aswell.
docker compose -f docker-compose.train.yml up
Use the following command to test your trained model in a dockerized environment.
docker compose -f docker-compose.predict.yml up
Visualize key metrics for your trained model in a dockerized environment using the following command.
docker compose -f docker-compose.visualize.yml up
Set up a serve API in a dockerized environment using the following command. There are two supported modes for serving models. Either you download HuggingFace models or you use your custom locally trained models for inference.
Please remember to adjust your configuration accordingly and set up the correct environment variables.
docker compose -f docker-compose.serve-hf.yml up
Please remember to adjust your configuration accordingly and set up the correct environment variables.
docker compose -f docker-compose.serve-local.yml up
You can upload your trained models to the Hugging Face Hub to make them publicly accessible or to share them with collaborators. The Hugging Face Command Line Interface (CLI) simplifies the process.
Set up a SSH key for Huggingface first. For further instructions, please refer to https://huggingface.co/docs/hub/security-git-ssh
Create a new repository using HuggingFace's CLI.
huggingface-cli repo create <repo-name>
Clone your newly created repository using the following command
git clone [email protected]:<username>/<repo-name>.git
cd <repo-name>
Set up Git LFS (Large File Storage) to manage large files (e.g., model weights).
git lfs install
git lfs track "model/model_weights"
Copy all your model files to the cloned repository directory:
git add .
git commit -m "Upload trained model with weights"
git push