Tweet Sentiment Analyzer Service

A FastAPI-based web service that deploys a pre-trained sentiment extraction model from the Kaggle "Tweet Sentiment Extraction" competition. This service offers two deployment options:

Encapsulated FastAPI: Deploys the model directly within a FastAPI application.
NVIDIA Triton Inference Server: Uses Triton for optimized inference, with a FastAPI client as a proxy.

Overview

The Tweet Sentiment Extraction Service provides an API for extracting sentiment-based text from tweets. It uses a pre-trained RoBERTa model fine-tuned for sentiment extraction, inspired by Chris Deotte's approach from the Kaggle competition. The service is built using FastAPI, TensorFlow, and tokenizers, and it supports GPU acceleration through Docker and Triton Inference Server.

Project Structure

.
├── config                   # Model configuration files
├── data                     # Dataset files
├── docker                   # Docker configurations for deployment
│   ├── docker-compose.yml   # Docker Compose file for multi-container setup
│   ├── Dockerfile.encapsulated # Dockerfile for encapsulated FastAPI deployment
│   └── Dockerfile.triton    # Dockerfile for Triton-based deployment
├── environment.yml          # Conda environment file for encapsulated setup
├── models                   # Pre-trained model weights
├── report                   # Test and benchmark reports, including Report.md
├── requirements.txt         # Python requirements for Triton deployment
├── src                      # Source code for FastAPI application and utilities
├── static                   # HTML UI files
├── tests                    # Test suite for functionality and performance, includes Tests.md
├── triton_models            # Triton model repository
└── utils                    # Utility scripts (e.g., for model conversion)

Requirements

Python 3.10
CUDA-compatible GPU for Dockerized GPU acceleration
CUDA Toolkit compatible with TensorFlow and Triton, tested on Cuda11.8
Docker and NVIDIA Docker for GPU support
Docker Compose

Setup

Local Setup

Clone the repository:

git clone https://github.com/amd-rezaei/TweetSentimentExtractor.git
cd TweetSentimentExtractor

Setup Conda Environment (for encapsulated setup):

conda env create -f environment.yml
conda activate senta

Set Environment Variables (optional): Adjust any paths in .env to customize file locations if needed.

Docker Setup

The project uses two Dockerfiles: Dockerfile.encapsulated for a direct FastAPI-based deployment and Dockerfile.triton for a Triton-based deployment.

Encapsulated Docker Image:
- Based on NVIDIA CUDA 11.8 with CUDNN for TensorFlow support.
- Installs essential tools, Miniconda, and Python 3.10.
- Sets up the Conda environment specified in environment.yml.
- Entrypoint: start_encapsulated.sh, which initializes the FastAPI app.
Triton Docker Image:
- Based on NVIDIA Triton Inference Server with Python support.
- Installs supervisor for service management and creates a Python virtual environment for dependencies.
- Entrypoint: start_triton.sh, which starts the Triton server and the FastAPI proxy.

Built from scratch

To build the images using Docker Compose:

docker-compose -f docker/docker-compose.yml up --build

Use available images

Pull Images:

docker pull ahmadrezaei96/triton:latest
docker pull ahmadrezaei96/encapsulated:latest

Running the Service

Encapsulated FastAPI

This option deploys the model directly within FastAPI, providing straightforward inference without the additional layer of Triton Inference Server. This setup is best suited for direct model access and lower complexity.

Run Command

To deploy the encapsulated FastAPI service, use the following command:

docker-compose -f docker/docker-compose.yml up -d encapsulated

This will start the service, making it accessible at http://localhost:9001.

Triton Deployment

This option deploys the model using NVIDIA Triton Inference Server, optimized for high-performance model inference. A FastAPI client proxy is also set up to interact with Triton, separating the inference server and client layers.

Run Command

To deploy the service with Triton, use the following command:

docker-compose -f docker/docker-compose.yml up -d triton

This will start the service, making it accessible at http://localhost:9000

Docker Compose Commands

For streamlined deployment and management of both the encapsulated FastAPI and Triton-based services, Docker Compose can be used. The following commands help build, run, and tear down the services efficiently:

Build the Images without Cache:

docker-compose -f docker/docker-compose.yml build --no-cache

Run the Services in Detached Mode:

docker-compose -f docker/docker-compose.yml up -d

Stop and Remove Containers:

docker-compose -f docker/docker-compose.yml down

API Usage

Replace to "http://localhost:9000/predict" for Encapsulated version.

Endpoints

POST /predict: Extracts sentiment-based text from a tweet.
GET /: Returns the main HTML page.

Example Request

Request

curl -X POST "http://localhost:9001/predict" -H "Content-Type: application/json" -d '{"text": "I love the sunny weather!", "sentiment": "positive"}'

Response

{
  "text": "I love the sunny weather!",
  "selected_text": "love the sunny weather"
}

Testing

Testing with Docker Compose

To automate testing on container startup, set RUN_TESTS_ON_START=true in your docker-compose.yml file. When enabled, this will trigger the entrypoint to automatically run pytest on startup.

services:
  encapsulated:
    environment:
      - RUN_TESTS_ON_START=true

Manual Testing Commands

You can run tests manually within each container. Below are the commands for both the encapsulated and Triton containers.

1. Encapsulated FastAPI Testing

In the encapsulated FastAPI container, activate the Conda environment first, as pytest is installed within it. Here’s how:

docker exec -it <encapsulated_container_name> /bin/bash
source /opt/conda/etc/profile.d/conda.sh
conda activate senta
pytest

2. Triton Deployment Testing

For the Triton container, you can directly use pytest if it’s installed globally or within a virtual environment. Access the container and run:

docker exec -it <triton_container_name> pytest

This verifies the functionality of the service in both deployment environments. More details can be found in tests/Tests.md.

Performance Measurement and Optimization

Key Optimizations

Latency Measurement: Tracks response time for /predict to identify bottlenecks.
Docker Image Optimization: Multi-stage builds reduce image size and improve deployment time.
Model Warm-Up: Initial inference at startup minimizes first-request latency.

Additional Conceptual Optimizations

Batch Processing: Batching reduces redundant computations for high-throughput scenarios.
TensorRT Conversion: Improves inference speed and reduces memory usage with TensorRT.
Cache Frequent Requests: Caches common queries to reduce repeated computation.

Future Optimizations

Enhanced Concurrency and Dynamic Batching: For Triton, enabling dynamic batching optimizes handling of high volumes of concurrent requests. FastAPI's asynchronous design already supports concurrency, but additional tuning can maximize connection limits.
Mixed Precision: Using FP16 precision reduces memory usage and improves processing speed.
Distributed Model Serving: Load balancing across instances or GPUs for high traffic.
Model Distillation: Creates lighter model versions for faster inference on limited resources.

Reports

Performance and benchmark comparisons between TensorFlow and TensorRT can be found in report/Report.md, with additional test insights in report_test_encapsulated.txt and report_test_triton.txt.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Tweet Sentiment Analyzer Service

Table of Contents

Overview

Project Structure

Requirements

Setup

Local Setup

Docker Setup

Running the Service

Encapsulated FastAPI

Run Command

Triton Deployment

Run Command

Docker Compose Commands

API Usage

Endpoints

Example Request

Request

Response

Testing

Testing with Docker Compose

Manual Testing Commands

1. Encapsulated FastAPI Testing

2. Triton Deployment Testing

Performance Measurement and Optimization

Key Optimizations

Additional Conceptual Optimizations

Future Optimizations

Reports

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Tweet Sentiment Analyzer Service

Table of Contents

Overview

Project Structure

Requirements

Setup

Local Setup

Docker Setup

Running the Service

Encapsulated FastAPI

Run Command

Triton Deployment

Run Command

Docker Compose Commands

API Usage

Endpoints

Example Request

Request

Response

Testing

Testing with Docker Compose

Manual Testing Commands

1. Encapsulated FastAPI Testing

2. Triton Deployment Testing

Performance Measurement and Optimization

Key Optimizations

Additional Conceptual Optimizations

Future Optimizations

Reports

License