Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add native support for toxicity detection guardrail microservice #1258

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b8a4f9f
add opea native support for toxic-prompt-roberta
daniel-de-leon-user293 Jan 30, 2025
075c7bb
add test script back
daniel-de-leon-user293 Jan 31, 2025
a154df2
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 5, 2025
314b1e6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 5, 2025
34b5278
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 5, 2025
518e33e
add comp name env variable
daniel-de-leon-user293 Feb 5, 2025
3ba48bb
set default port to 9090
daniel-de-leon-user293 Feb 5, 2025
425e6a8
add service to compose
daniel-de-leon-user293 Feb 5, 2025
350cdb5
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 5, 2025
ba1f075
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 6, 2025
4f75aa2
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 7, 2025
81ad81c
Merge branch 'main' into daniel/update-guardrails-docs
lvliang-intel Feb 9, 2025
6bb07d0
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 10, 2025
80d9ada
removed debug print
daniel-de-leon-user293 Feb 10, 2025
c99ebef
remove triton version because habana updated
daniel-de-leon-user293 Feb 11, 2025
5b755ec
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 12, 2025
4eac0b1
Merge branch 'main' into daniel/update-guardrails-docs
ashahba Feb 13, 2025
8589615
add locust results
daniel-de-leon-user293 Feb 14, 2025
b0fcd89
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 14, 2025
32c75ab
Merge branch 'main' into daniel/update-guardrails-docs
ashahba Feb 14, 2025
0351f23
Merge branch 'main' into daniel/update-guardrails-docs
daniel-de-leon-user293 Feb 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions comps/guardrails/deployment/docker_compose/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,19 @@ services:
HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN}
restart: unless-stopped

# toxicity detection service
guardrails-toxicity-detection-server:
image: ${REGISTRY:-opea}/guardrails-toxicity-detection:${TAG:-latest}
container_name: guardrails-toxicity-detection-server
ports:
- "${TOXICITY_DETECTION_PORT:-9090}:9090"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped

# factuality alignment service
guardrails-factuality-predictionguard-server:
image: ${REGISTRY:-opea}/guardrails-factuality-predictionguard:${TAG:-latest}
Expand Down Expand Up @@ -130,6 +143,7 @@ services:
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
PREDICTIONGUARD_API_KEY: ${PREDICTIONGUARD_API_KEY}
TOXICITY_DETECTION_COMPONENT_NAME: "PREDICTIONGUARD_TOXICITY_DETECTION"
qgao007 marked this conversation as resolved.
Show resolved Hide resolved
restart: unless-stopped

networks:
Expand Down
82 changes: 67 additions & 15 deletions comps/guardrails/src/toxicity_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,52 @@

## Introduction

Toxicity Detection Microservice allows AI Application developers to safeguard user input and LLM output from harmful language in a RAG environment. By leveraging a smaller fine-tuned Transformer model for toxicity classification (e.g. DistilledBERT, RoBERTa, etc.), we maintain a lightweight guardrails microservice without significantly sacrificing performance making it readily deployable on both Intel Gaudi and Xeon.
Toxicity Detection Microservice allows AI Application developers to safeguard user input and LLM output from harmful language in a RAG environment. By leveraging a smaller fine-tuned Transformer model for toxicity classification (e.g. DistillBERT, RoBERTa, etc.), we maintain a lightweight guardrails microservice without significantly sacrificing performance. This [article](https://huggingface.co/blog/daniel-de-leon/toxic-prompt-roberta) shows how the small language model (SLM) used in this microservice performs as good, if not better, than some of the most popular decoder LLM guardrails. This microservice uses [`Intel/toxic-prompt-roberta`](https://huggingface.co/Intel/toxic-prompt-roberta) that was fine-tuned on Gaudi2 with ToxicChat and Jigsaw Unintended Bias datasets.

This microservice uses [`Intel/toxic-prompt-roberta`](https://huggingface.co/Intel/toxic-prompt-roberta) that was fine-tuned on Gaudi2 with ToxicChat and Jigsaw Unintended Bias datasets.
In addition to showing promising toxic detection performance, the table below compares a [locust](https://github.com/locustio/locust) stress test on this microservice and the [LlamaGuard microservice](https://github.com/opea-project/GenAIComps/blob/main/comps/guardrails/src/guardrails/README.md#LlamaGuard). The input included varying lengths of toxic and non-toxic input over 200 seconds. A total of 50 users are added in the first 100 seconds, while the last 100 seconds the number of users stayed constant. It should also be noted that the LlamaGuard microservice was deployed on a Gaudi2 card while the toxicity detection microservice was deployed on a 4th generation Xeon.

Toxicity is defined as rude, disrespectful, or unreasonable language likely to make someone leave a conversation. This can include instances of aggression, bullying, targeted hate speech, or offensive language. For more information on labels see [Jigsaw Toxic Comment Classification Challenge](http://kaggle.com/c/jigsaw-toxic-comment-classification-challenge).
| Microservice | Request Count | Median Response Time (ms) | Average Response Time (ms) | Min Response Time (ms) | Max Response Time (ms) | Requests/s | 50% | 95% |
| :----------------- | ------------: | ------------------------: | -------------------------: | ---------------------: | ---------------------: | ---------: | ---: | ---: |
| LG | 2099 | 3300 | 2718 | 81 | 4612 | 10.5 | 3300 | 4600 |
| Toxicity Detection | 4547 | 450 | 796 | 19 | 10045 | 22.7 | 450 | 2500 |

This microservice is designed to detect toxicity, which is defined as rude, disrespectful, or unreasonable language likely to make someone leave a conversation. This can include instances of aggression, bullying, targeted hate speech, or offensive language. For more information on labels see [Jigsaw Toxic Comment Classification Challenge](http://kaggle.com/c/jigsaw-toxic-comment-classification-challenge).

## Environment Setup

### Clone OPEA GenAIComps and Setup Environment

Clone this repository at your desired location and set an environment variable for easy setup and usage throughout the instructions.

```bash
git clone https://github.com/opea-project/GenAIComps.git

export OPEA_GENAICOMPS_ROOT=$(pwd)/GenAIComps
```

Set the port that this service will use and the component name

```
export TOXICITY_DETECTION_PORT=9090
export TOXICITY_DETECTION_COMPONENT_NAME="OPEA_NATIVE_TOXICITY"
```

By default, this microservice uses `OPEA_NATIVE_TOXICITY` which invokes [`Intel/toxic-prompt-roberta`](https://huggingface.co/Intel/toxic-prompt-roberta), locally.

Alternatively, if you are using Prediction Guard, reset the following component name environment variable:

```
export TOXICITY_DETECTION_COMPONENT_NAME="PREDICTIONGUARD_TOXICITY_DETECTION"
```

### Set environment variables

## 🚀1. Start Microservice with Python(Option 1)

### 1.1 Install Requirements

```bash
cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/src/toxicity_detection
pip install -r requirements.txt
```

Expand All @@ -24,27 +59,42 @@ python toxicity_detection.py

## 🚀2. Start Microservice with Docker (Option 2)

### 2.1 Prepare toxicity detection model
### 2.1 Build Docker Image

export HUGGINGFACEHUB_API_TOKEN=${HP_TOKEN}
```bash
cd $OPEA_GENAICOMPS_ROOT
docker build \
--build-arg https_proxy=$https_proxy \
--build-arg http_proxy=$http_proxy \
-t opea/guardrails-toxicity-detection:latest \
-f comps/guardrails/src/toxicity_detection/Dockerfile .
```

### 2.2 Build Docker Image
### 2.2.a Run Docker with Compose (Option A)

```bash
cd ../../../ # back to GenAIComps/ folder
docker build -t opea/guardrails-toxicity-detection:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/src/toxicity_detection/Dockerfile .
cd $OPEA_GENAICOMPS_ROOT/comps/guardrails/deployment/docker_compose
docker compose up -d guardrails-toxicity-detection-server
```

### 2.3 Run Docker Container with Microservice
### 2.2.b Run Docker with CLI (Option B)

```bash
docker run -d --rm --runtime=runc --name="guardrails-toxicity-detection-endpoint" -p 9091:9091 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} opea/guardrails-toxicity-detection:latest
qgao007 marked this conversation as resolved.
Show resolved Hide resolved
docker run -d --rm \
--name="guardrails-toxicity-detection-server" \
--runtime=runc \
-p ${TOXICITY_DETECTION_PORT}:9090 \
--ipc=host \
-e http_proxy=$http_proxy \
-e https_proxy=$https_proxy \
-e no_proxy=${no_proxy} \
opea/guardrails-toxicity-detection:latest
```

## 🚀3. Get Status of Microservice

```bash
docker container logs -f guardrails-toxicity-detection-endpoint
docker container logs -f guardrails-toxicity-detection-server
```

## 🚀4. Consume Microservice Pre-LLM/Post-LLM
Expand All @@ -54,9 +104,9 @@ Once microservice starts, users can use examples (bash or python) below to apply
**Bash:**

```bash
curl localhost:9091/v1/toxicity
-X POST
-d '{"text":"How to poison my neighbor'\''s dog without being caught?"}'
curl localhost:${TOXICITY_DETECTION_PORT}/v1/toxicity \
-X POST \
-d '{"text":"How to poison my neighbor'\''s dog without being caught?"}' \
-H 'Content-Type: application/json'
```

Expand All @@ -71,9 +121,11 @@ Example Output:
```python
import requests
import json
import os

toxicity_detection_port = os.getenv("TOXICITY_DETECTION_PORT")
proxies = {"http": ""}
url = "http://localhost:9091/v1/toxicity"
url = f"http://localhost:{toxicty_detection_port}/v1/toxicity"
data = {"text": "How to poison my neighbor'''s dog without being caught?"}


Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import asyncio
import os

from transformers import pipeline

from comps import CustomLogger, OpeaComponent, OpeaComponentRegistry, ServiceType, TextDoc

logger = CustomLogger("opea_toxicity_native")
logflag = os.getenv("LOGFLAG", False)


@OpeaComponentRegistry.register("OPEA_NATIVE_TOXICITY")
class OpeaToxicityDetectionNative(OpeaComponent):
"""A specialized toxicity detection component derived from OpeaComponent."""

def __init__(self, name: str, description: str, config: dict = None):
super().__init__(name, ServiceType.GUARDRAIL.name.lower(), description, config)
self.model = os.getenv("TOXICITY_DETECTION_MODEL", "Intel/toxic-prompt-roberta")
self.toxicity_pipeline = pipeline("text-classification", model=self.model, tokenizer=self.model)
health_status = self.check_health()
if not health_status:
logger.error("OpeaToxicityDetectionNative health check failed.")

async def invoke(self, input: TextDoc):
"""Invokes the toxic detection for the input.

Args:
input (Input TextDoc)
"""
toxic = await asyncio.to_thread(self.toxicity_pipeline, input.text)
if toxic[0]["label"].lower() == "toxic":
return TextDoc(text="Violated policies: toxicity, please check your input.", downstream_black_list=[".*"])
else:
return TextDoc(text=input.text)

def check_health(self) -> bool:
"""Checks the health of the animation service.

Returns:
bool: True if the service is reachable and healthy, False otherwise.
"""
if self.toxicity_pipeline:
return True
else:
return False
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@

import os
import time

from integrations.predictionguard import OpeaToxicityDetectionPredictionGuard
from typing import Union

from comps import (
CustomLogger,
Expand All @@ -21,7 +20,17 @@
logger = CustomLogger("opea_toxicity_detection_microservice")
logflag = os.getenv("LOGFLAG", False)

toxicity_detection_component_name = os.getenv("TOXICITY_DETECTION_COMPONENT_NAME", "PREDICTIONGUARD_TOXICITY_DETECTION")
toxicity_detection_port = int(os.getenv("TOXICITY_DETECTION_PORT", 9090))
toxicity_detection_component_name = os.getenv("TOXICITY_DETECTION_COMPONENT_NAME", "OPEA_NATIVE_TOXICITY")

if toxicity_detection_component_name == "OPEA_NATIVE_TOXICITY":
from integrations.toxicdetection import OpeaToxicityDetectionNative
elif toxicity_detection_component_name == "PREDICTIONGUARD_TOXICITY_DETECTION":
from integrations.predictionguard import OpeaToxicityDetectionPredictionGuard
else:
logger.error(f"Component name {toxicity_detection_component_name} is not recognized")
exit(1)

# Initialize OpeaComponentLoader
loader = OpeaComponentLoader(
toxicity_detection_component_name,
Expand All @@ -35,12 +44,12 @@
service_type=ServiceType.GUARDRAIL,
endpoint="/v1/toxicity",
host="0.0.0.0",
port=9090,
port=toxicity_detection_port,
input_datatype=TextDoc,
output_datatype=ScoreDoc,
output_datatype=Union[TextDoc, ScoreDoc],
)
@register_statistics(names=["opea_service@toxicity_detection"])
async def toxicity_guard(input: TextDoc) -> ScoreDoc:
async def toxicity_guard(input: TextDoc) -> Union[TextDoc, ScoreDoc]:
start = time.time()

# Log the input if logging is enabled
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

set -x

WORKPATH=$(dirname "$PWD")
ip_address=$(hostname -I | awk '{print $1}')

function build_docker_images() {
echo "Start building docker images for microservice"
cd $WORKPATH
docker build --no-cache -t opea/guardrails-toxicity-detection:comps --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/src/toxicity_detection/Dockerfile .
if [ $? -ne 0 ]; then
echo "opea/guardrails-toxicity-detection built fail"
exit 1
else
echo "opea/guardrails-toxicity-detection built successful"
fi
}

function start_service() {
echo "Starting microservice"
export toxicity_detection_port=9091
docker run -d --runtime=runc --name="test-comps-guardrails-toxicity-detection-endpoint" -p $toxicity_detection_port:$toxicity_detection_port --ipc=host -e TOXICITY_DETECTION_PORT=$toxicity_detection_port -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy-$no_proxy opea/guardrails-toxicity-detection:comps
sleep 25
echo "Microservice started"
}

function validate_microservice() {
echo "Validate microservice started"
echo "test 1 - toxic"
result=$(curl localhost:9091/v1/toxicity -X POST -d '{"text":"How to poison your neighbor'\''s dog secretly"}' -H 'Content-Type: application/json')
if [[ $result == *"Violated"* ]]; then
echo "Result correct."
else
docker logs test-comps-guardrails-toxicity-detection-endpoint
exit 1
fi
echo "test 2 - non-toxic"
result=$(curl localhost:9091/v1/toxicity -X POST -d '{"text":"How to write a paper on raising dogs?"}' -H 'Content-Type: application/json')
if [[ $result == *"paper"* ]]; then
echo "Result correct."
else
echo "Result wrong."
docker logs test-comps-guardrails-toxicity-detection-endpoint
exit 1
fi
echo "Validate microservice completed"
}

function stop_docker() {
cid=$(docker ps -aq --filter "name=test-comps-guardrails-toxicity-detection-endpoint")
echo "Shutdown legacy containers "$cid
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
}

function main() {

stop_docker

build_docker_images
start_service

validate_microservice

stop_docker
echo "cleanup container images and volumes"
echo y | docker system prune 2>&1 > /dev/null

}

main
Loading