Deploying a Web API with a Sentiment Analysis NLP Model with Kubernetes

Motivation

This project aims to run through the production phase of the MLOps life cycle by containerizing and deploying a web API with a machine learning model with Azure Kubernetes Service (AKS).

Application Features & Deployment Infrastructure

This application was created in FastAPI, where model inputs and outputs were defined with Pydantic. Outputs are also cached in redis for repeat inputs.

The application endpoints were tested locally with pytest.

The application was containerized in a Dockerfile, using python:3.11-slim as a base image.

The application's deployment was first tested locally with minikube. The deployment infrastructure is as follows:

After deploying to AKS, load testing of the application was performed with k6 and the application's traffic was monitored with grafana.

How to deploy the Application

Requirements

This app requires Azure authentication and access to the W255 organization on Azure.

AKS Deployment

Authenticate to Azure with [email protected] email.

az login --tenant berkeleydatasciw255.onmicrosoft.com

Authenticate to the AKS cluster.

az aks get-credentials --name w255-aks --resource-group w255 --overwrite-existing

Set the context appropriately for the AKS cluster.

kubectl config use-context w255-aks

kubectl config set-context --current --namespace=cynthiaxu04

Run the build-push.sh script to:

set the image prefix, aka the DNS normalized form of [email protected] email.
set the image name, project
set the ACR domain name, w255mids.azurecr.io
get the latest git commit hash, which is the image tag
build and push the latest docker image to ACR
pull the latest docker image from the ACR based on the image tag

Deploy to the AKS cluster.

kubectl apply -k .k8s/overlays/prod

Wait approximately 45 seconds for the pods to launch. Check that they are running and ready with the command:

kubectl get deployments

Running the Application

Once deployed, copy and past the given URL into your browser of choice:

https://cynthiaxu04.w255mids.com

Access the application endpoints by adding the following to the URL:

https://cynthiaxu04.w255mids.com/project/health

https://cynthiaxu04.w255mids.com/project/bulk-predict

To enter input for the sentiment analysis model, open a new terminal window and use the command:

curl -X 'POST' \
  'https://cynthiaxu04.mids255.com/project/bulk-predict' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "text": ["I am awesome", "This stinks"]
}'

The expected output is:

[ 
   [  {'label': 'POSITIVE', 'score': 0.9964176416397095}, 
      {'label': 'NEGATIVE', 'score': 0.003582375356927514}]
]

Any invalid inputs will generate an error.

Monitoring Application Traffic

To test the application performance, run the load test:

k6 run -e NAMESPACE=cynthiaxu04 load.js

Access Grafan with the command:

kubectl port-forward -n prometheus svc/grafana 3000:3000

With limits of cpu=1100m and memory=1Gi, the Grafana metrics for service and workload are:

P50 and P90 are less than 2 seconds. P99 is usually less than 3 seconds with occasional spikes to 4 seconds.

25 requests/s was also achieved.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.k8s		.k8s
mlapi		mlapi
README.md		README.md
deployment.png		deployment.png
example.py		example.py
grader.sh		grader.sh
load.js		load.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deploying a Web API with a Sentiment Analysis NLP Model with Kubernetes

Motivation

Application Features & Deployment Infrastructure

How to deploy the Application

Requirements

AKS Deployment

Running the Application

Monitoring Application Traffic

About

Contributors 2

Languages

cynthiaxu04/nlp-deploy-distilbert

Folders and files

Latest commit

History

Repository files navigation

Deploying a Web API with a Sentiment Analysis NLP Model with Kubernetes

Motivation

Application Features & Deployment Infrastructure

How to deploy the Application

Requirements

AKS Deployment

Running the Application

Monitoring Application Traffic

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages