TT-Inference-Server

Tenstorrent Inference Server (tt-inference-server) is the repo of available model APIs for deploying on Tenstorrent hardware.

Official Repository

https://github.com/tenstorrent/tt-inference-server

Getting Started

Please follow setup instructions for the model you want to serve, Model Name in tables below link to corresponding implementation.

Note: models with Status [🔍 preview] are under active development. If you encounter setup or stability problems please file an issue and our team will address it.

LLMs

Model Name	Model URL	Hardware	Status	Minimum Release Version
Qwen2.5-72B-Instruct	HF Repo	TT-QuietBox & TT-LoudBox	🔍 preview	v0.0.2
Qwen2.5-72B	HF Repo	TT-QuietBox & TT-LoudBox	🔍 preview	v0.0.2
Qwen2.5-7B-Instruct	HF Repo	n150	🔍 preview	v0.0.2
Qwen2.5-7B	HF Repo	n150	🔍 preview	v0.0.2
Llama-3.3-70B-Instruct	HF Repo	TT-QuietBox & TT-LoudBox	✅ supported	v0.0.1
Llama-3.3-70B	HF Repo	TT-QuietBox & TT-LoudBox	✅ supported	v0.0.1
Llama-3.2-11B-Vision-Instruct	HF Repo	n300	🔍 preview	v0.0.1
Llama-3.2-11B-Vision	HF Repo	n300	🔍 preview	v0.0.1
Llama-3.2-3B-Instruct	HF Repo	n150	🔍 preview	v0.0.1
Llama-3.2-3B	HF Repo	n150	🔍 preview	v0.0.1
Llama-3.2-1B-Instruct	HF Repo	n150	🔍 preview	v0.0.1
Llama-3.2-1B	HF Repo	n150	🔍 preview	v0.0.1
Llama-3.1-70B-Instruct	HF Repo	TT-QuietBox & TT-LoudBox	✅ supported	v0.0.1
Llama-3.1-70B	HF Repo	TT-QuietBox & TT-LoudBox	✅ supported	v0.0.1
Llama-3.1-8B-Instruct	HF Repo	n150	✅ supported	v0.0.1
Llama-3.1-8B	HF Repo	n150	✅ supported	v0.0.1

CNNs

Model Name	Model URL	Hardware	Status	Minimum Release Version
YOLOv4	GH Repo	n150	🔍 preview	v0.0.1

Name		Name	Last commit message	Last commit date
Latest commit History 228 Commits
.github/workflows		.github/workflows
archive/tt-metal-mistral-7b		archive/tt-metal-mistral-7b
benchmarking		benchmarking
docs		docs
evals		evals
locust		locust
scripts		scripts
tests		tests
tt-metal-yolov4		tt-metal-yolov4
utils		utils
vllm-tt-metal-llama3		vllm-tt-metal-llama3
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
docker-entrypoint.sh		docker-entrypoint.sh
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TT-Inference-Server

Official Repository

Getting Started

LLMs

CNNs

About

Releases

Packages

Contributors 7

Languages

License

tenstorrent/tt-inference-server

Folders and files

Latest commit

History

Repository files navigation

TT-Inference-Server

Official Repository

Getting Started

LLMs

CNNs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages