Name	Name	Last commit message	Last commit date
parent directory ..
A100	A100
GH200	GH200
H100	H100
MI250	MI250
MI300X	MI300X
Max1550	Max1550
README.md	README.md

Name

Last commit message

Last commit date

GH200

vLLM

llama.cpp is an open-source implementation of Meta's LLaMA architecture, written in C++. It is designed to facilitate efficient inference of large language models on various hardware platforms, including consumer-grade devices.

llama.cpp Github Repo
General Documentation for Installation

Platform Specific Instuctions and scripts used for LLM-Inference-Bench

Nvidia A100
Nvidia H100
Nvidia GH200
AMD MI250
AMD MI300X
Intel Max 1550

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama.cpp

llama.cpp

README.md

vLLM

Files

llama.cpp

Directory actions

More options

Directory actions

More options

Latest commit

History

llama.cpp

Folders and files

parent directory

README.md

vLLM