Model Benchmarking Framework

This project provides a framework for benchmarking various models, including but not limited to Language Models (LLMs). The benchmarking process involves preparing datasets for classification measurement and generating classification metrics.

Prerequisites

Before you begin, ensure you have met the following requirements:

You have installed Kotlin and Jupyter Notebook.
You have an OpenAI API key (if benchmarking LLM AgentRoutingSpec Resolver). Set the OPENAI_API_KEY environment variable with your OpenAI API key.
You have Ollama installed(if benchmarking Vector AgentRoutingSpec Resolver) on your local machine. The default model is "all-minilm".

Setup

Clone the repository:

git clone https://github.com/eclipse-lmos/lmos-router.git
cd lmos-router

Set the OPENAI_API_KEY environment variable (if applicable):
```
export OPENAI_API_KEY=your_openai_api_key
```

Step 1: Preparing Dataset for Classification Measurement

The first step involves preparing the dataset by running the LLMResolverBenchmark.kt or VectorResolverBenchmark.kt script. This script reads an input CSV file, processes each record to generate predictions using the specified model, and writes the results to an output CSV file.

Step 2: Generating Classification Metrics

The second step involves using the prediction file generated in Step 1 to compute classification metrics. This is done using a Jupyter Notebook.

Notebook: `benchmarks/benchmark.ipynb`

Open the Jupyter Notebook:

jupyter notebook benchmarks/benchmark.ipynb

Follow the instructions in the notebook to load the prediction file and generate classification metrics.

Usage

Run the respective resolver script to generate the prediction file:

For LLM AgentRoutingSpec Resolver:

kotlinc src/main/kotlin/llm/LLMResolverBenchmark.kt -include-runtime -d LLMResolverBenchmark.jar
java -jar LLMResolverBenchmark.jar

For Vector AgentRoutingSpec Resolver:

kotlinc src/main/kotlin/vector/VectorResolverBenchmark.kt -include-runtime -d VectorResolverBenchmark.jar
java -jar VectorResolverBenchmark.jar

Open the Jupyter Notebook to generate classification metrics:
```
jupyter notebook benchmarks/benchmark.ipynb
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReadMe.md

ReadMe.md

Model Benchmarking Framework

Table of Contents

Prerequisites

Setup

Step 1: Preparing Dataset for Classification Measurement

Step 2: Generating Classification Metrics

Notebook: `benchmarks/benchmark.ipynb`

Usage

Files

ReadMe.md

Latest commit

History

ReadMe.md

File metadata and controls

Model Benchmarking Framework

Table of Contents

Prerequisites

Setup

Step 1: Preparing Dataset for Classification Measurement

Step 2: Generating Classification Metrics

Notebook: benchmarks/benchmark.ipynb

Usage

Notebook: `benchmarks/benchmark.ipynb`