Skip to content

Latest commit

 

History

History
78 lines (54 loc) · 2.81 KB

ReadMe.md

File metadata and controls

78 lines (54 loc) · 2.81 KB

Model Benchmarking Framework

This project provides a framework for benchmarking various models, including but not limited to Language Models (LLMs). The benchmarking process involves preparing datasets for classification measurement and generating classification metrics.

Table of Contents

Prerequisites

Before you begin, ensure you have met the following requirements:

  • You have installed Kotlin and Jupyter Notebook.
  • You have an OpenAI API key (if benchmarking LLM AgentRoutingSpec Resolver). Set the OPENAI_API_KEY environment variable with your OpenAI API key.
  • You have Ollama installed(if benchmarking Vector AgentRoutingSpec Resolver) on your local machine. The default model is "all-minilm".

Setup

  1. Clone the repository:

    git clone https://github.com/eclipse-lmos/lmos-router.git
    cd lmos-router
  2. Set the OPENAI_API_KEY environment variable (if applicable):

    export OPENAI_API_KEY=your_openai_api_key

Step 1: Preparing Dataset for Classification Measurement

The first step involves preparing the dataset by running the LLMResolverBenchmark.kt or VectorResolverBenchmark.kt script. This script reads an input CSV file, processes each record to generate predictions using the specified model, and writes the results to an output CSV file.

Step 2: Generating Classification Metrics

The second step involves using the prediction file generated in Step 1 to compute classification metrics. This is done using a Jupyter Notebook.

Notebook: benchmarks/benchmark.ipynb

  1. Open the Jupyter Notebook:

    jupyter notebook benchmarks/benchmark.ipynb
  2. Follow the instructions in the notebook to load the prediction file and generate classification metrics.

Usage

  1. Run the respective resolver script to generate the prediction file:

    • For LLM AgentRoutingSpec Resolver:
    kotlinc src/main/kotlin/llm/LLMResolverBenchmark.kt -include-runtime -d LLMResolverBenchmark.jar
    java -jar LLMResolverBenchmark.jar
    • For Vector AgentRoutingSpec Resolver:
    kotlinc src/main/kotlin/vector/VectorResolverBenchmark.kt -include-runtime -d VectorResolverBenchmark.jar
    java -jar VectorResolverBenchmark.jar
  2. Open the Jupyter Notebook to generate classification metrics:

    jupyter notebook benchmarks/benchmark.ipynb