Skip to content

Files

Latest commit

 

History

History

ppl

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Perplexity

This folder contains implementations to measure model's per-token perplexity.

Table of Contents

Overview

The folder contains code for model perplexity measurement. Amber and Crystal models are currently supported.

Directory Structure

single_ckpt_ppl_eval.py is the main entrypoint for calculating perplexity on a single model. It uses python modules in utils/ folder.

The utils/ folder contains helper functions for model/dataset IO:

  • data_utils.py: Dataset IO utils
  • model_utils.py: Model loader

We provide a sample dataset at ./data/wikitext.txt, which contains a 1,000-line random sample from the wikitext-2-v1 train split. By default, the perplexity results are saved in ./results.josn.

Installation

  1. Clone and enter the folder:
    git clone https://github.com/LLM360/Analysis360.git
    cd Analysis360/analysis/metrics/ppl
  2. Install dependencies:
    pip install -r requirements.txt

Quick Start

Perplexity evaluation

An example usage is provided in the demo.ipynb, which can be executed with a single A100 80G GPU.