This repo is for running inverse scaling examples. There is a colab set up for it, which you can find in the task spreadsheet.
To run on NYU:
- Follow these Getting Started instructions to get connected to Greene.
- Follow these Singularity instructions up until Install packages with the following differences:
- Instead of
cuda11.2-cudnn8-devel-ubuntu20.04.sif
, usecuda11.3.0-cudnn8-devel-ubuntu20.04.sif
- Instead of
overlay-7.5GB-300K.ext3.gz
useoverlay-10GB-400K.ext3
- Instead of
- Activate the Singularity image with the overlay
- Remember to run
source /ext3/env.sh
(or whatever you called it when setting up the image) to activate the Python environment.
- Remember to run
cd
to/ext3
and rungit clone https://github.com/naimenz/inverse-scaling-eval-pipeline
to get a copy of the code.- Run
pip install .
to install theinverse-scaling-eval-pipeline
package. - Run the command
python -m pip install torch==1.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
to install the correct version of PyTorch.
- Copy the
example.sbatch
script included under the/ext3/inverse-scaling-eval-pipeline/scripts
directory to somewhere outside the image, e.g. yourhome
orscratch
. - There are two options for pointing to your data:
- Put your data in
/ext3/inverse-scaling-eval-pipeline/data
and use the option--data
as in the script. - Put your data elsewhere and use the option
--dataset-path
to point to it.
- Put your data in
- For
--exp-dir
, give the absolute path of the directory you want the results to be saved in. - Remember to add the flag
--use-gpu
only for HuggingFace models (GPT-2, GPT-Neo) and to add the flag--batch-size n
(with n > 1) only for OpenAI API models (GPT-3) - Submit your
.sbatch
file as a job withsbatch example.sbatch
- Run the plotting file by activating the Singularity image and running
python /ext3/inverse-scaling-eval-pipeline/eval_pipeline/plot_loss.py </path/to/results/dir>
Let me know which parts of these instructions are incorrect/unclear!