Skip to content

Hugo-SEQUIER/prime-intellect-test

Repository files navigation

GPU Testing on PrimeIntellect

PrimeIntellect interface

Welcome to the PrimeIntellect GPU testing documentation! PrimeIntellect is an innovative aggregator platform that simplifies the use of virtual machines (VMs) equipped with PyTorch images, enabling users to efficiently set up and execute deep learning models across various GPU configurations.

Model Optimization Techniques: Pruning and Distillation

Pruning

Pruning aims to make models smaller and more efficient by reducing the number of operational elements within them. The key methods include:

  • Dropping Layers: Each layer in a neural network processes data differently, contributing to its ability to learn distinct aspects of the data. By selectively removing layers, the model can become leaner without significantly impacting performance.
  • Attention Heads: These are components of transformer models that help in parallel processing and understanding the context better rather than translating on a word-by-word basis. Pruning attention heads can streamline the model while maintaining its contextual awareness.
  • Embedding Channels: These are vector representations of data. Reducing the number of embedding channels can decrease the model's size and simplify the inputs it needs to process.

Distillation

Distillation is a technique used to transfer knowledge from a large language model (LLM) to a smaller language model (SLM). This method involves training the smaller model to mimic the behavior and predictions of the larger one, effectively condensing the knowledge without needing the same computational resources.

Understanding the Test Framework

GPU Test Result

The attached image provides a detailed overview of our testing metrics across different GPU models. For each GPU configuration, you can observe key parameters such as CPU, memory specifications, disk size, VRAM, and performance metrics such as time, RAM usage, accuracy, and cost associated with running specific deep learning models like Nvidia Minitron 8B, Mistral v0.3 (7B), Llama 3.1 (8B), and Gemma 2 (9B). This data is crucial for evaluating the efficiency and cost-effectiveness of deploying models on different GPU setups provided by PrimeIntellect.

Accuracy Comparison Table

I conducted an experiment where I performed inference on the MMLU (Massive Multitask Language Understanding) dataset using various models. The goal was to evaluate the accuracy of each model across a wide range of subjects. After obtaining the inference results, I applied a natural language processing (NLP) algorithm to compare the AI-generated answers with the expected correct answers from the dataset.

The comparison was done using a fuzzy string matching algorithm, which assesses the similarity between the AI's answer and the correct answer.

This function works by first preprocessing the answers to ensure consistency in comparison. It then calculates a similarity ratio using the fuzzy string matching technique. The function checks whether the AI's answer contains the correct answer, or vice versa, and whether the similarity ratio exceeds a threshold of 80%. Additionally, it verifies if the correct answer index is present in the AI's response.

By applying this algorithm, I was able to accurately determine the correctness of the AI-generated answers, allowing for a more nuanced evaluation of the model's performance on the MMLU dataset.

Subject/model Minitron Mistral Llama Gemma
abstract_algebra 0.364 0.545 0.455 0.273
anatomy 0.071 0.357 0.143 0.571
astronomy 0.063 0.313 0.063 0.375
business_ethics 0.091 0.727 0.091 0.273
clinical_knowledge 0.069 0.414 0.241 0.345
college_biology 0.063 0.375 0.250 0.500
college_chemistry 0.250 0.250 0.250 0.625
college_computer_science 0.091 0.273 0.182 0.273
college_mathematics 0.091 0.273 0.273 0.182
college_medicine 0.136 0.409 0.273 0.409
college_physics 0.182 0.273 0.364 0.273
computer_security 0.091 0.455 0.273 0.364
conceptual_physics 0.231 0.346 0.385 0.423
econometrics 0.083 0.667 0.333 0.333
electrical_engineering 0.188 0.375 0.313 0.375
elementary_mathematics 0.195 0.293 0.317 0.341
formal_logic 0.143 0.214 0.286 0.214
global_facts 0.100 0.200 0.500 0.200
high_school_biology 0.125 0.438 0.156 0.313
high_school_chemistry 0.091 0.182 0.091 0.182
high_school_computer_science 0.111 0.778 0.111 0.111
high_school_european_history 0.056 0.556 0.111 0.556
high_school_geography 0.136 0.455 0.409 0.545
high_school_government_and_politics 0.238 0.476 0.381 0.476
high_school_macroeconomics 0.070 0.512 0.186 0.349
high_school_mathematics 0.069 0.448 0.414 0.241
high_school_microeconomics 0.115 0.385 0.115 0.269
high_school_physics 0.000 0.235 0.000 0.118
high_school_psychology 0.200 0.617 0.367 0.483
high_school_statistics 0.043 0.435 0.261 0.391
high_school_us_history 0.045 0.545 0.136 0.409
high_school_world_history 0.038 0.462 0.192 0.231
human_aging 0.087 0.348 0.391 0.348
human_sexuality 0.083 0.250 0.250 0.250
international_law 0.077 0.462 0.077 0.231
jurisprudence 0.091 0.364 0.091 0.182
logical_fallacies 0.056 0.500 0.278 0.278
machine_learning 0.364 0.545 0.273 0.455
management 0.182 0.455 0.455 0.636
marketing 0.280 0.560 0.440 0.640
medical_genetics 0.091 0.636 0.182 0.455
miscellaneous 0.430 0.628 0.593 0.500
moral_disputes 0.158 0.368 0.132 0.237
moral_scenarios 0.130 0.380 0.090 0.360
nutrition 0.152 0.273 0.242 0.394
philosophy 0.088 0.382 0.206 0.471
prehistory 0.029 0.343 0.114 0.343
professional_accounting 0.065 0.258 0.065 0.194
professional_law 0.012 0.165 0.076 0.271
professional_medicine 0.065 0.419 0.161 0.290
professional_psychology 0.087 0.493 0.116 0.333
public_relations 0.083 0.500 0.167 0.250
security_studies 0.037 0.222 0.037 0.296
sociology 0.045 0.409 0.045 0.409
us_foreign_policy 0.091 0.455 0.000 0.636
virology 0.222 0.389 0.278 0.167
world_religions 0.316 0.632 0.579 0.579
Moyenne 12.558 41.604 23.258 35.484

Connecting to the VM

How to Connect to the VM

After launching an instance on PrimeIntellect, follow these steps to connect to your VM:

  1. Download the Private Key: Once your VM is ready, download the private key provided by PrimeIntellect. This key is necessary to securely connect to your VM.
  2. Change Permissions on the Private Key: Before using the key, you must change its permissions to ensure that it is secure. Open a terminal on your computer and navigate to the directory where you downloaded the key. Then, execute the following command:
    chmod 400 [your-key-name].pem
  3. Connect to the VM: With the key's permissions set, you're ready to connect to the VM. In the same terminal window, use the connection command provided by PrimeIntellect. It will look something like this:
    ssh -i [your-key-name].pem ubuntu@[vm-ip-address]

Replace [your-key-name] with the name of your key file and [vm-ip-address] with the IP address provided for your VM.

Setting Up Your Test Environment

Once you've connected to a VM, setting up and running the test scripts is straightforward. Follow these steps:

  1. Clone the Repository
    • Open your VM's terminal.
    • Execute the following command to clone the repository containing the test scripts:
      git clone https://github.com/Hugo-SEQUIER/prime-intellect-test.git
  2. Prepare the Scripts
    • Ensure that all scripts in the cloned repository are executable by running:
      find prime-intellect-test -type f -name "*.sh" -exec chmod +x {} \;

Now all you have to do is navigate the project and run the training script.

cd prime-intellect-test
cd llama_8b
./training.sh

Conclusion

By following this guide, you can leverage PrimeIntellect's VMs to perform comprehensive benchmarks on different GPUs using the pre-configured PyTorch images. This will aid in making informed decisions about which GPU configuration best suits your deep learning tasks in terms of performance and cost efficiency.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published