Skip to content

Commit

Permalink
Release v2.0.0 (#64)
Browse files Browse the repository at this point in the history
What's New:

New Features & Enhancements
- Introduced Multistage Attack: We've added a novel `multistage_depth` parameter to the `start_testing()` fucntion, allowing users to specify the depth of a dialogue during testing, enabling more sophisticated and targeted LLM Red teaming strategies.
- Refactored Sycophancy Attack: The `sycophancy_test` has been renamed to `sycophancy`, transforming it into a multistage attack for increased effectiveness in uncovering model vulnerabilities.
- Enhanced Logical Inconsistencies Attack: The `logical_inconsistencies_test` has been renamed to `logical_inconsistencies` and restructured as a multistage attack to better detect and exploit logical weaknesses within language models.
- New Multistage Harmful Behavior Attack: Introducing `harmful_behaviour_multistage`, a more nuanced version of the original harmful behavior attack, designed for deeper penetration testing.
- Innovative System Prompt Leakage Attack: We've developed a new multistage attack, `system_prompt_leakage`, leveraging jailbreak examples from dataset to target and exploit model internals.

Improvements & Refinements
- Conducted extensive refactoring for improved code efficiency and maintainability across the framework.
- Made numerous small improvements and optimizations to enhance overall performance and user experience.

---------

Co-authored-by: Timur Nizamov <[email protected]>
Co-authored-by: Nikita Ivanov <[email protected]>
  • Loading branch information
3 people authored Jan 14, 2025
1 parent 35382c6 commit 0404080
Show file tree
Hide file tree
Showing 48 changed files with 2,377 additions and 1,266 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.1.1
current_version = 2.0.0
commit = False
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+))?
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -87,5 +87,7 @@ report.xml

# CMake
cmake-build-*/

*/artifacts/
/examples/chrome-data/
/venv/
84 changes: 46 additions & 38 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ pre-commit install

### Run tests

1) Go to tests/test_local_llamator.py
1) Go to `tests/test_local_llamator.py`.

2) Create .env from .env.example and fill in the necessary fields.
2) Create `.env` from `.env.example` and fill in the necessary fields.

3) Run the function to perform testing depending on your LLM client
3) Run the function to perform testing depending on your LLM client.

## Making Changes

Expand All @@ -62,21 +62,21 @@ pre-commit install
```bash
git checkout -b your-branch-name
```

2. Make your changes to the code and add or modify unit tests as necessary.

3. Run tests again
3. Run tests again.

4. Commit Your Changes
4. Commit Your Changes.

Keep your commits as small and focused as possible and include meaningful commit messages.
```bash
git add .
git commit -m "Add a brief description of your change"
```

5. Push the changes you did to GitHub
5. Push the changes you did to GitHub.

6.
```bash
git push origin your-branch-name
```
Expand All @@ -86,83 +86,91 @@ pre-commit install
The easist way to contribute to LLAMATOR project is by creating a new test!
This can be easily acheived by:

#### 1. Create a Test File
* Navigate to the attacks directory.
#### 1. Create a Test File:
* Navigate to the `attacks` directory.
* Create a new python file, naming it after the specific attack or the dataset it utilizes.

#### 2. Set Up Your File
#### 2. Set Up Your File.

The easiest way is to copy the existing attack (py file in the attacks directory)
and change the elements in it according to your implementation
and change the elements in it according to your implementation.

#### 3. Creating datasets with texts for attacks
For multi-stage attack implementation see "What Drives the Multi-stage?" notes in [docs](https://romiconez.github.io/llamator/attacks_description.html).

All files containing attack texts or prompts must be in parquet format.
#### 3. Creating datasets with texts for attacks.

These files are stored in the attack_data folder.
All files containing attack texts or prompts must be in `.parquet` format.

#### 3. Add your attack file name to the attack_loader.py file:
```text
from .attacks import (
dynamic_test,
translation,
typoglycemia,
dan,
These files are stored in the `attack_data` folder.

#### 3. Add your attack file name to the `attack_loader.py` file:
```python
from ..attacks import ( # noqa
aim,
self_refine,
ethical_compliance,
ucar,
base64_injection,
complimentary_transition,
dan,
ethical_compliance,
harmful_behavior,
base64_injection
linguistic,
logical_inconsistencies,
past_tense,
ru_dan,
ru_typoglycemia,
ru_ucar,
sycophancy,
typoglycemia,
ucar,
#TODO: YOUR TEST HERE
)
```

#### 4. Add your attack name to the initial_validation.py file:
```text
#### 4. Add your attack name to the docstring of `start_testing()` in `main.py` and `initial_validation.py` file:
```python
AvailableTests = [
"aim_jailbreak",
"base64_injection",
"complimentary_transition",
"do_anything_now_jailbreak",
"RU_do_anything_now_jailbreak",
"ethical_compliance",
"harmful_behavior",
"past_tense",
"linguistic_evasion",
"sycophancy_test",
"typoglycemia_attack",
"logical_inconsistencies",
"past_tense",
"RU_do_anything_now_jailbreak",
"RU_typoglycemia_attack",
"ucar",
"RU_ucar",
"sycophancy",
"typoglycemia_attack",
"ucar",
#TODO: YOUR TEST HERE
]
```

#### 5. Add your attack description to the attack_descriptions.json file:
#### 5. Add your attack to the `attack_descriptions.json` and `attack_descriptions.md` files.

#### 6. Open a PR! Submit your changes for review by opening a pull request.

## Submitting a pull request
## Submitting a pull request.

1. Update your branch
1. Update your branch.

Fetch any new changes from the base branch and rebase your branch.
```bash
git fetch origin
git rebase origin/main
```

2. Submit a Pull Request
2. Submit a Pull Request.

Go to GitHub and submit a pull request from your branch to the project main branch.

3. Request Reviews
3. Request Reviews.

Request reviews from other contributors listed as maintainers. If you receive a feedback - make any necessary changes and push them.

4. Merge
4. Merge.

Once your pull request is approved, it will be merged into the main branch.
51 changes: 37 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
# LLAMATOR

## Description 📖
Red Teaming python-framework for testing chatbots and LLM-systems

Red teaming python-framework for testing vulnerabilities of chatbots based on large language models (LLM). Supports testing of Russian-language RAG systems.
[![License: CC BY-NC-SA 4.0](https://img.shields.io/badge/License-CC_BY--NC--SA_4.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/llamator)](https://pypi.org/project/llamator)
[![PyPI](https://badge.fury.io/py/llamator.svg)](https://badge.fury.io/py/llamator)
[![Downloads](https://pepy.tech/badge/llamator)](https://pepy.tech/project/llamator)
[![Downloads](https://pepy.tech/badge/llamator/month)](https://pepy.tech/project/llamator)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

## Install 🚀

```bash
pip install llamator==1.1.1
pip install llamator==2.0.0
```

## Documentation 📚
Expand All @@ -16,29 +21,47 @@ Documentation Link: [https://romiconez.github.io/llamator](https://romiconez.git

## Examples 💡

* 📄 [RAG Chatbot testing via API (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-api.ipynb)
* 🧙‍♂️ [Gandalf bot testing via Selenium (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-selenium.ipynb)
* 💬 [Telegram bot testing via Telethon (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-telegram.ipynb)
* 📱 [WhatsApp client testing via Selenium (ENG)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-whatsapp.ipynb)
* 🔗 [LangChain client testing with custom attack (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-langchain-custom-attack.ipynb)
* 📄 [RAG bot testing via REST API](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-api.ipynb)
* 🧙‍♂️ [Gandalf web bot testing via Selenium](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-selenium.ipynb)
* 💬 [Telegram bot testing via Telethon](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-telegram.ipynb)
* 📱 [WhatsApp bot testing via Selenium](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-whatsapp.ipynb)
* 🔗 [LangChain client testing with custom attack](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-langchain-custom-attack.ipynb)

## Supported Clients 🛠️

* 🌐 All LangChain clients
* 🧠 OpenAI-like API
* ⚙️ Custom Class (Telegram, Selenium, etc.)
* ⚙️ Custom Class (Telegram, WhatsApp, Selenium, etc.)

## Unique Features 🌟

* 🛡️ Support for custom attacks from the user
* 📊 Results of launching each attack in CSV format
* 📈 Report with attack requests and responses for all tests in Excel format
* 📄 Test report document available in DOCX format
* ️🗡 Support for custom attacks from the user
* 👜 Large selection of attacks on RAG / Agent / Prompt in English and Russian
* 🛡 Custom configuration of chat clients
* 📊 History of attack requests and responses in Excel and CSV format
* 📄 Test report document in DOCX format

## OWASP Classification 🔒

* 💉 [LLM01: Prompt Injection and Jailbreaks](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM01_PromptInjection.md)
* 🕵 [LLM07: System Prompt Leakage](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM07_SystemPromptLeakage.md)
* 🎭 [LLM09: Misinformation](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM09_Misinformation.md)

## Community 🌍

* 📣 [Telegram Channel — AI Security Lab](https://t.me/aisecuritylab)
* 💬 [Telegram Chat — LLAMATOR | AI Red Team Community](https://t.me/llamator)

## Supported by 🚀

* [AI Security Lab ITMO](https://ai.itmo.ru/aisecuritylab)
* [Raft Security](https://raftds.ru/)
* [AI Talent Hub](https://ai.itmo.ru/)

## License 📜

© Roman Neronov, Timur Nizamov, Nikita Ivanov

This project is licensed under the terms of the **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International** license. See the [LICENSE](LICENSE) file for details.

[![Creative Commons License](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
[![Creative Commons License](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
Loading

0 comments on commit 0404080

Please sign in to comment.