Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v2.0.0 #64

Merged
merged 232 commits into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
232 commits
Select commit Hold shift + click to select a range
6eb0894
initial commit
nizamovtimur Sep 5, 2024
727387f
Framework foundation implemented, base tests adapted for framework us…
RomiconEZ Sep 5, 2024
d915bd6
add api and selenium examples to `notebooks/`
nizamovtimur Sep 6, 2024
8f0dbe7
add some info to TODO
nizamovtimur Sep 6, 2024
e0985df
Add DeepSeek Client
Sep 6, 2024
cede2e8
Add model description to ClientConfig
Sep 6, 2024
98f9d38
Add sycophancy test
Sep 6, 2024
5545450
Merge remote-tracking branch 'origin/add-examples' into feature/sycop…
Sep 6, 2024
6b1eed7
Rename deepseel sycophancy_test notebook example
Sep 6, 2024
c7d1aa9
refactor duplicated clients, add `model_description` to `ClientBase`
nizamovtimur Sep 7, 2024
a5470a8
fix api example
nizamovtimur Sep 7, 2024
0d2992a
yet another examples formatting fix
nizamovtimur Sep 7, 2024
6ac8bb5
Merge pull request #1 from RomiconEZ/add-examples
nizamovtimur Sep 7, 2024
354e6de
Delete hello.py
RomiconEZ Sep 7, 2024
0861aa4
Add dan, ucar, translation attacks
NickoJo Sep 8, 2024
0fa112b
Add exception handler for json loads
Sep 9, 2024
6d697d7
Add base64_injection, corrected utils, translation attack
NickoJo Sep 9, 2024
4d23e2e
Update prompt for authority \-bias
Sep 10, 2024
29bb21f
Add notebook for testing with deepseek
Sep 10, 2024
ca140ef
add new attacks to examples
nizamovtimur Sep 9, 2024
d48aa9e
edit README.md
nizamovtimur Sep 10, 2024
ff1b1c0
fix attacks list in examples
nizamovtimur Sep 10, 2024
60fcdb6
implement ethical compiance attacks
bulatovv Sep 10, 2024
6dccfb5
update architecture
nizamovtimur Sep 10, 2024
b739f21
Updated langchain and related libraries to the latest versions.
RomiconEZ Sep 10, 2024
33d09b4
Unified base64 attack & util
NickoJo Sep 10, 2024
eada2c6
Created Ru&En versions of dan, ucar, self_refine attacks
NickoJo Sep 10, 2024
d4091b6
Created Ru&En versions of dan, ucar, self_refine attacks
NickoJo Sep 10, 2024
3d93486
Corrected attack_loader
NickoJo Sep 10, 2024
6f76186
Merge remote-tracking branch 'origin/testing_artifacts' into feature/…
Sep 11, 2024
6fa80d0
gg
nizamovtimur Sep 11, 2024
2685382
fix ipynb
nizamovtimur Sep 11, 2024
20a5b8d
Merge pull request #5 from RomiconEZ/testing_artifacts
nizamovtimur Sep 11, 2024
36c7d2a
Merge branch 'main' into edit-readme-examples
nizamovtimur Sep 11, 2024
8ae0c1c
add english versions of attacks
bulatovv Sep 11, 2024
919ff93
actualize example
nizamovtimur Sep 11, 2024
97c2a40
Merge remote-tracking branch 'origin/main' into feature/sycophancy-test
Sep 11, 2024
c8636e7
fix prepare_attack_data
Sep 11, 2024
71346f4
merge main
NickoJo Sep 11, 2024
16aa0c0
fix conflicts
bulatovv Sep 11, 2024
4e32aac
fix gitignore
bulatovv Sep 11, 2024
f9bc034
fix merge conflict
bulatovv Sep 11, 2024
e83216a
Corrected translation attack
NickoJo Sep 11, 2024
9f16967
Corrected translation attack #2
NickoJo Sep 11, 2024
256ff0c
Updated creation of artifacts dir.
RomiconEZ Sep 11, 2024
cd08b82
fix attack data format
Sep 11, 2024
d0cd595
Merge pull request #6 from RomiconEZ/testing_artifacts
nizamovtimur Sep 11, 2024
85a16a4
add config
Sep 11, 2024
37260f5
run linters
bulatovv Sep 11, 2024
b6096ad
Merge pull request #2 from RomiconEZ/feature/jailbreak_attacks
nizamovtimur Sep 11, 2024
b3b4e19
Create env example. Add env var to openai client test.
RomiconEZ Sep 11, 2024
d4c4e8b
remove redundant line from gitignore
bulatovv Sep 11, 2024
c03ef9a
Merge pull request #8 from RomiconEZ/testing_artifacts
nizamovtimur Sep 11, 2024
5a7ad37
Merge pull request #7 from RomiconEZ/feature/sycophancy-test
nizamovtimur Sep 11, 2024
d866b16
Merge pull request #4 from RomiconEZ/ethical-compliance
nizamovtimur Sep 11, 2024
d516add
Merge branch 'main' into edit-readme-examples
nizamovtimur Sep 11, 2024
fc85ce9
add report_error for json exceptions
Sep 11, 2024
eae6f94
add all tests checking
nizamovtimur Sep 11, 2024
2e2b738
fixed dan & self-refine attack
NickoJo Sep 11, 2024
d20eb20
fix try except on upper level
Sep 11, 2024
cdec2ff
Merge pull request #10 from RomiconEZ/feature/correction_attacks
nizamovtimur Sep 11, 2024
9553698
update examples
nizamovtimur Sep 11, 2024
1179d04
Merge branch 'main' into Branch_eae6f941
nizamovtimur Sep 11, 2024
cace3e0
fix
nizamovtimur Sep 11, 2024
094eb08
refactor DRY with refusal checking
nizamovtimur Sep 11, 2024
a23acb9
fix
nizamovtimur Sep 11, 2024
61ee8bf
refactor test_descriptions
nizamovtimur Sep 11, 2024
23e1206
fix commas
nizamovtimur Sep 11, 2024
689eb5f
add `test_description` to `generate_summary`
nizamovtimur Sep 11, 2024
8693eb5
add missing periods in docstrings
nizamovtimur Sep 11, 2024
b285af2
Merge pull request #11 from RomiconEZ/refactor-refusal
nizamovtimur Sep 12, 2024
67d0a48
add ensure_ascii and returns to exceptions
Sep 12, 2024
52c6f29
Merge pull request #12 from RomiconEZ/refactor-test-descriptions
nizamovtimur Sep 12, 2024
cfe1f8d
Merge branch 'main' into edit-readme-examples
nizamovtimur Sep 12, 2024
73d3633
fix logging attack_data in error case
Sep 12, 2024
45d02d8
add ru_typoglycemia test
Sep 12, 2024
d28f196
fix `get_system_prompts_summary` if there are no system prompts
nizamovtimur Sep 12, 2024
dbae60a
Merge remote-tracking branch 'remotes/origin/feature/sycophancy' into…
nizamovtimur Sep 12, 2024
5c0b212
fix new attack
nizamovtimur Sep 12, 2024
e44394f
fix all
nizamovtimur Sep 12, 2024
6e4b808
change roadmap
nizamovtimur Sep 12, 2024
929b5c7
fix
nizamovtimur Sep 12, 2024
56c23cf
Merge pull request #3 from RomiconEZ/edit-readme-examples
nizamovtimur Sep 12, 2024
fcde67a
Added to perform the attack: amnesia, authoritative_role_impersonatio…
RomiconEZ Sep 12, 2024
df84459
Merge pull request #13 from RomiconEZ/add_dynamic_attack
nizamovtimur Sep 12, 2024
84876bd
fix some bugs
nizamovtimur Sep 12, 2024
9976948
edit architecture
nizamovtimur Sep 12, 2024
9b39292
edit examples
nizamovtimur Sep 12, 2024
5aca3db
fix
nizamovtimur Sep 12, 2024
14acd28
Merge pull request #14 from RomiconEZ/edit-examples
RomiconEZ Sep 12, 2024
ae541e3
Fix docstring in main. The correct construction of documentation base…
RomiconEZ Sep 12, 2024
0d99433
Publish package on PyPI. Documentation has been improved and completed.
RomiconEZ Sep 12, 2024
1a1ee6c
Run pre-commit
RomiconEZ Sep 12, 2024
306ba32
Fix version to -dev
RomiconEZ Sep 12, 2024
81187b8
Fix version
RomiconEZ Sep 12, 2024
4fd8a65
Bump version: 0.0.1-dev → 0.0.2-dev
RomiconEZ Sep 12, 2024
d572b63
Update README
RomiconEZ Sep 12, 2024
7f168a6
Bump version: 0.0.2-dev → 0.0.3-dev
RomiconEZ Sep 12, 2024
a6a7889
Merge pull request #15 from RomiconEZ/add_doc
RomiconEZ Sep 12, 2024
cef1126
Update setup_dev_env.sh for correct doc page building
RomiconEZ Sep 12, 2024
7190baa
Merge pull request #16 from RomiconEZ/add_doc
RomiconEZ Sep 12, 2024
242075d
Update dependency for sphinx
RomiconEZ Sep 13, 2024
645250d
Merge pull request #17 from RomiconEZ/add_doc
RomiconEZ Sep 13, 2024
27ac023
Add Doc to README
RomiconEZ Sep 13, 2024
d5a250a
Merge pull request #18 from RomiconEZ/add_doc
RomiconEZ Sep 13, 2024
df9a334
Update tqdm version
RomiconEZ Oct 19, 2024
63f6fa2
Refactor attack classes and update testing functionality
RomiconEZ Oct 20, 2024
2bf71fd
Fix ethical_compliance prompt, update README.
RomiconEZ Oct 20, 2024
7eab632
actualize examples
nizamovtimur Oct 22, 2024
7435f60
add telegram examples
nizamovtimur Oct 22, 2024
810b21d
fix static answer waiting
nizamovtimur Oct 22, 2024
09fe8be
Merge pull request #19 from RomiconEZ/num_attack_attempts
RomiconEZ Oct 22, 2024
5aeaa7b
Updated the name of the csv files with attack reports. Improved forma…
RomiconEZ Oct 25, 2024
405d3f6
Dependencies updated: python-docx added
RomiconEZ Oct 25, 2024
4a2ca1d
Pre-commit done
RomiconEZ Oct 25, 2024
8a08963
Disabling test execution in github actions
RomiconEZ Oct 30, 2024
25121b3
Changed function names in pytest test files
RomiconEZ Oct 30, 2024
5436d6b
Merge pull request #21 from RomiconEZ/docx-report
RomiconEZ Oct 30, 2024
e2f210f
Added Past Tense attack, performed minor changes, removed self_refine…
NickoJo Oct 30, 2024
d681828
Added Past Tense to attacks_description.md
NickoJo Oct 30, 2024
6915b0c
actualize notebooks and add telegram example
nizamovtimur Oct 31, 2024
1c9a3e0
add doctrings for test and `.env.example` for examples
nizamovtimur Oct 31, 2024
82a4209
Merge pull request #23 from RomiconEZ/tg-example
nizamovtimur Oct 31, 2024
41a607f
Corrected attacks
NickoJo Oct 31, 2024
894910c
Merge branch 'main' into attacks_refactoring
nizamovtimur Nov 1, 2024
0f08ff6
fix missing comma and actualize examples
nizamovtimur Nov 1, 2024
b54e29d
enhance prompt for translation attack and list of refusal words
nizamovtimur Nov 1, 2024
1d82adb
add logging shutdown
nizamovtimur Nov 1, 2024
1bb2a82
Merge pull request #25 from RomiconEZ/logging-shutdown
nizamovtimur Nov 1, 2024
46f4912
corrected check in ethical_compliance.py and utils
NickoJo Nov 1, 2024
2ef3b3c
fix some funny issues
nizamovtimur Nov 1, 2024
c370c56
Merge pull request #22 from RomiconEZ/attacks_refactoring
nizamovtimur Nov 1, 2024
ca824fb
add `llamator-langchain-custom-attack.ipynb` example notebook
nizamovtimur Nov 1, 2024
3d9a219
fix docs
nizamovtimur Nov 2, 2024
c836cbd
Update howtos.md
nizamovtimur Nov 2, 2024
13f11a1
Merge pull request #26 from RomiconEZ/custom-attack
RomiconEZ Nov 3, 2024
73aa9db
Merge branch 'main' into disable_test_check
RomiconEZ Nov 3, 2024
008c88c
Merge pull request #27 from RomiconEZ/disable_test_check
RomiconEZ Nov 3, 2024
fe4309f
Updated README. Updated documentation. Documentation is now built onl…
RomiconEZ Nov 3, 2024
362083b
Merge pull request #28 from RomiconEZ/update_doc_and_md
RomiconEZ Nov 3, 2024
af315e4
Updated README-dev
RomiconEZ Nov 3, 2024
b0b1041
Updated LLAMATOR version to 1.0.0
RomiconEZ Nov 3, 2024
6d75d25
enhance title in README and CONTRIBUTING
nizamovtimur Nov 3, 2024
69414b6
fix
nizamovtimur Nov 3, 2024
6b16cab
Merge pull request #31 from RomiconEZ/enhance-title
nizamovtimur Nov 3, 2024
01b5856
refactor LLM-as-a-judge attacks: `ethical_compliance` and `harmful_be…
nizamovtimur Nov 5, 2024
f3bfb66
fix all by pre-commit
nizamovtimur Nov 5, 2024
b771e5c
Update refusal words
nizamovtimur Nov 6, 2024
be36d02
Merge pull request #32 from RomiconEZ/refactror-llm-as-a-judge-attacks
nizamovtimur Nov 6, 2024
c6af4a3
add new sycophancy test and `are_responses_coherent()` method
nizamovtimur Nov 7, 2024
740de8c
fix grammar
nizamovtimur Nov 8, 2024
1a6859f
fix
nizamovtimur Nov 8, 2024
2894809
Merge pull request #33 from RomiconEZ/new-sycophancy
nizamovtimur Nov 8, 2024
c4648ee
fix sycophancy system prompt and description
nizamovtimur Nov 8, 2024
877f69c
Update attack_descriptions.json
nizamovtimur Nov 8, 2024
afba178
Update sycophancy.py
nizamovtimur Nov 8, 2024
9b20582
Merge pull request #34 from RomiconEZ/fix-sycophancy
nizamovtimur Nov 8, 2024
970ffba
corrected dan attack
NickoJo Nov 9, 2024
d1d5e4c
add logical inconsistencies test
nizamovtimur Nov 10, 2024
9e41f9c
Merge pull request #37 from RomiconEZ/logical-inconsistencies
nizamovtimur Nov 14, 2024
0df99de
corrected dan/ucar attack
NickoJo Nov 21, 2024
55586b0
fix "ru" column name
nizamovtimur Nov 22, 2024
78fb357
Merge branch 'main' into attacks_correction
nizamovtimur Nov 22, 2024
6426946
Merge pull request #35 from RomiconEZ/attacks_correction
nizamovtimur Nov 22, 2024
d2cdeec
Main merge with release v1.0.2 (#40)
RomiconEZ Nov 26, 2024
9091e66
refactor llm vs llm attacks with new system prompts
nizamovtimur Nov 28, 2024
1e28033
add real example to the sycophancy attack model system prompt
nizamovtimur Nov 29, 2024
9b801a2
change ethical judge prompt
nizamovtimur Nov 30, 2024
0bb5501
corrected/improved datasets + checks in attacks
NickoJo Dec 1, 2024
3cdf416
structured dataset
NickoJo Dec 1, 2024
946b0b5
fix little bug
nizamovtimur Dec 2, 2024
0be9adb
Merge pull request #43 from RomiconEZ/dataset-improve
nizamovtimur Dec 2, 2024
128396b
Merge branch 'main' into refactor-llm-judging
nizamovtimur Dec 2, 2024
894cf97
fix after merge main
nizamovtimur Dec 2, 2024
4b7d26d
run pre-commit fix
nizamovtimur Dec 2, 2024
1e118fd
Merge pull request #42 from RomiconEZ/refactor-llm-judging
nizamovtimur Dec 2, 2024
e459d6f
Add llamator example with WhatsApp integration.
RomiconEZ Dec 9, 2024
a63849b
Add author info in README
RomiconEZ Dec 10, 2024
0ab608b
add fitting datasets to `num_attempts`
nizamovtimur Dec 10, 2024
d8d2203
Merge pull request #44 from RomiconEZ/add-data-fit-to-num_attempts
nizamovtimur Dec 10, 2024
3f9238d
Add WhatsApp example in README
RomiconEZ Dec 10, 2024
c000a1a
Add WhatsApp example in Doc
RomiconEZ Dec 10, 2024
87b55f3
WhatsApp example
RomiconEZ Dec 10, 2024
ccb2159
Add model_description to ClientWhatsAppSelenium init
RomiconEZ Dec 10, 2024
32473ab
Merge pull request #45 from RomiconEZ/whatsapp-example
nizamovtimur Dec 11, 2024
847805c
Main - release v1.1.0 (#46)
RomiconEZ Dec 12, 2024
08d595b
Release v1.1.1 (#47)
RomiconEZ Dec 12, 2024
c8156be
Release v1.1.1
RomiconEZ Dec 12, 2024
5a59018
Merge branch 'release'
RomiconEZ Dec 13, 2024
e50a277
rewrite all examples notebooks in english
nizamovtimur Dec 18, 2024
b275bcc
Merge pull request #50 from RomiconEZ/translate-examples
nizamovtimur Dec 20, 2024
b2511b9
fix attack model system prompt
nizamovtimur Dec 24, 2024
c5c2ac5
Merge pull request #52 from RomiconEZ/small-fix-examples
nizamovtimur Dec 25, 2024
da2b320
Multi stage attack (#51)
nizamovtimur Dec 26, 2024
651999f
move `stop_criterion` from loop
nizamovtimur Dec 27, 2024
f7c7e69
fix sycophancy and logical_inconsistencies naming
nizamovtimur Dec 27, 2024
d3bfe8c
rename `translation.py` to `linguistic.py`
nizamovtimur Dec 27, 2024
28db862
Merge pull request #54 from RomiconEZ/refactor-multistages
nizamovtimur Dec 27, 2024
86541a6
Add refine_attack_prompt func to MultiStageInteractionSession.
RomiconEZ Dec 27, 2024
584d362
Add refine_tested_client_prompt and refine_attacker_prompt funcs to M…
RomiconEZ Dec 27, 2024
830b408
enhance whatsapp example
nizamovtimur Dec 28, 2024
a10b227
sync logic with handling tested_client_response before passing it to…
nizamovtimur Dec 28, 2024
0a3a55c
pre-commit
nizamovtimur Dec 28, 2024
87e44cd
Merge pull request #55 from RomiconEZ/refine_attack_prompt
nizamovtimur Dec 28, 2024
a854cda
Merge pull request #56 from RomiconEZ/enhance-whatsapp-example
nizamovtimur Dec 28, 2024
3adc800
added harmful_behavior_multistage.py
NickoJo Dec 26, 2024
1c63774
corrected harmful_behavior_multistage.py according to the new logic i…
NickoJo Dec 28, 2024
6bb723b
corrected attack
NickoJo Dec 28, 2024
fed170e
corrected attack
NickoJo Dec 28, 2024
30c91c5
add `harmful_behavior_multistage` to docs
nizamovtimur Dec 29, 2024
f00faf1
Merge pull request #53 from RomiconEZ/multi-stage-attack
nizamovtimur Dec 29, 2024
024ee01
Add param history_limit. Add kwargs and args to all attacks. Upgrade …
RomiconEZ Jan 11, 2025
8a38171
Adjust colour theme for dark mode for doc
RomiconEZ Jan 11, 2025
b0aecb9
Adjust colour theme for light mode for doc
RomiconEZ Jan 11, 2025
6e9a368
Run pre-commit
RomiconEZ Jan 11, 2025
6927cb1
Rename history_limit to multistage_depth. Change constant history_lim…
RomiconEZ Jan 11, 2025
a24b957
Run pre-commit
RomiconEZ Jan 11, 2025
2edf701
Update examples to use param - multistage_depth
RomiconEZ Jan 12, 2025
c4425cb
fix missing `multistage_depth` in `attack_registry.py` and actualize …
nizamovtimur Jan 12, 2025
905cd04
add `system_prompt_leakage` attack with dataset (#58)
nizamovtimur Jan 12, 2025
881bb04
corrected refine_kwargs
NickoJo Jan 12, 2025
daea518
typo fix + pre-commit
NickoJo Jan 12, 2025
0cda5e0
Merge pull request #61 from RomiconEZ/hb-issue-fix
nizamovtimur Jan 12, 2025
d448a8d
Merge branch 'main' into multistage_depth
nizamovtimur Jan 12, 2025
55c21db
add to `multistage_depth` to `system_prompt_leakage`
nizamovtimur Jan 12, 2025
5c243d3
actualize example
nizamovtimur Jan 12, 2025
b10034b
Update text blocks in ipynb examples. Add more test as comment to tes…
RomiconEZ Jan 12, 2025
c9b38d1
Merge pull request #60 from RomiconEZ/multistage_depth
RomiconEZ Jan 13, 2025
8f83521
Add more info to attacks docs (#62)
nizamovtimur Jan 13, 2025
f0a8441
Release v2.0.0
RomiconEZ Jan 13, 2025
fcb7f9a
Update llamator version and run pre-commit
RomiconEZ Jan 13, 2025
e560aa2
Bump major version to 2.0.0
RomiconEZ Jan 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.1.1
current_version = 2.0.0
commit = False
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+))?
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -87,5 +87,7 @@ report.xml

# CMake
cmake-build-*/

*/artifacts/
/examples/chrome-data/
/venv/
84 changes: 46 additions & 38 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ pre-commit install

### Run tests

1) Go to tests/test_local_llamator.py
1) Go to `tests/test_local_llamator.py`.

2) Create .env from .env.example and fill in the necessary fields.
2) Create `.env` from `.env.example` and fill in the necessary fields.

3) Run the function to perform testing depending on your LLM client
3) Run the function to perform testing depending on your LLM client.

## Making Changes

Expand All @@ -62,21 +62,21 @@ pre-commit install
```bash
git checkout -b your-branch-name
```

2. Make your changes to the code and add or modify unit tests as necessary.

3. Run tests again
3. Run tests again.

4. Commit Your Changes
4. Commit Your Changes.

Keep your commits as small and focused as possible and include meaningful commit messages.
```bash
git add .
git commit -m "Add a brief description of your change"
```

5. Push the changes you did to GitHub
5. Push the changes you did to GitHub.

6.
```bash
git push origin your-branch-name
```
Expand All @@ -86,83 +86,91 @@ pre-commit install
The easist way to contribute to LLAMATOR project is by creating a new test!
This can be easily acheived by:

#### 1. Create a Test File
* Navigate to the attacks directory.
#### 1. Create a Test File:
* Navigate to the `attacks` directory.
* Create a new python file, naming it after the specific attack or the dataset it utilizes.

#### 2. Set Up Your File
#### 2. Set Up Your File.

The easiest way is to copy the existing attack (py file in the attacks directory)
and change the elements in it according to your implementation
and change the elements in it according to your implementation.

#### 3. Creating datasets with texts for attacks
For multi-stage attack implementation see "What Drives the Multi-stage?" notes in [docs](https://romiconez.github.io/llamator/attacks_description.html).

All files containing attack texts or prompts must be in parquet format.
#### 3. Creating datasets with texts for attacks.

These files are stored in the attack_data folder.
All files containing attack texts or prompts must be in `.parquet` format.

#### 3. Add your attack file name to the attack_loader.py file:
```text
from .attacks import (
dynamic_test,
translation,
typoglycemia,
dan,
These files are stored in the `attack_data` folder.

#### 3. Add your attack file name to the `attack_loader.py` file:
```python
from ..attacks import ( # noqa
aim,
self_refine,
ethical_compliance,
ucar,
base64_injection,
complimentary_transition,
dan,
ethical_compliance,
harmful_behavior,
base64_injection
linguistic,
logical_inconsistencies,
past_tense,
ru_dan,
ru_typoglycemia,
ru_ucar,
sycophancy,
typoglycemia,
ucar,

#TODO: YOUR TEST HERE
)
```

#### 4. Add your attack name to the initial_validation.py file:
```text
#### 4. Add your attack name to the docstring of `start_testing()` in `main.py` and `initial_validation.py` file:
```python
AvailableTests = [
"aim_jailbreak",
"base64_injection",
"complimentary_transition",
"do_anything_now_jailbreak",
"RU_do_anything_now_jailbreak",
"ethical_compliance",
"harmful_behavior",
"past_tense",
"linguistic_evasion",
"sycophancy_test",
"typoglycemia_attack",
"logical_inconsistencies",
"past_tense",
"RU_do_anything_now_jailbreak",
"RU_typoglycemia_attack",
"ucar",
"RU_ucar",
"sycophancy",
"typoglycemia_attack",
"ucar",

#TODO: YOUR TEST HERE
]
```

#### 5. Add your attack description to the attack_descriptions.json file:
#### 5. Add your attack to the `attack_descriptions.json` and `attack_descriptions.md` files.

#### 6. Open a PR! Submit your changes for review by opening a pull request.

## Submitting a pull request
## Submitting a pull request.

1. Update your branch
1. Update your branch.

Fetch any new changes from the base branch and rebase your branch.
```bash
git fetch origin
git rebase origin/main
```

2. Submit a Pull Request
2. Submit a Pull Request.

Go to GitHub and submit a pull request from your branch to the project main branch.

3. Request Reviews
3. Request Reviews.

Request reviews from other contributors listed as maintainers. If you receive a feedback - make any necessary changes and push them.

4. Merge
4. Merge.

Once your pull request is approved, it will be merged into the main branch.
51 changes: 37 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
# LLAMATOR

## Description 📖
Red Teaming python-framework for testing chatbots and LLM-systems

Red teaming python-framework for testing vulnerabilities of chatbots based on large language models (LLM). Supports testing of Russian-language RAG systems.
[![License: CC BY-NC-SA 4.0](https://img.shields.io/badge/License-CC_BY--NC--SA_4.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/llamator)](https://pypi.org/project/llamator)
[![PyPI](https://badge.fury.io/py/llamator.svg)](https://badge.fury.io/py/llamator)
[![Downloads](https://pepy.tech/badge/llamator)](https://pepy.tech/project/llamator)
[![Downloads](https://pepy.tech/badge/llamator/month)](https://pepy.tech/project/llamator)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

## Install 🚀

```bash
pip install llamator==1.1.1
pip install llamator==2.0.0
```

## Documentation 📚
Expand All @@ -16,29 +21,47 @@ Documentation Link: [https://romiconez.github.io/llamator](https://romiconez.git

## Examples 💡

* 📄 [RAG Chatbot testing via API (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-api.ipynb)
* 🧙‍♂️ [Gandalf bot testing via Selenium (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-selenium.ipynb)
* 💬 [Telegram bot testing via Telethon (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-telegram.ipynb)
* 📱 [WhatsApp client testing via Selenium (ENG)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-whatsapp.ipynb)
* 🔗 [LangChain client testing with custom attack (RU)](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-langchain-custom-attack.ipynb)
* 📄 [RAG bot testing via REST API](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-api.ipynb)
* 🧙‍♂️ [Gandalf web bot testing via Selenium](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-selenium.ipynb)
* 💬 [Telegram bot testing via Telethon](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-telegram.ipynb)
* 📱 [WhatsApp bot testing via Selenium](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-whatsapp.ipynb)
* 🔗 [LangChain client testing with custom attack](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-langchain-custom-attack.ipynb)

## Supported Clients 🛠️

* 🌐 All LangChain clients
* 🧠 OpenAI-like API
* ⚙️ Custom Class (Telegram, Selenium, etc.)
* ⚙️ Custom Class (Telegram, WhatsApp, Selenium, etc.)

## Unique Features 🌟

* 🛡️ Support for custom attacks from the user
* 📊 Results of launching each attack in CSV format
* 📈 Report with attack requests and responses for all tests in Excel format
* 📄 Test report document available in DOCX format
* ️🗡 Support for custom attacks from the user
* 👜 Large selection of attacks on RAG / Agent / Prompt in English and Russian
* 🛡 Custom configuration of chat clients
* 📊 History of attack requests and responses in Excel and CSV format
* 📄 Test report document in DOCX format

## OWASP Classification 🔒

* 💉 [LLM01: Prompt Injection and Jailbreaks](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM01_PromptInjection.md)
* 🕵 [LLM07: System Prompt Leakage](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM07_SystemPromptLeakage.md)
* 🎭 [LLM09: Misinformation](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM09_Misinformation.md)

## Community 🌍

* 📣 [Telegram Channel — AI Security Lab](https://t.me/aisecuritylab)
* 💬 [Telegram Chat — LLAMATOR | AI Red Team Community](https://t.me/llamator)

## Supported by 🚀

* [AI Security Lab ITMO](https://ai.itmo.ru/aisecuritylab)
* [Raft Security](https://raftds.ru/)
* [AI Talent Hub](https://ai.itmo.ru/)

## License 📜

© Roman Neronov, Timur Nizamov, Nikita Ivanov

This project is licensed under the terms of the **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International** license. See the [LICENSE](LICENSE) file for details.

[![Creative Commons License](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
[![Creative Commons License](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
Loading
Loading