What's New
New Features & Enhancements
- Introduced Multistage Attack: We've added a novel
multistage_depth
parameter to thestart_testing()
fucntion, allowing users to specify the depth of a dialogue during testing, enabling more sophisticated and targeted LLM Red teaming strategies. - Refactored Sycophancy Attack: The
sycophancy_test
has been renamed tosycophancy
, transforming it into a multistage attack for increased effectiveness in uncovering model vulnerabilities. - Enhanced Logical Inconsistencies Attack: The
logical_inconsistencies_test
has been renamed tological_inconsistencies
and restructured as a multistage attack to better detect and exploit logical weaknesses within language models. - New Multistage Harmful Behavior Attack: Introducing
harmful_behaviour_multistage
, a more nuanced version of the original harmful behavior attack, designed for deeper penetration testing. - Innovative System Prompt Leakage Attack: We've developed a new multistage attack,
system_prompt_leakage
, leveraging jailbreak examples from dataset to target and exploit model internals.
Improvements & Refinements
- Conducted extensive refactoring for improved code efficiency and maintainability across the framework.
- Made numerous small improvements and optimizations to enhance overall performance and user experience.
Community Engagement
- Join Our Telegram Chat: We have created a LLAMATOR channel on Telegram where we encourage all users to share feedback, discuss findings, and contribute to our community. You can find us here: @llamator
Get Involved
We value your input in making LLAMATOR the best tool for LLM Red teaming. Your feedback is essential as we continue to evolve and improve. If you have suggestions, encounter any issues, or want to share your experiences using LLAMATOR 2.0.0, please don't hesitate to reach out!
Thank you for choosing LLAMATOR. Let's make AI security better together!