Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detector: StringDetector matchtype fullmatch #1084

Open
leondz opened this issue Jan 17, 2025 · 1 comment
Open

detector: StringDetector matchtype fullmatch #1084

leondz opened this issue Jan 17, 2025 · 1 comment
Labels
detectors work on code that inherits from or manages Detector new plugin Describes an entirely new probe, detector, generator or harness

Comments

@leondz
Copy link
Collaborator

leondz commented Jan 17, 2025

Summary

Detector for if generator output is a specific string

Add configurable option to strip() generator output first

Add configurable option to still match even if output truncates before entire match is found (to handle max_tokens clipping and similar)

Useful for e.g. shield models, or probes that specify a complete expectedoutput string

@leondz leondz added detectors work on code that inherits from or manages Detector new plugin Describes an entirely new probe, detector, generator or harness labels Jan 17, 2025
@leondz leondz changed the title detector: fullmatch detector: StringDetector matchtype fullmatch Jan 17, 2025
@Eric-Hacker
Copy link
Contributor

It seems that some of this was just added with the startswith StringDetector. The strip() option would be a good thing to add.

The option to do a partial match can be controlled by the detector by specifying a shorter string to be detected with startswith.

The Shields detector is designed to support the match string(s) and matchtype as optional parameters so it is ready to be run time configured for whatever the expected output would be. I thought about adding more docs to show this, but thought that might get to be too much? It could also be part of a tutorial or something.

If a fullmatch is still needed, then the Shields detector should be updated to support it. Even if not, I think Shields should use strip() as I already ran into one situation where that would have been helpful. If agreed, assign the issue to me along with exactly which way to go. And by exactly, I mean if fullmatch is somehow going to have an option partial then please provide a better name that doesn't sounds so silly because I can't think of one. 😏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detectors work on code that inherits from or manages Detector new plugin Describes an entirely new probe, detector, generator or harness
Projects
None yet
Development

No branches or pull requests

2 participants