Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn if api_key in Config #1049

Conversation

erickgalinkin
Copy link
Collaborator

Fixes #927

Look for the presence of api_key as a key when loading config files, raise a warning to the user if the file is readable by group or others.

Verification

List the steps needed to make sure this thing works

---
run:
    seed:
    eval_threshold: 0.5
    generations: 1
    probe_tags:

plugins:
    model_type: test
    model_name: Lipsum
    detector_spec: auto
    api_key: test
    probe_spec: test.Test

reporting:
    report_prefix: test

run garak --config config.yaml

NOTE: The warning does not get surfaced to the user directly, which shouldn't be the case when we warn, but it is written to the log. This is an artifact of our default verbosity and it may make sense to print the warning here.

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me, a couple minor enhancement ideas.

garak/_config.py Outdated
Comment on lines 121 to 122
if not isinstance(d, dict):
return False
Copy link
Collaborator

@jmartin-tech jmartin-tech Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are any current places where this would occur, however should also process any list that may contain a dict objects?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively one could go the other way and raise ValueError("_key_exists() can only process dicts"). I think it's probably beneficial to pick one way (dicts+lists of dicts) or the other (dicts or gtfo)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I like the idea of enforcing dict, the way this is written is looking for something at the root which is a value and returns False at the base of the nested dict. I don't think we should be processing list objects here at all, since we're explicitly looking for a key that will end up being used by the config.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern would be if some future plugin offered something like:

detectors:
  fake:
    MultiCallDetector:
      endpoints:
        - uri: https://localhost:4000/
          api_key: value1
        - uri: https://localhost:4001/
          api_key: value2
        - uri: https://localhost:4002/
          api_key: value3

This under the current pattern would be reported as not having any api_key values when in fact there are multiple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. That's pretty easy to solve, I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed lists in f02d735 @jmartin-tech

Would love your review here.

garak/_config.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@leondz leondz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cool, made some optional suggestions

garak/_config.py Outdated
Comment on lines 121 to 122
if not isinstance(d, dict):
return False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively one could go the other way and raise ValueError("_key_exists() can only process dicts"). I think it's probably beneficial to pick one way (dicts+lists of dicts) or the other (dicts or gtfo)

garak/_config.py Outdated Show resolved Hide resolved
garak/_config.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List support looks good when tested on macOS and Linux. Windows however does not reflect permissions correctly for the short term I could see a guard to only test this on *nix platforms, however I do believe proper guidance would also be a preferred action on Windows.

Result I am seeing is that even when a Windows file has inheritance disabled and permissions restricted only to the executing user the file warning is always displayed.

It might be worth adding a unit test in test_config.py that creates a temporary file that sets specific ownership values and can evaluate enforcement or at least document expectations in all supported operating systems.

This seems complete reasonable for yaml based config, in a future task json based config passed as *_option_file arguments should probably get similar treatment. Consider that when #913 is addressed this may be the right time to ensure all json config file permissions are validated.

garak/_config.py Outdated Show resolved Hide resolved
@erickgalinkin
Copy link
Collaborator Author

List support looks good when tested on macOS and Linux. Windows however does not reflect permissions correctly for the short term I could see a guard to only test this on *nix platforms, however I do believe proper guidance would also be a preferred action on Windows.

Result I am seeing is that even when a Windows file has inheritance disabled and permissions restricted only to the executing user the file warning is always displayed.

It might be worth adding a unit test in test_config.py that creates a temporary file that sets specific ownership values and can evaluate enforcement or at least document expectations in all supported operating systems.

This seems complete reasonable for yaml based config, in a future task json based config passed as *_option_file arguments should probably get similar treatment. Consider that when #913 is addressed this may be the right time to ensure all json config file permissions are validated.

That seems kind of weird re: the Windows stuff. Should definitely add a UT -- great point. Perhaps it makes sense to check and only do the check if os.name != 'nt'?

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@erickgalinkin erickgalinkin merged commit f70716f into NVIDIA:main Jan 7, 2025
9 checks passed
@erickgalinkin erickgalinkin deleted the 927-security-warn-if-api_key-given-in-config-and-config-is-world-readable branch January 7, 2025 00:39
@github-actions github-actions bot locked and limited conversation to collaborators Jan 7, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

security: warn if api_key given in config and config is world-readable
3 participants