Assess fidelity of secret pattern matching #209

poppysec · 2024-12-05T12:37:48Z

A considerable number of the patterns in signatures.yml will match on secret variable names, such as OPENAI_API_KEY in addition to or in place of the secret key or token itself. This is an artefact of the previous secret blocking implementation.

- Amazon:
  - Access Key: (?:A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA|ABIA|ACCA)[A-Z0-9]{16}
  - Secret Access Key Variable: (?i)(amazon|amz|aws)[-_]{0,1}(secret)[-_]{0,1}((access)[-_]{0,1}){0,1}key
  # - Cognito User Pool ID: (?i)us-[a-z]{2,}-[a-z]{4,}-\d{1,}
  - RDS Password: (?i)(rds\-master\-password|db\-password)
  - S3 Private Key Variable: (?i)AWS_S3_PRIVATE_KEY|s3_key|S3_PRIVATE_KEY
  - Security Token Header Variable: (?i)X-Amz-Security-Token
  - API Gateway Key Source Header Variable: (?i)x-amazon-apigateway-api-key-source
  - S3 Bucket: (?i)AWS_S3_BUCKET|s3_bucket
  - SNS Confirmation URL: (?i)https:\/\/sns\.[a-z0-9-]+\.amazonaws\.com\/?Action=ConfirmSubscription&Token=[a-zA-Z0-9-=_]+
  - SES SMTP Password Variable: (?i)ses_smtp_password
  - AWS Private Key Variable: (?i)ec2\-private\-key|EC2_PRIVATE_KEY
  - MWS Token: (amzn\.mws\.[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})
  - AppSync GraphQL Key: \bda2-[a-z0-9]{26}

- Microsoft:
  - Azure API Key Variable: (?i)Ocp-Apim-Subscription-Key
  - Azure Functions Key Header Variable: (?i)x-functions-key

Now with on-the-fly encryption we must be precise with the strings which are encrypted - if we obfuscate an entire line in a user's code prompt, including the variable name, it could cause the LLM to produce mangled output. We also want to avoid adding spurious claims of encrypting x amount of nonexistent secrets to the response.

This task will focus on assessing the changes needed to the way the patterns are matched in order to improve the matching fidelity. E.g. We can still detect on OPENAI_API_KEY : <key>, but the key itself should be within a separate matching group so it can be extracted and encrypted exclusively.

The text was updated successfully, but these errors were encountered:

lukehinds · 2024-12-09T10:49:57Z

@lukehinds plan for roadmap

poppysec · 2024-12-11T18:39:46Z

Doing some testing: Jupyter notebook with the following cells with GPT 4o-generated secrets

x = 'AIzaSyD-EXAMPLEk72b3gHs4TTGEXAMPLEKEY'
my_stripe_secret_key = 'sk_test_4eC39HqLyjWDarjtT1zdp7dcEXAMPLE'
gh_token = 'ghp_1a2B3c4D5eF6G7H8I9J0KLmnopQrSTuVWxyzEXAMPLE'
slack_token = 'xoxb-123456789012-9876543210987-abcdefGHIJKLMNOPQRSTUvwxYz'

btc_address = '1BoatSLRHtKNngkdXEeobR76b53LETtpyT'
eth_addresses = ['0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359', '0x32Be343B94f860124dC4fEe278FDCBD38C102D88']
xrp_address = 'rEb8TK3gBgk5auZkwc6sHnwrGVJH8DuaLh'
solana_address = '4o5YuSJ2dhJkVzUVAsxP2hbr4R8RLyX5NXSTFcHbSjaT'

btc_wif = '5HueCGU8rMjxEXAMPLEGonPnLC5EXAMPL3dENwRYEXAMPLEKKn9W'
my_jwt = 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c'

The EXAMPLE ones might cause issues with pattern matching, we will see.

🛡️ Codegate prevented 9 secrets from being leaked by redacting them.

CodeGate Security Analysis

The provided code snippet contains several security concerns related to sensitive information handling. Here are the key issues and recommendations for addressing them:

API Keys and Secrets:

x: An API key that appears to be related to Google services (e.g., Google Maps API).

my_stripe_secret_key: A Stripe secret key, which should be kept confidential and not exposed in public code repositories.

gh_token: A GitHub token. Tokens should not be hard-coded in the source code as they can be used to access repositories and perform actions on behalf of the user.

Recommendations:

Use environment variables or a secure vault to store and access API keys, secret keys, and tokens.

Implement access controls to ensure that only authorized personnel can view or modify these credentials.

Cryptocurrency Addresses:

btc_address: A Bitcoin address.

eth_addresses: Ethereum addresses.

xrp_address: XRP (ipple) address.

solana_address: Solana address.

Recommendations:

Ensure that cryptocurrency addresses are used securely and are not exposed in public repositories to prevent unauthorized transactions.

Consider using a secure wallet or service to manage cryptocurrency addresses and transactions.

Private Keys:

btc_wif: A Bitcoin private key in Wallet Import Format (WIF).

Recommendations:

Never hard-code private keys in your source code. They should be handled securely using environment variables or secure storage solutions.

Consider using a hardware wallet or a secure key management service to store private keys.

JWT Tokens:

my_jwt: A JSON Web Token (JWT).

Recommendations:

JWTs should be stored securely and not hard-coded in the source code. They should be transmitted over secure channels (e.g., HTTPS) and validated server-side.

Ensure that JWTs have appropriate expiration times and are refreshed as needed.

Best Practices for Secure Implementation:

Avoid hard-coding sensitive information such as API keys, secret keys, tokens, and private keys in the source code.

Use environment variables or secure vaults to manage sensitive information.

Implement proper access controls and authentication mechanisms to protect sensitive data.

Regularly review and update security policies and procedures to ensure compliance with industry standards and best practices.

References:

OWASP Top Ten: https://owasp.org/www-project-top-ten/

Secure Coding Practices: https://cheatsheetseries.owasp.org/cheatsheets/Secure_Coding_Cheat_Sheet.html

NIST Cybersecurity Framework: https://www.nist.gov/cybersecurity/framework

Two secrets on the same line (in a list) were detected separately, which is great!

Ones missed:
XRP - rEb8TK3gBgk5auZkwc6sHnwrGVJH8DuaLh
Stripe secret key -sk_test_4eC39HqLyjWDarjtT1zdp7dcEXAMPLE
Google API key - AIzaSyD-EXAMPLEk72b3gHs4TTGEXAMPLEKEY

We can look at this again later when revamping the code based on, when #209 gets underway

lukehinds · 2025-01-06T11:53:29Z

Make sense to perhaps fold this into the refactor as well: #423

poppysec added the Must-Have label Dec 5, 2024

poppysec self-assigned this Dec 5, 2024

This was referenced Dec 5, 2024

Remove regular expressions for secret variable names #210

Merged

Further removal of overbroad regex #212

Merged

lukehinds removed the Must-Have label Dec 17, 2024

lukehinds added a commit that referenced this issue Jan 6, 2025

Amazon Secret Key is matching paths

92631d7

We can look at this again later when revamping the code based on, when #209 gets underway

lukehinds mentioned this issue Jan 6, 2025

Amazon Secret Key is matching paths #490

Merged

lukehinds unassigned poppysec Jan 6, 2025

lukehinds added the feature-request label Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assess fidelity of secret pattern matching #209

Assess fidelity of secret pattern matching #209

poppysec commented Dec 5, 2024 •

edited

Loading

lukehinds commented Dec 9, 2024

poppysec commented Dec 11, 2024

lukehinds commented Jan 6, 2025

Assess fidelity of secret pattern matching #209

Assess fidelity of secret pattern matching #209

Comments

poppysec commented Dec 5, 2024 • edited Loading

lukehinds commented Dec 9, 2024

poppysec commented Dec 11, 2024

lukehinds commented Jan 6, 2025

poppysec commented Dec 5, 2024 •

edited

Loading