Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS-9479] Set up SDS for Cloud Storage #27589

Open
wants to merge 1 commit into
base: may/sds-expansion-cloud-storage
Choose a base branch
from

Conversation

maycmlee
Copy link
Contributor

@maycmlee maycmlee commented Feb 12, 2025

What does this PR do? What is the motivation?

Adds a Set Up SDS for Cloud Storage doc.

Child of: #27159

DOCS-9479

Merge instructions

Merge readiness:

  • Ready for merge

Merge queue is enabled in this repo. To have it automatically merged after it receives the required reviews, create the PR (from a branch that follows the <yourname>/description naming convention) and then add the following PR comment:

/merge

Additional notes

@maycmlee maycmlee added the WORK IN PROGRESS No review needed, it's a wip ;) label Feb 12, 2025
@maycmlee maycmlee requested a review from a team as a code owner February 12, 2025 20:06
@github-actions github-actions bot added Architecture Everything related to the Doc backend Images Images are added/removed with this PR Guide Content impacting a guide labels Feb 12, 2025
@maycmlee maycmlee changed the base branch from master to may/sds-expansion-cloud-storage February 12, 2025 20:13
Copy link
Contributor

@janine-c janine-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey May, this looks as awesome as part 1! Again, these are all small readability suggestions that you can take or leave. Let me know how I can help you get these out the door, as always 🙂


## Disable Sensitive Data Scanner
{{< callout url="https://www.datadoghq.com/product-preview/data-security/" >}}
Limited Availability Scanning support for Amazon S3 buckets and RDS instances is in Limited Availability. To enroll, click Request Access.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing a repetition of "Limited Availability," but up to you if that was there for a reason 🙂

Suggested change
Limited Availability Scanning support for Amazon S3 buckets and RDS instances is in Limited Availability. To enroll, click Request Access.
Scanning support for Amazon S3 buckets and RDS instances is in Limited Availability. To enroll, click Request Access.


To turn off Sensitive Data Scanner entirely, set the toggle to **off** for each Scanning Group so that they are disabled.
Deploy Datadog Agentless scanners in your environment to scan for sensitive information in your cloud storage resources. Agentless scanners are EC2 instances that you control and run within your environment and use [Remote Configuration][1] to retrieve a list of S3 buckets and RDS instances, as well as their dependencies. They scan many types of text files, such as CSVs and JSONs in your S3 buckets and tables in your RDS instances.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny grammatical nitpick here: between "Agentless scanners" and "you," I'm not sure which one applies to the verb "use" - that is, I'm not sure if it's the scanner or the user who uses Remote Configuration based on the wording here. Would it be possible to clarify the actions a bit here?

To turn off Sensitive Data Scanner entirely, set the toggle to **off** for each Scanning Group so that they are disabled.
Deploy Datadog Agentless scanners in your environment to scan for sensitive information in your cloud storage resources. Agentless scanners are EC2 instances that you control and run within your environment and use [Remote Configuration][1] to retrieve a list of S3 buckets and RDS instances, as well as their dependencies. They scan many types of text files, such as CSVs and JSONs in your S3 buckets and tables in your RDS instances.

When an Agentless scanner finds a match with any of the [SDS library rules][2], the rule type and location of the match is sent to Datadog by the scanning instance. **Note**: Cloud storage resources and their files are only read in your environment - no sensitive data that was scanned is sent back to Datadog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editing to remove the passive voice:

Suggested change
When an Agentless scanner finds a match with any of the [SDS library rules][2], the rule type and location of the match is sent to Datadog by the scanning instance. **Note**: Cloud storage resources and their files are only read in your environment - no sensitive data that was scanned is sent back to Datadog.
When an Agentless scanner finds a match with any of the [SDS library rules][2], the scanning instance sends the rule type and location of the match to Datadog. **Note**: Cloud storage resources and their files are only read in your environment - no sensitive data that was scanned is sent back to Datadog.


When an Agentless scanner finds a match with any of the [SDS library rules][2], the rule type and location of the match is sent to Datadog by the scanning instance. **Note**: Cloud storage resources and their files are only read in your environment - no sensitive data that was scanned is sent back to Datadog.

In the Sensitive Data Scanner [Summary page][3], you can see what cloud storage resources have been scanned and any matches found, including the rules that matched it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pluralizing "them" to match with the plural "cloud storage resources" and "matches" 🙂

Suggested change
In the Sensitive Data Scanner [Summary page][3], you can see what cloud storage resources have been scanned and any matches found, including the rules that matched it.
In the Sensitive Data Scanner [Summary page][3], you can see what cloud storage resources have been scanned and any matches found, including the rules that matched them.


To use Sensitive Data Scanner in your AWS environments, you need to:

1. Enable Remote Configuration. Remote Configuration allows Datadog to send information to scanners, such as which cloud storage resources should be scanned. See [Enabling Remote Configuration][4] for instructions on how to set up Remote Configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little odd to be in a section called "Enable Remote Configuration" and then have a link to a section that's called "Enabling Remote Configuration" - they're so similar, I can imagine someone getting confused. Maybe consider differentiating the two a bit from each other, so a user doesn't think that they're just going to get linked to the place they already are?

1. Click on the plus icon for the account you want to enable sensitive data scanning.
1. Select that you want to add the scanner using CloudFormation.
1. Select the AWS region in the dropdown menu,
1. Select an API key that is already configured for Remote Configuration. If the API key you select does not have Remote Configuration enabled, Remote Configuration is automatically enabled for that key upon selection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Higher up on the page, it says "Only admins have permissions to enable Remote Configuration for individual API keys." I'm curious if this would still happen if the user following these steps wasn't an admin. Maybe they wouldn't be able to get this far in the process in the first place?

1. Select that you want to add the scanner using CloudFormation.
1. Select the AWS region in the dropdown menu,
1. Select an API key that is already configured for Remote Configuration. If the API key you select does not have Remote Configuration enabled, Remote Configuration is automatically enabled for that key upon selection.
1. Toggle the **Enable Sensitive Data Scanning** to add the scanner to the account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Toggle the **Enable Sensitive Data Scanning** to add the scanner to the account.
1. Toggle **Enable Sensitive Data Scanning** on to add the scanner to the account.


### Manually deploy scanners using Terraform {#manually-deploy-scanners-using-terraform}

You can deploy Agentless scanners using the Terraform Module Datadog Agentless Scanner Module. See [Datadog Agentless Scanner Module][7] for more information. Datadog recommends that you choose one of these two setup options if you manually deploy scanners.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can deploy Agentless scanners using the Terraform Module Datadog Agentless Scanner Module. See [Datadog Agentless Scanner Module][7] for more information. Datadog recommends that you choose one of these two setup options if you manually deploy scanners.
You can deploy Agentless scanners using the [Terraform Module Datadog Agentless Scanner][7]. Datadog recommends that you choose one of these two setup options if you manually deploy scanners:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is more concise, and doesn't repeat the word "module" as much? But do what is technically right here 🙂


## Add scanning groups

In the [Cloud Storage][6] settings page, the **Scanning Groups** section is read-only. All SDS library rules are applied within the scanning group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to match the heading in that it doesn't instruct users how to add scanning groups. It sounds like it's not possible, if the section is read-only? I wonder if the heading above it is necessary, or if this can just stand alone as its own little detail.


## Disable Agentless scanning

1. Navigate to [Sensitive Data Scanner][6] settings page.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Navigate to [Sensitive Data Scanner][6] settings page.
1. Navigate to the [Sensitive Data Scanner][6] settings page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Architecture Everything related to the Doc backend Guide Content impacting a guide Images Images are added/removed with this PR WORK IN PROGRESS No review needed, it's a wip ;)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants