Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify Azure storage permissions #763

Merged
merged 7 commits into from
Sep 18, 2024
Merged

Clarify Azure storage permissions #763

merged 7 commits into from
Sep 18, 2024

Conversation

kbatuigas
Copy link
Contributor

@kbatuigas kbatuigas commented Sep 12, 2024

Description

Resolves https://github.com/redpanda-data/documentation-private/issues/2472
Resolves https://github.com/redpanda-data/documentation-private/issues/2484
Resolves https://github.com/redpanda-data/documentation-private/issues/2486

  • Specify minimum permissions required for Tiered Storage with Azure
  • Recommend using a custom role with managed identities, per "least privilege," instead of the built-in Storage Blob Data Contributor role

Review deadline: 16 September

Page previews

Use Tiered Storage > Configure object storage (Microsoft ABS/ADLS)
IAM Roles
Object Storage Properties > cloud_storage_api_endpoint

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

Copy link

netlify bot commented Sep 12, 2024

Deploy Preview for redpanda-docs-preview ready!

Name Link
🔨 Latest commit ec8ddbd
🔍 Latest deploy log https://app.netlify.com/sites/redpanda-docs-preview/deploys/66eb166f4982170008529931
😎 Deploy Preview https://deploy-preview-763--redpanda-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@@ -33,6 +33,7 @@ Optional API endpoint.

- AWS: When blank, this is automatically generated using <<cloud_storage_region,region>> and <<cloud_storage_bucket,bucket>>. Otherwise, this uses the value assigned.
- GCP: Uses `storage.googleapis.com`.
- Azure: If you have enabled hierarchical namespaces for your storage account and use a custom endpoint, use <<cloud_storage_azure_adls_endpoint,`cloud_storage_azure_adls_endpoint`>>.
Copy link
Contributor Author

@kbatuigas kbatuigas Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently in our Tiered Storage doc we have a note that says

For hierarchical namespaces created with a custom endpoint, set cloud_storage_azure_adls_endpoint and cloud_storage_azure_adls_port. If you haven’t configured custom endpoints in Azure, there’s no need to edit these properties.

  • Should both 1) HNS enabled and 2) custom endpoints set up be True in order to use cloud_storage_azure_adls_endpoint and cloud_storage_azure_adls_port? (Is it possible for users to have HNS enabled, but not have custom endpoints for their Azure storage account?)
  • If Azure users do not have HNS enabled, do they then need to configure cloud_storage_api_endpoint (as with AWS/GCP users)?
  • We specify that these endpoint properties are "Optional: No", but at the same time we say "Default: null" (and cloud_storage_api_endpoint is also described as "Optional API endpoint"). This is confusing for readers. Can we clarify what config properties are required to use Azure for tiered storage vs what are optional, and in which scenarios?

Copy link
Contributor

@andijcr andijcr Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it's a bit messy but the jist is that:

redpanda uses an endpoint to connect to Azure Blob Storage (the main one).
Users can enable HNS in their storage account. If so, redpanda will auto-discover it and use an additional endpoint to perform List/Delete operations instead of the main one.

these endpoints are in the form [cloud_storage_azure_storage_account].blob.core.windows.net for main and [cloud_storage_azure_storage_account].dfs.core.windows.net for the additional HNS endpoint.

In addition, users can have custom endpoints (I think, for example, when they deploy in China), and in that case, they can use them by setting cloud_storage_api_endopoint for the main and cloud_storage_azure_adls_endpoint if they have HNS enabled.

Note: cloud_storage_api_endopoint and cloud_storage_azure_adls_endpoint are independent of one another. We don't require that cloud_storage_api_endopoint be set if cloud_storage_azure_adls_endpoint is set. However, in a real-world scenario, if you have HNS enabled and cloud_storage_api_endopoint set, I would expect that cloud_storage_azure_adls_endpoint, too, would need to be set.

If HNS is not enabled, cloud_storage_azure_adls_endpoint is just ignored.

We specify that these endpoint properties are "Optional: No",

this seems wrong, redpanda will fallback to the default [cloud_storage_azure_storage_account].blob.core.windows.net [...] without problems

@kbatuigas kbatuigas marked this pull request as ready for review September 12, 2024 20:53
@kbatuigas kbatuigas requested a review from a team as a code owner September 12, 2024 20:53
@@ -695,7 +695,7 @@ CAUTION: Do not set an object storage property to an empty string `""` or to `nu

- For information about how to grant access from an internet IP range (if you need to open additional routes/ports between your broker nodes and Azure Blob Storage; for example, in a hybrid cloud deployment), see the https://learn.microsoft.com/en-us/azure/storage/common/storage-network-security?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json&tabs=azure-portal#grant-access-from-an-internet-ip-range[Microsoft documentation^].

- For more information about shared key authentication, see the https://learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-shared-key[Microsoft documentation^].
Copy link
Contributor Author

@kbatuigas kbatuigas Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have gotten some feedback that "Shared Key" might be more of an implementation detail and is not all that helpful for our end users. Is it better to remove this entirely? I'm not sure that "account access key" and "shared key" are interchangeable in the context of ABS.

Copy link

@WillemKauf WillemKauf Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Microsoft docs seem to use "access key" as the term for the generated tokens allowing users access to ABS, while "Shared Key" authorization is the process by which they are used- at least, that's how I'm interpreting the docs.

I think the changes you have made here (using Shared Key authenication to refer to the process and account access keys as the term for the keys themselves) are in line with their messaging.

@@ -582,7 +582,7 @@ NOTE: The `serviceAccount` annotations and the `statefulset` Pod labels are esse

CAUTION: Do not set an object storage property to an empty string `""` or to `null` as a way to reset it to its default value.

To configure access to ABS/ADLS with shared keys:
To configure access to ABS/ADLS with account access keys:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to verify: as far as I can tell, Azure account access keys grant full access to storage account data.

If so, then these instructions we provide for AWS S3 shared keys don't have an equivalent for Azure, correct?

To configure access to an S3 bucket with access keys instead of an IAM role:

  1. Grant a user the following permissions to read and create objects on the bucket to be used with the cluster (or on all buckets): GetObject, DeleteObject, PutObject, PutObjectTagging, ListBucket.

Meaning there is no way to modify permissions if using account access keys, and if they want to go by "least privilege," they must use managed identities instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SMEs--please provide this guidance ( @deniscoady @andijcr or @WillemKauf ) Thx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning there is no way to modify permissions if using account access keys, and if they want to go by "least privilege," they must use managed identities instead?

Good catch. yes, the granular permissions are only for azure managed identities

@@ -124,7 +124,28 @@ If you are using Microsoft Azure as your cloud provider, you must satisfy the fo
. https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities?pivots=identity-mi-methods-azp#create-a-user-assigned-managed-identity[Create a user-assigned managed identity]^.
. https://learn.microsoft.com/en-us/azure/storage/common/storage-account-create?tabs=azure-portal#create-a-storage-account-1[Create an Azure storage account]^.
. https://learn.microsoft.com/en-us/azure/storage/blobs/blob-containers-portal#create-a-container[Create a container]^ in the storage account.
. Assign the identity, with the https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles/storage#storage-blob-data-contributor[`Storage Blob Data Contributor`]^ role, either during the creation of the storage account, or for an existing storage account. See the official https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/qs-configure-portal-windows-vm#user-assigned-managed-identity[Azure Managed Identities] documentation for more guidance.
. It is recommended to assign a https://learn.microsoft.com/en-us/azure/role-based-access-control/custom-roles[custom role^] to the identity that has the following minimum set of permissions:
Copy link
Contributor

@Feediver1 Feediver1 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is for prereqs--things you must complete before attempting the task. This bullet says "it is recommended..." so not truly a prereq? If a prereq, maybe say "You must..." It also would be helpful to users if you add a line explaining why these custom roles are recommended (or required--however you decide to present)...also, if not an actual requirement, maybe list outside the scope of prereqs, or, identify this prereq as "optional".

@kbatuigas kbatuigas requested a review from voutilad September 18, 2024 13:06
@kbatuigas kbatuigas requested a review from Feediver1 September 18, 2024 13:08
Copy link
Contributor

@Feediver1 Feediver1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor update suggestion

@kbatuigas kbatuigas merged commit c943f90 into main Sep 18, 2024
7 checks passed
@kbatuigas kbatuigas deleted the 2482_ts-azure-config branch September 18, 2024 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants