-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify Azure storage permissions #763
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@@ -33,6 +33,7 @@ Optional API endpoint. | |||
|
|||
- AWS: When blank, this is automatically generated using <<cloud_storage_region,region>> and <<cloud_storage_bucket,bucket>>. Otherwise, this uses the value assigned. | |||
- GCP: Uses `storage.googleapis.com`. | |||
- Azure: If you have enabled hierarchical namespaces for your storage account and use a custom endpoint, use <<cloud_storage_azure_adls_endpoint,`cloud_storage_azure_adls_endpoint`>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently in our Tiered Storage doc we have a note that says
For hierarchical namespaces created with a custom endpoint, set cloud_storage_azure_adls_endpoint and cloud_storage_azure_adls_port. If you haven’t configured custom endpoints in Azure, there’s no need to edit these properties.
- Should both 1) HNS enabled and 2) custom endpoints set up be True in order to use
cloud_storage_azure_adls_endpoint
andcloud_storage_azure_adls_port
? (Is it possible for users to have HNS enabled, but not have custom endpoints for their Azure storage account?) - If Azure users do not have HNS enabled, do they then need to configure
cloud_storage_api_endpoint
(as with AWS/GCP users)? - We specify that these endpoint properties are "Optional: No", but at the same time we say "Default: null" (and
cloud_storage_api_endpoint
is also described as "Optional API endpoint"). This is confusing for readers. Can we clarify what config properties are required to use Azure for tiered storage vs what are optional, and in which scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it's a bit messy but the jist is that:
redpanda uses an endpoint to connect to Azure Blob Storage (the main one).
Users can enable HNS in their storage account. If so, redpanda will auto-discover it and use an additional endpoint to perform List/Delete operations instead of the main one.
these endpoints are in the form [cloud_storage_azure_storage_account].blob.core.windows.net
for main and [cloud_storage_azure_storage_account].dfs.core.windows.net
for the additional HNS endpoint.
In addition, users can have custom endpoints (I think, for example, when they deploy in China), and in that case, they can use them by setting cloud_storage_api_endopoint
for the main and cloud_storage_azure_adls_endpoint
if they have HNS enabled.
Note: cloud_storage_api_endopoint
and cloud_storage_azure_adls_endpoint
are independent of one another. We don't require that cloud_storage_api_endopoint
be set if cloud_storage_azure_adls_endpoint
is set. However, in a real-world scenario, if you have HNS enabled and cloud_storage_api_endopoint
set, I would expect that cloud_storage_azure_adls_endpoint
, too, would need to be set.
If HNS is not enabled, cloud_storage_azure_adls_endpoint
is just ignored.
We specify that these endpoint properties are "Optional: No",
this seems wrong, redpanda will fallback to the default [cloud_storage_azure_storage_account].blob.core.windows.net
[...] without problems
@@ -695,7 +695,7 @@ CAUTION: Do not set an object storage property to an empty string `""` or to `nu | |||
|
|||
- For information about how to grant access from an internet IP range (if you need to open additional routes/ports between your broker nodes and Azure Blob Storage; for example, in a hybrid cloud deployment), see the https://learn.microsoft.com/en-us/azure/storage/common/storage-network-security?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json&tabs=azure-portal#grant-access-from-an-internet-ip-range[Microsoft documentation^]. | |||
|
|||
- For more information about shared key authentication, see the https://learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-shared-key[Microsoft documentation^]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have gotten some feedback that "Shared Key" might be more of an implementation detail and is not all that helpful for our end users. Is it better to remove this entirely? I'm not sure that "account access key" and "shared key" are interchangeable in the context of ABS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Microsoft docs seem to use "access key" as the term for the generated tokens allowing users access to ABS, while "Shared Key" authorization is the process by which they are used- at least, that's how I'm interpreting the docs.
I think the changes you have made here (using Shared Key authenication to refer to the process and account access keys as the term for the keys themselves) are in line with their messaging.
@@ -582,7 +582,7 @@ NOTE: The `serviceAccount` annotations and the `statefulset` Pod labels are esse | |||
|
|||
CAUTION: Do not set an object storage property to an empty string `""` or to `null` as a way to reset it to its default value. | |||
|
|||
To configure access to ABS/ADLS with shared keys: | |||
To configure access to ABS/ADLS with account access keys: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to verify: as far as I can tell, Azure account access keys grant full access to storage account data.
If so, then these instructions we provide for AWS S3 shared keys don't have an equivalent for Azure, correct?
To configure access to an S3 bucket with access keys instead of an IAM role:
- Grant a user the following permissions to read and create objects on the bucket to be used with the cluster (or on all buckets): GetObject, DeleteObject, PutObject, PutObjectTagging, ListBucket.
Meaning there is no way to modify permissions if using account access keys, and if they want to go by "least privilege," they must use managed identities instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SMEs--please provide this guidance ( @deniscoady @andijcr or @WillemKauf ) Thx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning there is no way to modify permissions if using account access keys, and if they want to go by "least privilege," they must use managed identities instead?
Good catch. yes, the granular permissions are only for azure managed identities
@@ -124,7 +124,28 @@ If you are using Microsoft Azure as your cloud provider, you must satisfy the fo | |||
. https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities?pivots=identity-mi-methods-azp#create-a-user-assigned-managed-identity[Create a user-assigned managed identity]^. | |||
. https://learn.microsoft.com/en-us/azure/storage/common/storage-account-create?tabs=azure-portal#create-a-storage-account-1[Create an Azure storage account]^. | |||
. https://learn.microsoft.com/en-us/azure/storage/blobs/blob-containers-portal#create-a-container[Create a container]^ in the storage account. | |||
. Assign the identity, with the https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles/storage#storage-blob-data-contributor[`Storage Blob Data Contributor`]^ role, either during the creation of the storage account, or for an existing storage account. See the official https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/qs-configure-portal-windows-vm#user-assigned-managed-identity[Azure Managed Identities] documentation for more guidance. | |||
. It is recommended to assign a https://learn.microsoft.com/en-us/azure/role-based-access-control/custom-roles[custom role^] to the identity that has the following minimum set of permissions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is for prereqs--things you must complete before attempting the task. This bullet says "it is recommended..." so not truly a prereq? If a prereq, maybe say "You must..." It also would be helpful to users if you add a line explaining why these custom roles are recommended (or required--however you decide to present)...also, if not an actual requirement, maybe list outside the scope of prereqs, or, identify this prereq as "optional".
modules/reference/pages/properties/object-storage-properties.adoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Joyce Fee <[email protected]>
modules/reference/pages/properties/object-storage-properties.adoc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor update suggestion
Co-authored-by: Joyce Fee <[email protected]>
Description
Resolves https://github.com/redpanda-data/documentation-private/issues/2472
Resolves https://github.com/redpanda-data/documentation-private/issues/2484
Resolves https://github.com/redpanda-data/documentation-private/issues/2486
Review deadline: 16 September
Page previews
Use Tiered Storage > Configure object storage (Microsoft ABS/ADLS)
IAM Roles
Object Storage Properties > cloud_storage_api_endpoint
Checks