-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alarm api_unauthorized
for HeadBucket/Object from SSM agent
#6141
Comments
Index: terraform/gitlab/gitlab.tf.json.template.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/terraform/gitlab/gitlab.tf.json.template.py b/terraform/gitlab/gitlab.tf.json.template.py
--- a/terraform/gitlab/gitlab.tf.json.template.py (revision f1a3d58efe03021f754f89a1f8f03484574e3aaf)
+++ b/terraform/gitlab/gitlab.tf.json.template.py (date 1712694300553)
@@ -345,7 +345,10 @@
'edu-ucsc-gi-azul-*',
'*.azul.data.humancellatlas.org',
]
- )
+ ) + [
+ f'amazon-ssm-packages-{aws.region_name}',
+ f'aws-ssm-document-attachments-{aws.region_name}'
+ ]
)
},
@@ -949,7 +952,9 @@
's3:HeadObject'
],
'resources': [
+ f'arn:aws:s3:::amazon-ssm-packages-{aws.region_name}',
f'arn:aws:s3:::amazon-ssm-packages-{aws.region_name}/*',
+ f'arn:aws:s3:::aws-ssm-document-attachments-{aws.region_name}',
f'arn:aws:s3:::aws-ssm-document-attachments-{aws.region_name}/*'
]
} |
api_unauthorized
for HeadBucket/Object from AssumedRole azul-gitlabapi_unauthorized
for HeadBucket/Object from SSM agent
|
https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent-minimum-s3-permissions.html Assignee to chase PR with another one that mentions all buckets as documented above. |
For demo, show absence of matching log events for one week after this lands in a main deployment. |
Originally posted on another issue #6134 (comment): From the spike experiments for #6134 it appears that AccessDenied requests occur in the following cases: When a newly created instance first starts up (two AccessDenied requests), when a new version of the SSM agent is available and automatically installed by the agent (new versions are checked twice daily, the uninstallation of an old version incurs two AccessDenied requests, the installation of the new version incurs another two). We typically observe one SSM updated per week, so we expect four false AccessDenied alarms. If new versions are released more frequently, we could observe up to eight false AccessDenied alarms (2 * (2 + 2)). Assignee to try s3:* in the IAM policy. |
Originally posted on another issue #6134 (comment): Assignee to draft a AWS support request on Google Docs. |
AWS Support case has been created. |
Assignee to monitor AWS Support ticket and follow up if necessary. |
AWS Support responded, they've mentioned that this is a known issued and that there's nothing we could do to prevent it. They also said that they've urge the service team responsible for this to look into it and that they'll keep us posted with the details. |
Followed up with some questions, awaiting response. |
Note that this was already closed with a fix in stable, but the fix was ineffective so we went back to AWS Support. Spike to continue to monitor the AWS Support ticket. |
@hannes-ucsc: "AWS responded stating that they are working on a fix, but can't release an ETA for it. Assignee to continue to monitor the support ticket." |
AWS responded,
still waiting for an upstream resolution. |
Assignee to periodically check with AWS Support. |
AWS replied:
|
Assignee to verify Amazon's fix in two weeks time. |
After AWS release the fix, we haven't observed any of the 'weekly' SSM |
AWS Support ticket has been marked as resolved. |
@hannes-ucsc: "It turns out that rebooting GitLab did trigger more AccessDenied trail events. We are using a version that is several months old (3.3.987) while the newest version is 3.3.1345.0. First, we need to install the latest version and reboot several times with that version installed. Before that though, we need to reboot with the old version in order to ensure that we have a reliable reproduction. We should also investigate why the currently use is so old. We may need to hard-code the version in the Terraform config." |
A recent upgrades PR updated the instance AMI to a more recent version, which confirmed that when the instance is booted, the latest version of the amazon-ssm-agent package isn't installed (still uses 3.3.987). We associate this outdated version with causing false positive alarms, this update also confirmed that the package manager being used by the instance isn't updating at least amazon-ssm-agent to the latest available version. Index: terraform/gitlab/gitlab.tf.json.template.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/terraform/gitlab/gitlab.tf.json.template.py b/terraform/gitlab/gitlab.tf.json.template.py
--- a/terraform/gitlab/gitlab.tf.json.template.py (revision e39203ce85b62e52483c49716357ffb70e962120)
+++ b/terraform/gitlab/gitlab.tf.json.template.py (date 1734720734323)
@@ -1609,7 +1609,8 @@
'docker',
'amazon-cloudwatch-agent',
'amazon-ecr-credential-helper',
- 'dracut-fips'
+ 'dracut-fips',
+ ['amazon-ssm-agent', '3.3.1345.0'],
],
'ssh_authorized_keys': [] if config.deployment.is_stable else operator_keys,
'bootcmd': [
|
Assignee to try specifying the fully qualified URL in cloud-config user data, as suggested by https://docs.aws.amazon.com/systems-manager/latest/userguide/agent-install-al2.html#quick-install-al2 |
Using the following patch installed amazon-ssm-agent - v3.3.1345.0 (latest version), diff --git a/terraform/gitlab/gitlab.tf.json.template.py b/terraform/gitlab/gitlab.tf.json.template.py
index e9fabf394..d10ddda9d 100644
--- a/terraform/gitlab/gitlab.tf.json.template.py
+++ b/terraform/gitlab/gitlab.tf.json.template.py
@@ -1609,7 +1609,8 @@ emit_tf({} if config.terraform_component != 'gitlab' else {
'docker',
'amazon-cloudwatch-agent',
'amazon-ecr-credential-helper',
- 'dracut-fips'
+ 'dracut-fips',
+ 'https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm'
],
'ssh_authorized_keys': [] if config.deployment.is_stable else operator_keys,
'bootcmd': [
… which did't prevent the false positive AccessDenied alarms caused by the SSM agent. |
@hannes-ucsc: "Assignee to open new support ticket, referring to the old one, providing the above evidence that the issue persists." |
The text was updated successfully, but these errors were encountered: