Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Enhance CloudWatch Agent Startup on Windows AMIs via Dependency Updates #444

Merged
merged 2 commits into from
Dec 16, 2024

Conversation

okankoAMZ
Copy link
Contributor

@okankoAMZ okankoAMZ commented Dec 10, 2024

Description of the Issue

The CloudWatch Agent service on Windows fails to start automatically on certain custom AMIs created using Windows sysprep. This issue occurs due to a race condition between the agent's startup and the initialization of network drivers during system boot.

Description of Changes

This PR modifies the CloudWatch Agent's service configuration to improve its startup reliability:

  1. Removed the dependency on the "LanmanServer" service.
  2. Added new service dependencies: "Tcpip" and "Dnscache".
  3. Implemented more detailed failure action settings in the service configuration.
  4. Added Start='install' to the ServiceControl element.

Rationale:

  • Removed "LanmanServer" dependency: Eliminates a non-critical wait during startup, reducing potential delays.
  • Added "Tcpip" and "Dnscache" dependencies: Ensures that the network stack is fully initialized before the CloudWatch Agent starts, addressing the race condition with network drivers.
  • Failure action settings: Provides specific instructions for Windows to handle potential startup failures, improving service recovery.
  • Start='install' addition: Ensures proper service management during installation, streamlining the setup process.

License

By submitting this pull request, I confirm that you are permitted to use, modify, copy, and redistribute this contribution under the terms of your choice.

Tests

Testing was performed on a Windows 2016 instance using the latest AMI in US-East-1:

  • Installed MSI and used the default configuration for the .json file.
  • Verified that when pushing the JSON config via parameter, the agent service started immediately after the AmazonCloudWatch-ManageAgent document completed.
  • Confirmed that CloudWatch metrics were successfully streamed from the Agent to the service.
  • Created an image from the working instance and launched it.
  • Launched an initial instance and 10 additional instances for testing.
  • Used SSM Run Command to execute the following PowerShell script on all instances:
    Get-Service -Name "Amazon CloudWatch*"
  • All executions returned success, confirming that the CloudWatch Agent was running on instances launched from the AMI.
    Screenshot 2024-12-11 at 3 42 29 PM

@okankoAMZ okankoAMZ marked this pull request as ready for review December 11, 2024 20:43
@okankoAMZ okankoAMZ requested a review from a team as a code owner December 11, 2024 20:43
@okankoAMZ okankoAMZ changed the title [BugFix] Update Windows Service Dependencies [Enhancement] Enhance CloudWatch Agent Startup on Windows AMIs via Dependency Updates Dec 11, 2024
Copy link
Contributor

@chadpatel chadpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this been tested w/impacted customer?

@okankoAMZ okankoAMZ merged commit 762da41 into main Dec 16, 2024
4 checks passed
dricross added a commit that referenced this pull request Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants