Add ability to provide the agent-id in enroll API #4226

blakerouse · 2024-12-17T20:15:00Z

Describe the enhancement:

Add the ability to specify the agent-id of the enrolling Elastic Agent.

Describe a specific use case for the enhancement or feature:

On serverless, an Elastic Agent is static but the pod doesn't have any persistent storage so it cannot store the enrollment information between restarts of the Elastic Agent. There have also been other reports of this issue from customers where they do not need persistent storage from the integration and requiring the Elastic Agent to have it just for the enrollment information is not possible.

To provide a stable Elastic Agent in the agents list in Kibana, this would allow an Elastic Agent to enroll with the ID they want to have. This would also replace an existing Elastic Agent if one already has the same ID.

The new enrolled Elastic Agent will replace the previous Elastic Agent prevent it from being able to communicate with the Fleet Server any more.

Describe any security issues:

This does open the possibility that if a bad actor had the enrollment token and the ID of the Elastic Agent it would be able to enroll over top of it and prevent the communication of that current Elastic Agent as the other Elastic Agent would be come the newly communicating Elastic Agent.

To prevent this only an additional replace-token would be added to the enrollment API. This would be any unique value that is stored as a bcrypt hash on the Elastic Agent record. If an Elastic Agent is enrolled without this token then it doesn't allow any other Elastic Agent to enroll with the same ID (trying to enroll with the same ID would error). If an Elastic Agent is enrolled with the replace token and its the first enrollment then it would successfully enroll. On a second enrollment to replace the Elastic Agent the exact same replace token must be provided and if it matches (using bcrypt hash) then it would be considered the replacement of the Elastic Agent and allow the enrollment to complete.

michel-laterman · 2024-12-19T22:50:14Z

I think we already provide this through the enrollment_id in the API:

fleet-server/model/openapi.yml

Lines 150 to 155 in 0687ea5

    
                   enrollment_id: 
        
                     type: string 
        
                     description: | 
        
                       The enrollment ID of the agent. 
        
                       To replace an agent on enroll fail. 
        
                       The existing agent with a matching enrollment_id will be deleted if it never checked in. The new agent will be enrolled with the enrollment_id.

It was added with #2655

blakerouse · 2024-12-20T15:09:32Z

@michel-laterman The existing agent with a matching enrollment_id will be deleted if it never checked in. What if it has checked-in?

michel-laterman · 2025-01-07T21:54:52Z

@blakerouse and I had a brief conversation about this.

We've decided to add an ID field to enrolment requests that is distinct from the existing enrollment_id value.
If this field is used, and indicates an existing agent that agent's current policy & existing API keys are used by the "new agent".
If the agent does not exist it's treated as a new enrolment.

This is so that we don't break/get blocked on existing scale tests when delivering this feature; and as a follow up we can see if we can make the scale tests just use the new ID value and deprecate enrollment_id (cc @juliaElastic).

I've also looked a bit more into opamp for how it handles duplicate IDs. In short, this type of workflow (where we may have more than one pod that are "the same agent") isn't supported.
It's pretty clear by the implications of the duplicate websockets connection section.

When sending a message an agent is able to specify their own instance_uid value or request one from the server
The server can also force agents to use a new instance_uid value at any time.

Additionally AgentToServer messages are expected to be sequential (indicated by sequence_num) as a mechanism for detecting missed messages.

Supporting this workflow is something we'll need to handle once we start supporting opamp.

jlind23 · 2025-01-08T06:37:14Z

@nimarezainia do you think this is something we could piggy back on in order to migrate Agent from a cluster to another using the same ID in the enroll command?

blakerouse · 2025-01-08T19:49:38Z

@jlind23 After our discussion of the security implications I have added a section to the description about the addition agent-token API option for enrollment. Hopefully this implementation would alleviate those implications.

@elastic/product-security Could you give the security implications a review?

blakerouse added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Dec 17, 2024

blakerouse self-assigned this Dec 17, 2024

blakerouse mentioned this issue Dec 17, 2024

Add ability to define the specific agent-id during enroll elastic/elastic-agent#6361

Open

kpollich assigned kpollich and unassigned kpollich Jan 7, 2025

blakerouse linked a pull request Jan 8, 2025 that will close this issue

Add ability to enroll with a specific ID #4290

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to provide the agent-id in enroll API #4226

Add ability to provide the agent-id in enroll API #4226

blakerouse commented Dec 17, 2024 •

edited

Loading

michel-laterman commented Dec 19, 2024

blakerouse commented Dec 20, 2024

michel-laterman commented Jan 7, 2025

jlind23 commented Jan 8, 2025

blakerouse commented Jan 8, 2025

Add ability to provide the agent-id in enroll API #4226

Add ability to provide the agent-id in enroll API #4226

Comments

blakerouse commented Dec 17, 2024 • edited Loading

michel-laterman commented Dec 19, 2024

blakerouse commented Dec 20, 2024

michel-laterman commented Jan 7, 2025

jlind23 commented Jan 8, 2025

blakerouse commented Jan 8, 2025

blakerouse commented Dec 17, 2024 •

edited

Loading