Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose using a different schema to represent Events in a span #11999

Closed
awangc opened this issue Dec 14, 2024 · 6 comments
Closed

Propose using a different schema to represent Events in a span #11999

awangc opened this issue Dec 14, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@awangc
Copy link

awangc commented Dec 14, 2024

Component(s)

exporter/elasticsearch

Is your feature request related to a problem? Please describe.

When storing Span Events in elasticsearch, the event name becomes the key, under which different attributes are stored, e.g. if we have events with name "my-event-1", "my-event-2", then in Elasticsearch we'll have Events.my-event-1.time, Events.my-event-2.time, etc. This does not seem to follow the data format for events for a Span from opentelemetry collector, which are modeled as an array of Span_Event, in which a Span_Event will contain fields like time, name and array of attributes.

The issue I see with this approach is that if name is given arbitrary values (e.g. random UUIDs), then we could see an arbitrary increase in the number of keys.

Describe the solution you'd like

Store span events as an array in elasticsearch, in which each element is an object with fields with time, name and array of attribute (with dropped attribute counts as another possible field - like the Span_Event class)

Admittedly this format may require nested objects which may bring its own performance issues, but it resembles more the data layout from opentelemetry pdata.

Describe alternatives you've considered

No response

Additional context

The schema proposed above would follow the same format for spans, e.g., we have Span.Name and Span.Attributes, and we'd have Event.Name and Event.Attributes, and more closely represents the Event as defined in opentelemetry

@awangc awangc added the enhancement New feature or request label Dec 14, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@atoulme
Copy link
Contributor

atoulme commented Dec 31, 2024

I believe this needs to be addressed in https://github.com/open-telemetry/opentelemetry-proto

@atoulme atoulme transferred this issue from open-telemetry/opentelemetry-collector-contrib Dec 31, 2024
@awangc
Copy link
Author

awangc commented Jan 6, 2025

@atoulme This issue is related to elasticsearchexporter's chosen way to encode Events, i.e., the elasticsearchexporter is choosing to use Event name as key when storing Span Events in Default mapping mode (https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/elasticsearchexporter/internal/objmodel/objmodel.go#L202), this can create field mapping explosion in Elasticsearch if Event name grows indiscriminately (e.g. using random UUIDs). I do not think this is pertinent to opentelemetry-proto?

@dmathieu
Copy link
Member

dmathieu commented Jan 6, 2025

Should this issue be in the core repo? It's about an exporter that's hosted in contrib.

@awangc
Copy link
Author

awangc commented Jan 6, 2025

@dmathieu oh, you're right, this issue should be in the contrib repository, my mistake. Let me close and open over there. Thanks!

@awangc awangc closed this as completed Jan 6, 2025
@dmathieu
Copy link
Member

dmathieu commented Jan 6, 2025

For the record, the new issue: open-telemetry/opentelemetry-collector-contrib#37028

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants