Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdk-common: add ComponentRegistry #375

Closed

Conversation

iRevive
Copy link
Contributor

@iRevive iRevive commented Nov 16, 2023

Reference Link
Spec https://opentelemetry.io/docs/specs/otel/trace/api/#get-a-tracer
Java implementation ComponentRegistry.java

The spec says:

Note

Tracers are identified by name, version, and schema_url fields. When more than one Tracer of the same name, version, and schema_url is created, it is unspecified whether or under which conditions the same or different Tracer instances are returned. It is a user error to create Tracers with different attributes but the same identity.

Java SDK uses ComponentRegistry to prevent duplicates.

Here is an example of how ComponentRegistry could be used in the SDK module:
https://github.com/typelevel/otel4s/pull/325/files#diff-96194a2d20a59c6f3717b447b20305201f52f19802d11b677580d86839d69d4bR40.

@iRevive iRevive added the module:sdk Features and improvements to the sdk module label Nov 16, 2023
@iRevive iRevive force-pushed the sdk-trace/components-registry branch 3 times, most recently from 4d28519 to dcf0fd1 Compare November 16, 2023 17:55
@iRevive iRevive force-pushed the sdk-trace/components-registry branch from dcf0fd1 to befcb77 Compare November 16, 2023 17:57
@armanbilge
Copy link
Member

it is unspecified whether or under which conditions the same or different Tracer instances are returned

Since it's unspecified, what are the pros/cons of of caching and returning the same instances?

schemaUrl: Option[String],
attributes: Attributes
): F[A] =
cache.evalModify { cache =>
Copy link
Member

@armanbilge armanbilge Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is a bit more defensive than the Java implementation from what I can make out.

Specifically, you are guaranteeing that buildComponent(...) will never be run twice for the same scope. However, the Java implementation doesn't actually seem to be guaranteeing that.

https://github.com/open-telemetry/opentelemetry-java/blob/9ac678e81bce7c12820acdb846d22d7957b8b15f/sdk/common/src/main/java/io/opentelemetry/sdk/internal/ComponentRegistry.java#L115-L121

Copy link
Contributor

@NthPortal NthPortal Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you're mistaken. it took a bit of digging, but it seems that ConcurrentHashMap#computeIfAbsent is guaranteed to execute the function exactly once if the key is absent, and exactly zero times if already present. this is specific to ConcurrentHashMap and not true of all ConcurrentMap implementations (e.g. ConcurrentSkipListMap)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha you're right, thanks for taking a close look. I missed that the component is being added to two collections, and the lock is specifically for the allComponents collection.

@iRevive
Copy link
Contributor Author

iRevive commented Nov 16, 2023

Since it's unspecified, what are the pros/cons of of caching and returning the same instances?

It all depends on the use cases.

Let's say you have several independent components within the library and each of them can be allocated independently.
If they all use the same tracer (name, version, schemaUrl), it makes sense to cache the instance.

I assume, in Java world, especially with DI and reflective allocations, the caching may be useful.
In the Scala world, I'm not sure, to be honest.

@armanbilge
Copy link
Member

it makes sense to cache the instance.

Yes, but why: what exactly is the optimization? Are we just saving memory, or are we saving some other (scarce) resource by sharing?

Copy link
Contributor

@NthPortal NthPortal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests could maybe be filled out a tiny bit more, but LGTM

schemaUrl: Option[String],
attributes: Attributes
): F[A] =
cache.evalModify { cache =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this is nitpicky, but I feel like it would be clearer without the name shadowing. perhaps map instead of cache for the function parameter?

cache.evalModify { cache =>
val key = Key(name, version, schemaUrl)

cache.get(key) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some part of me is going "but could this be implemented using updatedWith?" but idk if it makes the code any cleaner or is worth the effort

v1 <- registry.get(name, None, None, Attributes.Empty)
v2 <- registry.get(name, None, None, attributes)
v3 <- registry.get(name, Some(version), None, attributes)
v4 <- registry.get(name, Some(version), Some(schemaUrl), attributes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a case in each of the three tests for name and schema only, as well as a fourth test for name and schema only

schemaUrl: Option[String],
attributes: Attributes
): F[A] =
cache.evalModify { cache =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after thinking about it, it might be worth having a call to cache.get outside of the evalModify so that we can opportunistically avoid blocking if the cache is already populated

@iRevive
Copy link
Contributor Author

iRevive commented Nov 21, 2023

what exactly is the optimization? Are we just saving memory, or are we saving some other (scarce) resource by sharing?

I still don't have an answer, to be honest. We also preserve the uniqueness of the instruments (meters, tracers). Perhaps if we allow multiple meters with the same name/version/schemaUrl, it may lead affect the exported values.

So, since we don't know the component's purpose (at least yet). Should we hold off on the PR?

@armanbilge
Copy link
Member

So, since we don't know the component's purpose (at least yet). Should we hold off on the PR?

I don't have a strong opinion. On the one hand, the Java SDK did it, so that's a precedent that it might be useful. On the other hand, this is adding an (almost certainly inconsequential) overhead due to synchronization and keeping stuff in memory indefinitely ...

@iRevive
Copy link
Contributor Author

iRevive commented Nov 21, 2023

Let's move forward without component registry. It can be easily added later.

@iRevive
Copy link
Contributor Author

iRevive commented Nov 29, 2023

Closing for now.

@iRevive iRevive closed this Nov 29, 2023
@iRevive iRevive deleted the sdk-trace/components-registry branch November 29, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:sdk Features and improvements to the sdk module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants