Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add document defining an OpenTelemetry Collector #4313

Open
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

codeboten
Copy link
Contributor

Changes

Adds a definition of an OpenTelemetry Collector


- An OpenTelemetry Collector _MUST_ accept a OpenTelemetry Collector Config file.
- An OpenTelemetry Collector _MUST_ be able to be compiled with any and all
additional Collector plugins that the user wishes to include.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this specification define what an OpenTelemetry Collector plugin is? Is it any component of type receiver, processor, exporter, extension, or config map provider?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specification needs to define all the terms used in this definition, otherwise it does not remove ambiguity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed plugin to component and added a section to define OpenTelemetry Collector component.

specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Show resolved Hide resolved
Comment on lines 24 to 25
For a library to be considered an OpenTelemetry Collector component, it _MUST_
implement the [Component interface](https://github.com/open-telemetry/opentelemetry-collector/blob/main/component/component.go)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Collector also accepts confmap.Providers and confmap.Converters, which do not accept this interface. Do we consider those out of scope?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they do need to be considered in scope. Interoperability of those components is important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Collector also accepts confmap.Providers and confmap.Converters, which do not accept this interface. Do we consider those out of scope?

I wonder if including them would allow us to avoid having to include a definition for a config file, wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Providers handle configuration abstractly so they help remove the need for how configuration should be represented, but they don't solve the schema part (which I don't feel like we need to solve tbh)

specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Outdated Show resolved Hide resolved
Comment on lines +9 to +12
The goal of this document is for users to be able to easily switch between
OpenTelemetry Collector Distros while also ensuring that components produced by
the OpenTelemetry Collector SIG are able to work with any vendor who claims
support for an OpenTelemetry Collector.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this goal. If a vendor produces a collector distribution that has a subset of available components because those are the components relevant to their service offerings and that they're willing to support, where do any other components (whether hosted in an OTel repo or not) fit into that picture? Do we mean that a distribution must offer end users the ability to modify its source and create their own build? We should be explicit about that if that is the case.

Given that the licensing of the collector's source code does not require that distribution of derivative works happen in source form I'm not sure that we have much ability here to enforce such a requirement. We can certainly try to use the "OpenTelemetry" mark as a cudgel, but I'm not sure it'll be as effective as may be desirable since the terms "collector" and "distribution" are very broad. It could perhaps be argued that "OpenTelemetry Collector" is a protectable mark and maybe even that "Collector" has acquired secondary meaning in this limited scope, but protecting such a mark against genericization is going to be a Sisyphean task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this definition as separation from the term Distribution defined below. A Distribution is a specific compiled OpenTelemetry Collector with a specific set of OpenTelemetry Collector Components that the maintainer (the user in this case) decided to add. It is a OpenTelemetry Collector bc the maintainer was able to bring their chosen OpenTelemetry Collector components to it.

Something is not an OpenTelemetry Collector if it cannot support OpenTelemetry Collector Components. Maybe the word additional below is unnecessary and could be removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the licensing of the collector's source code does not require that distribution of derivative works happen in source form I'm not sure that we have much ability here to enforce such a requirement

We potentially have leverage over:

  • Trademark usage if "OpenTelemetry Collector" becomes a trademark
  • What we list on our registry and website and what we promote
  • What wording can be used in 'official' OTel events

I think we have enough leverage here to make this worth it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that having a registered "OpenTelemetry Collector" mark is sufficient here as nominative fair use would allow anyone preparing a distribution (in the colloquial sense, not Distribution however we seek to define it) to identify it as such. The Linux Foundation trademark usage guidelines also call out specifically this sort of usage as acceptable for indicating products are related to or based on the project that produces the product bearing their marks.

Obviously the project can control what it puts on its website and what marketing collateral is used in conjunction with events operated by LF/CNCF, but that doesn't seem like effective leverage over an actor who has no need or interest in such things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would generally say if someone isn't interested in "playing nice" then it doesn't really matter what we say or what we don't say. The solution to enforceable marks is offering certification and conformance suites that are attached to actual trademarks (e.g., "OTLP Inside" or whatever). This document is guidance for the community as much as it is guidance for external parties.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @austinlparker said along with my thinking that its important to define this, and include this requirement, to make clear why opentelemetry.io would or wouldn't list project Y as a Collector or Distribution.

specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Show resolved Hide resolved
Comment on lines 24 to 25
For a library to be considered an OpenTelemetry Collector component, it _MUST_
implement the [Component interface](https://github.com/open-telemetry/opentelemetry-collector/blob/main/component/component.go)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they do need to be considered in scope. Interoperability of those components is important.

to: collector/README.md
--->

# OpenTelemetry Collector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenTelemetry Collector is never defined. Is it a source code artifact? A binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one of the things I was trying to get at here. Since there's no binary plugin mechanism it seems that the source would need to be available for it to be extended in the manner contemplated, but that's not clear or explicit in the current state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's no binary plugin mechanism it seems that the source would need to be available for it to be extended in the manner contemplated, but that's not clear or explicit in the current state.

Is the lack of binary plugin mechanism something that the OpenTelemetry Collector SIG wants to solve? Are there technical blockers?

Binary and dynamic loading plugin seem to be an established pattern. For example:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Fluent Bit example is not necessarily apposite as it involves a C application dynamically loading shared libraries built with the Go toolchain using CGO (which is generally prohibited in the Collector codebase).

Go does have a native plugin mechanism, though it comes with many caveats and is widely regarded as a bad idea that can't be dropped due to compatibility guarantees. Its documentation sums up its litany of restrictions in this way, which sounds a lot like a suggestion to use something like ocb:

Together, these restrictions mean that, in practice, the application and its plugins must all be built together by a single person or component of a system. In that case, it may be simpler for that person or component to generate Go source files that blank-import the desired set of plugins and then compile a static executable in the usual way.

Copy link

@jaronoff97 jaronoff97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Alex 🙇

Copy link
Contributor

@tedsuo tedsuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex, this definition really captures the heart of it. A+

Signed-off-by: Alex Boten <[email protected]>
@carlosalberto
Copy link
Contributor

Overall LGTM, but I suggest there's a (minor) clarification around this comment:

Do we mean that a distribution must offer end users the ability to modify its source and create their own build? We should be explicit about that if that is the case.

As the bottom part mentions that, for vendor distros, this is a SHOULD instead.

Signed-off-by: Alex Boten <[email protected]>
@codeboten
Copy link
Contributor Author

Overall LGTM, but I suggest there's a (minor) clarification around this comment:

Do we mean that a distribution must offer end users the ability to modify its source and create their own build? We should be explicit about that if that is the case.

As the bottom part mentions that, for vendor distros, this is a SHOULD instead.

Updated should to must, PTAL

Comment on lines +60 to +62
of an OpenTelemetry Collector with a specific set of components and features. A
Distribution author _MUST_ provide users with tools and/or documentation for adding
their own components to the Distribution's components. Note that the resulting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems problematic to me as a MUST. This is, in effect, a requirement that distributions be made available in source form with a license that permits modification (and presumably distribution, though that's not clear). The license under which the Collector is released does not require this and I'm suspicious of the ability to use trademark protections to prevent someone from using the phrasing "Foo Distribution for OpenTelemetry Collector" given that's literally the first "Correct" example in the Linux Foundation trademark guidelines and is a textbook case of nominative fair use.

I think if we want an identifier for compatible distributions that can be effectively controlled we will need a distinctive mark for a compatibility certification that can be granted to distributions that satisfy its requirements, similar to what @tedsuo seems to be describing here.

Beyond those concerns, this requirement also seems excessively vague. What qualifies as "tools and/or documentation"? Is a link to https://go.dev/dl/ sufficient? This probably requires a definition similar to "Corresponding Source" from AGPL-3, which again reinforces the limitations that come from not having this be part of the license under which the collector source code is made available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spirit here is to allow users to reuse their components when they move from one distribution to another: the engineering investments made should not be lost. If the distribution is open source and there's clear documentation how to add a new component to it, that's good enough for me. If the distribution is not open source but allows me to enter the Go module name on a web interface somewhere and get a binary out, that's also fine.

I'd see that binary as "tainted" (to use the kernel terminology) and the final binary might not be officially supported (with SLAs) by a service provider, but as an end-user, I'm not locked in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a specification, so I don't think it's appropriate to leave ambiguity here and rely on interpretation of the "spirit" of the requirement. This was changed from SHOULD to MUST in response to a comment seeking clarification that we intended to require distributions to allow users to modify their source and build new, modified, binaries.

If the distribution is not open source but allows me to enter the Go module name on a web interface somewhere and get a binary out, that's also fine.

I would not expect, and do not think it reasonable to expect, that any vendor offering a closed-source, binary-only distribution will allow users to provide arbitrary code to be built into a new "tainted" binary by that vendor. Doing so would allow for a user to cause a vendor to distribute binaries built from code licensed under terms the vendor has no opportunity to review and which may require, for instance, that any code it is compiled with be distributed under the same terms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a specification, so I don't think it's appropriate to leave ambiguity here and rely on interpretation of the "spirit" of the requirement

I agree, my comment was more to provide the background, hoping that it would trigger ideas for a new wording.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defining what OpenTelemetry considers a distribution. To be considered a distribution by the project I'd think there is free reign over restrictions. That is different from trademark which would mean the project could actually stop someone else from saying, "This is an Otel Collector Distribution". So this wouldn't offer that protection, but instead define for others what the project will itself consider and call a distribution.

I'd still support rephrasing this to not requiring docs/tooling if it works with Otel docs and tooling. Which may mean "requiring" the documentation of all components within a distribution (otherwise how else would a user define an equivalent ocb configuration).

Comment on lines +9 to +12
The goal of this document is for users to be able to easily switch between
OpenTelemetry Collector Distros while also ensuring that components produced by
the OpenTelemetry Collector SIG are able to work with any vendor who claims
support for an OpenTelemetry Collector.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that having a registered "OpenTelemetry Collector" mark is sufficient here as nominative fair use would allow anyone preparing a distribution (in the colloquial sense, not Distribution however we seek to define it) to identify it as such. The Linux Foundation trademark usage guidelines also call out specifically this sort of usage as acceptable for indicating products are related to or based on the project that produces the product bearing their marks.

Obviously the project can control what it puts on its website and what marketing collateral is used in conjunction with events operated by LF/CNCF, but that doesn't seem like effective leverage over an actor who has no need or interest in such things.

to: collector/README.md
--->

# OpenTelemetry Collector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Fluent Bit example is not necessarily apposite as it involves a C application dynamically loading shared libraries built with the Go toolchain using CGO (which is generally prohibited in the Collector codebase).

Go does have a native plugin mechanism, though it comes with many caveats and is widely regarded as a bad idea that can't be dropped due to compatibility guarantees. Its documentation sums up its litany of restrictions in this way, which sounds a lot like a suggestion to use something like ocb:

Together, these restrictions mean that, in practice, the application and its plugins must all be built together by a single person or component of a system. In that case, it may be simpler for that person or component to generate Go source files that blank-import the desired set of plugins and then compile a static executable in the usual way.

Copy link

github-actions bot commented Jan 1, 2025

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 1, 2025
@jpkrohling jpkrohling removed the Stale label Jan 2, 2025
[Collector components](#opentelemetry-collector-components) that
the user wishes to include.

## OpenTelemetry Collector configuration file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this section really required when it is already defined above that it must accept an OpenTelemetry Collector configuration?

Besides redundancy it makes this document a living document that would have to remember to be updated if a new top level key is ever added to the collector configuration file -- for the "minimum structure".

implement a [Component interface](https://pkg.go.dev/go.opentelemetry.io/collector/component#Component)
defined by the OpenTelemetry Collector SIG.

Components require a unique identfier as a `type` string to be included in an OpenTelemetry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar concern as above when you could just say it must implement the Component interface. And is this paragraph referring to the type ID in Component which says:

The component ID (combination type + name) is unique for a given component.Kind.

So multiple components can use the same identifier if they are of different Kinds?

@carlosalberto
Copy link
Contributor

Trying to make progress, I want to share some of what I have (informally) discussed with some people regarding the last contentious point:

This seems problematic to me as a MUST. This is, in effect, a requirement that distributions be made available in source form with a license that permits modification (and presumably distribution, though that's not clear). The license under which the Collector is released does not require this and I'm suspicious of the ability to use trademark protections to prevent someone from using the phrasing "Foo Distribution for OpenTelemetry Collector"

A SHOULD instead of a MUST would make this statement kind of worthless, as we really want distros to support this. It is not about enforcing anything, legally speaking, as we can’t legally enforce anything part of the Specification today. This is more about conveying the OTEL vision, hopefully motivating vendors to implement this, given custom component support is important to achieve vendor neutrality.



This way we can tell “what is considered a distro and what is not”, when we get users looking for help.

Please raise your voice, as otherwise we should merge this soon, given it has been open for a while and it already has more than enough approvals (we can always tune things in follow ups, mind you).

@austinlparker
Copy link
Member

Trying to make progress, I want to share some of what I have (informally) discussed with some people regarding the last contentious point:

This seems problematic to me as a MUST. This is, in effect, a requirement that distributions be made available in source form with a license that permits modification (and presumably distribution, though that's not clear). The license under which the Collector is released does not require this and I'm suspicious of the ability to use trademark protections to prevent someone from using the phrasing "Foo Distribution for OpenTelemetry Collector"

A SHOULD instead of a MUST would make this statement kind of worthless, as we really want distros to support this. It is not about enforcing anything, legally speaking, as we can’t legally enforce anything part of the Specification today. This is more about conveying the OTEL vision, hopefully motivating vendors to implement this, given custom component support is important to achieve vendor neutrality.



This way we can tell “what is considered a distro and what is not”, when we get users looking for help.

Please raise your voice, as otherwise we should merge this soon, given it has been open for a while and it already has more than enough approvals (we can always tune things in follow ups, mind you).

I continue to support this as a MUST. As an example of why it's a valuable inclusion, I have been building an application to allow for local management of OpenTelemetry Collectors in my spare time. Since I can rely on the component manifest to be present through the CLI, it could easily be used to manage a distro of the Collector as well. You can see a screenshot of what it looks like below:

Screenshot 2025-01-22 at 3 27 15 PM

Being able to support an ecosystem around our software is truly one of the most important long-term bets we can make on the health of OpenTelemetry, thus it is crucial that we stick to our guns about things like this. If people want to do something else, they can do something else. They can call it a distribution and wave it around, we just won't include it on the website. We can't control what other people do, but we can control what we do.

@reyang
Copy link
Member

reyang commented Jan 22, 2025

Trying to make progress, I want to share some of what I have (informally) discussed with some people regarding the last contentious point:

This seems problematic to me as a MUST. This is, in effect, a requirement that distributions be made available in source form with a license that permits modification (and presumably distribution, though that's not clear). The license under which the Collector is released does not require this and I'm suspicious of the ability to use trademark protections to prevent someone from using the phrasing "Foo Distribution for OpenTelemetry Collector"

A SHOULD instead of a MUST would make this statement kind of worthless, as we really want distros to support this. It is not about enforcing anything, legally speaking, as we can’t legally enforce anything part of the Specification today. This is more about conveying the OTEL vision, hopefully motivating vendors to implement this, given custom component support is important to achieve vendor neutrality.



This way we can tell “what is considered a distro and what is not”, when we get users looking for help.

Please raise your voice, as otherwise we should merge this soon, given it has been open for a while and it already has more than enough approvals (we can always tune things in follow ups, mind you).

I feel we might be heading towards the wrong direction. This is what I'm seeing:

  1. We don't have a mechanism to allow binary plugins, we can only support plugins in "source code + recompile the whole thing".
  2. We want users to be able to add their own plugins.
  3. We want vendors to be able to provide plugins. Some vendors don't want to make their source publicly available (and that's fine, this is what Apache 2 License is about).

If we can solve 1) (support binary plugins), then we don't have problem 2 and 3. Which I think is the right direction to go.

From the conversation here, it looks we don't try to solve 1), then we are forced to pick a battle between 2 and 3. I suspect if this approach would work out.

@austinlparker
Copy link
Member

If we can solve 1) (support binary plugins), then we don't have problem 2 and 3. Which I think is the right direction to go.

From the conversation here, it looks we don't try to solve 1), then we are forced to pick a battle between 2 and 3. I suspect if this approach would work out.

If at some point in the future we solve point 1 then we can re-evaluate this decision. However, given the current state of things, I feel like this proposal clarifies what's important to us.

@reyang
Copy link
Member

reyang commented Jan 22, 2025

If we can solve 1) (support binary plugins), then we don't have problem 2 and 3. Which I think is the right direction to go.
From the conversation here, it looks we don't try to solve 1), then we are forced to pick a battle between 2 and 3. I suspect if this approach would work out.

If at some point in the future we solve point 1 then we can re-evaluate this decision. However, given the current state of things, I feel like this proposal clarifies what's important to us.

Agree, I feel it is important to clarify in the doc regarding "what's the current state" and "where we want to be", so the readers won't be confused. And we should also make sure there is room for things to be improved/fixed/evolved without breaking changes and big surprises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.