Should db.system be db.system.name? #1581

jack-berg · 2024-11-14T22:39:21Z

Area(s)

area:db

Is your change request related to a problem? Please describe.

As we approach stability, I wanted to throw out a question of whether we should rename db.system to db.system.name in order to preserve the ability to use db.system as a namespace.

I don't see any need to include other db.system.* attributes on metrics / spans. But supposing semantic conventions expand to database servers (e.g. resource conventions to represent a database server), its possible we might want to include other attributes along the lines of db.system.version.

Describe the solution you'd like

Consider renaming db.system to db.system.name.

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

lmolkova · 2024-11-19T23:02:52Z

I have an arbitrary (and optional) rule of thumb for attribute names to follow {domain}.{thing}.{a property of the thing} pattern.

The db.system.name follows this pattern and if we defined it from the scratch, I'd prefer to add .name.

maryliag · 2024-11-22T17:36:25Z

Agree that adding .name is a better option

jack-berg · 2024-11-22T18:22:30Z

I can open a PR. Can a maintainer assign it to me?

trask · 2024-11-22T18:26:13Z

@lmolkova and I added this to Monday's semconv meeting to get more feedback since it would affect several semantic conventions

db.system
messaging.system
rpc.system
gen_ai.system
feature_flag.system

dyladan · 2024-11-25T16:10:55Z

Discussed (very) briefly with feature flag sig and nobody had an objection

dyladan · 2024-11-25T16:37:15Z

I think this change might be a good way to future proof against entities in the future. Assuming we will want to use the same dictionary of attributes for entities, we might have things like system version, region, etc for entities in the future.

trask · 2024-12-07T02:19:39Z

I have an arbitrary (and optional) rule of thumb for attribute names to follow {domain}.{thing}.{a property of the thing} pattern.

an interesting exception to this rule appears to be db.namespace and code.namespace (not any problem, just wanted to drop a note about it)

codefromthecrypt · 2025-01-02T00:56:47Z

@jack-berg this db change is being conflated with genai in this PR, was that your intent? #1613

codefromthecrypt · 2025-01-02T01:00:27Z

the main problem conflating as I tried to elude to in comments is that genai is more like rest api in practice.

So, provider like cloud provider, and usually the cloud provider isn't versioned. It is one or more api groups within it that are versioned, if that happens.

I feel a need to be on the defense that some decision made about databases will be projected out to genai, even if it wouldn't be projected out to normal cloud APIs which usually genai systems are implemented as.

I wish semantically we could decouple these things, so that we can focus more on progress than defense or accident of coupling. an LLM is not a database

lmolkova · 2025-01-03T18:17:57Z

We're trying to stay consistent across domains. One way to achieve it is to follow the same naming patterns. The naming pattern *.system (or *.provider) is problematic since it's not specific enough and would block future evolution.

We should fix it across the repo and it has nothing to do with how similar or different the technologies are. We can pick different names (e.g. rpc.system becomes rpc.protocol.name) but they should still be specific.

jack-berg · 2025-01-03T21:08:46Z

was that your intent?

When I opened this issue I was making an observation about a single domain, but I think extracting out common modeling rules adds rails which improve our productivity and consistency. Of course, the trick is deciding where it makes sense to have common modeling rules and when a rule is / isn't applicable to a particular situation.

codefromthecrypt · 2025-01-03T22:35:42Z

unsolicited 2p again: concrete examples especially in each domain are useful not just in defining, but also when breaking attributes. Otherwise, the cost to break isn't understood and we contribute to a general sense of perhaps arbitrary instability.

If we can't prove 3 things that require breaking out a name into a struct, probably best to think twice. This applies to all domains, so each one who uses provider has a different context for that. So, I mean real values not invented ones in one domain to support of a proposal of a change in a different domain. End users would be best to suggest these problems.

In summary, it is easy in abstract to justify a model change when not put to use, but it is better to use real problems reported by users to motivate cross-cutting changes.

End of unsolicited advice, hope some of it is considered while rolling out this change to several domains.

jack-berg · 2025-01-03T23:51:03Z

If we can't prove 3 things that require breaking out a name into a struct, probably best to think twice.

I think there are competing concerns. Change for change's sake has bad optics and is hard on users. But given that semantic-conventions is fundamentally a taxonomy project, there is some real value in having conceptual purity. Things like symmetry across domains and following rules of thumb to ensure that we don't get boxed-in in the future. I suspect that in the long term too much slop would compound into a taxonomy that people don't like and want to replace. To me, its classic short term vs. long term trade off. Tough to balance.

Ideally, as we get more experience, we extract out the common rules / guidelines (i.e. extending what we already have in places like naming considerations) so that we have a higher likelihood of getting it right the first time (good for users) while also having consistency and maintaining ability to evolve.

codefromthecrypt · 2025-01-04T02:04:18Z

conceptual purity != naming convention purity across different concepts.

unrelated buckets to satisfy naming convention purity reduces conceptual purity. It doesn't serve one domain to force to be in unrelated and conflated buckets due to another domain. It is a non-goal imho to trade concepts for naming conventions of unrelated tech.

Let's focus on this when trying to optimize, as really what's good for users is things being coherent and also not drifty.

lmolkova · 2025-01-04T02:24:18Z

The original gen_ai.system was introduced based on db|messaging|rpc.system. We did it because this was a pattern we had for years. changing this pattern in one place, but not the other is a no-go for me.

We should expect more of these (e.g. *.operation.name -> operation.name).
All these conventions are experimental. DB is on the finish line, GenAI is in it's early days.

Having patterns and common attributes/guidance across conventions is important and we're going to push for consistency in experimental conventions.

codefromthecrypt · 2025-01-04T02:29:50Z

Yes, we can break db etc because .name or other things we feel are important, but some person in TC should ask themselves "should we" or "when should we" especially as the blast radius of work if far outside this org.

If the TC decides to not consider the blast of work as a part of what's important in conventions, that's a hazard for anyone participating. I'm speaking into the void as I think most people impacted by this personally are not subscribed to this issue. However, a simple string search on github should show that renaming attributes widely used has impact on people not on this issue to see it.

lmolkova · 2025-01-04T02:42:41Z

you should ask @open-telemetry/specs-semconv-maintainers who are responsible for this repo.

jsuereth · 2025-01-06T14:26:59Z

To both @jack-berg and @lmolkova's points here - We are making a long term vs. short term trade-off. Semantic Conventions is a long-term focused projects. The value of conventions is that they are stable + consistent. Specifically they need to be stable for a long period of time, and consistent across a large body of domains and vendors.

For this specific issue, @lmolkova and the HTTP/DB semconv folks have been carving that path for us recently, including the "meta" conventions we use. Additional good reading comes from the Systems semconv group, that finally outlined some principles around our ideas of "t-shaped" conventions: #1618

some person in TC should ask themselves "should we" or "when should we" especially as the blast radius of work if far outside this org.

That is entirely the point of this issue - answering that question. This is something we ask ourselves and debate a LOT within semconv. It's ultimately some of the hardest decision making within OpenTelemetry (that I've been involved with) and one we don't take lightly. But please, don't mistake disagreement for lack of care or thought.

We understand, and have been reinforced in that understanding, the rippling implications of change for OpenTelemetry on changing semantic conventions. However, you should envision the final state of Semconv more as a 2.0 of instrumentation for OpenTelemetry, as evidenced in the work the Java Agent took for HTTP semconv. We do not want to leave folks relying on de-facto stable semantic conventions in the dust. However, we do need to change and evolve what we have. It's been where a majority of the 'core' effort in semconv has been placed.

Now directly to this issue. DB and LLM semantic conventions are not stable and in fact, are rapidly changing. I think there are two needs here:

Instrumentation folks can rely on that keeps up to date with the wildly changing landscape.
Stable semantic conventions that provide consistent ways to observe and maintain LLM systems.

Semantic conventions focuses on the second. The first SHOULD be provided before coming to semantic conventions. Semantic Conventions doesn't aim to immediately replace these, but slowly provide a stable alternative to where folks are today.

I feel a need to be on the defense that some decision made about databases will be projected out to genai, even if it wouldn't be projected out to normal cloud APIs which usually genai systems are implemented as.

I'm not sure I appreciate the combative nature about "defense against decisions" here. This is an open community building consensus towards conventions across that community. Of course decisions around those general conventions will impact everyone, this is why we have a "general" semconv meeting and discuss/raise issues there. Even if unable to attend these meetings, the key PRs are called out in notes so folks can follow along and comment. All of these general principles we define, we're working towards PRs to encode them, e.g. #1707. You are more than welcome to participate in these discussions, PRs and offer thoughts.

jack-berg added enhancement New feature or request triage:needs-triage labels Nov 14, 2024

github-actions bot added the area:db label Nov 14, 2024

trask added this to Database Client Semantic Conventions Nov 14, 2024

lmolkova linked a pull request Nov 26, 2024 that will close this issue

[DO NOT MERGE] Rename db|messaging|gen_ai.system to *.provider.name, rpc.system to rpc.protocol.name #1613

Open

3 tasks

dyladan mentioned this issue Nov 26, 2024

Rename feature_flag.system back to feature_flag.provider_name #1614

Merged

3 tasks

SylvainJuge mentioned this issue Dec 2, 2024

Implement code-generation hints to drop/rename attributes in case of a collision #1462

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should db.system be db.system.name? #1581

Should db.system be db.system.name? #1581

jack-berg commented Nov 14, 2024

lmolkova commented Nov 19, 2024

maryliag commented Nov 22, 2024

jack-berg commented Nov 22, 2024

trask commented Nov 22, 2024 •

edited

Loading

dyladan commented Nov 25, 2024

dyladan commented Nov 25, 2024

trask commented Dec 7, 2024

codefromthecrypt commented Jan 2, 2025

codefromthecrypt commented Jan 2, 2025

lmolkova commented Jan 3, 2025

jack-berg commented Jan 3, 2025

codefromthecrypt commented Jan 3, 2025 •

edited

Loading

jack-berg commented Jan 3, 2025

codefromthecrypt commented Jan 4, 2025

lmolkova commented Jan 4, 2025 •

edited

Loading

codefromthecrypt commented Jan 4, 2025

lmolkova commented Jan 4, 2025

jsuereth commented Jan 6, 2025

Should db.system be db.system.name? #1581

Should db.system be db.system.name? #1581

Comments

jack-berg commented Nov 14, 2024

Area(s)

Is your change request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

lmolkova commented Nov 19, 2024

maryliag commented Nov 22, 2024

jack-berg commented Nov 22, 2024

trask commented Nov 22, 2024 • edited Loading

dyladan commented Nov 25, 2024

dyladan commented Nov 25, 2024

trask commented Dec 7, 2024

codefromthecrypt commented Jan 2, 2025

codefromthecrypt commented Jan 2, 2025

lmolkova commented Jan 3, 2025

jack-berg commented Jan 3, 2025

codefromthecrypt commented Jan 3, 2025 • edited Loading

jack-berg commented Jan 3, 2025

codefromthecrypt commented Jan 4, 2025

lmolkova commented Jan 4, 2025 • edited Loading

codefromthecrypt commented Jan 4, 2025

lmolkova commented Jan 4, 2025

jsuereth commented Jan 6, 2025

trask commented Nov 22, 2024 •

edited

Loading

codefromthecrypt commented Jan 3, 2025 •

edited

Loading

lmolkova commented Jan 4, 2025 •

edited

Loading