release-24.3: ui, apiutil, server: surface multiple indexes within a range in the hot ranges page #134988

angles-n-daemons · 2024-11-12T18:26:37Z

Contained with the three linked PRs are the changes required to surface all of the contents within a range in the hot ranges DB console page.

Backport:

1/1 commits from "sql: using catalog reader, allow descriptor scanning by span" (sql: using catalog reader, allow descriptor scanning by span #133190)
1/1 commits from "apiutil, roachpb: create utilities to map descriptors to ranges" (apiutil, roachpb: create utilities to map descriptors to ranges #133840)
2/2 commits from "ui, server: modify hot ranges api and table to use new contents approx." (ui, server: modify hot ranges api and table to use new contents approx. #134106)

Please see individual PRs for details.

/cc @cockroachdb/release

Fixes: #130997

Release justification: fixes an important bug within the db console

Historically, to estimate which tables and indexes existed within the span of a range, we used the Collection's `GetAllDescriptors` function and returned the values which fell in between the start and end keys of the range's span. This approach had the benefit of being precise, but the drawback of being computationally expensive - since the system can have hundreds of nodes and theoretically millions of tables, using `GetAllDescriptors` begins to become a bottleneck. This approach was modified so that instead of pulling all descriptors for the system when computing range contents, the system instead used the start key of the range to identify the exact table + index at that point within the keyspace ([change](https://github.com/cockroachdb/cockroach/pull/77277/files)). This works if each table, index has their own range, but quickly breaks down if tables share a range. Consider the following layout of data: ``` T1=Table 1 T2=Table 2 R1=Range1 └─────T1─────┴─────T2────┴─────T3─────┘ └────────┴─────────R1───────┴─────────┘ ``` Since the start key of the range falls with Table 1, the system associates the range with only Table 1, despite it containing Tables 2 and 3. Using this information, it becomes necessary to identify a set of descriptors within a certain span. This PR introduces the `ScanDescriptorsInSpans` function which does just that, allows the user to specify a set of spans whose descriptors are important and then return a catalog including those descriptors. It does this by translating the span keys into descriptor span keys and scanning them from the descriptors table. For example given a span `[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for the Users table is `5`, it will generate a descriptor span `[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`. This descriptor too comes with its drawbacks in that within the descriptor space, keys are scoped by table, and not necessarily indexes. That means in a following PR, the status server will be responsible for taking these descriptors, which include all indexes in the tables pulled, and filtering it down to only the indexes which appear in the specified range. The bulk of the changeset is in updating the datadriven tests to test this behavior, the primary area of focus for review should be the `pkg/sql/catalog/internal/catkv/catalog_reader.go` file (~75 LOC). Epic: CRDB-43151 Fixes: cockroachdb#130997 Release note: None

Previously each range correlated to a single table, or even a single index in a database, so all that was required to identify which tables, indexes were in the range were to look at the start key of the range and map it accordingly. With range coalescing however, it's possible for one, or many, tables, indexes and the like to reside within the same range. To properly identify the contents of a range, this PR adds the following utilities: 1. A utility function which turns a range into a span, and clamps it to its tenant's table space. 2. A utility function which takes the above spans and uses the catalog and new descriptor by span utility to turn those spans into a set of table descriptors ordered by id. 3. A utility function which transforms those table descriptors into a set of (database, table, index) names which deduplicate and identify each index uniquely. 4. A utility function, which merges the ranges and indexes into a map keyed by RangeID whose values are the above index names. 5. A primary entrypoint for consumers from which a set of ranges can be passed in and a mapping from those ranges to indexes can be returned. A variety of caveats come with this approach. It attempts to scan the desciptors all at once, but it still will scan a sizable portion of the descriptors table if the request is large enough. This makes no attempt to describe system information which does not have a descriptor. It will describe system tables which appear in the descriptors table, but it will not try to explain "tables" which do not have descriptors (example tsdb), or any other information stored in the keyspace without a descriptor (PseudoTableIDs, GossipKeys for example). Throughout this work, many existing utilities were duplicated, and then un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you see anything that seems to already exist, feel free to point it out accordingly. Epic: CRDB-43151 Fixes: cockroachdb#130997 Release note: None

Previously each range correlated to a single table, or even a single index in a database, so all that was required to identify which tables, indexes were in the range were to look at the start key of the range and map it accordingly. With range coalescing however, it's possible for one, or many, tables, indexes and the like to reside within the same range. To properly identify the contents of a range, this PR adds the following utilities: 1. A utility function which turns a range into a span, and clamps it to its tenant's table space. 2. A utility function which takes the above spans and uses the catalog and new descriptor by span utility to turn those spans into a set of table descriptors ordered by id. 3. A utility function which transforms those table descriptors into a set of (database, table, index) names which deduplicate and identify each index uniquely. 4. A utility function, which merges the ranges and indexes into a map keyed by RangeID whose values are the above index names. 5. A primary entrypoint for consumers from which a set of ranges can be passed in and a mapping from those ranges to indexes can be returned. A variety of cavets come with this approach. It attempts to scan the desciptors all at once, but it still will scan a sizable portion of the descriptors table if the request is large enough. This makes no attempt to describe system information which does not have a descriptor. It will describe system tables which appear in the descriptors table, but it will not try to explain "tables" which do not have descriptors (example tsdb), or any other information stored in the keyspace without a descriptor (PseudoTableIDs, GossipKeys for example). Throughout this work, many existing utilities were duplicated, and then un-duplicated (`keys.TableDataMin`, `roachpb.Span.Overlap`, etc). If you see anything that seems to already exist, feel free to point it out accordingly. Epic: none Fixes: cockroachdb#130997 Release note: None

This change is the last in a set of commits to change the hot ranges page from only showing one table, index per range to many. It builds on top of the changes in the catalog reader (10b9ee0) and the range utilities (109219d) to surface a set of tables, indexes for each range. The primary changes in this commit specifically are the modification of the status server to use the new `rangeutil` utilities, and changing the wire, presentation format of the information. Epic: CRDB-43151 Fixes: cockroachdb#130997 Release note (bug fix): changes the table, index contents of the hot ranges page in DB console.

blathers-crl · 2024-11-12T18:26:42Z

blathers-crl · 2024-11-12T18:26:44Z

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

cockroach-teamcity · 2024-11-12T18:26:52Z

This change is

dhartunian · 2024-11-15T16:54:19Z

@angles-n-daemons can you explain why the same commit appears twice in this PR? It looks like the UI PR contained a slightly different version of a commit that was already merged. That's a bit confusing here.

angles-n-daemons · 2024-11-19T14:47:33Z

I'm not entirely sure. My original thought is that this could have been caused by a rebase onto a moving head. If PR 3 branched off of 2 where a different commit hash existed due to 2's own amending, rebasing, it's possible that PR 3 believed there were multiple commits for 2 which it should include.

That being said, I tried to recreate this locally and had little luck. It could also have been a merge instead of a rebase, but it's tricky to tell. I notice some code that was removed at one point is added as one of the commits. It doesn't affect any functionality, but it is strange.

angles-n-daemons · 2024-12-02T16:58:19Z

I was unable to reproduce the duplicate commit locally. Looking into the individual commits themselves, the second one seems to add in some code which was removed in an amend. It could be a strange rebasing issue where some information was removed, which created changed the commit references.

dhartunian · 2024-12-04T18:27:21Z

pkg/server/serverpb/status.proto

    string database_name = 5;
-    // index_name indicates the index name for current range.
+    // index_name has been deprecated in favor of indexes = 17;


is there a reason that only one field has deprecated = true?

No this was an accident. I'll make the corresponding change to master.

angles-n-daemons added 4 commits November 12, 2024 13:23

angles-n-daemons requested review from a team as code owners November 12, 2024 18:26

angles-n-daemons requested review from kyle-a-wong and removed request for a team November 12, 2024 18:26

blathers-crl bot added the backport Label PR's that are backports to older release branches label Nov 12, 2024

angles-n-daemons requested a review from dhartunian November 12, 2024 18:27

rafiss removed the request for review from a team December 2, 2024 16:06

dhartunian approved these changes Dec 4, 2024

View reviewed changes

angles-n-daemons merged commit 480af61 into cockroachdb:release-24.3 Dec 6, 2024
20 of 21 checks passed

angles-n-daemons deleted the backport24.3-133190-133840-134106 branch December 6, 2024 20:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-24.3: ui, apiutil, server: surface multiple indexes within a range in the hot ranges page #134988

release-24.3: ui, apiutil, server: surface multiple indexes within a range in the hot ranges page #134988

angles-n-daemons commented Nov 12, 2024

blathers-crl bot commented Nov 12, 2024 •

edited by angles-n-daemons

Loading

blathers-crl bot commented Nov 12, 2024

cockroach-teamcity commented Nov 12, 2024

dhartunian commented Nov 15, 2024

angles-n-daemons commented Nov 19, 2024

angles-n-daemons commented Dec 2, 2024

dhartunian Dec 4, 2024

angles-n-daemons Dec 6, 2024

release-24.3: ui, apiutil, server: surface multiple indexes within a range in the hot ranges page #134988

release-24.3: ui, apiutil, server: surface multiple indexes within a range in the hot ranges page #134988

Conversation

angles-n-daemons commented Nov 12, 2024

blathers-crl bot commented Nov 12, 2024 • edited by angles-n-daemons Loading

blathers-crl bot commented Nov 12, 2024

cockroach-teamcity commented Nov 12, 2024

dhartunian commented Nov 15, 2024

angles-n-daemons commented Nov 19, 2024

angles-n-daemons commented Dec 2, 2024

dhartunian Dec 4, 2024

Choose a reason for hiding this comment

angles-n-daemons Dec 6, 2024

Choose a reason for hiding this comment

blathers-crl bot commented Nov 12, 2024 •

edited by angles-n-daemons

Loading