Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'SHOW TABLES' and similar commands results in error when run from in -console SQL Shell/SQL CLI #139512

Open
cvansicklecrdb opened this issue Jan 21, 2025 · 2 comments · May be fixed by #139532
Open
Assignees
Labels
branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-1 Issues/test failures with a fix SLA of 1 month T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@cvansicklecrdb
Copy link

cvansicklecrdb commented Jan 21, 2025

Describe the problem

Several reports of being unable to run SHOW TABLES; (and similar commands) across several clusters in the in-console SQL shell. We have observed this behavior in the SQL CLI as well. Testing so far has show that in-situ shell is not effected.

Example error output:

show tables from search_prod;

ERROR: executing stmt 1: run-query-via-api: read-roles: batch timestamp 1736961131.075117398,0 must be after replica GC threshold 1736964071.150720255,0 (r8: /Table/{4-5})
SQLSTATE: XXUUU

Again, this doesn’t repro in the in-situ shell AFAICT.

To Reproduce
It seems like there are environmental aspects that need to be in place for the query to fail. Those are unknown as of yet, we do have live reproduction available internally per request.

On an affected cluster however:

  1. Log in to the Admin Console
  2. Go to the SQL SHELL
  3. run SHOW DATABASES;
  4. find a database and run SHOW TABLES FROM <database>
  5. See error Expected behavior
    *Note: Even on affected clusters, this won't always fail. Usually waiting at least 5 mins will cause it to start to fail.

Additional data / screenshots
The error seems to indicate that the batch timestamp is earlier than the GC threshold - given the timestamps provided that is true. However the anomaly is that the batch timestamp is also (in our testing's case) 2 days older than the current time. For example, in one test case, we were running the statement on the 18th, but the batch timestamp was from the 16th and the GC Replica Threshold was on the 18th.

If a node in your cluster encountered a fatal error, supply the contents of the
log directories (at minimum of the affected node(s), but preferably all nodes).

Environment:

  • CockroachDB version v24.2.8
  • Host machines
  1. AWS 4 vCPU
  2. 1280 GiB
  3. 16 GiB (memory)
  4. 16000 IOPS
  5. m6i.xlarge
  • Client app:
    known failures in:
  1. In-console SQL Shell
  2. SQL CLI
  3. dbbeaver

Jira issue: CRDB-46693

@cvansicklecrdb cvansicklecrdb added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels Jan 21, 2025
Copy link

blathers-crl bot commented Jan 21, 2025

Hi @cvansicklecrdb, please add branch-* labels to identify which branch(es) this C-bug affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@rafiss rafiss added the branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 label Jan 21, 2025
@exalate-issue-sync exalate-issue-sync bot added the P-1 Issues/test failures with a fix SLA of 1 month label Jan 21, 2025
@rafiss
Copy link
Collaborator

rafiss commented Jan 22, 2025

I made a PR that will address this error, and it will be backported: #139532

In the meantime, the way to workaround this issue is to perform an operation that causes a refresh of the role cache. Running a command like BEGIN; CREATE USER tmp; DROP USER tmp; COMMIT; (creating and dropping a user in the same transaction) would cause the cache to refresh and will result in CRDB picking a newer timestamp that is after the GC threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-1 Issues/test failures with a fix SLA of 1 month T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants