Add an `ilab` tab to the CPT dashboard UI #124

dbutenhof · 2024-10-17T20:26:09Z

Type of change

Description

This uses #123 (the ilab API endpoint, which in turn relies on #122, the crucible_svc) to display a tab to view and analyze RHEL AI InstructLab CPT data extracted from a Crucible controller.

We can graph Crucible metrics as well as displaying metadata. Subsequent PRs will provide filtering, graph comparisons, and metric value tables.

This is chained from #122 (Crucible service) -> #140 (unit test framework) -> #146 (crucible unit tests) -> #123 (ilab API) -> #155 (API unit tests) -> #158 (functional test framework) -> #124 (ilab UI)

Related Tickets & Documents

Various Jira stories under Epic PANDA-496.

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.

Testing

Tested manually against a local backend.

This adds the basic UI to support comparison of the metrics of two InstructLab runs. This compares only the primary metrics of the two runs, in a relative timeline graph. This is backed by cloud-bulldozer#125, which is backed by cloud-bulldozer#124, which is backed by cloud-bulldozer#123, which is backed by cloud-bulldozer#122. These represent a series of steps towards a complete InstructLab UI and API, and will be reviewed and merged from cloud-bulldozer#122 forward.

github-actions · 2024-11-17T18:43:23Z

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions · 2024-12-18T18:46:45Z

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions · 2025-01-18T18:40:38Z

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 10 days.

This encapsulates substantial logic to encapsulate interpretation of the Crucible Common Data Model OpenSearch schema for the use of CPT dashboard API components. By itself, it does nothing.

This uses `black`, `isort` and `flake8` to check code quality, although failure is ignored until we've cleaned it up (which has begin in PR cloud-bulldozer#139 against the `revamp` branch). Minimal unit testing is introduced, generating a code coverage report. The text summary is added to the Action summary page, and the more detailed HTML report is stored as an artifact for download. NOTE: The GitHub Action environment is unhappy with `uvicorn` 0.15; upgrading to the latest 0.32.x seems to work and hasn't obviously broken anything else.

`crucible_svc.py` test coverage is now at 97%. While the remaining 3% is worth some effort later, the law of diminishing returns will require A significant additional effort; and since subsequent ILAB PRs will change some of the service code anyway it's good enough for now.

Provide the `api/v1/ilab` API endpoint to allow a client to query collected data on a Crucible CDM OpenSearch instance through the `crucible_svc` service layer. It is backed by the Crucible layer added in cloud-bulldozer#122, so only the final commit represents changes in this PR.

This covers 100% of the ilab.py API module using `FastAPI`'s `TestClient`. This proved ... interesting ... as the FastAPI and Starlette versions we use are incompatible with the underlying httpx version ... TestClient init fails in a way that can't be worked around. (Starlette passes an unknown keyword parameter.) After some experimentation, I ended up "unlocking" all the API-related packages in `project.toml` to `"*"` and letting `poetry update` resolve them, then "re-locked" them to those versions. The resulting combination of modules works for unit testing, and appears to work in a real `./local-compose.sh` deployment as well.

This adds a mechanism to "can" and restore a small prototype ILAB (Crucible CDM) Opensearch database in a pod along with the dashboard back end, front end, and functional tests. The functional tests run entirely within the pod, with no exposed ports and with unique container and pod names, allowing for the possibility of simultaneous runs (e.g., a CI) on the same system. This also has utilities for diagnosing a CDM (v7) datastore and cloning a limited subset, along with creating an Opensearch snapshot from that data to bootstrap the functional test pod. Only a few functional test cases are implemented here, as demonstration. More will be added separately.

This relies on the ilab API in cloud-bulldozer#123, which in turn builds on the crucible service in cloud-bulldozer#122.

This adds the basic UI to support comparison of the metrics of two InstructLab runs. This compares only the primary metrics of the two runs, in a relative timeline graph. This is backed by cloud-bulldozer#125, which is backed by cloud-bulldozer#124, which is backed by cloud-bulldozer#123, which is backed by cloud-bulldozer#122. These represent a series of steps towards a complete InstructLab UI and API, and will be reviewed and merged from cloud-bulldozer#122 forward.

dbutenhof force-pushed the ilab3 branch 2 times, most recently from 021ca1f to 176ddee Compare October 18, 2024 16:52

dbutenhof mentioned this pull request Oct 18, 2024

Add backend support for multi-run graphing #125

Draft

7 tasks

dbutenhof mentioned this pull request Oct 24, 2024

Add a UI for multi-run comparison graphs #127

Draft

7 tasks

This was referenced Nov 6, 2024

API infrastructure to support InstructLab tab display of Crucible metric statistics #129

Draft

Add statistical summary charts #130

Draft

Add a metadata flyover to the comparison page #131

Draft

Support selection of multiple metrics #132

Draft

dbutenhof mentioned this pull request Nov 13, 2024

Allow selecting multiple metrics on compare page #133

Draft

7 tasks

github-actions bot added the Stale label Nov 17, 2024

dbutenhof removed the Stale label Nov 18, 2024

dbutenhof self-assigned this Nov 18, 2024

This was referenced Nov 18, 2024

Improve formatting of delta time X axis #134

Draft

Allow "bare metal" deployment for testing #136

Draft

Add a metric label template mechanism #137

Draft

github-actions bot added the Stale label Dec 18, 2024

dbutenhof removed the Stale label Dec 19, 2024

dbutenhof mentioned this pull request Jan 13, 2025

ILAB date picker updates when no ILAB jobs found #153

Draft

7 tasks

github-actions bot added the Stale label Jan 18, 2025

dbutenhof removed the Stale label Jan 20, 2025

dbutenhof added 5 commits January 28, 2025 10:55

Add new Crucible backend service

15dba24

This encapsulates substantial logic to encapsulate interpretation of the Crucible Common Data Model OpenSearch schema for the use of CPT dashboard API components. By itself, it does nothing.

dbutenhof and others added 2 commits January 29, 2025 08:42

Add an ilab UI tab

35dfb4f

This relies on the ilab API in cloud-bulldozer#123, which in turn builds on the crucible service in cloud-bulldozer#122.

dbutenhof force-pushed the ilab3 branch from 176ddee to 35dfb4f Compare January 29, 2025 15:25

dbutenhof mentioned this pull request Jan 31, 2025

Add support for CDM v8 #144

Draft

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an `ilab` tab to the CPT dashboard UI #124

Add an `ilab` tab to the CPT dashboard UI #124

dbutenhof commented Oct 17, 2024 •

edited

Loading

github-actions bot commented Nov 17, 2024

github-actions bot commented Dec 18, 2024

github-actions bot commented Jan 18, 2025

Add an ilab tab to the CPT dashboard UI #124

Are you sure you want to change the base?

Add an ilab tab to the CPT dashboard UI #124

Conversation

dbutenhof commented Oct 17, 2024 • edited Loading

Type of change

Description

Related Tickets & Documents

Checklist before requesting a review

Testing

github-actions bot commented Nov 17, 2024

github-actions bot commented Dec 18, 2024

github-actions bot commented Jan 18, 2025

Add an `ilab` tab to the CPT dashboard UI #124

Add an `ilab` tab to the CPT dashboard UI #124

dbutenhof commented Oct 17, 2024 •

edited

Loading