Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LAYERS data to XLSX output #1490

Open
mjherzog opened this issue Dec 31, 2024 · 4 comments
Open

Add LAYERS data to XLSX output #1490

mjherzog opened this issue Dec 31, 2024 · 4 comments
Labels
enhancement New feature or request high priority outputs This issue is related to one of the SCIO output files/ web-ui

Comments

@mjherzog
Copy link
Member

For Docker image Scans we always need the LAYERS data. Currently the SCIO XLSX output includes the LAYERS sheet when the image is directly scanned on SCIO, but the XLSX output does not include LAYERS when we download XLSX output for a Docker image Scan with uploaded to SCIO with the load_inventory pipeline.
I also do not see the Codebase data in the UI for the Scans uploaded with the load_inventory pipeline.

The key enhancement is to get the LAYERS data in the XLSX output but it would be nice to also be able to see the Codebase data in the UI if possible.

@mjherzog mjherzog added enhancement New feature or request high priority web-ui outputs This issue is related to one of the SCIO output files/ labels Dec 31, 2024
tdruez added a commit that referenced this issue Dec 31, 2024
@tdruez
Copy link
Contributor

tdruez commented Dec 31, 2024

but the XLSX output does not include LAYERS when we download XLSX output for a Docker image Scan with uploaded to SCIO with the load_inventory pipeline.

The layers data is extracted during the docker analysis pipeline and it is stored in the project extra_data.images field.

This value, when available, is presented in the UI and included as the LAYERS sheet in the XLSX output.

The export of the LAYERS sheet was implemented in #735 and only contains a subset of the whole images JSON data structure that is stored on the Project.

The issue here is that the input.load_inventory_from_xlsx function does not load the LAYERS sheet back into the
project extra_data.images field.

As anticipated in #735 (comment)

There will be some data loss due to the XLSX format limitation.

XLSX is not the ideal format to export and load project data. This is especially true for the LAYERS sheet.
The ScanCode.io JSON output is preferred to load project data without any loss.

We can add bit of code to load the available layers subset in the Project extra_data field though.


I also do not see the Codebase data in the UI for the Scans uploaded with the load_inventory pipeline.

The codebase data panel presents the data available on disk, it is not available when loading data from an SBOM (or XLSX).

tdruez added a commit that referenced this issue Dec 31, 2024
@tdruez
Copy link
Contributor

tdruez commented Dec 31, 2024

We can add bit of code to load the available layers subset in the Project extra_data field though.

Added in #1491

@tdruez tdruez closed this as completed Dec 31, 2024
@mjherzog
Copy link
Member Author

mjherzog commented Jan 2, 2025

I have run several tests with v34.9.3 using new projects to load_inventory , but I do not see a LAYERS sheet in the XLSX output. I could not see the extra_data in the Web UI, but I verified that we have reasonable-looking layers data in the JSON files.

Clarifying that we loaded the JSON data into SCIO - not XLSX, so the data is available for SCIO to collect during the load_inventory pipeline.

@mjherzog mjherzog reopened this Jan 3, 2025
@tdruez
Copy link
Contributor

tdruez commented Jan 6, 2025

I could not see the extra_data in the Web UI, but I verified that we have reasonable-looking layers data in the JSON files.

The extra_data field was not supported in the load_inventory pipeline for neither JSON or XLSX inputs.

See #926 (comment) for the previous discussion on this topic.

I've revisited this and implemented a solution, see #926 (comment)

The extra_data content should now properly be loaded from the JSON file into the Project and will be available to craft the LAYERS sheet on XLSX export.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high priority outputs This issue is related to one of the SCIO output files/ web-ui
Projects
None yet
Development

No branches or pull requests

2 participants