Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

back2source: Run pipeline for Rust source and binaries #1475

Open
Tracked by #1437
pombredanne opened this issue Dec 12, 2024 · 9 comments
Open
Tracked by #1437

back2source: Run pipeline for Rust source and binaries #1475

pombredanne opened this issue Dec 12, 2024 · 9 comments
Assignees

Comments

@pombredanne
Copy link
Member

No description provided.

@tdruez
Copy link
Contributor

tdruez commented Jan 15, 2025

@pombredanne We need the list of inputs to make progress on this one.

@AyanSinhaMahapatra
Copy link
Member

@tdruez I have a list with me, let me upload a csv for this soon.

Maybe we should have the inputs (csvs with source/binary links), scan outputs (JSON) and the report csvs all at https://github.com/aboutcode-org/back2source-data in a structured format?

I can start a PR there

@tdruez
Copy link
Contributor

tdruez commented Jan 15, 2025

@AyanSinhaMahapatra See #1476 (comment) for the supported CSV input example.

@AyanSinhaMahapatra
Copy link
Member

@tdruez I've added aboutcode-org/back2source-data#6 with the input files, reports and scans for Rust. Ready for your review, is this an okay structure to store the data?

I'm adding the Scan data too, as this could be useful to look at improvements in various scans for improvements in SCIO d2d pipelines

@tdruez
Copy link
Contributor

tdruez commented Jan 17, 2025

@AyanSinhaMahapatra Thanks for the listing at https://github.com/aboutcode-org/back2source-data/blob/fe50e25c82b70ceea9feba224bd50624ba77e47c/input-data/rust.csv
Although it only contains 25 projects we are generally running back2source on 100 projects.
Should we add more projects in this context or go ahead with 25 only?

@pombredanne
Copy link
Member Author

@tdruez @AyanSinhaMahapatra 25 projects is a good enough start for now IMHO. We can expand this later, these project with binaries are harder to find for Rust and a few other ecosystem. We should in a near future run this at scale on many projects, like all rust crates!

@AyanSinhaMahapatra
Copy link
Member

We can expand this later, these project with binaries are harder to find for Rust and a few other ecosystem

One approach to find these which I've used for rust and could also be helpful for other ecosystems is finding a github action which releases rust binaries and source, and then search github repos with that to get releases with rust binary and source archives.

For rust this was taiki-e/upload-rust-binary-action@v1 and tar: unix

@chinyeungli
Copy link
Contributor

Create a project lists that use rust

back2source_rust_projects_list-2025-01-23.csv

@tdruez
Copy link
Contributor

tdruez commented Jan 24, 2025

Run on the 100 projects list provided at https://github.com/user-attachments/files/18531274/back2source_rust_projects_list-2025-01-23.csv

  1. Using back2source_rust_projects_list-2025-01-23.csv

  2. Batch create the projects

docker compose -f /opt/scancodeio/docker-compose.yml run --rm \
    --volume $PWD:/input-data:ro \
    web scanpipe batch-create \
    --input-list /input-data/back2source_rust_projects_list-2025-01-23.csv \
    --pipeline map_deploy_to_develop:Rust \
    --label back2source-Rust-v2 \
    --execute --async
  1. Use the "Report" action on the filtered list of projects by "back2source-Rust-v2" label.

Results: back2source-report-Rust.xlsx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

No branches or pull requests

4 participants