Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulnerability locations index #94

Open
bstee615 opened this issue Mar 9, 2022 · 0 comments
Open

Vulnerability locations index #94

bstee615 opened this issue Mar 9, 2022 · 0 comments

Comments

@bstee615
Copy link

bstee615 commented Mar 9, 2022

Hello, my name is Ben Steenhoek and I am a PhD student at Iowa State University studying deep learning-based vulnerability detection. Thank you for making this dataset available and easy to use.

I want to use your corpus of programs for the DARPA Cyber Grand Challenge to train a neural network model to detect buggy code, such as null-pointer dereferences or buffer overflows. To do this, I provide the model with the source code of the program and the location of the vulnerability. For example, if the vulnerability is a crash, I mark the statement which causes the crash, such as a segmentation fault caused by a null pointer dereference. In order to collect a large dataset of vulnerable programs, I can only use the vulnerability location if it's in a machine-readable format such as XML or CSV.

Since the cyber grand challenge evaluated several systems, I would expect there's some level of automated checking. However, I do not see a machine-readable index of vulnerable locations. This repo only includes a natural language description of each vulnerability in README.md. How can I access a machine-readable index of the vulnerability locations? I would be grateful for your help in making use of this wonderful dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant