You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, my name is Ben Steenhoek and I am a PhD student at Iowa State University studying deep learning-based vulnerability detection. Thank you for making this dataset available and easy to use.
I want to use your corpus of programs for the DARPA Cyber Grand Challenge to train a neural network model to detect buggy code, such as null-pointer dereferences or buffer overflows. To do this, I provide the model with the source code of the program and the location of the vulnerability. For example, if the vulnerability is a crash, I mark the statement which causes the crash, such as a segmentation fault caused by a null pointer dereference. In order to collect a large dataset of vulnerable programs, I can only use the vulnerability location if it's in a machine-readable format such as XML or CSV.
Since the cyber grand challenge evaluated several systems, I would expect there's some level of automated checking. However, I do not see a machine-readable index of vulnerable locations. This repo only includes a natural language description of each vulnerability in README.md. How can I access a machine-readable index of the vulnerability locations? I would be grateful for your help in making use of this wonderful dataset.
The text was updated successfully, but these errors were encountered:
Hello, my name is Ben Steenhoek and I am a PhD student at Iowa State University studying deep learning-based vulnerability detection. Thank you for making this dataset available and easy to use.
I want to use your corpus of programs for the DARPA Cyber Grand Challenge to train a neural network model to detect buggy code, such as null-pointer dereferences or buffer overflows. To do this, I provide the model with the source code of the program and the location of the vulnerability. For example, if the vulnerability is a crash, I mark the statement which causes the crash, such as a segmentation fault caused by a null pointer dereference. In order to collect a large dataset of vulnerable programs, I can only use the vulnerability location if it's in a machine-readable format such as XML or CSV.
Since the cyber grand challenge evaluated several systems, I would expect there's some level of automated checking. However, I do not see a machine-readable index of vulnerable locations. This repo only includes a natural language description of each vulnerability in README.md. How can I access a machine-readable index of the vulnerability locations? I would be grateful for your help in making use of this wonderful dataset.
The text was updated successfully, but these errors were encountered: