layout | permalink | challenge-id | status | sidenav | card-image | agency-logo | challenge-title | tagline | agency | partner-agencies-federal | partners-non-federal | external-url | total-prize-offered-cash | type-of-challenge | submission-start | submission-end | submission-link | prize | legal-authority | fiscal-year | challenge-manager | challenge-manager-email | point-of-contact | description | prizes | rules | judging | how-to-enter |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
front-matter-data |
/challenge/artificial-intelligence-applications-to-autonomous-cybersecurity-challenge/ |
1049 |
closed |
true |
/assets/images/cards/AITAC.png |
dod_seal.jpg |
Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) Challenge |
Advanced malware prediction and prevention utilizing state-of-the art Artificial Intelligence (AI) at the endpoint. |
Department of Defense - Naval Information Warfare Systems Command |
Department of Energy, Oak Ridge National Lab, Cybersecurity Research Group |
$150,000 |
Software and apps; Technology demonstration and hardware; Analytics, visualizations, and algorithms |
06/28/2019 02:30 PM |
09/30/2019 05:00 PM |
true |
Direct Prize Authority - DOD |
FY19, FY20 |
Michael Karlbom |
<p>See <strong><a href="{{ site.baseurl }}/assets/document-library/AI-ATAC-Prize-Challenge-FAQ-2-05AUG19.pdf" target="_blank" rel="noopener">FREQUENTLY ASKED QUESTIONS</a></strong> for this challenge.</p> <p>The Naval Information Warfare Systems Command (NAVWARSYSCOM) and the Program Executive Office for Command, Control, Communications, Computers, and Intelligence (PEO C4I) are conducting the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC), pronounced “AI attack” Challenge (hereinafter referred to as “the Challenge”). The Navy’s Information Assurance and Cybersecurity Program Office (PMW 130) seeks to automate the Security Operations Center (SOC) using artificial intelligence and machine learning (AI/ML) beginning with the endpoint. Modern malware strains, especially sophisticated malware created by advanced persistent threat (APT) groups, have shown capabilities that mutate faster than signature-based protection tools can adapt. PMW 130 solicits white papers describing endpoint-based security technologies and the corresponding tool for evaluation in the AI ATAC Prize Challenge competition.</p> |
<p>The Challenge winners will be notified via email. NAVWARSYSCOM will announce the winners on the Challenge.gov website and via appropriate channels.</p> <p>NAVWARSYSCOM has established $150,000 as the total amount set aside for cash prizes under this Challenge. A $100,000 first place cash prize will be awarded to the winning entry. A $50,000 second place cash prize will be awarded to the second place winner. In the unlikely event of a tie, NAVWARSYSCOM will determine an equitable method of distributing the cash prizes.</p> <p>If a prize goes to a team of Participants, NAVWARSYSCOM will award the cash prize to the individual/team’s point of contact registered on the Challenge website, for further distribution to the team, as the team members see fit. </p> <p>NAWARSYSCOM may award, pursuant to Title 10 U.S.C. § 2371b, a follow-on production contract or transaction to one or more participants who successfully demonstrated an effective AI/ML approach under this Challenge. This Challenge, however, does not in any way obligate NAVWARSYSCOM to procure any of the items within the scope of this Challenge from the winners</p> <p>Tax treatment of prizes will be handled in accordance with U.S. Internal Revenue Service guidelines. The winner must provide a U.S. TIN (e.g., a SSN, TIN, EIN) to receive the cash prize.</p> |
<p>Each Participant (individual Participant, team of Participants, or commercial entity) shall submit one entry in response to this Challenge. Team entries or commercial entity entries must have an individual identified as the primary point of contact and prize recipient. By submitting an entry, a Participant authorizes his or her name to be released to the media if the Participant wins the prize.</p> <p>The submission package must include:</p> <ul> <li>white paper</li> <li>corresponding tool</li> </ul> <p>Note, the tool and white paper must contain only unclassified material.</p> <p><strong>In order for an entry to be considered, both the white paper and corresponding tool must be submitted no later than 30 September 2019, in accordance with these submission guidelines.</strong></p> <p><strong>White Paper Submission Guidelines:</strong></p> <p>White papers should provide an overview of the proposed technology and technical approach (e.g. architecture, deployment overview, algorithm description, model description, performance requirements, endpoint footprint, existing results, etc.), the benefits and novelty of the approach within the context of existing academic and commercially available technologies, and the dependencies necessary (e.g. data, platform, network connectivity, etc.) to operate the proposed technology. White papers must be no more than six pages in length. All white papers must be submitted along with the Participants tool per the instructions outlined in the tool submission guidelines below. Additionally, where appropriate, use protective markings such as “Do Not Publicly Release – Trade Secret” or “Do Not Publicly Release – Confidential Proprietary Business Information” in the Header or Footer of the Submission.</p> <p><strong>Tool Submission Guidelines:</strong></p> <p>Software for endpoint agents and/or management appliances, or hardware for management appliances must be shipped by trackable, non-postal delivery (FedEx, UPS, DHL, etc.) and received no later than 30 September 2019 at 1700 EDT, to the following address:</p> <div>Cybersecurity Research Group</div> <div>Oak Ridge National Laboratory</div> <div>Attn: AI ATAC Evaluation Team</div> <div>1 Bethel Valley Road Building 6012, Room 209</div> <div>Oak Ridge, TN 37830</div> <p>All questions regarding the Challenge should be sent via email to <a href="mailto:[email protected]" target="_blank" rel="noopener">AIATAC.PRIZE.CHALLENGE@<wbr />NAVY.MIL</a> no later than 30 August 2019, 1700 EDT. Questions submitted after this deadline may not be addressed.</p> <p><strong>Terms and Conditions:</strong></p> <p><strong>These terms and conditions apply to all participants in the Challenge.</strong></p> <p><strong>Agreement to Terms</strong></p> <p>The Participant agrees to comply with and be bound by the AI ATAC Challenge Background and Rules (“the Rules”) as well as the Terms and Conditions contained herein. The Participant also agrees that the decisions of the Government, in connection with all matters relating to this Challenge are binding and final.</p> <p><strong>Eligibility</strong></p> <p>The Challenge is open to individual Participants, teams of Participants, and commercial entities. Commercial entities must be incorporated in and maintain a primary place of business in the United States (U.S.). Individual Participants and all members of teams of Participants must all be U.S. citizens or U.S. Permanent Residents and be 18 years or older as of 08 July 2019. All Participants (commercial entities or individuals) must have a Social Security Number (SSN), Taxpayer Identification Number (TIN), or Employer Identification Number (EIN) in order to receive a prize. Eligibility is subject to verification before any prize is awarded.</p> <p>Federal Government employees, PMW 130 support contractors and their employees, and Oak Ridge National Laboratory (ORNL) employees are not eligible to participate in this Challenge. </p> <p>Violation of the rules contained herein or intentional or consistent activity that undermines the spirit of the Challenge may result in disqualification. The Challenge is void wherever restricted or prohibited by law.</p> <p><strong>Data Rights</strong></p> <p>NAVWARSYSCOM does not require that Participants relinquish or otherwise grant license rights to intellectual property developed or delivered under the Challenge. NAVWARSYSCOM requires sufficient data rights/intellectual property rights to use, release, display, and disclose the white paper and/or tool, but only to the evaluation team members, and only for purposes of evaluating the Participant submission. The evaluation team does not plan to retain entries after the Challenge is completed but does plan to retain data and aggregate performance statistics, resulting from the evaluation of those entries. By accepting these Terms and Conditions, the Participant consents to the use of data submitted to the evaluation team for these purposes.</p> <p>NAVWARSYSCOM may contact Participants, at no additional cost to the Government, to discuss the means and methods used in solving the Challenge, even if Participants did not win the Challenge. Such contact does not imply any sort of contractual commitment with the Participant.</p> <p>Because of the number of anticipated Challenge entries, NAVWARSYSCOM cannot and will not make determinations on whether or not third-party materials in the Challenge submissions have protectable intellectual property interests. By participating in this Challenge, each Participant (whether participating individually, as a team, or as a commercial entity) warrants and assures the Government that any data used for the purpose of submitting an entry for this Challenge, were obtained legally and through authorized access to such data. By entering the Challenge and submitting the Challenge materials, the Participant agrees to indemnify and hold the Government harmless against any claim, loss or risk of loss for patent or copyright infringement with respect to such third party interests.</p> <p>This Challenge does not replace or supersede any other written contracts and/or written challenges that the Participant has or will have with the Government, which may require delivery of any materials the Participant is submitting herein for this Challenge effort. </p> <p>This Challenge constitutes the entire understanding of the parties with respect to the Challenge. NAVWARSYSCOM may update the terms of the Challenge from time to time without notice. Participants are strongly encouraged to check the website frequently. </p> <p>If any provision of this Challenge is held to be invalid or unenforceable under applicable federal law, it will not affect the validity or enforceability of the remainder of the Terms and Conditions of this Challenge.</p> <p><strong>Results of Challenge</strong></p> <p>Winners will be announced on the challenge.gov website and email.</p> <p><strong>Release of Claims</strong></p> <p>The Participant agrees to release and forever discharge any and all manner of claims, equitable adjustments, actions, suits, debts, appeals, and all other obligations of any kind, whether past or present, known or unknown, that have or may arise from, are related to or are in connection with, directly or indirectly, this Challenge or the Participant’s submission.</p> <p><strong>Compliance with Laws</strong></p> <p>The Participant agrees to follow and comply with all applicable federal, state and local laws, regulations and policies.</p> <p><strong>Governing Law</strong></p> <p>This Challenge is subject to all applicable federal laws and regulations. ALL CLAIMS ARISING OUT OF OR RELATING TO THESE TERMS WILL BE GOVERNED BY THE FEDERAL LAWS AND REGULATIONS OF THE UNITED STATES OF AMERICA.</p> |
<p><strong>Scope: </strong></p> <p>The Challenge evaluation will focus on AI/ML technologies that detect malware on an endpoint. The following describes the scope of the candidate technologies:</p> <ul> <li>This evaluation is focused on detecting malware at the endpoint. Network-based tools (such as network intrusion detection systems) or network-based systems that reconstruct and analyze files from network traffic are not eligible.</li> <li>Malware detection tools will classify malware on the host: <ul> <li>within the filesystem statically or dynamically, including addition, alteration or replacement of authorized operating system, application, or user files; or</li> <li>with fileless techniques (e.g. identifying fileless malware that sits in memory as executable instructions); or</li> <li>based on other host activities (e.g., logs indicating encrypted command and control communications).</li> </ul> </li> <li>Technologies are expected to have an artificial intelligence and/or machine learning component (e.g., static or dynamic analysis and classifier), with the option to include other complementary approaches, such as signature-based detection.</li> <li>Technologies must operate on-premises, whether solely at the endpoint or in coordination with a local network-level appliance or VM that emulates a cloud analytic capability.</li> <li>To measure how AI/ML improves malware detection, the test data will use public and private benign and malicious samples.</li> </ul> <p><strong>Test Evaluation Process: </strong></p> <p>The evaluation team will review the white papers in order to better understand the tool. The actual test evaluation process of the tool will involve the delivery of multiple file test samples to virtual machines (VMs) running the Participant’s malware detection technology. For each test sample (benign or malicious file), the following procedures will be performed on the tool:</p> <ul> <li>A VM will be instantiated with the Participant’s malware detection technology installed and running.</li> <li>The following information will be recorded for evaluation during a set time interval in which the sample, a file or fileless malware/benign-ware, is present on the VM: <ul> <li>The classification decision for the sample, under the following constraints: <ul> <li>The label must identify whether benign or malicious and be time stamped. In the event there is no label given in the time interval or the tool does not produce a log or result for the sample, the classification will be interpreted as a benign label with maximum timestamp for that time interval.</li> <li>At most one label shall be given per sample. Other peripheral information beyond the required classification decision (e.g. taxonomy, confidence score, etc.) may also be provided, but will not be used in scoring of the Participant technology.</li> <li>The output classification and any peripheral information must be programmatically accessible by the evaluation team. Documentation of the classification logging approach and format must be provided by the Participant. For example, time-tagged logging content may be: <ul> <li>Sent to a logging server/SIEM, such as Splunk.</li> <li>Logged to syslog, a local file, or windows event logs.</li> </ul> </li> </ul> </li> </ul> </li> <li>Resource usage by only the Participant technology for the duration of the test sample’s VM as follows:</li> <li>CPU usage per second</li> <li>Volatile memory usage per second</li> <li>Disk I/O per second</li> <li>Network I/O per second</li> </ul> <p><strong>Note:</strong> If the Participant’s technology uses an on-premises appliance for dynamic or static analysis, or AI/ML analysis, either via a hardware appliance or a virtual machine, resource usage of the management console will be recorded as well as for the VM containing the test sample. If the on-premises appliance only manages endpoint agents, the usage will <strong>not</strong> be recorded.</p> <p><strong>Scoring:</strong></p> <p>Each candidate technology will be scored based on the aggregate performance of their technology when tested against benign and malicious file samples. The score will be in terms of a cost estimate, and will sum the following:</p> <ul> <li><strong>Simulated attack cost:</strong> A cost function will be used to compute estimated attack costs of each malicious file based on the file’s negative effects. The functions will be increasing in time, so that early detection will incur less costs than later detection. Note that in the case of a true positive (correct classification of the malicious sample) only the attack cost up to the detection time is accrued. In the case of a false negative (no alert on a malicious sample) the attack cost for the whole test period will be incurred. For false positive (alerting upon a benign sample) and true negative (correctly not alerting upon a benign sample) scenarios, there is no attack and thus no attack costs will be accrued.</li> <li><strong>Simulated security operator cost:</strong> Costs for handling alerts will be computed. This includes only the costs for handling true positive and false positive scenarios, respectively. These will be estimated from actual manpower costs incurred by security operation centers to triage, investigate, and document host alerts. In the case of the true negative or false negative scenario, there are no alerts and hence no operator costs are accrued.</li> <li><strong>Simulated host resource cost: </strong>Costs for the use of host resources will be accrued throughout the duration of the event. The resources monitored on each host are as follows: <ul> <li>CPU usage per second</li> <li>Volatile memory usage per second</li> <li>Disk I/O per second</li> <li>Network I/O per second</li> </ul> </li> </ul> <p>The simulated costs of each resource itemized above will be estimated from current rates. These simulated resource costs are accrued regardless of correct (true positive, true negative) or incorrect (false positive, false negative) classification. Note that in the case of a true negative (correctly not alerting on a benign sample), only host resource costs (no attack cost and no operator costs) will be accrued.</p> <p><strong>THE WINNERS OF THE CHALLENGE WILL BE THE PARTICIPANTS WITH THE LOWEST COMPUTED TOTAL COST IN ACCORDANCE WITH THE SCORING ABOVE. </strong></p> <p><strong>Environment: </strong></p> <p>The evaluation environment will be VMs. Submitted technologies do not have to be compatible with all tested platforms, but must be compatible with at least one. The two evaluation platforms include the following:</p> <ul> <li>Windows 10 Enterprise 64-bit</li> <li>CentOS 7 64-bit</li> </ul> <p>The operating systems will have the following additional user software installed as a baseline:</p> <ul> <li>Windows: Microsoft Office, Java, Adobe PDF Reader</li> <li>CentOS: OpenOffice, Java, Adobe PDF Reader</li> </ul> <p>The host-based endpoint agents must be able to be installed on the Windows VM through either a GUI, console, or powershell using standard Windows installation procedures. On CentOS, an RPM-compatible or RHEL-distribution compatible installer must be provided. Documentation describing the installation procedures should also be provided when submitting the tool, in accordance with the tool submission guidelines provided above.</p> <p>If the technology requires an on-premises management appliance or console, then software and/or hardware components for the management appliance should be provided upon submission of the white paper <strong>no later than 30 September 2019 at 1700 EDT.</strong> The components to run the management console must include documentation necessary for installation and a three-month evaluation license if needed. The software and/or hardware components for the on-premises console must be one of the following:</p> <ul> <li>Exported Virtual Machine Image (e.g. .ova, .qcow2, etc.) that can be ran with a libvirt-compatible hypervisor (e.g. QEMU, XenServer, VMWare, Virtualbox, Emulab, etc.)</li> <li>Software packages needed to install the appliance, rather than providing the entire virtual machine image.</li> <li>If not an OVA or similar image, a <em>standalone </em>hardware appliance that <em>will not be </em>connected to external cloud services.</li> </ul> |
<p>See the Rules section for submission guidelines.</p> |
See FREQUENTLY ASKED QUESTIONS for this challenge.
The Naval Information Warfare Systems Command (NAVWARSYSCOM) and the Program Executive Office for Command, Control, Communications, Computers, and Intelligence (PEO C4I) are conducting the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC), pronounced “AI attack” Challenge (hereinafter referred to as “the Challenge”). The Navy’s Information Assurance and Cybersecurity Program Office (PMW 130) seeks to automate the Security Operations Center (SOC) using artificial intelligence and machine learning (AI/ML) beginning with the endpoint. Modern malware strains, especially sophisticated malware created by advanced persistent threat (APT) groups, have shown capabilities that mutate faster than signature-based protection tools can adapt. PMW 130 solicits white papers describing endpoint-based security technologies and the corresponding tool for evaluation in the AI ATAC Prize Challenge competition.
The Challenge winners will be notified via email. NAVWARSYSCOM will announce the winners on the Challenge.gov website and via appropriate channels.
NAVWARSYSCOM has established $150,000 as the total amount set aside for cash prizes under this Challenge. A $100,000 first place cash prize will be awarded to the winning entry. A $50,000 second place cash prize will be awarded to the second place winner. In the unlikely event of a tie, NAVWARSYSCOM will determine an equitable method of distributing the cash prizes.
If a prize goes to a team of Participants, NAVWARSYSCOM will award the cash prize to the individual/team’s point of contact registered on the Challenge website, for further distribution to the team, as the team members see fit.
NAWARSYSCOM may award, pursuant to Title 10 U.S.C. § 2371b, a follow-on production contract or transaction to one or more participants who successfully demonstrated an effective AI/ML approach under this Challenge. This Challenge, however, does not in any way obligate NAVWARSYSCOM to procure any of the items within the scope of this Challenge from the winners
Tax treatment of prizes will be handled in accordance with U.S. Internal Revenue Service guidelines. The winner must provide a U.S. TIN (e.g., a SSN, TIN, EIN) to receive the cash prize.
Each Participant (individual Participant, team of Participants, or commercial entity) shall submit one entry in response to this Challenge. Team entries or commercial entity entries must have an individual identified as the primary point of contact and prize recipient. By submitting an entry, a Participant authorizes his or her name to be released to the media if the Participant wins the prize.
The submission package must include:
- white paper
- corresponding tool
Note, the tool and white paper must contain only unclassified material.
In order for an entry to be considered, both the white paper and corresponding tool must be submitted no later than 30 September 2019, in accordance with these submission guidelines.
White Paper Submission Guidelines:
White papers should provide an overview of the proposed technology and technical approach (e.g. architecture, deployment overview, algorithm description, model description, performance requirements, endpoint footprint, existing results, etc.), the benefits and novelty of the approach within the context of existing academic and commercially available technologies, and the dependencies necessary (e.g. data, platform, network connectivity, etc.) to operate the proposed technology. White papers must be no more than six pages in length. All white papers must be submitted along with the Participants tool per the instructions outlined in the tool submission guidelines below. Additionally, where appropriate, use protective markings such as “Do Not Publicly Release – Trade Secret” or “Do Not Publicly Release – Confidential Proprietary Business Information” in the Header or Footer of the Submission.
Tool Submission Guidelines:
Software for endpoint agents and/or management appliances, or hardware for management appliances must be shipped by trackable, non-postal delivery (FedEx, UPS, DHL, etc.) and received no later than 30 September 2019 at 1700 EDT, to the following address:
All questions regarding the Challenge should be sent via email to [email protected] no later than 30 August 2019, 1700 EDT. Questions submitted after this deadline may not be addressed.
Terms and Conditions:
These terms and conditions apply to all participants in the Challenge.
Agreement to Terms
The Participant agrees to comply with and be bound by the AI ATAC Challenge Background and Rules (“the Rules”) as well as the Terms and Conditions contained herein. The Participant also agrees that the decisions of the Government, in connection with all matters relating to this Challenge are binding and final.
Eligibility
The Challenge is open to individual Participants, teams of Participants, and commercial entities. Commercial entities must be incorporated in and maintain a primary place of business in the United States (U.S.). Individual Participants and all members of teams of Participants must all be U.S. citizens or U.S. Permanent Residents and be 18 years or older as of 08 July 2019. All Participants (commercial entities or individuals) must have a Social Security Number (SSN), Taxpayer Identification Number (TIN), or Employer Identification Number (EIN) in order to receive a prize. Eligibility is subject to verification before any prize is awarded.
Federal Government employees, PMW 130 support contractors and their employees, and Oak Ridge National Laboratory (ORNL) employees are not eligible to participate in this Challenge.
Violation of the rules contained herein or intentional or consistent activity that undermines the spirit of the Challenge may result in disqualification. The Challenge is void wherever restricted or prohibited by law.
Data Rights
NAVWARSYSCOM does not require that Participants relinquish or otherwise grant license rights to intellectual property developed or delivered under the Challenge. NAVWARSYSCOM requires sufficient data rights/intellectual property rights to use, release, display, and disclose the white paper and/or tool, but only to the evaluation team members, and only for purposes of evaluating the Participant submission. The evaluation team does not plan to retain entries after the Challenge is completed but does plan to retain data and aggregate performance statistics, resulting from the evaluation of those entries. By accepting these Terms and Conditions, the Participant consents to the use of data submitted to the evaluation team for these purposes.
NAVWARSYSCOM may contact Participants, at no additional cost to the Government, to discuss the means and methods used in solving the Challenge, even if Participants did not win the Challenge. Such contact does not imply any sort of contractual commitment with the Participant.
Because of the number of anticipated Challenge entries, NAVWARSYSCOM cannot and will not make determinations on whether or not third-party materials in the Challenge submissions have protectable intellectual property interests. By participating in this Challenge, each Participant (whether participating individually, as a team, or as a commercial entity) warrants and assures the Government that any data used for the purpose of submitting an entry for this Challenge, were obtained legally and through authorized access to such data. By entering the Challenge and submitting the Challenge materials, the Participant agrees to indemnify and hold the Government harmless against any claim, loss or risk of loss for patent or copyright infringement with respect to such third party interests.
This Challenge does not replace or supersede any other written contracts and/or written challenges that the Participant has or will have with the Government, which may require delivery of any materials the Participant is submitting herein for this Challenge effort.
This Challenge constitutes the entire understanding of the parties with respect to the Challenge. NAVWARSYSCOM may update the terms of the Challenge from time to time without notice. Participants are strongly encouraged to check the website frequently.
If any provision of this Challenge is held to be invalid or unenforceable under applicable federal law, it will not affect the validity or enforceability of the remainder of the Terms and Conditions of this Challenge.
Results of Challenge
Winners will be announced on the challenge.gov website and email.
Release of Claims
The Participant agrees to release and forever discharge any and all manner of claims, equitable adjustments, actions, suits, debts, appeals, and all other obligations of any kind, whether past or present, known or unknown, that have or may arise from, are related to or are in connection with, directly or indirectly, this Challenge or the Participant’s submission.
Compliance with Laws
The Participant agrees to follow and comply with all applicable federal, state and local laws, regulations and policies.
Governing Law
This Challenge is subject to all applicable federal laws and regulations. ALL CLAIMS ARISING OUT OF OR RELATING TO THESE TERMS WILL BE GOVERNED BY THE FEDERAL LAWS AND REGULATIONS OF THE UNITED STATES OF AMERICA.
Scope:
The Challenge evaluation will focus on AI/ML technologies that detect malware on an endpoint. The following describes the scope of the candidate technologies:
- This evaluation is focused on detecting malware at the endpoint. Network-based tools (such as network intrusion detection systems) or network-based systems that reconstruct and analyze files from network traffic are not eligible.
- Malware detection tools will classify malware on the host:
- within the filesystem statically or dynamically, including addition, alteration or replacement of authorized operating system, application, or user files; or
- with fileless techniques (e.g. identifying fileless malware that sits in memory as executable instructions); or
- based on other host activities (e.g., logs indicating encrypted command and control communications).
- Technologies are expected to have an artificial intelligence and/or machine learning component (e.g., static or dynamic analysis and classifier), with the option to include other complementary approaches, such as signature-based detection.
- Technologies must operate on-premises, whether solely at the endpoint or in coordination with a local network-level appliance or VM that emulates a cloud analytic capability.
- To measure how AI/ML improves malware detection, the test data will use public and private benign and malicious samples.
Test Evaluation Process:
The evaluation team will review the white papers in order to better understand the tool. The actual test evaluation process of the tool will involve the delivery of multiple file test samples to virtual machines (VMs) running the Participant’s malware detection technology. For each test sample (benign or malicious file), the following procedures will be performed on the tool:
- A VM will be instantiated with the Participant’s malware detection technology installed and running.
- The following information will be recorded for evaluation during a set time interval in which the sample, a file or fileless malware/benign-ware, is present on the VM:
- The classification decision for the sample, under the following constraints:
- The label must identify whether benign or malicious and be time stamped. In the event there is no label given in the time interval or the tool does not produce a log or result for the sample, the classification will be interpreted as a benign label with maximum timestamp for that time interval.
- At most one label shall be given per sample. Other peripheral information beyond the required classification decision (e.g. taxonomy, confidence score, etc.) may also be provided, but will not be used in scoring of the Participant technology.
- The output classification and any peripheral information must be programmatically accessible by the evaluation team. Documentation of the classification logging approach and format must be provided by the Participant. For example, time-tagged logging content may be:
- Sent to a logging server/SIEM, such as Splunk.
- Logged to syslog, a local file, or windows event logs.
- The classification decision for the sample, under the following constraints:
- Resource usage by only the Participant technology for the duration of the test sample’s VM as follows:
- CPU usage per second
- Volatile memory usage per second
- Disk I/O per second
- Network I/O per second
Note: If the Participant’s technology uses an on-premises appliance for dynamic or static analysis, or AI/ML analysis, either via a hardware appliance or a virtual machine, resource usage of the management console will be recorded as well as for the VM containing the test sample. If the on-premises appliance only manages endpoint agents, the usage will not be recorded.
Scoring:
Each candidate technology will be scored based on the aggregate performance of their technology when tested against benign and malicious file samples. The score will be in terms of a cost estimate, and will sum the following:
- Simulated attack cost: A cost function will be used to compute estimated attack costs of each malicious file based on the file’s negative effects. The functions will be increasing in time, so that early detection will incur less costs than later detection. Note that in the case of a true positive (correct classification of the malicious sample) only the attack cost up to the detection time is accrued. In the case of a false negative (no alert on a malicious sample) the attack cost for the whole test period will be incurred. For false positive (alerting upon a benign sample) and true negative (correctly not alerting upon a benign sample) scenarios, there is no attack and thus no attack costs will be accrued.
- Simulated security operator cost: Costs for handling alerts will be computed. This includes only the costs for handling true positive and false positive scenarios, respectively. These will be estimated from actual manpower costs incurred by security operation centers to triage, investigate, and document host alerts. In the case of the true negative or false negative scenario, there are no alerts and hence no operator costs are accrued.
- Simulated host resource cost: Costs for the use of host resources will be accrued throughout the duration of the event. The resources monitored on each host are as follows:
- CPU usage per second
- Volatile memory usage per second
- Disk I/O per second
- Network I/O per second
The simulated costs of each resource itemized above will be estimated from current rates. These simulated resource costs are accrued regardless of correct (true positive, true negative) or incorrect (false positive, false negative) classification. Note that in the case of a true negative (correctly not alerting on a benign sample), only host resource costs (no attack cost and no operator costs) will be accrued.
THE WINNERS OF THE CHALLENGE WILL BE THE PARTICIPANTS WITH THE LOWEST COMPUTED TOTAL COST IN ACCORDANCE WITH THE SCORING ABOVE.
Environment:
The evaluation environment will be VMs. Submitted technologies do not have to be compatible with all tested platforms, but must be compatible with at least one. The two evaluation platforms include the following:
- Windows 10 Enterprise 64-bit
- CentOS 7 64-bit
The operating systems will have the following additional user software installed as a baseline:
- Windows: Microsoft Office, Java, Adobe PDF Reader
- CentOS: OpenOffice, Java, Adobe PDF Reader
The host-based endpoint agents must be able to be installed on the Windows VM through either a GUI, console, or powershell using standard Windows installation procedures. On CentOS, an RPM-compatible or RHEL-distribution compatible installer must be provided. Documentation describing the installation procedures should also be provided when submitting the tool, in accordance with the tool submission guidelines provided above.
If the technology requires an on-premises management appliance or console, then software and/or hardware components for the management appliance should be provided upon submission of the white paper no later than 30 September 2019 at 1700 EDT. The components to run the management console must include documentation necessary for installation and a three-month evaluation license if needed. The software and/or hardware components for the on-premises console must be one of the following:
- Exported Virtual Machine Image (e.g. .ova, .qcow2, etc.) that can be ran with a libvirt-compatible hypervisor (e.g. QEMU, XenServer, VMWare, Virtualbox, Emulab, etc.)
- Software packages needed to install the appliance, rather than providing the entire virtual machine image.
- If not an OVA or similar image, a standalone hardware appliance that will not be connected to external cloud services.
See the Rules section for submission guidelines.