layout | permalink | challenge-id | status | sidenav | card-image | agency-logo | challenge-title | tagline | agency | partner-agencies-federal | partners-non-federal | external-url | total-prize-offered-cash | type-of-challeng | submission-start | submission-end | submission-link | prize | fiscal-year | legal-authority | challenge-manager | challenge-manager-email | point-of-contact | description | prizes | rules | judging | how-to-enter |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
front-matter-data |
/challenge/differential-privacy-synthetic-data-challenge/ |
74 |
closed |
true |
/assets/images/cards/SyntheticData_PSCR.jpg |
NIST_logo.png |
Differential Privacy Synthetic Data Challenge |
Propose an algorithm to develop differentially private synthetic datasets to enable the protection of personally identifiable information (PII) while maintaining a dataset's utility for analysis. |
Department of Commerce - National Institute of Standards and Technology |
$150,000 |
Software and apps |
10/31/2018 04:00 PM |
05/20/2019 12:00 AM |
true |
FY18, FY19 |
America COMPETES |
Terese Manley |
<h4>Overview</h4> <p><strong>THANK YOU to all the competitors in the NIST DIFFERENTIAL PRIVACY SYNTHETIC DATA CHALLENGE!</strong></p> <p><strong>March 14, 2019 Informational Webinar Resources:</strong></p> <ul> <li><a href="https://youtu.be/dxvyaZwYJeQ" target="_blank" rel="noopener">Webinar video recording</a></li> </ul> <p><strong>March 6, 2019</strong></p> <p>The NIST technical project lead, Christine Task of Knexus Research Corporation, presented a challenge summary at the Simons Institute workshop event <a href="https://gcc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsimons.berkeley.edu%2Fprivacy2019-1&data=02%7C01%7Cterese.manley%40nist.gov%7Cc2c356c334ff467cbfd508d6a66253e3%7C2ab5d82fd8fa4797a93e054655c61dec%7C1%7C0%7C636879338579479314&sdata=m%2Fm5d4iY4%2BKE%2FhYT%2BhlIOhMdkyW4GSGXl3bEjNa2qKo%3D&reserved=0" target="_blank" rel="noopener" data-saferedirecturl="https://www.google.com/url?q=https://gcc01.safelinks.protection.outlook.com/?url%3Dhttps%253A%252F%252Fsimons.berkeley.edu%252Fprivacy2019-1%26data%3D02%257C01%257Cterese.manley%2540nist.gov%257Cc2c356c334ff467cbfd508d6a66253e3%257C2ab5d82fd8fa4797a93e054655c61dec%257C1%257C0%257C636879338579479314%26sdata%3Dm%252Fm5d4iY4%252BKE%252FhYT%252BhlIOhMdkyW4GSGXl3bEjNa2qKo%253D%26reserved%3D0&source=gmail&ust=1552425614963000&usg=AFQjCNF3dK5bxQiCNG3dJZTPScpjmd27tw">Data Privacy: From Foundations to Applications</a> held March 4-8, 2019. <a href="{{ site.baseurl }}/assets/document-library/CTask_Knexus_Simons_NISTChallenge_Slides-030619.pdf">View the slides</a>.</p> <p><strong>Jan. 15, 2019 Informational Webinar Resources:</strong></p> <ul> <li><a href="https://youtu.be/f5ig0qBByFo">Webinar video recording</a></li> <li><a href="{{ site.baseurl }}/assets/document-library/NIST_Jan-15-Webinar-Questions.pdf">Q&A Session document</a></li> <li><a href="{{ site.baseurl }}/assets/document-library/NIST_Jan-15-2019-Webinar-Presentation.pdf">Presentation slides</a></li> </ul> <p><strong>Nov. 13, 2018 Informational Webinar Resources:</strong></p> <ul> <li><a href="https://youtu.be/6sU-NFTsR-I">Webinar video recording</a></li> <li><a href="{{ site.baseurl }}/assets/document-library/Nist_Nov-13-Webinar-Questions.pdf">Q&A Session document</a></li> <li><a href="{{ site.baseurl }}/assets/document-library/NIST_Nov-13-Webinar-presentation.pdf">Presentation slides</a></li> </ul> <p>Are you a mathematician or data scientist interested in a new challenge? Then join this exciting data privacy competition with <strong>up to $150,000 in prizes</strong>, where participants will create new or improved differentially private synthetic data generation tools. When a data set has important public value but contains sensitive personal information and can’t be directly shared with the public, privacy-preserving synthetic data tools solve the problem by producing new, artificial data that can serve as a practical replacement for the original sensitive data, with respect to common analytics tasks such as clustering, classification and regression. By mathematically proving that a synthetic data generator satisfies the rigorous Differential Privacy guarantee, we can be confident that the synthetic data it produces won’t contain any information that can be traced back to specific individuals in the original data. The “Differential Privacy Synthetic Data Challenge” will entail a sequence of three marathon matches run on the Topcoder platform, asking contestants to design and implement their own synthetic data generation algorithms, mathematically prove their algorithm satisfies differential privacy, and then enter it to compete against others’ algorithms on empirical accuracy over real data, with the prospect of advancing research in the field of Differential Privacy.</p> <p>If you’re not a differential privacy expert, and you’d like to learn, join the Topcoder community for tutorials to help you catch up and compete! <strong>Join, learn, and compete for $150,000 in prizes!</strong></p> <h4>How Important Is This?</h4> <p>This challenge is focused on proactively protecting individual privacy while allowing for public safety data to be used by researchers for positive purposes and outcomes. NIST’s PSCR (public safety communications research) has strong commitments to both public safety research and the preservation of security and privacy, including the use of de-identification. </p> <p>There is no absolute protection that data will not be misused. Even a dataset that protects individual identities may, if it gets into the wrong hands, be used for ill purposes. Weaknesses in the security of the original data can threaten the privacy of individuals. </p> <p>It is well known that privacy in data release is an important area for the Federal Government (which has an Open Data Policy), state governments, the public safety sector and many commercial non-governmental organizations. Developments coming out of this competition would hopefully drive major advances in the practical applications of differential privacy for these organizations.</p> <p>The purpose of this series of competitions is to provide a platform for researchers to develop more advanced differentially private methods that can substantially improve the privacy protection and utility of the resulting datasets.</p> <h4>Get Involved - How to Participate</h4> <p><em>>Note: All submissions for this challenge are being collected through the <a href="https://www.topcoder.com/community/data-science/Differential-Privacy-Synthetic-Data-Challenge">Topcoder website</a>.</em></p> <p>The Differential Privacy Synthetic Data Challenge is phase 2 of <a href="https://www.herox.com/UnlinkableDataChallenge/overview">The Unlinkable Data Challenge: Advancing Methods in Differential Privacy</a>, where competitors wrote concept papers to identify new approaches to de-identification and inform the final coding design of this challenge. Participants of this challenge will create an algorithm and participate in a sequence of Marathon Matches. Throughout each marathon match, participants will design and implement their own differentially private synthetic data generation algorithm, mathematically prove that their algorithm satisfies differential privacy, and enter it to compete against others’ algorithms on empirical accuracy over real data, with the prospect of advancing the understanding in the field of Differential Privacy.</p> <p>This is a multi-phased contest with three marathon matches. Competitors may enter the contest at any point to participate between November 2018 and April 2019. Topcoder will bring the registrations from previous matches to the next matches.</p> <p>The marathon mechanism will provide participants with immediate feedback about the quality of their submission using an online leaderboard, which allows teams to repeatedly improve and validate the capabilities of their algorithms through several phases of increasing difficulty. In each marathon, participants are able to make changes to their algorithm, team with other Topcoder members and watch their opponents move up and down the leaderboard. The final stage in each marathon match will be a sequestered stage where participants submit their final code to be rigorously tested and evaluated. Where a competitor's algorithm falls with respect to the utility-privacy frontier curve will determine who wins at each marathon match.</p> <h4>Prize Schedule</h4> <p>The Differential Privacy Synthetic Data Challenge consists of a sequence of three Marathon Matches with increasing difficulty. Each marathon match is approximately two months in length. For all prize eligibility, the submissions must be certified (see Match contest rule \#8) in advance and the participants will have 24 hours to supply their code base to re-confirm their certifications.</p> <h4>Summary of Important Dates</h4> <p>Pre-registration Begins: October 5, 2018</p> <p>Challenge Launch: October 31, 2018 @ 9 a.m. ET</p> <p>Match \#1: competitors submit October 31 - November 29, 2018</p> <p>Match \#2: competitors submit January 11 - February 9, 2019</p> <p>Match \#3: competitors submit March 10 - April 23, 2019</p> <p>Final Scoring Deadline: April 2019 (Refer to Challenge Timeline on <a href="https://www.topcoder.com/community/data-science/Differential-Privacy-Synthetic-Data-Challenge">Topcoder website</a> under each Match page)</p> <p>Final Winners Announced: May 20, 2019 (Interim Match Winnners posted under the "Winners" tab)</p> <h4>How Do I Win?</h4> <p>To be eligible for an award, your submission must, at minimum:</p> <ul> <li>Meet the eligibility requirements in the NIST PSCR Official Challenge Rules posted on Challenge.gov under the “Rules” tab.</li> <li>Satisfy compliance review based on completion of the Certification Process (Match contest rule \#8) at each match, and if you are a top scorer, submit a report for Final Review (Match contest rule \#9).</li> <li>Score higher than your competitors!</li> </ul> <h4>Prizes</h4> <p>A total prize purse of up to $150,000 is available for this Challenge. Here is a breakdown:</p> <h4><strong>Final Prizes</strong></h4> <p>For the final prize of each match the top participants (1st through 5th place) will be invited via email to the final round of testing, the sequestered phase, by providing their solution in the form of a Dockerfile within 7 days of the end of the submission phase. This Dockerfile will contain their solution (source code) which will be run against the sequestered set of data and ground truth. A set of analytics will be run against the generated privatized data, and the same analytics will be run against the raw dataset. The results of the analytics (clustering, categorization, and linear regression) of the two sets of data will be compared using appropriate similarity scores, and a total accuracy score will be computed that represents the final score for that participant. The top leaders will be required to provide written documents defining their solution and providing a clear, correct proof that their solution satisfies differential privacy (see Match contest rule \#9, Final Review). </p> <h4><strong>Progressive Prizes</strong></h4> <p>There are multiple ways for Challenge leaders to win progressive prizes halfway through each match and prior to the final round. Progressive prizes are announced when the Challenge is publicly launched and awarded based on the participants placement on the provisional leaderboard at an exact time (approximately half way through the challenge), and their provisional testing score (not the sequestered results). These prizes are awarded to the Top 4 pre-certified participants from the provisional leaderboard once their code has been inspected. There are 4 progressive prizes available for each marathon match at $1000 each.</p> <h4><strong>Additional Prizes</strong></h4> <p>An additional prize of $4000 may be awarded to each of the top 5 award winning teams at the end of the final marathon match who agree to provide and do provide their full code solution in an open source repository for use by all interested parties.</p> <h4><strong>Prize Summary</strong></h4> <table style="border-collapse:collapse; width:100%; height:246px;" border="1"> <tbody> <tr style="height:18px;"> <td style="width:33.3333%; height:18px;"><strong>Match 1</strong></td> <td style="width:33.3333%; height:18px;"><strong>Match 2</strong></td> <td style="width:33.3333%; height:18px;"><strong>Match 3</strong></td> </tr> <tr style="height:174px;"> <td style="width:33.3333%; height:174px;"> <p>1st Plac $10,000</p> <p>2nd Plac $7,000</p> <p>3rd Plac $5,000</p> <p>4th Plac $2,000</p> <p>5th Plac $1,000</p> </td> <td style="width:33.3333%; height:174px;"> <p>1st Plac $15,000</p> <p>2nd Plac $10,000</p> <p>3rd Plac $5,000</p> <p>4th Plac $3,000</p> <p>5th Plac $2,000</p> </td> <td style="width:33.3333%; height:174px;"> <p>1st Plac $25,000</p> <p>2nd Plac $15,000</p> <p>3rd Plac $10,000</p> <p>4th Plac $5,000</p> <p>5th Plac $3,000</p> </td> </tr> <tr style="height:18px;"> <td style="width:33.3333%; height:18px;"> <p>Progressive Priz 4 x $1,000</p> </td> <td style="width:33.3333%; height:18px;">Progressive Priz 4 x $1,000</td> <td style="width:33.3333%; height:18px;">Progressive Priz 4 x $1,000</td> </tr> <tr style="height:18px;"> <td style="width:33.3333%; height:18px;">Total: $29,000</td> <td style="width:33.3333%; height:18px;">Total: $39,000</td> <td style="width:33.3333%; height:18px;">Total: $62,000</td> </tr> </tbody> </table> <p><em>* An additional prize of $4000 may be awarded to each of the top 5 award winning teams at the end of the final marathon match who agree to provide and do provide their full code solution in an open source repository for use by all interested parties.</em></p> <p><em><strong>Total Prize Purse for Differential Privacy Synthetic Data Challeng $150,000.</strong></em></p> <h4>Questions?</h4> <p>For questions about the Official Rules or Challenge, contact <a href="mailto:[email protected]">[email protected]</a> with “Differential Privacy Synthetic Data Challenge” in the subject.</p> |
<p><strong>Overview</strong></p> <p>The total prize purse is up to $150,000. For details of prizes, see the "Prizes" section in the <strong>Overview</strong> tab.</p> <p><strong>Match \#1 Final Prize Winners</strong></p> <p>Congratulations to the winners of Match \#1! <a href="https://apps.topcoder.com/forums/?module=Thread&threadID=929417&start=0">See the Topcoder Match \#1 forum for final analysis!</a></p> <ul> <li>1st ($10,000) – Team pfr (jonathanps*); members Jonathan Sculley and Paul Froissart</li> <li>2nd ($7,000) – Team DPSyn (ninghui*); members Ninghui Li, Zhikun Zhang and Tianhao Wang from Purdue University</li> <li>3rd ($5,000) – Team RMcKenna (rmckenna*); member Ryan McKenna</li> <li>4th ($2,000) – Team UCLANESL (manisrivastava*); members Mani Srivastava (UCLA), Moustafa Alzantot (UCLA), Supriyo Chakraborty (IBM Research) and Nathaniel Snyder (UCLA)</li> <li>5th ($1,000) – Team PrivBayes (privbayes*); members Boling Ding, Xiaokui Xiao, Jun Zhao, Ergute Bao and Xuejun Zhao</li> </ul> <p>Congratulations to the 4 Progressive Prize winners of Match \#1 (announced mid-point of the match), earning $1,000 each: <em>Team RMcKenna</em> (rmckenna*), <em>Team pfr</em> (jonathanps*), <em>Team DP-D</em> (eceva*), and <em>Team Epsilon-delta</em> (brettbj*)</p> <p>*Topcoder handle</p> <p><strong>Match \#2 Final Prize Winners</strong></p> <p>Congratulations to the winners of Match \#2! You can read the full winners announcement on the <a href="https://gcc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.topcoder.com%2Fforums%2F%3Fmodule%3DThread%26threadID%3D932743%26start%3D0%26mc%3D1%232323757&data=02%7C01%7Cterese.manley%40nist.gov%7Ca265654094e744497d9808d6a30dadd1%7C2ab5d82fd8fa4797a93e054655c61dec%7C1%7C0%7C636875676477635641&sdata=5Tol34ph11P97hnPfKmRmnPrv0fmTJADHP6wyO6vPGc%3D&reserved=0" target="_blank" rel="noopener" data-saferedirecturl="https://www.google.com/url?q=https://gcc01.safelinks.protection.outlook.com/?url%3Dhttps%253A%252F%252Fapps.topcoder.com%252Fforums%252F%253Fmodule%253DThread%2526threadID%253D932743%2526start%253D0%2526mc%253D1%25232323757%26data%3D02%257C01%257Cterese.manley%2540nist.gov%257Ca265654094e744497d9808d6a30dadd1%257C2ab5d82fd8fa4797a93e054655c61dec%257C1%257C0%257C636875676477635641%26sdata%3D5Tol34ph11P97hnPfKmRmnPrv0fmTJADHP6wyO6vPGc%253D%26reserved%3D0&source=gmail&ust=1552425614963000&usg=AFQjCNGZT4ZtAfay1VyDZFC9gtaMyhTzBA">Topcoder site</a> (account login required).</p> <ul> <li>1st Plac $15,000 - Team pfr (jonathanps*); members Jonathan Sculley and Paul Froissart</li> <li>2nd Plac $10,000 - Team DPSyn (ninghui*); members Ninghui Li, Zhikun Zhang and Tianhao Wang from Purdue University</li> <li>3rd Plac $5,000 - Team PrivBayes (privbayes*); members Boling Ding, Xiaokui Xiao, Jun Zhao, Ergute Bao and Xuejun Zhao</li> <li>4th Plac $3,000 - Team RMcKenna (rmckenna*); member Ryan McKenna</li> <li>5th Plac $2,000 - Team John Gardner (gardn999*); member John Gardner</li> </ul> <p>Congratulations to the 4 Progressive Prize winners of Match \#2 (announced mid-point of the match), earning $1,000 each: Team RMcKenna (rmckenna*), Team pfr (jonathanps*), Team John Gardner (gardn999*), and Team DPSyn (ninghui*)</p> <p>*Topcoder handle</p> <p><strong>Match \#3 Final Prize Winners</strong></p> <p>Congratulations to the winners of Match \#3! You can read the full winners announcement on the <a href="https://apps.topcoder.com/forums/?module=ThreadList&forumID=643125&mc=345" target="_blank" rel="noopener" data-saferedirecturl="https://www.google.com/url?q=https://apps.topcoder.com/forums/?module%3DThreadList%26forumID%3D643125%26mc%3D345&source=gmail&ust=1558780835851000&usg=AFQjCNHl8_1Fy1K64Rl5ff1fLInru8Z66A">Topcoder forum</a> (account login required). Check back on the forum on May 31st for news about where contestants will post their open source code.</p> <ul type="disc"> <li>1st Plac $25,000 - Ryan McKenna (rmckenna*); member Ryan McKenna</li> <li>2nd Plac $15,000 - Team DPSyn (ninghui*); members Ninghui Li, Zhikun Zhang and Tianhao Wang from Purdue University</li> <li>3rd Plac $10,000 - Team PrivBayes (privbayes*); members Boling Ding, Xiaokui Xiao, Jun Zhao, Ergute Bao and Xuejun Zhao</li> <li>4th Plac $5,000 - Team John Gardner (gardn999*); member John Gardner</li> <li>5th Plac $3,000 - Team UCLANESL (manisrivastava*); members Mani Srivastava (UCLA), Moustafa Alzantot (UCLA), Supriyo Chakraborty (IBM Research) and Nathaniel Snyder (UCLA)</li> </ul> <p>Congratulations to the 4 Progressive Prize winners of Match \#3 (announced mid-point of the match), earning $1,000 each: <em>Ryan McKenna</em> (rmckenna*), <em>Team PrivBayes</em> (privbayes*), <em>Team John Gardner</em> (gardn999*), and <em>Team DPSyn</em> (ninghui*)</p> <p>*Topcoder handle</p> |
<h4>Match Contest Rules</h4> <h4><em>(The NIST Official Rules are explained below the Match Contest Rules.)</em></h4> <p>These rules pertain to the competition hosted on the <a href="https://www.topcoder.com/community/data-science/Differential-Privacy-Synthetic-Data-Challenge">Topcoder website</a>.</p> <p>In the event of any discrepancy or inconsistency between the terms and conditions of the official rules located at Challenge.gov and these Guidelines or any other Competition materials, the terms and conditions of the NIST Official Rules on Challenge.gov as specified within the Challenge Specific Agreement shall control.</p> <p>1. This Challenge will tentatively run from October 31 2018 to May 6 2019 and be divided into a sequence of three matches. Each match will introduce new constraints to the scoring function, described later. </p> <p>2. Upon registration for the challenge, the participants will receive one (1) or more datasets bifurcated into training and testing data in addition to instructions or pseudo code to generate the ground truth for the training set for all matches to date.</p> <p>3. For match 1 the ground truth will be generated using only Clustering Analysis.</p> <p>4. For match 2 the ground truth will be generated using Clustering and Classification. </p> <p>5. For match 3 the ground truth will be generated using Clustering, Classification and Regression Analysis. </p> <p>6. Inside the JSON submission payload, participants will receive a value of delta from NIST/Topcoder and will submit their synthetic datasets at three different values of epsilon: 0.1, 1, 10. A test harness (shell script and or Jar file) will be provided to sequentially run the participant solutions with different epsilon values that will concatenate and format the output for submission to Topcoder. Participants with algorithms that produce high quality privatized synthetic data on smaller values of epsilon, will receive higher scores.</p> <p>7. Upon proper submission, participants will be redirected to the leaderboard where they will see their handle and the new score. </p> <p>8. Certification Process: All participants will be required to submit a complete written explanation of their algorithm, and a clear, correct mathematical proof that their solution satisfies differential privacy, to a dedicated mailbox. This document will be reviewed by NIST staff or their delegates (who may request clarification or rewriting if documents are unclear or underspecified). Participants will receive “certification” that they have a basic understanding of differential privacy, or a brief explanation why their algorithm or proof are incorrect. An indicator will display on the leaderboard to all participants who have had their approach certified.</p> <p>- For participants with higher than normal scores for their specified epsilon and delta, the participant algorithm and code must be provided as often as requested by NIST staff or their delegates who will review for non-intentional mistakes. This is a courtesy to ensure the participant still qualifies for prize eligibility.</p> <p>- Certification status will be used in determining the provisional score for entries.</p> <p>- Only certified approaches are eligible for cash prizes (see rule \#9, Final Review).</p> <p>9. Final Review: Before awarding cash prizes to participants who place in prize-winning ranks at the end of each Marathon Match, these participants must submit a report containing their complete algorithm description and mathematical privacy proof, along with their source code. They will have 7 days notification to provide this report. This report will be subject to a final review to verify that their algorithm correctly satisfies differential privacy at the stated values of delta and epsilon, and that the code correctly implements the stated algorithm. The review will be performed by Topcoder and NIST staff or their delegates, which may include NIST certified subject matter experts from outside NIST. If a participant is placed in a prize winning rank but fails to provide the report, or the review determines that their solution does not satisfy differential privacy, then they will not receive a prize, and it will be awarded to the participant with the next best performance who successfully completes the final review.</p> <p>10. An indicator will be present on the leaderboard to identify submissions with intention to publicly share their solutions at the end of the final match. Having this indicator will not increase or decrease participants’ chances of winning a prize.</p> <p>11. Relinquish - Topcoder/NIST is allowing registered participants or teams to “relinquish”. Government employees, students or any persons who are not allowed to take monetary prizes or feel they may have a conflict of interest are welcome to participate in the challenge while agreeing to forego the monetary prizes (by responding to the Relinquish post in the challenge forum). Relinquish means the member will compete, and their solutions scored, but they will not be eligible for a prize. Once a person or team relinquishes, we post their name to a forum thread labeled “Relinquished Competitors”. Members who relinquish must submit their implementation code and methods to maintain leaderboard status. Winners who are identified as relinquished will still be publicly recognized in the final winner announcements based on their placement on the leaderboard and the prize award, normally allotted to the placement of the relinquished team, will be given to the next certified team on the leaderboard. Throughout the challenge, the challenge’s online leaderboard will display rankings and accomplishments, giving them various opportunities to have their work viewed and appreciated by stakeholders from industry, government and academic communities.</p> <p>12. All participants must register on the challenge platform, hosted by Topcoder as members. If they are already members, they must be in good standing (not banned or suspended).</p> <p>13. Teaming is allowed, participants are permitted to form teams for this competition. After forming a team, participants of the same team are permitted to collaborate with each other. To form a team, a Topcoder member may recruit other Topcoder members, and register the team by completing the Topcoder Teaming Form for this challenge. Each team must declare a Captain. All participants in a team must individually register for this Competition and accept Topcoder's Terms and Conditions prior to joining the team. Team Captains must apportion prize distribution percentages for each teammate on the Teaming Form. The sum of all prize portions must equal 100%. The minimum permitted size of a team is 1 member, the maximum permitted team size is 5 members. Only team Captains may submit a solution to the Competition. Topcoder members participating in a team will not receive a rating for this Competition. Notwithstanding Topcoder rules and conditions to the contrary, solutions submitted by any Topcoder member who is a member of a team but is not the Captain of the team may be deleted and is ineligible for award. The deadline for forming teams is 11:59pm ET on the 21st day following the date that Registration and Submission opens as shown on the Challenge Details page for each match. Topcoder will prepare a Teaming Agreement for each team that has completed the Topcoder Teaming Form and distribute it to each member of the team. Teaming Agreements must be electronically signed by each team member to be considered valid. All Teaming Agreements are void, unless electronically signed by all team members by 11:59pm ET of the 28th day following the date that Registration & Submission opens as shown on the Challenge Details page. Any Teaming Agreement received after this period is void. Teaming Agreements may not be changed in any way after signature. The registered teams will be listed in the contest forum thread titled “Registered Teams”.</p> <p>14. Organizations such as companies may compete as one competitor if they are registered as a team and follow all challenge and platform rules.</p> <p>15. In this match participants may use any programming language and libraries, including commercial solutions, provided Topcoder/NIST are able to run it free of any charge. You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by NIST. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see “Ways to Win: Final Prizes” section). Submissions will be deleted/destroyed after they are confirmed. Topcoder/NIST will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run free of cost, and with all necessary licenses pre-installed in your solution. Topcoder/NIST are not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.</p> <p>16. If the solution includes licensed software (e.g. commercial software, open source software, etc), participants must include the full license agreements with the submission. Include licenses in a folder labeled “Licenses”. Within the same folder, include a text file labeled “README” that explains the purpose of each licensed software package as it is used in your solution.</p> <p>17. External datasets and pre-trained models are allowed for use in the competition provided the following are satisfied: (a) The external data and pre-trained models are unencumbered with legal restrictions that conflict with its use in the competition. And (b) The data source or data used to train the pre-trained models is defined in the submission description.</p> <h4> </h4> <h4>NIST Official Rules (Challenge-specific Agreement)</h4> <p><strong>PLEASE READ THIS CAREFULLY! You ("Innovator” or “Participant”) and NIST ("Challenge Sponsor”) are entering into this Challenge-Specific Agreement ("CSA”) for this particular incentive-based competition ("Challenge”) only. In order to participate in this Challenge, Innovator must accept these terms, and therefore should take the time to understand them. This CSA includes the NIST Official Rules.</strong></p> <p>1. If Innovator clicks "Accept" and proceeds to register for this Challenge, this CSA will be a valid and binding agreement between Innovator and Challenge Sponsor, and is in addition to the existing Topcoder Terms of Use for all purposes relating to this Challenge. Innovator should print and keep a copy of this CSA. No provisions that Innovator may have agreed to that are specific to any other individual challenge will apply.</p> <p>In the event of any discrepancy or inconsistency between the terms and conditions of the official rules and disclosures or other statements contained in any Competition materials, including but not limited to the Competition submission form, Competition website and use terms, Topcoder terms of participation, advertising (including but not limited to television, print, radio or online ads), the terms and conditions of the NIST Official Rules on Challenge.gov website as specified within this Challenge Specific Agreement shall control.</p> <p>2.<strong> America COMPETES Reauthorization Act of 2010:</strong> All challenge and prize competitions shall be performed in accordance with the America COMPETES Reauthorization Act of 2010, Pub. Law 111-358, title I, § 105(a), Jan. 4, 2011, as amended, codified at 15 U.S.C. § 3719 (hereinafter “America COMPETES Act”).</p> <p>3.<strong> Eligibility:</strong> Each Competition Participant (individual, team, or legal entity) is required to register on the Topcoder NIST Differential Privacy Synthetic Data Challlenge [Unlinkable Data Challenge-Stage 2] website. There shall be one Official Representative for each Competition Participant. The Official Representative must provide a username (which may serve as a team or affiliation name), email address, and affirm that he/she has read and consents to be governed by the Competition Rules. At NIST’s discretion, any violation of this rule will be grounds for disqualification from the Competition. Multiple individuals and/or legal entities may collaborate as a team to submit a single entry, in which case the designated Official Representative will be responsible for meeting all entry and evaluation requirements. Participation is subject to all U.S. federal, state and local laws and regulations. Participants, including individuals and private entities, must not have been convicted of a felony criminal violation under any Federal law within the preceding 24 months and must not have any unpaid Federal tax liability that has been assessed, for which all judicial and administrative remedies have been exhausted or have lapsed, and that is not being paid in a timely manner pursuant to an agreement with the authority responsible for collecting the tax liability. Participants must not be suspended, debarred, or otherwise excluded from doing business with the Federal Government. Individuals entering on behalf of or representing a company, institution or other legal entity are responsible for confirming that their entry does not violate any policies of that company, institution or legal entity. Any other individuals or legal entities involved with the design, production, execution, distribution or evaluation of the NIST Differential Privacy Synthetic Data Challenge are not eligible to participate.</p> <p>To be eligible for a cash prize:</p> <p>a. A Participant (whether an individual, team, or legal entity) must have registered to participate and complied with all the requirements under section 3719 of title 15, United States Code as contained herein.</p> <p>b. At the time of Entry, the Official Representative (individual or team lead, in the case of a group project) must be age 18 or older and a U.S. citizen or permanent resident of the United States or its territories.</p> <p>c. In the case of a private entity, the business shall be incorporated in and maintain a primary place of business in the United States or its territories.</p> <p>d. Participants may not be a Federal entity or Federal employee acting within the scope of their employment. NIST employees are not eligible to participate. Non-NIST Federal employees acting in their personal capacities should consult with their respective agency ethics officials to determine whether their participation in this Competition is permissible.</p> <p>e. A Participant shall not be deemed ineligible because the Participant consulted with Federal employees or used Federal facilities in preparing its submission to the NIST Differential Privacy Synthetic Data Challenge prize competition if the Federal employees and facilities are made available to all Participants on an equitable basis.</p> <p>4. <strong>Submissions:</strong> By participating in this Challenge, Innovator may submit to Challenge Sponsor submission materials ("Submission”), as outlined in these NIST Official Rules on Challenge.gov and the Challenge Guidelines specific to this Challenge on Topcoder.com. By submitting a Submission, Innovator thereby agrees to provide reasonable assistance and additional information concerning the Submission to Challenge Sponsor, if requested.</p> <p>5. <strong>Warranties:</strong> By submitting an Entry, the Participant represents and warrants that all information submitted is true and complete to the best of the Participant’s knowledge, that the Participant has the right and authority to submit the Entry on the Participant’s own behalf or on behalf of the persons and entities that the Participant specifies within the Entry, and that the Entry (both the information and materials submitted in the Entry and the underlying technology/method/idea/treatment protocol/solution described in the Entry):</p> <p>a. Is the Participant’s own original work, or is submitted by permission with full and proper credit given within the Entry;</p> <p>b. Does not contain trade secrets (the Participant’s or anyone else’s);</p> <p>c. Does not knowingly violate or infringe upon the patent rights, industrial design rights, copyrights, trademarks, rights of privacy, publicity or other intellectual property or other rights of any person or entity;</p> <p>d. Does not contain malicious code, such as viruses, malware, timebombs, cancelbots, worms, Trojan horses or other potentially harmful programs or other material or information;</p> <p>e. Does not and will not violate any applicable law, statute, ordinance, rule or regulation, including, without limitation, United States export laws and regulations, including but not limited to, the International Traffic in Arms Regulations and the Department of Commerce Export Regulations; and</p> <p>f. Does not trigger any reporting or royalty or other obligation to any third party.</p> <p>6. <strong>Intellectual Property:</strong> Any applicable intellectual property rights to an Entry will remain with the Participant. By participating in the prize competition, the Participant is not granting any rights in any patents, pending patent applications, or copyrights related to the technology described in the Entry. However, by submitting an Entry, the Participant is granting NIST, NASA, and any parties acting on their behalf certain limited rights as set forth herein.</p> <p>a. By submitting an Entry, the Participant grants to NIST, NASA, and any parties acting on their behalf the right to review the Entry, to describe the Entry in any materials created in connection with this competition, and to screen and evaluate the Entry. NIST and NASA, and any parties acting on their behalf will also have the right to publicize Participant’s name and, as applicable, the names of Participant’s team members and/or Organization which participated in submitting the Entry following the conclusion of the Competition. </p> <p>b. As part of its submission, the Participant must provide written consent granting NIST, NASA, and any parties acting on their behalf, a royalty-free, non-exclusive, irrevocable, worldwide license to display publicly and use for promotional purposes the Participant’s entry (“demonstration license”). This demonstration license includes posting or linking to the Participant’s entry on NIST and NASA’s websites, including the Competition Website, and partner websites, and inclusion of the Participant’s Entry in any other media, worldwide.</p> <p>7. <strong>Trade Secret Information:</strong> By making a submission to this prize competition, the Participant agrees that no part of its submission includes any Trade Secret information, ideas or products. All submissions to this prize competition are deemed non-proprietary. Since NIST does not wish to receive or hold any submitted materials “in confidence” it is agreed that, with respect to the Participant’s Entry, no confidential or fiduciary relationship or obligation of secrecy is established between NIST, NASA, or any parties acting on their behalf and the Participant, the Participant’s team, or the company or institution the Participant represents when submitting an Entry, or any other person or entity associated with any part of the Participant’s Entry.</p> <p>8. <strong>Liability:</strong> Participants shall agree to assume any and all risks and waive claims against the Federal Government and its related entities, except in the case of willful misconduct, for any injury, death, damage, or loss of property, revenue, or profits, whether direct, indirect, or consequential, arising from participation in this prize competition, whether the injury, death, damage, or loss arises through negligence or otherwise.</p> <p>9. <strong>Insurance:</strong> Participants are not required to obtain liability insurance for this Competition.</p> <p>10. <strong>Indemnification:</strong> Participants shall agree to indemnify the Federal Government against third party claims for damages arising from or related to Challenge activities.</p> <p>11. <strong>Changes and Cancellation:</strong> Challenge Sponsor has the right to make updates and/or make any changes at any time during the Challenge. Innovators are responsible for regularly reviewing the official rules on Challenge.gov website and updates on the <a href="https://www.topcoder.com/community/data-science/Differential-Privacy-Synthetic-Data-Challenge">Topcoder site</a> to ensure they are meeting all rules and requirements of the Challenge. Challenge Sponsor has the right to cancel the Challenge at any time, without warning or explanation, and to subsequently remove the Prize completely.</p> <p>12. <strong>Payments:</strong> The prize competition winners will be paid prizes directly from NIST. Prior to payment, winners will be required to verify eligibility. The verification process with the agency includes providing the full legal name, tax identification number or social security number, routing number and banking account to which the prize money can be deposited directly.</p> <p>13. <strong>Existing Laws:</strong> The Federal Government shall not, by virtue of conducting this prize competition, be responsible for compliance by Participants in the prize competition with Federal law, including licensing, export control, and nonproliferation laws, and related regulations.</p> <p>Participation is subject to all U.S. federal, state and local laws and regulations. Participants are responsible for checking applicable laws and regulations in their jurisdiction(s) before participating in the prize competition to ensure that their participation is legal. Individuals entering on behalf of or representing a company, institution or other legal entity are responsible for confirming that their entry does not violate any policies of that company, institution or legal entity.</p> <p>14. <strong>Registration and Submissions:</strong> Submissions must be made online (only), via upload to the Topcoder website, at the specific dates and times listed in the Summary of Important Dates on Challenge.gov website and <a href="https://www.topcoder.com/community/data-science/Differential-Privacy-Synthetic-Data-Challenge">Topcoder site</a>. No late submissions will be accepted.</p> <p>15. <strong>Selection of Winners:</strong> Based on the winning criteria, prizes will be awarded per the Judging Criteria section in the Challenge Guidelines. In the case of a tie, the winner(s) will be selected based on the highest votes from the judges.</p> <p>16. <strong>Judging:</strong> The final determination of the winners will be made at the sole discretion of NIST. Scores and feedback from NIST will not be shared. </p> <p>17. <strong>Progressive Prize Awards:</strong> There are multiple ways for Challenge leaders to win progressive prizes halfway through each marathon match and prior to the final round. Progressive prizes are announced when the Challenge is publicly launched and awarded based on the participants placement on the provisional leaderboard at an exact time (approximately half way through the challenge), and their provisional testing score (not the sequestered results). These prizes are awarded to the Top 4 pre-certified participants from the provisional leaderboard once their code has been inspected. There are 4 progressive prizes available for each marathon match at $1000 each.</p> <p>a. A Marathon Match is defined as a long-running crowdsourcing challenge, using the Topcoder website, where participants solve complex algorithms on large data sets. Participants submit their software code to create and optimize their algorithm. </p> <p>b. A Leaderboard is an online, public display of the competition on the Topcoder website. Each marathon match will have a leaderboard where participants will receive immediate feedback about the quality of their submission. Participants can repeatedly improve their score and validate the capabilities of their algorithm by viewing the leaderboard. The leaderboard will display the participants handle, ranking and provisional score, at any given point in time. Additionally, the leaderboard will identify those teams or individuals who agree to make their submission (code) publicly available at the conclusion of this challenge; this is an option, not required to participate.</p> <p>18. <strong>Additional Prize Awards: </strong>Additional funds are available to the NIST Judge panel, in the form of Additional Prizes, up to $20,000 and assessed at the end of the final marathon match. An additional prize of $4000 may be awarded to each of the top 5 award winning teams who agree to provide and do provide their full code solution in an open source repository for use by all interested parties.</p> <p>19. <strong>Privacy Advisory:</strong> The Topcoder.com website is hosted by a private entity and is not a service of NIST. The solicitation and collection of your personal or individually identifiable information is subject to the host’s privacy and security policies and will not be shared with NIST unless you win the Challenge. Challenge winners’ personally identifiable information must be made available to NIST in order to collect an award.</p> |
<p><strong>Judging and Scoring for Final Prizes</strong></p> <p>NIST staff or their delegates will review entries based on the Certification Process and Final Review Process. A submission that fails to meet the compliance criteria will be ineligible to win prizes in this contest. Submissions that pass the final compliance review will be scored based on accuracy during a sequestered evaluation: the submitter's code will be run against a blind (sequestered) dataset and similarity metrics will be computed between analytics results on the synthetic privatized set and the original dataset.</p> |
<p>All submisions must be made on the <a href="https://www.topcoder.com/community/data-science/Differential-Privacy-Synthetic-Data-Challenge">Topcoder website</a>. All contest details, including timeline and marathon match details can be found on the <a href="https://www.topcoder.com/community/data-science/Differential-Privacy-Synthetic-Data-Challenge">Topcoder website</a>.</p> |
THANK YOU to all the competitors in the NIST DIFFERENTIAL PRIVACY SYNTHETIC DATA CHALLENGE!
March 14, 2019 Informational Webinar Resources:
March 6, 2019
The NIST technical project lead, Christine Task of Knexus Research Corporation, presented a challenge summary at the Simons Institute workshop event Data Privacy: From Foundations to Applications held March 4-8, 2019. View the slides.
Jan. 15, 2019 Informational Webinar Resources:
Nov. 13, 2018 Informational Webinar Resources:
Are you a mathematician or data scientist interested in a new challenge? Then join this exciting data privacy competition with up to $150,000 in prizes, where participants will create new or improved differentially private synthetic data generation tools. When a data set has important public value but contains sensitive personal information and can’t be directly shared with the public, privacy-preserving synthetic data tools solve the problem by producing new, artificial data that can serve as a practical replacement for the original sensitive data, with respect to common analytics tasks such as clustering, classification and regression. By mathematically proving that a synthetic data generator satisfies the rigorous Differential Privacy guarantee, we can be confident that the synthetic data it produces won’t contain any information that can be traced back to specific individuals in the original data. The “Differential Privacy Synthetic Data Challenge” will entail a sequence of three marathon matches run on the Topcoder platform, asking contestants to design and implement their own synthetic data generation algorithms, mathematically prove their algorithm satisfies differential privacy, and then enter it to compete against others’ algorithms on empirical accuracy over real data, with the prospect of advancing research in the field of Differential Privacy.
If you’re not a differential privacy expert, and you’d like to learn, join the Topcoder community for tutorials to help you catch up and compete! Join, learn, and compete for $150,000 in prizes!
This challenge is focused on proactively protecting individual privacy while allowing for public safety data to be used by researchers for positive purposes and outcomes. NIST’s PSCR (public safety communications research) has strong commitments to both public safety research and the preservation of security and privacy, including the use of de-identification.
There is no absolute protection that data will not be misused. Even a dataset that protects individual identities may, if it gets into the wrong hands, be used for ill purposes. Weaknesses in the security of the original data can threaten the privacy of individuals.
It is well known that privacy in data release is an important area for the Federal Government (which has an Open Data Policy), state governments, the public safety sector and many commercial non-governmental organizations. Developments coming out of this competition would hopefully drive major advances in the practical applications of differential privacy for these organizations.
The purpose of this series of competitions is to provide a platform for researchers to develop more advanced differentially private methods that can substantially improve the privacy protection and utility of the resulting datasets.
Not All submissions for this challenge are being collected through the Topcoder website.
The Differential Privacy Synthetic Data Challenge is phase 2 of The Unlinkable Data Challeng Advancing Methods in Differential Privacy, where competitors wrote concept papers to identify new approaches to de-identification and inform the final coding design of this challenge. Participants of this challenge will create an algorithm and participate in a sequence of Marathon Matches. Throughout each marathon match, participants will design and implement their own differentially private synthetic data generation algorithm, mathematically prove that their algorithm satisfies differential privacy, and enter it to compete against others’ algorithms on empirical accuracy over real data, with the prospect of advancing the understanding in the field of Differential Privacy.
This is a multi-phased contest with three marathon matches. Competitors may enter the contest at any point to participate between November 2018 and April 2019. Topcoder will bring the registrations from previous matches to the next matches.
The marathon mechanism will provide participants with immediate feedback about the quality of their submission using an online leaderboard, which allows teams to repeatedly improve and validate the capabilities of their algorithms through several phases of increasing difficulty. In each marathon, participants are able to make changes to their algorithm, team with other Topcoder members and watch their opponents move up and down the leaderboard. The final stage in each marathon match will be a sequestered stage where participants submit their final code to be rigorously tested and evaluated. Where a competitor's algorithm falls with respect to the utility-privacy frontier curve will determine who wins at each marathon match.
The Differential Privacy Synthetic Data Challenge consists of a sequence of three Marathon Matches with increasing difficulty. Each marathon match is approximately two months in length. For all prize eligibility, the submissions must be certified (see Match contest rule #8) in advance and the participants will have 24 hours to supply their code base to re-confirm their certifications.
Pre-registration Begins: October 5, 2018
Challenge Launch: October 31, 2018 @ 9 a.m. ET
Match #1: competitors submit October 31 - November 29, 2018
Match #2: competitors submit January 11 - February 9, 2019
Match #3: competitors submit March 10 - April 23, 2019
Final Scoring Deadlin April 2019 (Refer to Challenge Timeline on Topcoder website under each Match page)
Final Winners Announced: May 20, 2019 (Interim Match Winnners posted under the "Winners" tab)
To be eligible for an award, your submission must, at minimum:
- Meet the eligibility requirements in the NIST PSCR Official Challenge Rules posted on Challenge.gov under the “Rules” tab.
- Satisfy compliance review based on completion of the Certification Process (Match contest rule #8) at each match, and if you are a top scorer, submit a report for Final Review (Match contest rule #9).
- Score higher than your competitors!
A total prize purse of up to $150,000 is available for this Challenge. Here is a breakdown:
For the final prize of each match the top participants (1st through 5th place) will be invited via email to the final round of testing, the sequestered phase, by providing their solution in the form of a Dockerfile within 7 days of the end of the submission phase. This Dockerfile will contain their solution (source code) which will be run against the sequestered set of data and ground truth. A set of analytics will be run against the generated privatized data, and the same analytics will be run against the raw dataset. The results of the analytics (clustering, categorization, and linear regression) of the two sets of data will be compared using appropriate similarity scores, and a total accuracy score will be computed that represents the final score for that participant. The top leaders will be required to provide written documents defining their solution and providing a clear, correct proof that their solution satisfies differential privacy (see Match contest rule #9, Final Review).
There are multiple ways for Challenge leaders to win progressive prizes halfway through each match and prior to the final round. Progressive prizes are announced when the Challenge is publicly launched and awarded based on the participants placement on the provisional leaderboard at an exact time (approximately half way through the challenge), and their provisional testing score (not the sequestered results). These prizes are awarded to the Top 4 pre-certified participants from the provisional leaderboard once their code has been inspected. There are 4 progressive prizes available for each marathon match at $1000 each.
An additional prize of $4000 may be awarded to each of the top 5 award winning teams at the end of the final marathon match who agree to provide and do provide their full code solution in an open source repository for use by all interested parties.
Match 1 | Match 2 | Match 3 |
1st Plac $10,000 2nd Plac $7,000 3rd Plac $5,000 4th Plac $2,000 5th Plac $1,000 |
1st Plac $15,000 2nd Plac $10,000 3rd Plac $5,000 4th Plac $3,000 5th Plac $2,000 |
1st Plac $25,000 2nd Plac $15,000 3rd Plac $10,000 4th Plac $5,000 5th Plac $3,000 |
Progressive Priz 4 x $1,000 |
Progressive Priz 4 x $1,000 | Progressive Priz 4 x $1,000 |
Total: $29,000 | Total: $39,000 | Total: $62,000 |
* An additional prize of $4000 may be awarded to each of the top 5 award winning teams at the end of the final marathon match who agree to provide and do provide their full code solution in an open source repository for use by all interested parties.
Total Prize Purse for Differential Privacy Synthetic Data Challeng $150,000.
For questions about the Official Rules or Challenge, contact [email protected] with “Differential Privacy Synthetic Data Challenge” in the subject.
Overview
The total prize purse is up to $150,000. For details of prizes, see the "Prizes" section in the Overview tab.
Match #1 Final Prize Winners
Congratulations to the winners of Match #1! See the Topcoder Match #1 forum for final analysis!
- 1st ($10,000) – Team pfr (jonathanps*); members Jonathan Sculley and Paul Froissart
- 2nd ($7,000) – Team DPSyn (ninghui*); members Ninghui Li, Zhikun Zhang and Tianhao Wang from Purdue University
- 3rd ($5,000) – Team RMcKenna (rmckenna*); member Ryan McKenna
- 4th ($2,000) – Team UCLANESL (manisrivastava*); members Mani Srivastava (UCLA), Moustafa Alzantot (UCLA), Supriyo Chakraborty (IBM Research) and Nathaniel Snyder (UCLA)
- 5th ($1,000) – Team PrivBayes (privbayes*); members Boling Ding, Xiaokui Xiao, Jun Zhao, Ergute Bao and Xuejun Zhao
Congratulations to the 4 Progressive Prize winners of Match #1 (announced mid-point of the match), earning $1,000 each: Team RMcKenna (rmckenna*), Team pfr (jonathanps*), Team DP-D (eceva*), and Team Epsilon-delta (brettbj*)
*Topcoder handle
Match #2 Final Prize Winners
Congratulations to the winners of Match #2! You can read the full winners announcement on the Topcoder site (account login required).
- 1st Plac $15,000 - Team pfr (jonathanps*); members Jonathan Sculley and Paul Froissart
- 2nd Plac $10,000 - Team DPSyn (ninghui*); members Ninghui Li, Zhikun Zhang and Tianhao Wang from Purdue University
- 3rd Plac $5,000 - Team PrivBayes (privbayes*); members Boling Ding, Xiaokui Xiao, Jun Zhao, Ergute Bao and Xuejun Zhao
- 4th Plac $3,000 - Team RMcKenna (rmckenna*); member Ryan McKenna
- 5th Plac $2,000 - Team John Gardner (gardn999*); member John Gardner
Congratulations to the 4 Progressive Prize winners of Match #2 (announced mid-point of the match), earning $1,000 each: Team RMcKenna (rmckenna*), Team pfr (jonathanps*), Team John Gardner (gardn999*), and Team DPSyn (ninghui*)
*Topcoder handle
Match #3 Final Prize Winners
Congratulations to the winners of Match #3! You can read the full winners announcement on the Topcoder forum (account login required). Check back on the forum on May 31st for news about where contestants will post their open source code.
- 1st Plac $25,000 - Ryan McKenna (rmckenna*); member Ryan McKenna
- 2nd Plac $15,000 - Team DPSyn (ninghui*); members Ninghui Li, Zhikun Zhang and Tianhao Wang from Purdue University
- 3rd Plac $10,000 - Team PrivBayes (privbayes*); members Boling Ding, Xiaokui Xiao, Jun Zhao, Ergute Bao and Xuejun Zhao
- 4th Plac $5,000 - Team John Gardner (gardn999*); member John Gardner
- 5th Plac $3,000 - Team UCLANESL (manisrivastava*); members Mani Srivastava (UCLA), Moustafa Alzantot (UCLA), Supriyo Chakraborty (IBM Research) and Nathaniel Snyder (UCLA)
Congratulations to the 4 Progressive Prize winners of Match #3 (announced mid-point of the match), earning $1,000 each: Ryan McKenna (rmckenna*), Team PrivBayes (privbayes*), Team John Gardner (gardn999*), and Team DPSyn (ninghui*)
*Topcoder handle
These rules pertain to the competition hosted on the Topcoder website.
In the event of any discrepancy or inconsistency between the terms and conditions of the official rules located at Challenge.gov and these Guidelines or any other Competition materials, the terms and conditions of the NIST Official Rules on Challenge.gov as specified within the Challenge Specific Agreement shall control.
1. This Challenge will tentatively run from October 31 2018 to May 6 2019 and be divided into a sequence of three matches. Each match will introduce new constraints to the scoring function, described later.
2. Upon registration for the challenge, the participants will receive one (1) or more datasets bifurcated into training and testing data in addition to instructions or pseudo code to generate the ground truth for the training set for all matches to date.
3. For match 1 the ground truth will be generated using only Clustering Analysis.
4. For match 2 the ground truth will be generated using Clustering and Classification.
5. For match 3 the ground truth will be generated using Clustering, Classification and Regression Analysis.
6. Inside the JSON submission payload, participants will receive a value of delta from NIST/Topcoder and will submit their synthetic datasets at three different values of epsilon: 0.1, 1, 10. A test harness (shell script and or Jar file) will be provided to sequentially run the participant solutions with different epsilon values that will concatenate and format the output for submission to Topcoder. Participants with algorithms that produce high quality privatized synthetic data on smaller values of epsilon, will receive higher scores.
7. Upon proper submission, participants will be redirected to the leaderboard where they will see their handle and the new score.
8. Certification Process: All participants will be required to submit a complete written explanation of their algorithm, and a clear, correct mathematical proof that their solution satisfies differential privacy, to a dedicated mailbox. This document will be reviewed by NIST staff or their delegates (who may request clarification or rewriting if documents are unclear or underspecified). Participants will receive “certification” that they have a basic understanding of differential privacy, or a brief explanation why their algorithm or proof are incorrect. An indicator will display on the leaderboard to all participants who have had their approach certified.
- For participants with higher than normal scores for their specified epsilon and delta, the participant algorithm and code must be provided as often as requested by NIST staff or their delegates who will review for non-intentional mistakes. This is a courtesy to ensure the participant still qualifies for prize eligibility.
- Certification status will be used in determining the provisional score for entries.
- Only certified approaches are eligible for cash prizes (see rule #9, Final Review).
9. Final Review: Before awarding cash prizes to participants who place in prize-winning ranks at the end of each Marathon Match, these participants must submit a report containing their complete algorithm description and mathematical privacy proof, along with their source code. They will have 7 days notification to provide this report. This report will be subject to a final review to verify that their algorithm correctly satisfies differential privacy at the stated values of delta and epsilon, and that the code correctly implements the stated algorithm. The review will be performed by Topcoder and NIST staff or their delegates, which may include NIST certified subject matter experts from outside NIST. If a participant is placed in a prize winning rank but fails to provide the report, or the review determines that their solution does not satisfy differential privacy, then they will not receive a prize, and it will be awarded to the participant with the next best performance who successfully completes the final review.
10. An indicator will be present on the leaderboard to identify submissions with intention to publicly share their solutions at the end of the final match. Having this indicator will not increase or decrease participants’ chances of winning a prize.
11. Relinquish - Topcoder/NIST is allowing registered participants or teams to “relinquish”. Government employees, students or any persons who are not allowed to take monetary prizes or feel they may have a conflict of interest are welcome to participate in the challenge while agreeing to forego the monetary prizes (by responding to the Relinquish post in the challenge forum). Relinquish means the member will compete, and their solutions scored, but they will not be eligible for a prize. Once a person or team relinquishes, we post their name to a forum thread labeled “Relinquished Competitors”. Members who relinquish must submit their implementation code and methods to maintain leaderboard status. Winners who are identified as relinquished will still be publicly recognized in the final winner announcements based on their placement on the leaderboard and the prize award, normally allotted to the placement of the relinquished team, will be given to the next certified team on the leaderboard. Throughout the challenge, the challenge’s online leaderboard will display rankings and accomplishments, giving them various opportunities to have their work viewed and appreciated by stakeholders from industry, government and academic communities.
12. All participants must register on the challenge platform, hosted by Topcoder as members. If they are already members, they must be in good standing (not banned or suspended).
13. Teaming is allowed, participants are permitted to form teams for this competition. After forming a team, participants of the same team are permitted to collaborate with each other. To form a team, a Topcoder member may recruit other Topcoder members, and register the team by completing the Topcoder Teaming Form for this challenge. Each team must declare a Captain. All participants in a team must individually register for this Competition and accept Topcoder's Terms and Conditions prior to joining the team. Team Captains must apportion prize distribution percentages for each teammate on the Teaming Form. The sum of all prize portions must equal 100%. The minimum permitted size of a team is 1 member, the maximum permitted team size is 5 members. Only team Captains may submit a solution to the Competition. Topcoder members participating in a team will not receive a rating for this Competition. Notwithstanding Topcoder rules and conditions to the contrary, solutions submitted by any Topcoder member who is a member of a team but is not the Captain of the team may be deleted and is ineligible for award. The deadline for forming teams is 11:59pm ET on the 21st day following the date that Registration and Submission opens as shown on the Challenge Details page for each match. Topcoder will prepare a Teaming Agreement for each team that has completed the Topcoder Teaming Form and distribute it to each member of the team. Teaming Agreements must be electronically signed by each team member to be considered valid. All Teaming Agreements are void, unless electronically signed by all team members by 11:59pm ET of the 28th day following the date that Registration & Submission opens as shown on the Challenge Details page. Any Teaming Agreement received after this period is void. Teaming Agreements may not be changed in any way after signature. The registered teams will be listed in the contest forum thread titled “Registered Teams”.
14. Organizations such as companies may compete as one competitor if they are registered as a team and follow all challenge and platform rules.
15. In this match participants may use any programming language and libraries, including commercial solutions, provided Topcoder/NIST are able to run it free of any charge. You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by NIST. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see “Ways to Win: Final Prizes” section). Submissions will be deleted/destroyed after they are confirmed. Topcoder/NIST will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run free of cost, and with all necessary licenses pre-installed in your solution. Topcoder/NIST are not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.
16. If the solution includes licensed software (e.g. commercial software, open source software, etc), participants must include the full license agreements with the submission. Include licenses in a folder labeled “Licenses”. Within the same folder, include a text file labeled “README” that explains the purpose of each licensed software package as it is used in your solution.
17. External datasets and pre-trained models are allowed for use in the competition provided the following are satisfied: (a) The external data and pre-trained models are unencumbered with legal restrictions that conflict with its use in the competition. And (b) The data source or data used to train the pre-trained models is defined in the submission description.
PLEASE READ THIS CAREFULLY! You ("Innovator” or “Participant”) and NIST ("Challenge Sponsor”) are entering into this Challenge-Specific Agreement ("CSA”) for this particular incentive-based competition ("Challenge”) only. In order to participate in this Challenge, Innovator must accept these terms, and therefore should take the time to understand them. This CSA includes the NIST Official Rules.
1. If Innovator clicks "Accept" and proceeds to register for this Challenge, this CSA will be a valid and binding agreement between Innovator and Challenge Sponsor, and is in addition to the existing Topcoder Terms of Use for all purposes relating to this Challenge. Innovator should print and keep a copy of this CSA. No provisions that Innovator may have agreed to that are specific to any other individual challenge will apply.
In the event of any discrepancy or inconsistency between the terms and conditions of the official rules and disclosures or other statements contained in any Competition materials, including but not limited to the Competition submission form, Competition website and use terms, Topcoder terms of participation, advertising (including but not limited to television, print, radio or online ads), the terms and conditions of the NIST Official Rules on Challenge.gov website as specified within this Challenge Specific Agreement shall control.
2. America COMPETES Reauthorization Act of 2010: All challenge and prize competitions shall be performed in accordance with the America COMPETES Reauthorization Act of 2010, Pub. Law 111-358, title I, § 105(a), Jan. 4, 2011, as amended, codified at 15 U.S.C. § 3719 (hereinafter “America COMPETES Act”).
3. Eligibility: Each Competition Participant (individual, team, or legal entity) is required to register on the Topcoder NIST Differential Privacy Synthetic Data Challlenge [Unlinkable Data Challenge-Stage 2] website. There shall be one Official Representative for each Competition Participant. The Official Representative must provide a username (which may serve as a team or affiliation name), email address, and affirm that he/she has read and consents to be governed by the Competition Rules. At NIST’s discretion, any violation of this rule will be grounds for disqualification from the Competition. Multiple individuals and/or legal entities may collaborate as a team to submit a single entry, in which case the designated Official Representative will be responsible for meeting all entry and evaluation requirements. Participation is subject to all U.S. federal, state and local laws and regulations. Participants, including individuals and private entities, must not have been convicted of a felony criminal violation under any Federal law within the preceding 24 months and must not have any unpaid Federal tax liability that has been assessed, for which all judicial and administrative remedies have been exhausted or have lapsed, and that is not being paid in a timely manner pursuant to an agreement with the authority responsible for collecting the tax liability. Participants must not be suspended, debarred, or otherwise excluded from doing business with the Federal Government. Individuals entering on behalf of or representing a company, institution or other legal entity are responsible for confirming that their entry does not violate any policies of that company, institution or legal entity. Any other individuals or legal entities involved with the design, production, execution, distribution or evaluation of the NIST Differential Privacy Synthetic Data Challenge are not eligible to participate.
To be eligible for a cash prize:
a. A Participant (whether an individual, team, or legal entity) must have registered to participate and complied with all the requirements under section 3719 of title 15, United States Code as contained herein.
b. At the time of Entry, the Official Representative (individual or team lead, in the case of a group project) must be age 18 or older and a U.S. citizen or permanent resident of the United States or its territories.
c. In the case of a private entity, the business shall be incorporated in and maintain a primary place of business in the United States or its territories.
d. Participants may not be a Federal entity or Federal employee acting within the scope of their employment. NIST employees are not eligible to participate. Non-NIST Federal employees acting in their personal capacities should consult with their respective agency ethics officials to determine whether their participation in this Competition is permissible.
e. A Participant shall not be deemed ineligible because the Participant consulted with Federal employees or used Federal facilities in preparing its submission to the NIST Differential Privacy Synthetic Data Challenge prize competition if the Federal employees and facilities are made available to all Participants on an equitable basis.
4. Submissions: By participating in this Challenge, Innovator may submit to Challenge Sponsor submission materials ("Submission”), as outlined in these NIST Official Rules on Challenge.gov and the Challenge Guidelines specific to this Challenge on Topcoder.com. By submitting a Submission, Innovator thereby agrees to provide reasonable assistance and additional information concerning the Submission to Challenge Sponsor, if requested.
5. Warranties: By submitting an Entry, the Participant represents and warrants that all information submitted is true and complete to the best of the Participant’s knowledge, that the Participant has the right and authority to submit the Entry on the Participant’s own behalf or on behalf of the persons and entities that the Participant specifies within the Entry, and that the Entry (both the information and materials submitted in the Entry and the underlying technology/method/idea/treatment protocol/solution described in the Entry):
a. Is the Participant’s own original work, or is submitted by permission with full and proper credit given within the Entry;
b. Does not contain trade secrets (the Participant’s or anyone else’s);
c. Does not knowingly violate or infringe upon the patent rights, industrial design rights, copyrights, trademarks, rights of privacy, publicity or other intellectual property or other rights of any person or entity;
d. Does not contain malicious code, such as viruses, malware, timebombs, cancelbots, worms, Trojan horses or other potentially harmful programs or other material or information;
e. Does not and will not violate any applicable law, statute, ordinance, rule or regulation, including, without limitation, United States export laws and regulations, including but not limited to, the International Traffic in Arms Regulations and the Department of Commerce Export Regulations; and
f. Does not trigger any reporting or royalty or other obligation to any third party.
6. Intellectual Property: Any applicable intellectual property rights to an Entry will remain with the Participant. By participating in the prize competition, the Participant is not granting any rights in any patents, pending patent applications, or copyrights related to the technology described in the Entry. However, by submitting an Entry, the Participant is granting NIST, NASA, and any parties acting on their behalf certain limited rights as set forth herein.
a. By submitting an Entry, the Participant grants to NIST, NASA, and any parties acting on their behalf the right to review the Entry, to describe the Entry in any materials created in connection with this competition, and to screen and evaluate the Entry. NIST and NASA, and any parties acting on their behalf will also have the right to publicize Participant’s name and, as applicable, the names of Participant’s team members and/or Organization which participated in submitting the Entry following the conclusion of the Competition.
b. As part of its submission, the Participant must provide written consent granting NIST, NASA, and any parties acting on their behalf, a royalty-free, non-exclusive, irrevocable, worldwide license to display publicly and use for promotional purposes the Participant’s entry (“demonstration license”). This demonstration license includes posting or linking to the Participant’s entry on NIST and NASA’s websites, including the Competition Website, and partner websites, and inclusion of the Participant’s Entry in any other media, worldwide.
7. Trade Secret Information: By making a submission to this prize competition, the Participant agrees that no part of its submission includes any Trade Secret information, ideas or products. All submissions to this prize competition are deemed non-proprietary. Since NIST does not wish to receive or hold any submitted materials “in confidence” it is agreed that, with respect to the Participant’s Entry, no confidential or fiduciary relationship or obligation of secrecy is established between NIST, NASA, or any parties acting on their behalf and the Participant, the Participant’s team, or the company or institution the Participant represents when submitting an Entry, or any other person or entity associated with any part of the Participant’s Entry.
8. Liability: Participants shall agree to assume any and all risks and waive claims against the Federal Government and its related entities, except in the case of willful misconduct, for any injury, death, damage, or loss of property, revenue, or profits, whether direct, indirect, or consequential, arising from participation in this prize competition, whether the injury, death, damage, or loss arises through negligence or otherwise.
9. Insurance: Participants are not required to obtain liability insurance for this Competition.
10. Indemnification: Participants shall agree to indemnify the Federal Government against third party claims for damages arising from or related to Challenge activities.
11. Changes and Cancellation: Challenge Sponsor has the right to make updates and/or make any changes at any time during the Challenge. Innovators are responsible for regularly reviewing the official rules on Challenge.gov website and updates on the Topcoder site to ensure they are meeting all rules and requirements of the Challenge. Challenge Sponsor has the right to cancel the Challenge at any time, without warning or explanation, and to subsequently remove the Prize completely.
12. Payments: The prize competition winners will be paid prizes directly from NIST. Prior to payment, winners will be required to verify eligibility. The verification process with the agency includes providing the full legal name, tax identification number or social security number, routing number and banking account to which the prize money can be deposited directly.
13. Existing Laws: The Federal Government shall not, by virtue of conducting this prize competition, be responsible for compliance by Participants in the prize competition with Federal law, including licensing, export control, and nonproliferation laws, and related regulations.
Participation is subject to all U.S. federal, state and local laws and regulations. Participants are responsible for checking applicable laws and regulations in their jurisdiction(s) before participating in the prize competition to ensure that their participation is legal. Individuals entering on behalf of or representing a company, institution or other legal entity are responsible for confirming that their entry does not violate any policies of that company, institution or legal entity.
14. Registration and Submissions: Submissions must be made online (only), via upload to the Topcoder website, at the specific dates and times listed in the Summary of Important Dates on Challenge.gov website and Topcoder site. No late submissions will be accepted.
15. Selection of Winners: Based on the winning criteria, prizes will be awarded per the Judging Criteria section in the Challenge Guidelines. In the case of a tie, the winner(s) will be selected based on the highest votes from the judges.
16. Judging: The final determination of the winners will be made at the sole discretion of NIST. Scores and feedback from NIST will not be shared.
17. Progressive Prize Awards: There are multiple ways for Challenge leaders to win progressive prizes halfway through each marathon match and prior to the final round. Progressive prizes are announced when the Challenge is publicly launched and awarded based on the participants placement on the provisional leaderboard at an exact time (approximately half way through the challenge), and their provisional testing score (not the sequestered results). These prizes are awarded to the Top 4 pre-certified participants from the provisional leaderboard once their code has been inspected. There are 4 progressive prizes available for each marathon match at $1000 each.
a. A Marathon Match is defined as a long-running crowdsourcing challenge, using the Topcoder website, where participants solve complex algorithms on large data sets. Participants submit their software code to create and optimize their algorithm.
b. A Leaderboard is an online, public display of the competition on the Topcoder website. Each marathon match will have a leaderboard where participants will receive immediate feedback about the quality of their submission. Participants can repeatedly improve their score and validate the capabilities of their algorithm by viewing the leaderboard. The leaderboard will display the participants handle, ranking and provisional score, at any given point in time. Additionally, the leaderboard will identify those teams or individuals who agree to make their submission (code) publicly available at the conclusion of this challenge; this is an option, not required to participate.
18. Additional Prize Awards: Additional funds are available to the NIST Judge panel, in the form of Additional Prizes, up to $20,000 and assessed at the end of the final marathon match. An additional prize of $4000 may be awarded to each of the top 5 award winning teams who agree to provide and do provide their full code solution in an open source repository for use by all interested parties.
19. Privacy Advisory: The Topcoder.com website is hosted by a private entity and is not a service of NIST. The solicitation and collection of your personal or individually identifiable information is subject to the host’s privacy and security policies and will not be shared with NIST unless you win the Challenge. Challenge winners’ personally identifiable information must be made available to NIST in order to collect an award.
Judging and Scoring for Final Prizes
NIST staff or their delegates will review entries based on the Certification Process and Final Review Process. A submission that fails to meet the compliance criteria will be ineligible to win prizes in this contest. Submissions that pass the final compliance review will be scored based on accuracy during a sequestered evaluation: the submitter's code will be run against a blind (sequestered) dataset and similarity metrics will be computed between analytics results on the synthetic privatized set and the original dataset.
All submisions must be made on the Topcoder website. All contest details, including timeline and marathon match details can be found on the Topcoder website.