title | layout | category | order |
---|---|---|---|
Wikipedia Network Analysis |
page |
demo |
3 |
First, you need to download and extract the following two files from a preprocessed version of the Wikipedia links dataset.
Add their paths in config.toml
as wiki-links
and wiki-titles
.
Then, run the following command: {% highlight bash %} ./wiki-page-rank config.toml {% endhighlight %}
Note that the calculations using this dataset require about 7.5 GB of memory.
{% highlight bash %}
- United_States 0.00220614
- 2007 0.00140843
- 2008 0.00135685
- Geographic_coordinate_system 0.00125164
- United_Kingdom 0.00100978
- 2006 0.000865724
- France 0.000732849
- Wikimedia_Commons 0.0007239
- Wiktionary 0.000658065
- Canada 0.000648715
- 2005 0.000616681
- England 0.000603392
- Biography 0.000599893
- Germany 0.000584397
- United_States_postal_abbreviations 0.00055105
- Australia 0.000528317
- English_language 0.000518112
- World_War_II 0.000506347
- Japan 0.00048433
- List_of_U._S._postal_abbreviations 0.000469503
- Europe 0.000463587
- India 0.000449639
- 2004 0.000436032
- Italy 0.00040331
- Music_genre 0.000397291 {% endhighlight %}
{% highlight bash %}
- Computer_science 5.86886e+06
- Computer 59995
- Programming_language 42832
- Computational_complexity_theory 41272
- Human-computer_interaction 38284
- Algorithm 38085
- Category_theory 37330
- Data_structure 37262
- Quantum_computer 36760
- Cognitive_science 36476
- Type_theory 36325
- Computational_science 35673
- David_Kahn 35494
- Microarchitecture 35413
- Mathematics 33997
- Compiler 31904
- 2008 31262
- 2006 29557
- Digital_object_identifier 28396
- Software_engineering 28217
- Artificial_intelligence 27915
- Computer_programming 27734
- Operating_system 26752
- Association_for_Computing_Machinery 26725
- Information_systems 26389 {% endhighlight %}
{% highlight bash %}
- Bill_Gates 5.85817e+06
- United_States 38556
- 2007 34182
- Microsoft 32890
- 2008 32463
- BASIC 28863
- Seattle 28858
- Altair_8800 28635
- 2006 28191
- Time_Person_of_the_Year 27929
- Philanthropy 27777
- United_States_Microsoft_antitrust_case 26578
- 2005 22684
- Operating_system 22267
- Steve_Ballmer 21174
- Executive_officer 20650
- IBM 20511
- Ray_Ozzie 20459
- 2004 20233
- Jeff_Raikes 20171
- Richard_Rashid 20166
- Craig_Mundie 19996
- United_States_dollar 19975
- Brian_Kevin_Turner 19854
- Steven_Sinofsky 19838 {% endhighlight %}
{% highlight bash %}
- University_of_Illinois_at_Urbana-Champaign 5.87068e+06
- United_States 38425
- Illinois_Fighting_Illini 32101
- UIUC_College_of_Engineering 28506
- 2007 27630
- UIUC_campus 26562
- Illinois_Fighting_Illini_football 25158
- Beckman_Institute 25084
- Ohio_State_University 24697
- Big_Ten_Conference 24235
- University_of_Wisconsin-Madison 23458
- Harvard_University 22926
- U.S.News&_World_Report 22882
- Stanford_University 22729
- Illinois 22679
- UIUC_Main_Campus 22654
- Illinois_Fighting_Illini_men's_basketball 22000
- Massachusetts_Institute_of_Technology 21995
- Yale_University 21961
- Chief_Illiniwek 21694
- College_and_university_rankings 21567
- National_Center_for_Supercomputing_Applications 21542
- 2006 21217
- Marching_Illini 21132
- Kenney_Gym 21125 {% endhighlight %}
{% highlight bash %}
- Pizza 5.86632e+06
- United_States 51673
- Sicilian_pizza 38670
- Pissaladière 37739
- Bread 37418
- Tarte_flambée 34569
- Italy 33079
- Ricotta 31228
- Olive_oil 31101
- Farinata 30548
- Yeast 30492
- Chicago-style_pizza 29426
- Wikimedia_Commons 29343
- Pita 29341
- Crème_fraîche 29097
- Masonry_oven 28999
- Calzone 28429
- Naan 28414
- Paratha 28092
- Quesadilla 27985
- Green_onion_pancake 27889
- New_Haven-style_pizza 27786
- Focaccia 27617
- History_of_pizza 27600
- Pizza_delivery 27596 {% endhighlight %}
{% highlight bash %}
- Beer 5.86645e+06
- Pale_lager 31153
- Yeast 28062
- Alcoholic_beverage 26486
- Belgian_beer 26128
- Brewery 24892
- United_States 23183
- Alcohol_by_volume 21898
- Schnapps 21772
- Stout 21684
- Brewing 21470
- 2008 20995
- Wheat_beer 20959
- Beer_in_the_United_States 20899
- 2007 20413
- Malt 20279
- Lambic 19963
- Beer_in_Denmark 19640
- List_of_countries_by_beer_consumption_per_capita 19563
- German_beer 19470
- American_beer 19446
- Beer_in_Africa 19417
- Beer_in_Ireland 19279
- Cask_ale 19191
- Beer_in_Poland 19141 {% endhighlight %}