HIP-WG · eightysteele · Oct 21, 2012 · Oct 21, 2012 · Oct 21, 2012 · Oct 21, 2012
diff --git a/doc/bmc_article/bmc_article.tex b/doc/bmc_article/bmc_article.tex
@@ -123,7 +123,7 @@
        \email{Rutger A Vos\correspondingauthor - [email protected]}%
       \and
          Aaron Steele\correspondingauthor$^2$%
-         \email{Aaron Steele\correspondingauthor - [email protected]}
+         \email{Aaron Steele\correspondingauthor - [email protected]}
       }
 
 
@@ -135,7 +135,7 @@
 
 \address{%
     \iid(1)Naturalis Biodiversity Center, Einsteinweg 2, Leiden, the Netherlands\\
-    \iid(2)UC Berkeley, Berkeley, USA
+    \iid(2)University of California Berkeley, Berkeley, USA
 }%
 
 \maketitle
@@ -329,7 +329,7 @@ \section*{Results and Discussion}
 		need to be scalars we concatenate the keys with | and the values with , 
 		(for example). Here's the result we would then emit:
 
-		 A       => 1,1 # the first integer is the node ID, the second its tip count
+		 A       => 1,1 % the first integer is the node ID, the second its tip count
 		 C       => 2,1
 		 A|C     => 3,2
 		 A|C     => 4,2
@@ -363,10 +363,32 @@ \section*{Results and Discussion}
   - performance
 
   % this describes at a high level Aaron's code
-  \subsection*{Name of the Clojure implementation}
-  - using the clojure implementation
-  - web front-end
-  - performance
+  \subsection*{Clojure}
+
+  Our implementation rides on Clojure, a dynamic programming language that 
+  compiles down to bytecode and gets executed on the Java Virtual Machine. It
+  can natively access Java frameworks like Apache Hadoop, making it an ideal 
+  candidate for implementing distributed MapReduce algorithms in an extrememly 
+  performant way. In addition to Clojure, our implementation rides on 
+  Cascalog, a high performance data processing library for querying "Big Data" 
+  on Hadoop using clusters or local machines with the interactive Clojure REPL.
+
+  \subsubsection*{Implementation details}
+
+  As input, our implementation takes two files: The phylogenetic tree that has 
+  been transformed and labelled in a post-order traversal from node tip to 
+  root, and a file containing the node tips from which to prune. The output is 
+  the taxon bipartition table described above. The algorithm initially maps 
+  each node to its tip, then combines and merges resulting tips to create the 
+  final bipartition table. The MapReduce job can be launched from the command 
+  line on a Hadoop cluster or interactively using the REPL.
+
+  \subsubsection*{Runtime performance}
+
+  Here a brief overview of how performance improves as input data gets larger 
+  since the Hadoop overhead is eclipsed. Also mention combining other sources 
+  of Big Data such as spatial data via GADM native Java bindings, taxonomy 
+  synonyms, etc could be done much faster than serial methods.
 
 %%%%%%%%%%%%%%%%%%%%%%
 \section*{Conclusions}