Agg comment and file cleanup

zhaofanfan2019 · Jan 5, 2015 · 1731b70 · 1731b70
1 parent ead31c9
commit 1731b70
Show file tree

Hide file tree

Showing 8 changed files with 24 additions and 87 deletions.
diff --git a/03_Aggregations.asciidoc b/03_Aggregations.asciidoc
@@ -33,9 +33,8 @@ _near real-time_, just like search.
 This is extremely powerful for reporting and dashboards.  Instead of performing
 _rollups_ of your data (_that crusty Hadoop job that takes a week to run_),
 you can visualize your data in real time, allowing you to respond immediately.
-
-// Perhaps mention "not precalculated, out of date, and irrelevant"?
-// Perhaps "aggs are calculated in the context of the user's search, so you're not showing them that you have 10 4 star hotels on your site, but that you have 10 4 star hotels that *match their criteria*".
+Your report changes as your data changes, rather than being pre-calculated, out of
+date and irrelevant.
 
 Finally, aggregations operate alongside search requests.((("aggregations", "operating alongside search requests"))) This means you can
 both search/filter documents _and_ perform analytics at the same time, on the

diff --git a/300_Aggregations/10_facets.asciidoc b/300_Aggregations/10_facets.asciidoc
diff --git a/300_Aggregations/20_basic_example.asciidoc b/300_Aggregations/20_basic_example.asciidoc
@@ -6,6 +6,13 @@ and their syntax,((("aggregations", "basic example", id="ix_basicex"))) but aggr
 Once you learn how to think about aggregations, and how to nest them appropriately,
 the syntax is fairly trivial.
 
+[NOTE]
+=========================
+A complete list of aggregation buckets and metrics can be found at the http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html[online
+reference documentation].  We'll cover many of them in this chapter, but glance
+over it after finishing so you are familiar with the full range of capabilities.
+=========================
+
 So let's just dive in and start with an example.  We are going to build some
 aggregations that might be useful to a car dealer.  Our data will be about car
 transactions: the car model, manufacturer, sale price, when it sold, and more.
@@ -40,32 +47,35 @@ using a simple aggregation.  We will do this using a `terms` bucket:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count <1>
+GET /cars/transactions/_search?search_type=count
 {
-    "aggs" : { <2>
-        "colors" : { <3>
+    "aggs" : { <1>
+        "colors" : { <2>
             "terms" : {
-              "field" : "color" <4>
+              "field" : "color" <3>
             }
         }
     }
 }
 --------------------------------------------------
 // SENSE: 300_Aggregations/20_basic_example.json
 
-// Add the search_type=count thing as a sidebar, so it doesn't get in the way
-<1> Because we don't care about search results, we are going to use the `count`
-<<search-type,search_type>>, which((("count search type"))) will be faster.
-<2> Aggregations are placed under the ((("aggs parameter")))top-level `aggs` parameter (the longer `aggregations`
+<1> Aggregations are placed under the ((("aggs parameter")))top-level `aggs` parameter (the longer `aggregations`
 will also work if you prefer that).
-<3> We then name the aggregation whatever we want: `colors`, in this example
-<4> Finally, we define a single bucket of type `terms`.
+<2> We then name the aggregation whatever we want: `colors`, in this example
+<3> Finally, we define a single bucket of type `terms`.
 
 Aggregations are executed in the context of search results,((("searching", "aggregations executed in context of search results"))) which means it is
 just another top-level parameter in a search request (for example, using the `/_search`
 endpoint).  Aggregations can be paired with queries, but we'll tackle that later
 in <<_scoping_aggregations>>.
 
+[NOTE]
+=========================
+You'll notice that we used the `count` <<search-type,search_type>>.((("count search type")))
+Because we don't care about search results -- the aggregation totals -- the 
+`count` search_type will be faster because it omits the fetch phase.
+=========================
 
 Next we define a name for our aggregation.  Naming is up to you;
 the response will be labeled with the name you provide so that your application

diff --git a/300_Aggregations/21_add_metric.asciidoc b/300_Aggregations/21_add_metric.asciidoc
@@ -82,7 +82,6 @@ and what field we want the average to be calculated on (`price`):
 --------------------------------------------------
 <1> New `avg_price` element in response
 
-// Would love to have a graph under each example showing how the data can be displayed (later, i know)
 Although the response has changed minimally, the data we get out of it has grown
 substantially.((("avg_price metric (example)")))  Before, we knew there were four red cars.  Now we know that the
 average price of red cars is $32,500.  This is something that you can plug directly

diff --git a/300_Aggregations/28_bucket_metric_list.asciidoc b/300_Aggregations/28_bucket_metric_list.asciidoc
diff --git a/300_Aggregations/30_histogram.asciidoc b/300_Aggregations/30_histogram.asciidoc
@@ -12,9 +12,6 @@ undoubtedly had a few bar charts in it. The histogram works by specifying an int
 prices, you might specify an interval of 20,000.  This would create a new bucket
 every $20,000.  Documents are then sorted into buckets.
 
-// Perhaps "demonstrate" that a car of 28,000 gets dropped into the "20,000" bucket,while a car of 15,000 gets dropped into the "0" bucket
-// Delete "Just like the ...."
-
 For our dashboard, we want a bar chart of car sale prices, but we
 also want to know the top-selling make per price range.  This is easily accomplished
 using a `terms` bucket ((("terms bucket", "nested in a histogram bucket")))((("buckets", "nested in other buckets", "terms bucket nested in histogram bucket")))nested inside the `histogram`:
@@ -48,11 +45,10 @@ interval that defines the bucket size.
 <2> A `terms` bucket is nested inside each price range, which will show us the
 top make per price range.
 
-// Make the point that the upper limit is exclusive
 As you can see, our query is built around the `price` aggregation, which contains
 a `histogram` bucket.  This bucket requires a numeric field to calculate
 buckets on, and an interval size.  The interval defines how "wide" each bucket
-is.  An interval of 20000 means we will have the ranges `[0-20000, 20000-40000, ...]`.
+is.  An interval of 20000 means we will have the ranges `[0-19999, 20000-39999, ...]`.
 
 Next, we define a nested bucket inside the histogram.  This is a `terms` bucket
 over the `make` field.  There is also a new `size` parameter, which defines the number of terms we want to generate.  A `size` of `1` means we want only the top make

diff --git a/300_Aggregations/40_scope.asciidoc b/300_Aggregations/40_scope.asciidoc
@@ -137,8 +137,6 @@ by adding a search bar.((("dashboards", "adding a search bar")))  This allows th
 of the graphs (which are powered by aggregations, and thus scoped to the query)
 update in real time.  Try that with Hadoop!
 
-//<TODO> Maybe add two screenshots of a Kibana dashboard that changes considerably
-
 [float]
 === Global Bucket
 

diff --git a/302_Example_Walkthrough.asciidoc b/302_Example_Walkthrough.asciidoc
@@ -5,6 +5,4 @@ include::300_Aggregations/21_add_metric.asciidoc[]
 
 include::300_Aggregations/22_nested_bucket.asciidoc[]
 
-include::300_Aggregations/23_extra_metrics.asciidoc[]
-
-include::300_Aggregations/28_bucket_metric_list.asciidoc[]
+include::300_Aggregations/23_extra_metrics.asciidoc[]