First round of phase 2 changes to sync up with version 2.x.

zhaofanfan2019 · Dec 17, 2015 · e1688d9 · e1688d9
1 parent 25c4adc
commit e1688d9
Show file tree

Hide file tree

Showing 32 changed files with 212 additions and 167 deletions.
diff --git a/010_Intro/10_Installing_ES.asciidoc b/010_Intro/10_Installing_ES.asciidoc
@@ -54,35 +54,30 @@ You should see a response like this:
 [source,js]
 --------------------------------------------------
 {
-   "status": 200,
-   "name": "Shrunken Bones",
-   "version": {
-      "number": "1.4.0",
-      "lucene_version": "4.10"
-   },
-   "tagline": "You Know, for Search"
+  "name" : "Tom Foster",
+  "cluster_name" : "elasticsearch",
+  "version" : {
+    "number" : "2.1.0",
+    "build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
+    "build_timestamp" : "2015-11-18T22:40:03Z",
+    "build_snapshot" : false,
+    "lucene_version" : "5.3.1"
+  },
+  "tagline" : "You Know, for Search"
 }
 --------------------------------------------------
 // SENSE: 010_Intro/10_Info.json
 
-This means that your Elasticsearch _cluster_ is up and running, and we can
-start experimenting with it.
+This means that you have an Elasticsearch node up and running, and you can
+start experimenting with it. A _node_ is a running instance of Elasticsearch.
+((("nodes", "defined"))) A _cluster_ is ((("clusters", "defined")))a group of
+nodes with the same `cluster.name` that are working together to share data
+and to provide failover and scale. (A single node, however, can form a cluster
+all by itself.)
 
 TIP: See that View in Sense link at the bottom of the example? <<sense, Install the Sense console>>
 to run the examples in this book against your own Elasticsearch cluster and view the results. 
 
-A _node_ is a running instance of Elasticsearch.((("nodes", "defined"))) A _cluster_ is ((("clusters", "defined")))a group of
-nodes with the same `cluster.name` that are working together to share data
-and to provide failover and scale, although a single node can form a cluster
-all by itself.
-
-You should change the default `cluster.name` to something appropriate to you,
-like your own name, to stop ((("clusters", "changing default name")))your nodes from trying to join another cluster on
-the same network with the same name!
-
-You can do this by editing the `elasticsearch.yml` file in the `config/`
-directory and then restarting Elasticsearch.  
-
 When Elasticsearch is running in the foreground, you can stop it by pressing Ctrl-C.
 
 [[sense]]

diff --git a/020_Distributed_Cluster/15_Add_an_index.asciidoc b/020_Distributed_Cluster/15_Add_an_index.asciidoc
@@ -65,25 +65,30 @@ If we were to check the
 [source,js]
 --------------------------------------------------
 {
-   "cluster_name":          "elasticsearch",
-   "status":                "yellow", <1>
-   "timed_out":             false,
-   "number_of_nodes":       1,
-   "number_of_data_nodes":  1,
-   "active_primary_shards": 3,
-   "active_shards":         3,
-   "relocating_shards":     0,
-   "initializing_shards":   0,
-   "unassigned_shards":     3 <2>
+  "cluster_name": "elasticsearch",
+  "status": "yellow", <1>
+  "timed_out": false,
+  "number_of_nodes": 1,
+  "number_of_data_nodes": 1,
+  "active_primary_shards": 3,
+  "active_shards": 3,
+  "relocating_shards": 0,
+  "initializing_shards": 0,
+  "unassigned_shards": 3, <2>
+  "delayed_unassigned_shards": 0,
+  "number_of_pending_tasks": 0,
+  "number_of_in_flight_fetch": 0,
+  "task_max_waiting_in_queue_millis": 0,
+  "active_shards_percent_as_number": 50
 }
 --------------------------------------------------
 
 <1> Cluster `status` is `yellow`.
-<2> Our three replica shards have not been allocated to a node.
+<2> The replica shards have not been allocated to a node.
 
 A cluster health of `yellow` means that all _primary_ shards are up and
 running (the cluster is capable of serving any request successfully) but
-not  all _replica_ shards are active.  In fact, all three of our replica shards
+not  all _replica_ shards are active.  In fact, all three replica shards
 are currently `unassigned`&#x2014;they haven't been allocated to a node. It
 doesn't make sense to store copies of the same data on the same node. If we
 were to lose that node, we would lose all copies of our data.

diff --git a/020_Distributed_Cluster/20_Add_failover.asciidoc b/020_Distributed_Cluster/20_Add_failover.asciidoc
@@ -12,11 +12,12 @@ in exactly the same way as you started the first one (see
 <<running-elasticsearch>>), and from the same directory. Multiple nodes can
 share the same directory.
 
-As long as the second node has the same `cluster.name` as the first node (see
-the `./config/elasticsearch.yml` file), it should automatically discover and
-join the cluster run by the first node. If it doesn't, check the logs to find
-out what went wrong.  It may be that multicast is disabled on your network, or
-that a firewall is preventing your nodes from communicating.
+When you run a second node on the same machine, it automatically discovers 
+and joins the cluster as long as it has the same `cluster.name` as the first node (see
+the `./config/elasticsearch.yml` file). However, for nodes running on different machines
+to join the same cluster, you need to configure a list of unicast hosts the nodes can contact
+to join the cluster. For more information about how Elasticsearch nodes find eachother, see https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html[Zen Discovery]
+in the Elasticsearch Reference. 
 
 ***************************************
 
@@ -38,16 +39,21 @@ shards (all three primary shards and all three replica shards) are active:
 [source,js]
 --------------------------------------------------
 {
-   "cluster_name":          "elasticsearch",
-   "status":                "green", <1>
-   "timed_out":             false,
-   "number_of_nodes":       2,
-   "number_of_data_nodes":  2,
-   "active_primary_shards": 3,
-   "active_shards":         6,
-   "relocating_shards":     0,
-   "initializing_shards":   0,
-   "unassigned_shards":     0
+  "cluster_name": "elasticsearch",
+  "status": "green", <1>
+  "timed_out": false,
+  "number_of_nodes": 2,
+  "number_of_data_nodes": 2,
+  "active_primary_shards": 3,
+  "active_shards": 6,
+  "relocating_shards": 0,
+  "initializing_shards": 0,
+  "unassigned_shards": 0,
+  "delayed_unassigned_shards": 0,
+  "number_of_pending_tasks": 0,
+  "number_of_in_flight_fetch": 0,
+  "task_max_waiting_in_queue_millis": 0,
+  "active_shards_percent_as_number": 100
 }
 --------------------------------------------------
 <1> Cluster `status` is `green`.

diff --git a/060_Distributed_Search/15_Search_options.asciidoc b/060_Distributed_Search/15_Search_options.asciidoc
@@ -82,40 +82,14 @@ discuss it in detail in <<scale>>.
 [[search-type]]
 ==== search_type
 
-While `query_then_fetch` is the default((("query_then_fetch search type")))((("search options", "search_type")))((("search_type"))) search type, other search types can
-be specified for particular purposes, for example:
+The default search type is `query_then_fetch` ((("query_then_fetch search type")))((("search options", "search_type")))((("search_type"))). In some cases, you might want to explicitly set the `search_type`
+to `dfs_query_then_fetch` to improve the accuracy of relevance scoring: 
 
 [source,js]
 --------------------------------------------------
-GET /_search?search_type=count
+GET /_search?search_type=dfs_query_then_fetch
 --------------------------------------------------
 
-`count`::
-
-The `count` search type has only a `query` phase.((("count search type")))  It can be used when you
-don't need search results, just a document count or
-<<aggregations,aggregations>> on documents matching the query.
-
-`query_and_fetch`::
-
-The `query_and_fetch` search type ((("query_and_fetch serch type")))combines the query and fetch phases into a
-single step.  This is an internal optimization that is used when a search
-request targets a single shard only, such as when a
-<<search-routing,`routing`>> value has been specified. While you can choose
-to use this search type manually, it is almost never useful to do so.
-
-`dfs_query_then_fetch` and `dfs_query_and_fetch`::
-
-The `dfs` search types((("dfs search types"))) have a prequery phase that fetches the term
-frequencies from all involved shards in order to calculate global term
+The `dfs_query_then_fetch` search type has a prequery phase that fetches the term
+frequencies from all involved shards to calculate global term
 frequencies. We discuss this further in <<relevance-is-broken>>.
-
-`scan`::
-
-The `scan` search type is((("scan search type"))) used in conjunction with the `scroll` API ((("scroll API")))to
-retrieve large numbers of results efficiently. It does this by disabling
-sorting.  We discuss _scan-and-scroll_ in the next section.
-
-
-
-
diff --git a/300_Aggregations/20_basic_example.asciidoc b/300_Aggregations/20_basic_example.asciidoc
@@ -46,12 +46,13 @@ using a simple aggregation.  We will do this using a `terms` bucket:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
-    "aggs" : { <1>
-        "colors" : { <2>
+    "size" : 0,
+    "aggs" : {
+        "popular_colors" : {
             "terms" : {
-              "field" : "color" <3>
+              "field" : "color"
             }
         }
     }
@@ -71,9 +72,11 @@ in <<_scoping_aggregations>>.
 
 [NOTE]
 =========================
-You'll notice that we used the `count` <<search-type,search_type>>.((("count search type")))
-Because we don't care about search results--the aggregation totals--the 
-`count` search_type will be faster because it omits the fetch phase.
+You'll notice that we set the `size` to zero. We 
+don't care about the search results themselves and
+returning zero hits speeds up the query. Setting
+`size: 0` is the equivalent of using the `count` 
+search type in Elasticsearch 1.x.
 =========================
 
 Next we define a name for our aggregation.  Naming is up to you;
@@ -115,7 +118,7 @@ Let's execute that aggregation and take a look at the results:
    }
 }
 --------------------------------------------------
-<1> No search hits are returned because we used the `search_type=count` parameter
+<1> No search hits are returned because we set the `size` parameter
 <2> Our `colors` aggregation is returned as part of the `aggregations` field.
 <3> The `key` to each bucket corresponds to a unique term found in the `color` field.
 It also always includes `doc_count`, which tells us the number of docs containing the term.

diff --git a/300_Aggregations/21_add_metric.asciidoc b/300_Aggregations/21_add_metric.asciidoc
@@ -14,8 +14,9 @@ Let's go ahead and add ((("average metric")))an `average` metric to our car exam
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+   "size" : 0,
    "aggs": {
       "colors": {
          "terms": {

diff --git a/300_Aggregations/22_nested_bucket.asciidoc b/300_Aggregations/22_nested_bucket.asciidoc
@@ -12,8 +12,9 @@ color:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+   "size" : 0,
    "aggs": {
       "colors": {
          "terms": {

diff --git a/300_Aggregations/23_extra_metrics.asciidoc b/300_Aggregations/23_extra_metrics.asciidoc
@@ -9,8 +9,9 @@ max price for each make:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+   "size" : 0,
    "aggs": {
       "colors": {
          "terms": {

diff --git a/300_Aggregations/30_histogram.asciidoc b/300_Aggregations/30_histogram.asciidoc
@@ -20,8 +20,9 @@ To do this, we use a `histogram` and a nested `sum` metric:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+   "size" : 0,
    "aggs":{
       "price":{
          "histogram":{ <1>
@@ -119,8 +120,9 @@ an `extended_stats` ((("extended_stats metric")))metric:
 
 [source,js]
 ----
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+  "size" : 0,
   "aggs": {
     "makes": {
       "terms": {

diff --git a/300_Aggregations/35_date_histogram.asciidoc b/300_Aggregations/35_date_histogram.asciidoc
@@ -41,8 +41,9 @@ how many cars were sold each month?
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+   "size" : 0,
    "aggs": {
       "sales": {
          "date_histogram": {
@@ -136,8 +137,9 @@ additional parameters that will provide this behavior:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+   "size" : 0,
    "aggs": {
       "sales": {
          "date_histogram": {
@@ -188,8 +190,9 @@ which car type is bringing in the most money to our business:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+   "size" : 0,
    "aggs": {
       "sales": {
          "date_histogram": {

diff --git a/300_Aggregations/40_scope.asciidoc b/300_Aggregations/40_scope.asciidoc
@@ -15,8 +15,9 @@ Let's look at one of our first aggregation examples:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+    "size" : 0,
     "aggs" : {
         "colors" : {
             "terms" : {
@@ -34,8 +35,9 @@ query is internally translated as follows:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+    "size" : 0,
     "query" : {
         "match_all" : {}
     },
@@ -65,7 +67,7 @@ a `match` query):
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search  <1>
+GET /cars/transactions/_search
 {
     "query" : {
         "match" : {
@@ -82,10 +84,9 @@ GET /cars/transactions/_search  <1>
 }
 --------------------------------------------------
 // SENSE: 300_Aggregations/40_scope.json
-<1> We are omitting `search_type=count` so((("search_type", "count"))) that search hits are returned too.
 
-By omitting the `search_type=count` this time, we can see both the search
-results and the aggregation results:
+Since we aren't specifying `"size" : 0`, both the search
+results and the aggregation results are returned:
 
 [source,js]
 --------------------------------------------------
@@ -155,8 +156,9 @@ aggregations inside it as usual:
 
 [source,js]
 --------------------------------------------------
-GET /cars/transactions/_search?search_type=count
+GET /cars/transactions/_search
 {
+    "size" : 0,
     "query" : {
         "match" : {
             "make" : "ford"