Skip to content

Commit

Permalink
First round of phase 2 changes to sync up with version 2.x.
Browse files Browse the repository at this point in the history
  • Loading branch information
debadair committed Dec 17, 2015
1 parent 25c4adc commit e1688d9
Show file tree
Hide file tree
Showing 32 changed files with 212 additions and 167 deletions.
37 changes: 16 additions & 21 deletions 010_Intro/10_Installing_ES.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -54,35 +54,30 @@ You should see a response like this:
[source,js]
--------------------------------------------------
{
"status": 200,
"name": "Shrunken Bones",
"version": {
"number": "1.4.0",
"lucene_version": "4.10"
},
"tagline": "You Know, for Search"
"name" : "Tom Foster",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "2.1.0",
"build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
"build_timestamp" : "2015-11-18T22:40:03Z",
"build_snapshot" : false,
"lucene_version" : "5.3.1"
},
"tagline" : "You Know, for Search"
}
--------------------------------------------------
// SENSE: 010_Intro/10_Info.json

This means that your Elasticsearch _cluster_ is up and running, and we can
start experimenting with it.
This means that you have an Elasticsearch node up and running, and you can
start experimenting with it. A _node_ is a running instance of Elasticsearch.
((("nodes", "defined"))) A _cluster_ is ((("clusters", "defined")))a group of
nodes with the same `cluster.name` that are working together to share data
and to provide failover and scale. (A single node, however, can form a cluster
all by itself.)

TIP: See that View in Sense link at the bottom of the example? <<sense, Install the Sense console>>
to run the examples in this book against your own Elasticsearch cluster and view the results.

A _node_ is a running instance of Elasticsearch.((("nodes", "defined"))) A _cluster_ is ((("clusters", "defined")))a group of
nodes with the same `cluster.name` that are working together to share data
and to provide failover and scale, although a single node can form a cluster
all by itself.

You should change the default `cluster.name` to something appropriate to you,
like your own name, to stop ((("clusters", "changing default name")))your nodes from trying to join another cluster on
the same network with the same name!

You can do this by editing the `elasticsearch.yml` file in the `config/`
directory and then restarting Elasticsearch.

When Elasticsearch is running in the foreground, you can stop it by pressing Ctrl-C.

[[sense]]
Expand Down
29 changes: 17 additions & 12 deletions 020_Distributed_Cluster/15_Add_an_index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -65,25 +65,30 @@ If we were to check the
[source,js]
--------------------------------------------------
{
"cluster_name": "elasticsearch",
"status": "yellow", <1>
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 3,
"active_shards": 3,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 3 <2>
"cluster_name": "elasticsearch",
"status": "yellow", <1>
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 3,
"active_shards": 3,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 3, <2>
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 50
}
--------------------------------------------------

<1> Cluster `status` is `yellow`.
<2> Our three replica shards have not been allocated to a node.
<2> The replica shards have not been allocated to a node.

A cluster health of `yellow` means that all _primary_ shards are up and
running (the cluster is capable of serving any request successfully) but
not all _replica_ shards are active. In fact, all three of our replica shards
not all _replica_ shards are active. In fact, all three replica shards
are currently `unassigned`&#x2014;they haven't been allocated to a node. It
doesn't make sense to store copies of the same data on the same node. If we
were to lose that node, we would lose all copies of our data.
Expand Down
36 changes: 21 additions & 15 deletions 020_Distributed_Cluster/20_Add_failover.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,12 @@ in exactly the same way as you started the first one (see
<<running-elasticsearch>>), and from the same directory. Multiple nodes can
share the same directory.
As long as the second node has the same `cluster.name` as the first node (see
the `./config/elasticsearch.yml` file), it should automatically discover and
join the cluster run by the first node. If it doesn't, check the logs to find
out what went wrong. It may be that multicast is disabled on your network, or
that a firewall is preventing your nodes from communicating.
When you run a second node on the same machine, it automatically discovers
and joins the cluster as long as it has the same `cluster.name` as the first node (see
the `./config/elasticsearch.yml` file). However, for nodes running on different machines
to join the same cluster, you need to configure a list of unicast hosts the nodes can contact
to join the cluster. For more information about how Elasticsearch nodes find eachother, see https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html[Zen Discovery]
in the Elasticsearch Reference.
***************************************

Expand All @@ -38,16 +39,21 @@ shards (all three primary shards and all three replica shards) are active:
[source,js]
--------------------------------------------------
{
"cluster_name": "elasticsearch",
"status": "green", <1>
"timed_out": false,
"number_of_nodes": 2,
"number_of_data_nodes": 2,
"active_primary_shards": 3,
"active_shards": 6,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0
"cluster_name": "elasticsearch",
"status": "green", <1>
"timed_out": false,
"number_of_nodes": 2,
"number_of_data_nodes": 2,
"active_primary_shards": 3,
"active_shards": 6,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}
--------------------------------------------------
<1> Cluster `status` is `green`.
Expand Down
36 changes: 5 additions & 31 deletions 060_Distributed_Search/15_Search_options.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -82,40 +82,14 @@ discuss it in detail in <<scale>>.
[[search-type]]
==== search_type

While `query_then_fetch` is the default((("query_then_fetch search type")))((("search options", "search_type")))((("search_type"))) search type, other search types can
be specified for particular purposes, for example:
The default search type is `query_then_fetch` ((("query_then_fetch search type")))((("search options", "search_type")))((("search_type"))). In some cases, you might want to explicitly set the `search_type`
to `dfs_query_then_fetch` to improve the accuracy of relevance scoring:

[source,js]
--------------------------------------------------
GET /_search?search_type=count
GET /_search?search_type=dfs_query_then_fetch
--------------------------------------------------

`count`::

The `count` search type has only a `query` phase.((("count search type"))) It can be used when you
don't need search results, just a document count or
<<aggregations,aggregations>> on documents matching the query.

`query_and_fetch`::

The `query_and_fetch` search type ((("query_and_fetch serch type")))combines the query and fetch phases into a
single step. This is an internal optimization that is used when a search
request targets a single shard only, such as when a
<<search-routing,`routing`>> value has been specified. While you can choose
to use this search type manually, it is almost never useful to do so.

`dfs_query_then_fetch` and `dfs_query_and_fetch`::

The `dfs` search types((("dfs search types"))) have a prequery phase that fetches the term
frequencies from all involved shards in order to calculate global term
The `dfs_query_then_fetch` search type has a prequery phase that fetches the term
frequencies from all involved shards to calculate global term
frequencies. We discuss this further in <<relevance-is-broken>>.

`scan`::

The `scan` search type is((("scan search type"))) used in conjunction with the `scroll` API ((("scroll API")))to
retrieve large numbers of results efficiently. It does this by disabling
sorting. We discuss _scan-and-scroll_ in the next section.




19 changes: 11 additions & 8 deletions 300_Aggregations/20_basic_example.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,13 @@ using a simple aggregation. We will do this using a `terms` bucket:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"aggs" : { <1>
"colors" : { <2>
"size" : 0,
"aggs" : {
"popular_colors" : {
"terms" : {
"field" : "color" <3>
"field" : "color"
}
}
}
Expand All @@ -71,9 +72,11 @@ in <<_scoping_aggregations>>.

[NOTE]
=========================
You'll notice that we used the `count` <<search-type,search_type>>.((("count search type")))
Because we don't care about search results--the aggregation totals--the
`count` search_type will be faster because it omits the fetch phase.
You'll notice that we set the `size` to zero. We
don't care about the search results themselves and
returning zero hits speeds up the query. Setting
`size: 0` is the equivalent of using the `count`
search type in Elasticsearch 1.x.
=========================

Next we define a name for our aggregation. Naming is up to you;
Expand Down Expand Up @@ -115,7 +118,7 @@ Let's execute that aggregation and take a look at the results:
}
}
--------------------------------------------------
<1> No search hits are returned because we used the `search_type=count` parameter
<1> No search hits are returned because we set the `size` parameter
<2> Our `colors` aggregation is returned as part of the `aggregations` field.
<3> The `key` to each bucket corresponds to a unique term found in the `color` field.
It also always includes `doc_count`, which tells us the number of docs containing the term.
Expand Down
3 changes: 2 additions & 1 deletion 300_Aggregations/21_add_metric.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ Let's go ahead and add ((("average metric")))an `average` metric to our car exam

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"colors": {
"terms": {
Expand Down
3 changes: 2 additions & 1 deletion 300_Aggregations/22_nested_bucket.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ color:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"colors": {
"terms": {
Expand Down
3 changes: 2 additions & 1 deletion 300_Aggregations/23_extra_metrics.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,9 @@ max price for each make:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"colors": {
"terms": {
Expand Down
6 changes: 4 additions & 2 deletions 300_Aggregations/30_histogram.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,9 @@ To do this, we use a `histogram` and a nested `sum` metric:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs":{
"price":{
"histogram":{ <1>
Expand Down Expand Up @@ -119,8 +120,9 @@ an `extended_stats` ((("extended_stats metric")))metric:

[source,js]
----
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"makes": {
"terms": {
Expand Down
9 changes: 6 additions & 3 deletions 300_Aggregations/35_date_histogram.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,9 @@ how many cars were sold each month?

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"sales": {
"date_histogram": {
Expand Down Expand Up @@ -136,8 +137,9 @@ additional parameters that will provide this behavior:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"sales": {
"date_histogram": {
Expand Down Expand Up @@ -188,8 +190,9 @@ which car type is bringing in the most money to our business:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"sales": {
"date_histogram": {
Expand Down
16 changes: 9 additions & 7 deletions 300_Aggregations/40_scope.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@ Let's look at one of our first aggregation examples:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"aggs" : {
"colors" : {
"terms" : {
Expand All @@ -34,8 +35,9 @@ query is internally translated as follows:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"query" : {
"match_all" : {}
},
Expand Down Expand Up @@ -65,7 +67,7 @@ a `match` query):

[source,js]
--------------------------------------------------
GET /cars/transactions/_search <1>
GET /cars/transactions/_search
{
"query" : {
"match" : {
Expand All @@ -82,10 +84,9 @@ GET /cars/transactions/_search <1>
}
--------------------------------------------------
// SENSE: 300_Aggregations/40_scope.json
<1> We are omitting `search_type=count` so((("search_type", "count"))) that search hits are returned too.

By omitting the `search_type=count` this time, we can see both the search
results and the aggregation results:
Since we aren't specifying `"size" : 0`, both the search
results and the aggregation results are returned:

[source,js]
--------------------------------------------------
Expand Down Expand Up @@ -155,8 +156,9 @@ aggregations inside it as usual:

[source,js]
--------------------------------------------------
GET /cars/transactions/_search?search_type=count
GET /cars/transactions/_search
{
"size" : 0,
"query" : {
"match" : {
"make" : "ford"
Expand Down
Loading

0 comments on commit e1688d9

Please sign in to comment.