forked from elasticsearch-cn/elasticsearch-definitive-guide
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
edf1d66
commit e046b59
Showing
2 changed files
with
41 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
|
||
[[relevance-conclusion]] | ||
=== Relevance tuning is the last 10% | ||
|
||
In this chapter, we looked at a how Lucene generates scores based on TF/IDF. | ||
Understanding the score generation process is critical so you can | ||
tune, modulate, attenuate and manipulate the score for your particular | ||
business domain. | ||
|
||
In practice, simple combinations of queries will get you good search results. | ||
But to get _great_ search results, you'll often have to start tinkering with | ||
the previously mentioned tuning methods. | ||
|
||
Often, applying a boost on a strategic field, or re-arranging a query to | ||
emphasize a particular clause, will be sufficient to make your results great. | ||
Sometimes you'll need more invasive changes. This is usually the case if your | ||
scoring requirements diverge heavily from Lucene's word-based TF/IDF model (e.g. you | ||
want to score based on time or distance). | ||
|
||
With that said, relevancy tuning is a rabbit-hole that you can easily fall into | ||
and never emerge. The concept of "most relevant" is a nebulous target to hit and | ||
different people often have different ideas about document ranking. It is easy | ||
to get into a cycle of constant fiddling without any apparent progress. | ||
|
||
We would encourage you to avoid this (very tempting) behavior and instead properly | ||
instrument your search results. Monitor how often your users click the top | ||
result, the top-ten, the first page, how often they execute a secondary query | ||
without selecting a result first, how often they click a result and immediately | ||
go back to the search results, etc etc. | ||
|
||
These are all indicators of how relevant your search results are to the user. | ||
If your query is returning highly relevant results, the user will select one of | ||
the top five results, find what they want and leave. Non-relevant results cause | ||
the user to click around and try new search queries. | ||
|
||
Once you have instrumentation in place, tuning your query is simple. Make a change, | ||
monitor it's effect on your users, repeat as necessary. The tools outlined in this | ||
chapter are just that: tools. You have to use them appropriately to propel | ||
your search results into _great_ category, and the only way to do that is with | ||
strong measurement of user behavior. |