Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test for duplicates #44

Open
nlehuby opened this issue Jun 27, 2018 · 2 comments · May be fixed by #48
Open

Test for duplicates #44

nlehuby opened this issue Jun 27, 2018 · 2 comments · May be fixed by #48

Comments

@nlehuby
Copy link
Contributor

nlehuby commented Jun 27, 2018

When you use your geocoder to perform autocomplete search, you don't want the results to include duplicates, because they are confusing for the user which will not know how to choose between them.

How can we use geocoder-tester to test that we don't have duplicate results ?

@nlehuby
Copy link
Contributor Author

nlehuby commented Jun 27, 2018

Here is a proposal about this:

  • Qwant@137cc88
  • a new parameters to add to the tests: max_matches
  • if we find more results that match the expected results than specified, the test is ko

We are not really satisfied with this solution, because

  • if you want to test for duplicates, you have to modify all the test files to activate this feature
  • you also need to use the limit parameter (if set to default, you won't have any duplicates as you only check 1 result)

Any better idea in how to handle this ?

@antoine-de
Copy link
Contributor

I created a PR on our fork to handle this:
Qwant#26

The idea is to add an option to check the duplicates: --check-dupplicates=10

This will run geocoder tester as always, and for each query, after the tests on the expected fields, we'll check that no objects in the response are duplicates.

The notion of a duplicate is something that the user can't differentiate, so we implemented something quite specific for qwant's display of the autocomplete's response:

  • for a poi, we consider the object's label + it's address
  • for the other objects only the label (or name if there is no label

For the moment this mechanism is quite hardcoded in get_label_for_dupplicates, we need to see how to make it more generic. But since it's an opt-in cli parameter, maybe we can first add this in the main geocoder-tester repository and makes it more generic if the need arises.

So this will add more test errors and the responses are formated like:

Duplicates found in the response
# Search was: indre
## Entry ('Reuilly (Indre) (Reuilly)', 'poi', 'Sentier des Tournelles (Reuilly)') has been found for:
           label           |         id          | type | osm_id | housenumber | street | postcode |  city   | country |        lat        |        lon         |               addr               | poi_types 
———————————————————————————|—————————————————————|——————|————————|—————————————|————————|——————————|—————————|—————————|———————————————————|————————————————————|——————————————————————————————————|———————————
 Reuilly (Indre) (Reuilly) | osm:node:1854248363 | poi  |   _    |      _      |   _    |  36260   | Reuilly |    _    | 47.08530172468403 | 2.0474608578328177 | Sentier des Tournelles (Reuilly) |  railway  
 Reuilly (Indre) (Reuilly) | osm:node:4498318505 | poi  |   _    |      _      |   _    |  36260   | Reuilly |    _    | 47.08529686318019 | 2.047508718499927  | Sentier des Tournelles (Reuilly) |  railway  

## Entry ('Indre Oslofjord (Oslo)', 'poi', 'Tøyengata (Oslo)') has been found for:
         label          |        id         | type | osm_id | housenumber | street | postcode | city | country |        lat        |        lon         |       addr       | poi_types 
————————————————————————|———————————————————|——————|————————|—————————————|————————|——————————|——————|—————————|———————————————————|————————————————————|——————————————————|———————————
 Indre Oslofjord (Oslo) | osm:way:233882196 | poi  |   _    |      _      |   _    |    _     | Oslo |    _    | 59.91907628783925 | 10.771447863393677 | Tøyengata (Oslo) |  garden   
 Indre Oslofjord (Oslo) | osm:way:233882197 | poi  |   _    |      _      |   _    |    _     | Oslo |    _    | 59.91908412491954 | 10.771563565673642 | Tøyengata (Oslo) |  garden   

would anyone be interested for this ? should I also make a PR on the central repository with this ?

@antoine-de antoine-de linked a pull request Sep 24, 2018 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants