Update elastic analyzer for autocomplete search #3

New Issue

2019-06-02T12:12:31Z

KevinMidboe commented

2019-06-02 12:12:31 +00:00

(Migrated from github.com)

Our mapping and search using match_phrase_prefix works very well, but there are some improvements that can be made.

Currently we don't have support for typing queries that do not match the entire beginning for the result.
E.g.

"interste" would match "Interstellar", because both start the same
"interset" would not match "Interstellar", because the two last characters are swapped
"terstellar" would not match "Interstellar", because the start is missing from the query

What looks like a good solution to this is to analyze as ngrams. A ngram analyzer will create many more terms for each input, all pointing to the same document. Taking in account multiple matching terms pointing at the same document could improve hit-rate for slightly misspelled queries.

Research:

https://qbox.io/blog/an-introduction-to-ngrams-in-elasticsearch

Our mapping and search using `match_phrase_prefix` works very well, but there are some improvements that can be made. Currently we don't have support for typing queries that do not match the entire beginning for the result. E.g. - "interste" would **match** "Interstellar", because both start the same - "interset" would **not match** "Interstellar", because the two last characters are swapped - "terstellar" would **not match** "Interstellar", because the start is missing from the query What looks like a good solution to this is to analyze as ngrams. A ngram analyzer will create many more terms for each input, all pointing to the same document. Taking in account multiple matching terms pointing at the same document could improve hit-rate for slightly misspelled queries. Research: - https://qbox.io/blog/an-introduction-to-ngrams-in-elasticsearch

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: KevinMidboe/seasoned#3