AnalyzingSuggester
,
FuzzySuggester
and AnalyzingInfixSuggester
.
Using an analyzer is powerful because it lets you customize exactly
how suggestions are matched: you can normalize case, apply stemming, match across
different synonym forms, etc.
One of the most common things you'll do with your analyzer is to remove stop-words using
StopFilter
. Unfortunately, if
you try this, you'll quickly notice that the stop filter is too
aggressive because it happily removes the last token even if the user
isn't done typing it yet. For example if the user has typed "a",
you'd expect suggestions like apple, aardvark, etc., but you won't get
that because StopFilter
removed the "a" token.
You could try using
StopFilter
only while indexing, which
was my first attempt with the suggestions
at jirasearch.mikemccandless.com,
but then, at least
for AnalyzingInfixSuggester
,
you'll fail to get matches when you
pass allTermsRequired=true
because the suggester then requires
that even stop words find matches.
Finally, you could use the new
StopSuggestFilter
at lookup time: this filter is just like StopFilter
except
when the token is the very last token, it checks the offset for that
token and if the offset indicates that the token has ended without any
further non-token characters, then the token is preserved. The token
is also marked as a keyword, so that any later stem filters won't change
it. This way a query "a" can find "apple", but a query "a " (with a
trailing space) will find nothing because the "a" will be removed.
I've pushed
StopSuggestFilter
to
jirasearch.mikemccandless.com
and it seems to be working well so far!