Hi All,

I am implementing a search function for address by hibernate search which is 
based on lucene. The class definition as following:

@Indexed
public class Address implements Cloneable
{
@DocumentId
private int id;
@Field
private String addrCountry;
private String addrDesc;
@Field
private String addrLineOne;
private String addrLineTwo;
@Field
private String addrCity;
......

As you see, addrCountry, addrLineone and addrCity are fields for search. I am 
using default analyzer in index & search. So I think country name like United 
States would be indexed as two terms United, and states.

In addition, during search, a search keyword like United states, or Salt lake 
city would be tokenized as two or three single words. 

As result, any address fields contain united, city would be returned. like 
United Kingdom, but actually I want to get a result of united states.

My expected result as following:

if someone searches for "united" it should return "united states" and "united 
kingdom".

if someone searches for "united states" it should return "united states", and 
not "united kingdom".

I hope the analyzer can generate term with multiple words. say, united states 
to united states. I think standardanalyzer would analyze united states to 
united and states?

A different example: if search keyword is parking lot in Salt Lake City, the 
generated terms to search need to be: parking lot and Salt Lake City, not 
parking,lot,salt,lake and city. 

I wonder if any analyzer can help me to implement my requirement. It would be 
better to use dictionary based solution, then I can manage some search terms 
that could have multiple words.

thanks

Ian

Reply via email to