I'm looking for some suggestions on an analyzer decision. I've got my own thoughts to this already, but would like some initial feedback on it first.
The scenario: - An index of geographic information: cities, towns, states, neighborhoods, zipcodes, generic names, etc. Examples are "New York, NY", "New York", "Midtown", "10012", "The Big Apple". - I have these mapped to underlying geographic data points: census data, postal data, mapping data, etc. - I want some of these to carry more precedence than others when conflicting/matching terms exist, i.e. "Washington" should score Washington D.C. higher than the state of Washington. This would be decided on an item-by-item basis, and not dictated by one broad field. - I need the right mix for searches to work as I expect. As an example, a search for "Wedgewood WA" would ideally not match "Wedgewood GA". I'm starting with the StandardAnalyzer and thinking of possibly extending it to carry in some of the business rules meant to come into play for tie-breakers. Comments appreciated. Thanks, jeff r.