Hi, Thanks for the reply. It's an index with tweets so any word really is a target for this. This would mean a significant increase of the index. My volumes are really small so that shouldn't be a problem (but performance/scalability is a concern).
I have the control over the query. Another solution would be to translate a query on "foo" to "foo or #foo or @foo" WDYT? Thanks! S. On Tue, Nov 5, 2013 at 2:17 PM, Erick Erickson <erickerick...@gmail.com>wrote: > If your universe of items you want to match this way is small, > consider something akin to synonyms. Your indexing process > emits two tokens, with and without the @ or # which should > cover your situation. > > FWIW, > Erick > > > On Tue, Nov 5, 2013 at 2:40 AM, Stéphane Nicoll > <stephane.nic...@gmail.com>wrote: > > > Hi, > > > > I am building an application that indexes tweet and offer some basic > > search facilities on them. > > > > I am trying to find a combination where the following would work: > > > > * foo matches the foo word, a mention (@foo) or the hashtag (#foo) > > * @foo only matches the mention > > * #foo matches only the hashtag > > > > It should matches complete word so I used the WhiteSpaceAnalyzer for > > indexing. > > > > Any recommendation for this use case? > > > > Thanks ! > > S. > > > > Sent from my iPhone > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > >