Millions of Facets

2019-09-02 Thread Hicks, Matt
I'm writing a messaging platform and indexing the messages in Lucene. In order to determine what messages a user is "subscribed to", I add their id as a "subscribed" facet. This is problematic as we scale up since it means that each message could potentially have millions of subscribers for the pu

Re: Beginner Question: Tokenized and full phrase

2019-09-02 Thread Erick Erickson
In the Lucene context you simply have tokens. In the analyzed case (i.e. text), the token is however the incoming stream is split up by the analysis chain you construct. In the string case the token is the entire input. That’s just the way it works. You have two choices: 1> Use two fields, one

Beginner Question: Tokenized and full phrase

2019-09-02 Thread Roland Käser
Hallo We use Lucene to index POJO's which are stored in the database. The index primarily contains text fields. After some work with lucene I came across a strange restriction. I can only assign string or text fields to the document to be indexed. One only indexes the whole string, the other