Re: posting list strings

2013-07-14 Thread Sriram Sankar
The large majority of terms in my index are not text terms. For example, I have connection terms. Suppose user 543 and user 664 are connected. Then the doc corresponding to user 543 will have a term connection:664 indexed. It is not useful to do prefix matching on this - and ideally I'd not wan

Re: posting list strings

2013-07-14 Thread Lance Norskog
Is there a Trie-based term index? Seems like this would be smaller, and very fast on non-leading wildcards. On 07/09/2013 02:34 PM, Uwe Schindler wrote: Hi, You can replace the term by their hash directly in the analyzer chain. Just write a custom TermToBytesRef attribute that hashes the term

MemoryIndex in Lucene 4.x

2013-07-14 Thread cischmidt77
I use Lucene/MemoryIndex for a large number of queries against data in a streaming system. I'm looking to upgrade from v3.5 to 4.x, but it seems that using MemoryIndex is roughly 25% slower based on a benchmark I built using our internal queries and a sample of 1000 documents to run against. I hav