You could also look at MemoryIndex or InstantiatedIndex, both in
lucene's contrib area. I think that I was also wondering if you might
gain from using TermDocs or TermVectors or something directly.
--
Ian.
On Tue, Jul 27, 2010 at 9:34 PM, Geir Gullestad Pettersen
wrote:
> Thanks for your fee
Ramdirectorys seem useful but as the index gets larger, java heap
sizes can become a problem in terms of garbage collection pauses. Some
customers are looking to use data grid products such as IBM websphere
extreme scale or oracle coherence to act as the directory for the
index. This stores the ind
Thanks for your feedback, Ian.
I have written a first implementation of this service that works well. You
mentioned something about technologies for speeding up lucene, something I
am interested in knowing more about. Would you, or anyone, please mind
elaborating a bit, or giving me some pointers?
So, if I've understood this correctly, you've got some text and wan't
to loop through a list of words and/or phrases, and see which of those
match the text.
e.g.
text "some random article about something or other of some random length"
words
some - matches
many - no match
article - matches
word
Hi,
I'm about to write an application that does very simple text analysis,
namely dictionary based entity entraction. The alternative is to do in
memory matching with substring:
String text; // could be any size, but normally "news paper length"
List matches;
for( String wordOrPhrase : dictionary