Re: Index doubling in size when adding extra terms

2009-07-15 Thread Michael McCandless
It looks like your "text_substrings" field will have many more unique terms than the original text, right? And, since it's indexed (I assume), the docIDs will in fact be stored twice (once in the postings for your orig text and once in the postings for text_substrings). So I think it's expected t

RE: Index doubling in size when adding extra terms

2009-07-15 Thread Uwe Schindler
gt; To: java-user@lucene.apache.org > Subject: Index doubling in size when adding extra terms > > I have added a new field to each document in my index containing > substrings of another field to speed up initial-wildcard searches. > > Each document has a field "text" wh

Index doubling in size when adding extra terms

2009-07-15 Thread Gregory Tarr
I have added a new field to each document in my index containing substrings of another field to speed up initial-wildcard searches. Each document has a field "text" which might contain "the quick brown fox jumped over the lazy dogs" The new field - "text_substrings" would then contain "the quick u