> Från: Chris Hostetter [mailto:[EMAIL PROTECTED]
> : Setting writer.setMaxFieldLength(5000) (default is 1)
> : seems to eliminate the risk for an OutOfMemoryError,
>
> that's because it now gives up after parsing 5000 tokens.
>
> : To me, it appears that simply calling
> :new Field("c
g lots of issues.
> >
> > -
> > AZ
> >
> > On 9/1/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
> >>
> >> I can't answer the question of why the same token
> >> takes up memory, but I've indexed far more than
> >> 20M
I'm creating a tokenized "content" Field from a plain text file
using an InputStreamReader and new Field("content", in);
The text file is large, 20 MB, and contains zillions lines,
each with the the same 100-character token.
That causes an OutOfMemoryError.
Given that all tokens are the *same*,
Kalle and Patrick: many thanks for the suggestions!
Caching the IndexSearcher in the ServletContext sounds like a very good idea.
However, I have to index a number of databases, each with a different Lucene
index. So keeping an IndexSearcher for each may come with a prohibitive
memory cost. But as
> Från: Karl Wettin [mailto:[EMAIL PROTECTED]
> 29 aug 2007 kl. 12.29 skrev Per Lindberg:
>
> >> how about using a RangeQuery and pick the hit with the
> >> greatest document number?
> >
> > Yep, that did the trick! There seems to be no Filter that can
> Från: Karl Wettin [mailto:[EMAIL PROTECTED]
> 28 aug 2007 kl. 17.48 skrev Per Lindberg:
>
> > Now, I want to search the content, and return only the
> > LATEST found document with each id. To complicate
> > things a bit, I want the latest before a given date. In o
trick. The query
syntax does not seem to support a question like "for each
vaule of the id field among the found hits, give me the one
with the highest date less than x"...
Cheers,
Per Lindberg
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]