l_text" field and only read _the_start_ of it?
Otherwise, I'm thinking I'll go with an extra 1st page field for the too-huge
documents.
-Paul
-Original Message-
From: Mike Sokolov [mailto:soko...@ifactory.com]
Sent: Saturday, June 23, 2012 7:16 PM
To: java-user@lucene.ap
xtra 1st page field for the too-huge
documents.
-Paul
> -Original Message-
> From: Mike Sokolov [mailto:soko...@ifactory.com]
> Sent: Saturday, June 23, 2012 7:16 PM
> To: java-user@lucene.apache.org
> Cc: Jack Krupansky
> Subject: Re: Fast way to get the start of document
>
&
I got the sense from Paul's post that he wanted a solution that didn't
require changing his index, although I'm not sure there is one. Paul if
you're willing to re-index, you could also store the length of the text
as a numeric field, retrieve that and use it to drive the decision about
whethe
Simply have two fields, "full_body" and "limited_body". The former would
index but not store the full document text from Tika (the "content"
metadata.) The latter would store but not necessarily index the first 10K or
so characters of the full text. Do searches on the full body field and
highli