Ok. Just to followup, I performed the same steps with another of our indexes
and did not have the same issue:
Opening index @ /lucenedata/index4
Segments file=segments_85 numSegments=1 version=FORMAT_HAS_PROX [Lucene 2.4]
1 of 1: name=_42 docCount=3986767
compound=true
hasProx=true
Michael McCandless-2 wrote:
>
> That exception seems to indicate that the fdx file being opened by
> FieldsReader is 0 length (it's trying to read the first int from that
> file).
>
> Is the exception repeatable, if you try again to call
> IndexReader.open?
>
> It's odd that CheckIndex finds
Toke Eskildsen wrote:
>
> A quick check when a corrupt index problem is encountered:
> Does any of your machines run Java 1.6.0_04-1.6.0_10b25?
>
Thanks Toke.
As I mentioned in my response to Erick, this is complicated by the fact that
the error is within a java stored procedure in Oracle. Th
Erick Erickson wrote:
>
> I guess my first question, based on your statement that you ran
> checkindex from a different machine would be whether you have
> the same version of Lucene installed on both machines? And how
> did you get your index where it is now? did you optmize it in place
> or d
Greetings all. I have an index that I have optimized and when I try to open
the index I get this:
java.io.IOException: read past EOF
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
at
org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInte
Michael McCandless-2 wrote:
>
>
> How did you delete the documents? EG, by docID using IndexReader, by
> Term or Query using IndexWriter?
>
> And when you said your previous index had 14488449 docs, was numDocs()
> or maxDoc()?
>
>
I deleted by docid. I got the number by numdocs.
Jus
Ganesh - yahoo wrote:
>
> Optimize will remove the deletes and rearrange the document numbers.
>
> Have you done some deletes before deleting 1.3 million docs?
>
>
No, that is the crazy part. I haven't done anything to this index since it
was first compiled until I did the deletes. That is
Ok. This is crazy. I have an index with 14,488,449 docs in it. Today I did a
CheckIndex on it and everything looked fine. I made a copy of the index, ran
a delete on about 1.3 million docs and then did an optimize and now my doc
count is 38449.
The index was originally built with 2.3, but I am no
Just an FYI in case anyone runs into something similar.
Essentially I had indexes that I have been searching from a java stored
procedure in Oracle without issue for awhile. All of a sudden, I started
getting the error I alluded to above when there were more than a certain
number of terms (4,5, o
OK, a little more information:
I run this query via a java stored procedure within Oracle. However, I just
ran the same query using the same code compiled in a separate class from a
CL on a different server that has the same filesystem mounted.
The queries ran fine from there.
So I am wondering
Greetings all. I am having an issue that is driving me mad.
I have many indexes ranging in size from 500K docs to 40mil docs. When I do
a simple query containing multiple terms on any of the indexes, I get this:
java.lang.ArrayIndexOutOfBoundsException
at org.apache.lucene.util.ScorerDoc
Thanks Erick. That is what I was assuming but couldn't confirm if it was
worth going down those paths to acheive what I was hoping. Your essay was
very informative about realistic expectations with the fieldselector.
I actually just got through reading the discussion on deprecating hits which
ess
karl wettin-3 wrote:
>
>
> I might be missing something here -- can't you just add the age field
> to the index and include that in your query?
>
>
Thanks for the response Karl:
I just used the age field as an example, but in reality the structured data
is copious and complex relationshi
Greetings all. I have read many posts concerning similar use cases, but I am
still a little hazy on the best way to achieve what I need to do. Here is
the background:
2 million documents with multiple sections, some sections contain structured
data, some unstructured.
We parse the docs and place
First off Karl, thanks for your reply and your time.
karl wettin-3 wrote:
>
> One could also say you are classifying your data based on keywords in
> the text?
>
I probably didn't explain myself very well or more specifically provide a
good example. In my case, there really isn't any relatio
Greetings all. I am indexing a set of documents where I am extracting terms
and mapping them to a controlled vocabulary and then placing the matched
vocabulary in a keyword field. What I want to know is if there is a way to
store the original term location with the keyword field?
Example Text: "T
16 matches
Mail list logo