date:20080926



Ari Miller wrote:

According to https://issues.apache.org/jira/browse/LUCENE-1282?focusedCommentId=12596949 
#action_12596949

(Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene), a workaround
for the bug which causes the CorruptIndexException was put in to the
2.3 branch and 2.4.
However, we are still experiencing this issue (intermittent creation
of a corrupt index) with a 2.3-SNAPSHOT from maven.
Was the workaround put into 2.3-SNAPSHOT?  Are there other issues
which would cause the same error (detailed below)?

We would prefer to avoid upgrading to JDK 6u10
(http://java.sun.com/javase/downloads/ea/6u10/6u10RC.jsp) until it is
a final release, thus the use of the 2.3-SNAPSHOT dated July 22.


Can you look in the Lucene core JAR's manifest and report back what  
version information you see?  (It should contain the timestamp that  
the JAR was built).


I committed this workaround to the 2.3 branch on May 22.

We don't currently have any automated build that pushes snapshot  
builds into maven on the 2.3 branch; my guess is that 2.3-SNAPSHOT in  
maven was from the last trunk build before 2.3 was released (which  
would not contain this fix), though I'm not sure why it has timestamp  
July 22.


Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Caused by: java.io.IOException: read past EOF on Slave



Can you describe the sequence of steps that your replication process  
goes through?


Also, which filesystem is the index being accessed through?

Mike

rahul_k123 wrote:



First of all, thanks to all the people who helped me in getting the  
lucene

replication setup working and right now its live in our production :-)

Everything working fine, except that i am seeing some exceptions on  
slaves.


The following is the one which is occuring more often on slaves

 at java.util.concurrent.Executors 
$RunnableAdapter.call(Executors.java:441)

   at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at
java.util.concurrent.ThreadPoolExecutor 
$Worker.runTask(ThreadPoolExecutor.java:885)

   at
java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:907)

   at java.lang.Thread.run(Thread.java:619)
Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
[data_dir/index]: [read past EOF]
   at
com 
.lucene 
.LuceneSearchService.getSearchResults(LuceneSearchService.java:964)

   ... 12 more
Caused by: java.io.IOException: read past EOF
   at
org 
.apache 
.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)

   at
org 
.apache 
.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
   at org.apache.lucene.store.IndexInput.readInt(IndexInput.java: 
66)
   at  
org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
   at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java: 
147)

   at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
   at
org 
.apache 
.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)

   at
org.apache.lucene.index.IndexReader.document(IndexReader.java:525)

and the second one is

at java.util.concurrent.Executors 
$RunnableAdapter.call(Executors.java:441)

   at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at
java.util.concurrent.ThreadPoolExecutor 
$Worker.runTask(ThreadPoolExecutor.java:885)

   at
java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:907)

   at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.IllegalArgumentException: attempt to access a  
deleted

document
   at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:657)
   at
org 
.apache 
.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)

   at
org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
This is on master index .



Any help is appreciated

Thanks.

--
View this message in context: 
http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19682684.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Sorting with ParallelReader

2008-09-26 Thread Ivan Vasilev

Sorry about the spam with this thread. We started using ParallelReader 
in our app and we have some bug in the app with the sorts.
I tested with simple standalone app ParallelReader and discovered that 
sort works in the same way perfectly as with the other Readers.

Sorry once again.

Best Regards,
Ivan

Ivan Vasilev wrote:

Hi Guys,

Does anybody know if it is possible results to be sorted using the 
ParallelReader?


Best Regards,
Ivan

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


__ NOD32 3472 (20080925) Information __

This message was checked by NOD32 antivirus system.
http://www.eset.com






-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: sharing SearchIndexer

2008-09-26 Thread Michael Wechner

Ian Lea schrieb:

Simon

There is nothing in lucene to detect that an index has changed and
automagically reopen an IndexReader.

would it make sense for Lucene to introduce this as a feature?

Cheers

Michael

--
Ian.

On Fri, Sep 26, 2008 at 8:41 AM, simon litwan <[EMAIL PROTECTED]> wrote:

Mark Miller schrieb:

simon litwan wrote:

hi all

i tried to reuse the IndexSearcher among all of the threads that are
doing searches as described in
(http://wiki.apache.org/lucene-java/LuceneFAQ#head-48921635adf2c968f7936dc07d51dfb40d638b82)

this works fine. but our application does continuous indexing. so the
index is changing and the at startup initialized IndexSearcher seems not to
be notified to reload the index.

is there a way to force the IndexSearcher to reload the index if the
index has changed?

thanks in advance

simon

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

You want to reopen the Reader under the IndexSearcher, or open a new
IndexSearcher.

I want to reopen the Reader under the IndexSearcher when the index has
changed. is there a way to do so?

simon

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

How to restore corrupted index

We have an application in which index will be updated frequently.

During development time, found that index files gets corrupted, i.e.
more than one  cfs files,some other extension files e.g. frq, fnm, nrm

Remains there in index directory.

Is there any way that such issue does not occur at all or if it happens
we can recover the index data again?

It would be a great help, if some one can.

 

 

Regards,

Chaula

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)



On Sep 26, 2008, at 6:30 AM, Michael McCandless wrote:



Ari Miller wrote:

According to https://issues.apache.org/jira/browse/LUCENE-1282?focusedCommentId=12596949 
#action_12596949
(Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene), a  
workaround

for the bug which causes the CorruptIndexException was put in to the
2.3 branch and 2.4.
However, we are still experiencing this issue (intermittent creation
of a corrupt index) with a 2.3-SNAPSHOT from maven.
Was the workaround put into 2.3-SNAPSHOT?  Are there other issues
which would cause the same error (detailed below)?

We would prefer to avoid upgrading to JDK 6u10
(http://java.sun.com/javase/downloads/ea/6u10/6u10RC.jsp) until it is
a final release, thus the use of the 2.3-SNAPSHOT dated July 22.


Can you look in the Lucene core JAR's manifest and report back what  
version information you see?  (It should contain the timestamp that  
the JAR was built).


I committed this workaround to the 2.3 branch on May 22.

We don't currently have any automated build that pushes snapshot  
builds into maven on the 2.3 branch; my guess is that 2.3-SNAPSHOT  
in maven was from the last trunk build before 2.3 was released  
(which would not contain this fix), though I'm not sure why it has  
timestamp July 22.


Yes, a 2.3-SNAPSHOT would definitely be before 2.3.0.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Restore index

We have an application in which index will be updated frequently.

 

During development time, found that index files gets corrupted, i.e.

more than one  cfs files,some other extension files e.g. frq, fnm, nrm

 

Remains there in index directory.

 

Is there any way that such issue does not occur at all or if it happens
we can recover the index data again?

 

It would be a great help, if some one can.

 

 

 

 

 

Regards,

 

Chaula

Re: 2.4 release candidate 2


Looks good.

On Sep 25, 2008, at 11:11 AM, Michael McCandless wrote:



Hi,

I just created the second release candidate for Lucene 2.4, here:

 http://people.apache.org/~mikemccand/staging-area/lucene2.4rc2

These are the fixes since RC1:

 * Issues with CheckIndex (LUCENE-1402)

 * Removed new yet deprecated ctors for IndexWriter, and set
   autoCommit=false default for new the ctors (LUCENE-1401)

 * Cases where optimize throws an IOException because a BG merge had
   problems, yet fails to include the root casue exception
   (LUCENE-1397)

 * Improved PhraseQuery.toString (LUCENE-1396)

 * NullPointerException in NearSpansUnordered.isPayloadAvailable
   (LUCENE-1404)

 * A bunch of small javadoc issues, unecessary import lines, missing
   copyright headers

Please continue testing and reporting any issues you find!  Thanks.

Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index

There is the CheckIndex tool included in the distribution for checking/ 
fixing bad indexes, but it can't solve everything.


The bigger question is why it is happening to begin with.  Can you  
describe your indexing process?  How do you know the index is actually  
corrupted?  Are you seeing exceptions when opening it?


-Grant
On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:


We have an application in which index will be updated frequently.

During development time, found that index files gets corrupted, i.e.
more than one  cfs files,some other extension files e.g. frq, fnm, nrm

Remains there in index directory.

Is there any way that such issue does not occur at all or if it  
happens

we can recover the index data again?

It would be a great help, if some one can.





Regards,

Chaula







--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index

2008-09-26 Thread Erick Erickson

You say that there are multiple files, but you don't say if the index
still works. Does it? If using the index gives you unexpected results,
can you tell us about what the failure modes are?

Best
Erick

On Fri, Sep 26, 2008 at 6:49 AM, Chaula Ganatra <[EMAIL PROTECTED]> wrote:

> We have an application in which index will be updated frequently.
>
> During development time, found that index files gets corrupted, i.e.
> more than one  cfs files,some other extension files e.g. frq, fnm, nrm
>
> Remains there in index directory.
>
> Is there any way that such issue does not occur at all or if it happens
> we can recover the index data again?
>
> It would be a great help, if some one can.
>
>
>
>
>
> Regards,
>
> Chaula
>
>
>
>
>
>

RE: How to restore corrupted index

I found one case when such multiple files are remained, when we call
writer.optimise() it throws exception and multiple files remained in
index dir.

After such multiple files, when we add document in index by calling
writer.addDocument it throws java.lang.NegativeArraySizeException

Regards,
Chaula

-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED] 
Sent: 26 September, 2008 6:02 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index

There is the CheckIndex tool included in the distribution for checking/ 
fixing bad indexes, but it can't solve everything.

The bigger question is why it is happening to begin with.  Can you  
describe your indexing process?  How do you know the index is actually  
corrupted?  Are you seeing exceptions when opening it?

-Grant
On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:

> We have an application in which index will be updated frequently.
>
> During development time, found that index files gets corrupted, i.e.
> more than one  cfs files,some other extension files e.g. frq, fnm, nrm
>
> Remains there in index directory.
>
> Is there any way that such issue does not occur at all or if it  
> happens
> we can recover the index data again?
>
> It would be a great help, if some one can.
>
>
>
>
>
> Regards,
>
> Chaula
>
>
>
>
>

--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index



Can you post the full stack trace in both cases?

Mike

Chaula Ganatra wrote:


I found one case when such multiple files are remained, when we call
writer.optimise() it throws exception and multiple files remained in
index dir.

After such multiple files, when we add document in index by calling
writer.addDocument it throws java.lang.NegativeArraySizeException

Regards,
Chaula

-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 6:02 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index

There is the CheckIndex tool included in the distribution for  
checking/

fixing bad indexes, but it can't solve everything.

The bigger question is why it is happening to begin with.  Can you
describe your indexing process?  How do you know the index is actually
corrupted?  Are you seeing exceptions when opening it?

-Grant
On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:


We have an application in which index will be updated frequently.

During development time, found that index files gets corrupted, i.e.
more than one  cfs files,some other extension files e.g. frq, fnm,  
nrm


Remains there in index directory.

Is there any way that such issue does not occur at all or if it
happens
we can recover the index data again?

It would be a great help, if some one can.





Regards,

Chaula







--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Search warmup from Tomcat

2008-09-26 Thread Ganesh


Hello all,

I need to do warmup the searcher object from my JSP pages. Currently i am 
having a static object, which i am frequently checking whether index got 
updated, if so i am closing the indexer and re-opening it. These JSP pages 
are invoked by the User. When User performs any search operation, few are 
faster and few are slower. This is because the searcher object is getting 
updated.


How do i warm up my searcher object, without the intervention of User 
request.


Regards
Ganesh


Send instant messages to your online friends http://in.messenger.yahoo.com 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: How to restore corrupted index

It was the Reader on same index, which I did not close so gave exception
in writer.optimise()

Chaula

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: 26 September, 2008 7:17 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index


Can you post the full stack trace in both cases?

Mike

Chaula Ganatra wrote:

> I found one case when such multiple files are remained, when we call
> writer.optimise() it throws exception and multiple files remained in
> index dir.
>
> After such multiple files, when we add document in index by calling
> writer.addDocument it throws java.lang.NegativeArraySizeException
>
> Regards,
> Chaula
>
> -Original Message-
> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
> Sent: 26 September, 2008 6:02 PM
> To: java-user@lucene.apache.org
> Subject: Re: How to restore corrupted index
>
> There is the CheckIndex tool included in the distribution for  
> checking/
> fixing bad indexes, but it can't solve everything.
>
> The bigger question is why it is happening to begin with.  Can you
> describe your indexing process?  How do you know the index is actually
> corrupted?  Are you seeing exceptions when opening it?
>
> -Grant
> On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:
>
>> We have an application in which index will be updated frequently.
>>
>> During development time, found that index files gets corrupted, i.e.
>> more than one  cfs files,some other extension files e.g. frq, fnm,  
>> nrm
>>
>> Remains there in index directory.
>>
>> Is there any way that such issue does not occur at all or if it
>> happens
>> we can recover the index data again?
>>
>> It would be a great help, if some one can.
>>
>>
>>
>>
>>
>> Regards,
>>
>> Chaula
>>
>>
>>
>>
>>
>
> --
> Grant Ingersoll
> http://www.lucidimagination.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index



It's perfectly fine to have a reader open on an index, while an  
IndexWriter runs optimize.


Which version of Lucene are you using?  And which OS & filesystem?

Mike

Chaula Ganatra wrote:

It was the Reader on same index, which I did not close so gave  
exception

in writer.optimise()

Chaula

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 7:17 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index


Can you post the full stack trace in both cases?

Mike

Chaula Ganatra wrote:


I found one case when such multiple files are remained, when we call
writer.optimise() it throws exception and multiple files remained in
index dir.

After such multiple files, when we add document in index by calling
writer.addDocument it throws java.lang.NegativeArraySizeException

Regards,
Chaula

-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 6:02 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index

There is the CheckIndex tool included in the distribution for
checking/
fixing bad indexes, but it can't solve everything.

The bigger question is why it is happening to begin with.  Can you
describe your indexing process?  How do you know the index is  
actually

corrupted?  Are you seeing exceptions when opening it?

-Grant
On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:


We have an application in which index will be updated frequently.

During development time, found that index files gets corrupted, i.e.
more than one  cfs files,some other extension files e.g. frq, fnm,
nrm

Remains there in index directory.

Is there any way that such issue does not occur at all or if it
happens
we can recover the index data again?

It would be a great help, if some one can.





Regards,

Chaula







--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Getting all found document ids from a search result

2008-09-26 Thread gregor_ewald


Hello you all,

 

is it somehow possible to get all document ids found by a search. Not only 50 
or 100...

If it is possible and someone knows it, please help me :-)

 

 

Thanks and beste regards,

Gregor



TREND MICRO Deutschland GmbH, Lise-Meitner-Str. 4, D-85716 Unterschleissheim, 
Germany
Geschaeftsfuehrer: Raimund Genes, Amtsgericht Muenchen - HRB 114739

Die in dieser E-Mail und ihren etwaigen Anhaengen enthaltenen Informationen 
sind vertraulich und koennen gewerblichen Schutzrechten unterliegen. Sollten 
Sie keiner der vorgesehenen Empfaenger sein, sind Sie nicht berechtigt, diese 
Nachricht in irgendeiner Weise zu benutzen oder weiterzugeben. Bitte 
benachrichtigen Sie uns ggf. per Rueckantwort oder telefonisch (089-37479700) 
und loeschen Sie diese E-Mail aus Ihrem E-Mail-System.

The information contained in this email and any attachments is confidential and 
may be subject to copyright or other intellectual property protection. If you 
are not the intended recipient, you are not authorized to use or disclose this 
information, and we request that you notify us by reply mail or telephone and 
delete the original message from your mail system.

Re: Search warmup from Tomcat

2008-09-26 Thread Asbjørn A . Fellinghaug

Ganesh:
> Hello all,
> 
> I need to do warmup the searcher object from my JSP pages. Currently i am 
> having a static object, which i am frequently checking whether index got 
> updated, if so i am closing the indexer and re-opening it. These JSP pages 
> are invoked by the User. When User performs any search operation, few are 
> faster and few are slower. This is because the searcher object is getting 
> updated.
> 
> How do i warm up my searcher object, without the intervention of User 
> request.

I would have used a set of predefined queries which are frequently typed
into your searcher. Typically queries with stopwords is good to use an a
warmup phase.

-- 
Asbjørn A. Fellinghaug
[EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: How to restore corrupted index

Lucene 2.2.0, windows XP

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: 26 September, 2008 8:00 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index


It's perfectly fine to have a reader open on an index, while an  
IndexWriter runs optimize.

Which version of Lucene are you using?  And which OS & filesystem?

Mike

Chaula Ganatra wrote:

> It was the Reader on same index, which I did not close so gave  
> exception
> in writer.optimise()
>
> Chaula
>
> -Original Message-
> From: Michael McCandless [mailto:[EMAIL PROTECTED]
> Sent: 26 September, 2008 7:17 PM
> To: java-user@lucene.apache.org
> Subject: Re: How to restore corrupted index
>
>
> Can you post the full stack trace in both cases?
>
> Mike
>
> Chaula Ganatra wrote:
>
>> I found one case when such multiple files are remained, when we call
>> writer.optimise() it throws exception and multiple files remained in
>> index dir.
>>
>> After such multiple files, when we add document in index by calling
>> writer.addDocument it throws java.lang.NegativeArraySizeException
>>
>> Regards,
>> Chaula
>>
>> -Original Message-
>> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
>> Sent: 26 September, 2008 6:02 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: How to restore corrupted index
>>
>> There is the CheckIndex tool included in the distribution for
>> checking/
>> fixing bad indexes, but it can't solve everything.
>>
>> The bigger question is why it is happening to begin with.  Can you
>> describe your indexing process?  How do you know the index is  
>> actually
>> corrupted?  Are you seeing exceptions when opening it?
>>
>> -Grant
>> On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:
>>
>>> We have an application in which index will be updated frequently.
>>>
>>> During development time, found that index files gets corrupted, i.e.
>>> more than one  cfs files,some other extension files e.g. frq, fnm,
>>> nrm
>>>
>>> Remains there in index directory.
>>>
>>> Is there any way that such issue does not occur at all or if it
>>> happens
>>> we can recover the index data again?
>>>
>>> It would be a great help, if some one can.
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Chaula
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>>
>>
>>
>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Getting all found document ids from a search result

2008-09-26 Thread Otis Gospodnetic

Gregor,

You could loop through the results or collect them using a custom HitCollector.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Friday, September 26, 2008 10:31:37 AM
> Subject: Getting all found document ids from a search result
> 
> 
> Hello you all,
> 
> 
> 
> is it somehow possible to get all document ids found by a search. Not only 50 
> or 
> 100...
> 
> If it is possible and someone knows it, please help me :-)
> 
> 
> 
> 
> 
> Thanks and beste regards,
> 
> Gregor
> 
> 
> 
> TREND MICRO Deutschland GmbH, Lise-Meitner-Str. 4, D-85716 Unterschleissheim, 
> Germany
> Geschaeftsfuehrer: Raimund Genes, Amtsgericht Muenchen - HRB 114739
> 
> Die in dieser E-Mail und ihren etwaigen Anhaengen enthaltenen Informationen 
> sind 
> vertraulich und koennen gewerblichen Schutzrechten unterliegen. Sollten Sie 
> keiner der vorgesehenen Empfaenger sein, sind Sie nicht berechtigt, diese 
> Nachricht in irgendeiner Weise zu benutzen oder weiterzugeben. Bitte 
> benachrichtigen Sie uns ggf. per Rueckantwort oder telefonisch (089-37479700) 
> und loeschen Sie diese E-Mail aus Ihrem E-Mail-System.
> 
> The information contained in this email and any attachments is confidential 
> and 
> may be subject to copyright or other intellectual property protection. If you 
> are not the intended recipient, you are not authorized to use or disclose 
> this 
> information, and we request that you notify us by reply mail or telephone and 
> delete the original message from your mail system.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: sharing SearchIndexer

2008-09-26 Thread Otis Gospodnetic

I think somebody provided a patch (might have been a whole new IndexReader 
impl?) mny moons ago (2005?), but it never attracted enough 
interest to get committed.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Michael Wechner <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Friday, September 26, 2008 6:49:24 AM
> Subject: Re: sharing SearchIndexer
> 
> Ian Lea schrieb:
> > Simon
> >
> >
> > There is nothing in lucene to detect that an index has changed and
> > automagically reopen an IndexReader.
> >
> > You can do the notification from your indexing thread, or every nnn
> > mins, or whatever makes sense for your application.  Note that
> > IndexReader.reopen() does nothing if the index has not changed - see
> > the javadocs.
> >  
> 
> would it make sense for Lucene to introduce this as a feature?
> 
> Cheers
> 
> Michael
> >
> > --
> > Ian.
> >
> >
> > On Fri, Sep 26, 2008 at 8:41 AM, simon litwan wrote:
> >  
> >> Mark Miller schrieb:
> >>
> >>> simon litwan wrote:
> >>>  
>  hi all
> 
>  i tried to reuse the IndexSearcher among all of the threads that are
>  doing searches as described in
>  
> (http://wiki.apache.org/lucene-java/LuceneFAQ#head-48921635adf2c968f7936dc07d51dfb40d638b82)
> 
>  this works fine. but our application does continuous indexing. so the
>  index is changing and the at startup initialized IndexSearcher seems not 
>  to
>  be notified to reload the index.
> 
>  is there a way to force the IndexSearcher to reload the index if the
>  index has changed?
> 
>  thanks in advance
> 
>  simon
> 
>  -
>  To unsubscribe, e-mail: [EMAIL PROTECTED]
>  For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> >>> You want to reopen the Reader under the IndexSearcher, or open a new
> >>> IndexSearcher.
> >>>  
> >> I want to reopen the Reader under the IndexSearcher when the index has
> >> changed. is there a way to do so?
> >>
> >> simon
> >>
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >  
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Index time Document Boosting and Query Time Sorts

2008-09-26 Thread Dino Korah

Cheers All

2008/9/24 Karl Wettin <[EMAIL PROTECTED]>

>
> 24 sep 2008 kl. 12.40 skrev Grant Ingersoll:
>
>  One side note based on your example, below:  Index time boosting does not
>> have much granularity (only 255 values), in other words, there is a loss of
>> precision.  Thus, you
>> want to make sure your boosts are different enough such that you can
>> distinguish between the two.   Maybe 1/(2*depth) or something like that. You
>> can alter how these 255 values are encoded, but that is fairly advanced
>> stuff.
>>
>
> Just a note, the granularity is 255 only if you turn off length
> normalization, if not it's something like 25.
>
>
>   karl
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


-- 
d i n ok o r a h
Tel: +44 7956 66 52 83
---
51°21'50.5902"N   0°6'11.8116"W

Re: Caused by: java.io.IOException: read past EOF on Slave

2008-09-26 Thread Marcelo Ochoa

Michael:
  I just start testing 2.4rc2 running inside OJVM.
 I found a similar stack trace during indexing:
IW 3 [Root Thread]:   flush: segment=_3 docStoreSegment=_3
docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
numDocs=2 numBufDelTerms=2
IW 3 [Root Thread]:   index before flush _1:C2->_1 _2:C2->_2
IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2
IW 3 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=264
docs/MB=7,943.758 new/old=0.237%
IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and 0 deleted
docIDs and 0 deleted queries on 3 segments.
IW 3 [Root Thread]: hit exception flushing deletes
Exception in thread "Root Thread" java.io.IOException: read past EOF
at 
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java)
at 
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
at 
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java)
at org.apache.lucene.index.IndexReader.termDocs(IndexReader.java)
at 
org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java)
at 
org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:918)
at org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
at 
org.apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainIndex.java:1308)

  I'll reinstall with a full debug info to see all line numbers in
Lucene java code.
  Is there a list of semantic changes at BufferedIndeInput code?
  I mean it do sequential or random writes for example.
  But anyway, I just compiled with latest code and ran my test suites,
I'll investigate the problem a bit more.
  Best regards, Marcelo.

On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Can you describe the sequence of steps that your replication process goes
> through?
>
> Also, which filesystem is the index being accessed through?
>
> Mike
>
> rahul_k123 wrote:
>
>>
>> First of all, thanks to all the people who helped me in getting the lucene
>> replication setup working and right now its live in our production :-)
>>
>> Everything working fine, except that i am seeing some exceptions on
>> slaves.
>>
>> The following is the one which is occuring more often on slaves
>>
>>  at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>   at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>   at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>   at java.lang.Thread.run(Thread.java:619)
>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
>> [data_dir/index]: [read past EOF]
>>   at
>>
>> com.lucene.LuceneSearchService.getSearchResults(LuceneSearchService.java:964)
>>   ... 12 more
>> Caused by: java.io.IOException: read past EOF
>>   at
>>
>> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)
>>   at
>>
>> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
>>   at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:66)
>>   at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
>>   at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:147)
>>   at
>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
>>   at
>>
>> org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
>>   at
>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>
>> and the second one is
>>
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>   at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>   at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>   at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.lang.IllegalArgumentEx

Re: CorruptIndexException workaround in 2.3-SNAPSHOT? (Attn: Michael McCandless)

2008-09-26 Thread Ari Miller

Confirmed that the manifest date on the 2.3-SNAPSHOT is much older
than the file date:
Implementation-Version: 2.3-SNAPSHOT 613047 - hudson - 2008-01-18 04:1
 1:25

Is there an available SNAPSHOT of the 2.3 branch with this fix?
I've downloaded the 2.4 SNAPSHOT to see if this will resolve the
corruption issue -- based on the date in the Manifest, I'm assuming
this is a slightly later version than 2.4rc2.

2.4 SNAPSHOT
Implementation-Version: 2.4-SNAPSHOT 699151 - 2008-09-26 02:10:14
2.4 RC2
Implementation-Version: 2.4.0-rc2 698976 - 2008-09-25 10:12:47

Best,
Ari

On Fri, Sep 26, 2008 at 3:30 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Ari Miller wrote:
>
>> According to
>> https://issues.apache.org/jira/browse/LUCENE-1282?focusedCommentId=12596949#action_12596949
>> (Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene), a workaround
>> for the bug which causes the CorruptIndexException was put in to the
>> 2.3 branch and 2.4.
>> However, we are still experiencing this issue (intermittent creation
>> of a corrupt index) with a 2.3-SNAPSHOT from maven.
>> Was the workaround put into 2.3-SNAPSHOT?  Are there other issues
>> which would cause the same error (detailed below)?
>>
>> We would prefer to avoid upgrading to JDK 6u10
>> (http://java.sun.com/javase/downloads/ea/6u10/6u10RC.jsp) until it is
>> a final release, thus the use of the 2.3-SNAPSHOT dated July 22.
>
> Can you look in the Lucene core JAR's manifest and report back what version
> information you see?  (It should contain the timestamp that the JAR was
> built).
>
> I committed this workaround to the 2.3 branch on May 22.
>
> We don't currently have any automated build that pushes snapshot builds into
> maven on the 2.3 branch; my guess is that 2.3-SNAPSHOT in maven was from the
> last trunk build before 2.3 was released (which would not contain this fix),
> though I'm not sure why it has timestamp July 22.
>
> Mike
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index

2008-09-26 Thread Jason Rutherglen

Mike,

As part of my goal of trying to use Lucene as primary storage
mechanism (perhaps not the best idea), what do you think is the best
way to handle storing data in Lucene and preventing corrupted data the
way something like an SQL database handles corrupted data?  Or is
there simply no good way to do this?

Jason

On Fri, Sep 26, 2008 at 10:30 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> It's perfectly fine to have a reader open on an index, while an IndexWriter
> runs optimize.
>
> Which version of Lucene are you using?  And which OS & filesystem?
>
> Mike
>
> Chaula Ganatra wrote:
>
>> It was the Reader on same index, which I did not close so gave exception
>> in writer.optimise()
>>
>> Chaula
>>
>> -Original Message-
>> From: Michael McCandless [mailto:[EMAIL PROTECTED]
>> Sent: 26 September, 2008 7:17 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: How to restore corrupted index
>>
>>
>> Can you post the full stack trace in both cases?
>>
>> Mike
>>
>> Chaula Ganatra wrote:
>>
>>> I found one case when such multiple files are remained, when we call
>>> writer.optimise() it throws exception and multiple files remained in
>>> index dir.
>>>
>>> After such multiple files, when we add document in index by calling
>>> writer.addDocument it throws java.lang.NegativeArraySizeException
>>>
>>> Regards,
>>> Chaula
>>>
>>> -Original Message-
>>> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
>>> Sent: 26 September, 2008 6:02 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: How to restore corrupted index
>>>
>>> There is the CheckIndex tool included in the distribution for
>>> checking/
>>> fixing bad indexes, but it can't solve everything.
>>>
>>> The bigger question is why it is happening to begin with.  Can you
>>> describe your indexing process?  How do you know the index is actually
>>> corrupted?  Are you seeing exceptions when opening it?
>>>
>>> -Grant
>>> On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:
>>>
 We have an application in which index will be updated frequently.

 During development time, found that index files gets corrupted, i.e.
 more than one  cfs files,some other extension files e.g. frq, fnm,
 nrm

 Remains there in index directory.

 Is there any way that such issue does not occur at all or if it
 happens
 we can recover the index data again?

 It would be a great help, if some one can.





 Regards,

 Chaula





>>>
>>> --
>>> Grant Ingersoll
>>> http://www.lucidimagination.com
>>>
>>> Lucene Helpful Hints:
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

How to get the index file entries (IndexReader) in different order?

2008-09-26 Thread jcd92


Hi all,

The index has Millions of entries. I need to display the index content in a
JTable with columns (terms, field, freq) and the user can choose the sorting
order (field, freq, terms), (freq, term, field), etc...

What is the best solution to manage the Index sorting

I just need some entries at a time, the one displayed in the table. Thus I
just sort the entries needed in a TreeMap reading the whole index file
(which is quite slow). 

Any Idea would be welcome

Best wishes to all

JCD

-- 
View this message in context: 
http://www.nabble.com/How-to-get-the-index-file-entries-%28IndexReader%29--in-different-order--tp19691563p19691563.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index



Corrupted data in what sense?

EG if you don't trust your IO system to store data properly?

Mike

Jason Rutherglen wrote:


Mike,

As part of my goal of trying to use Lucene as primary storage
mechanism (perhaps not the best idea), what do you think is the best
way to handle storing data in Lucene and preventing corrupted data the
way something like an SQL database handles corrupted data?  Or is
there simply no good way to do this?

Jason

On Fri, Sep 26, 2008 at 10:30 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:


It's perfectly fine to have a reader open on an index, while an  
IndexWriter

runs optimize.

Which version of Lucene are you using?  And which OS & filesystem?

Mike

Chaula Ganatra wrote:

It was the Reader on same index, which I did not close so gave  
exception

in writer.optimise()

Chaula

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 7:17 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index


Can you post the full stack trace in both cases?

Mike

Chaula Ganatra wrote:

I found one case when such multiple files are remained, when we  
call
writer.optimise() it throws exception and multiple files remained  
in

index dir.

After such multiple files, when we add document in index by calling
writer.addDocument it throws java.lang.NegativeArraySizeException

Regards,
Chaula

-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 6:02 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index

There is the CheckIndex tool included in the distribution for
checking/
fixing bad indexes, but it can't solve everything.

The bigger question is why it is happening to begin with.  Can you
describe your indexing process?  How do you know the index is  
actually

corrupted?  Are you seeing exceptions when opening it?

-Grant
On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:


We have an application in which index will be updated frequently.

During development time, found that index files gets corrupted,  
i.e.

more than one  cfs files,some other extension files e.g. frq, fnm,
nrm

Remains there in index directory.

Is there any way that such issue does not occur at all or if it
happens
we can recover the index data again?

It would be a great help, if some one can.





Regards,

Chaula







--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Caused by: java.io.IOException: read past EOF on Slave

2008-09-26 Thread rahul_k123


The following are steps..

1.We do indexing every 5 minutes on master and when indexing is done a
snapshot is taken

2. On slave we have a cronjob which runs snappuller every 3 minutes to check
for new snapshots and installs it on slave if it finds new one

3.Master and Slave are continuously serving search requests.


(I am not using SOLR for indexing )


The file system is ext3


Thanks in advance.





Michael McCandless-2 wrote:
> 
> 
> Can you describe the sequence of steps that your replication process  
> goes through?
> 
> Also, which filesystem is the index being accessed through?
> 
> Mike
> 
> rahul_k123 wrote:
> 
>>
>> First of all, thanks to all the people who helped me in getting the  
>> lucene
>> replication setup working and right now its live in our production :-)
>>
>> Everything working fine, except that i am seeing some exceptions on  
>> slaves.
>>
>> The following is the one which is occuring more often on slaves
>>
>>  at java.util.concurrent.Executors 
>> $RunnableAdapter.call(Executors.java:441)
>>at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>at
>> java.util.concurrent.ThreadPoolExecutor 
>> $Worker.runTask(ThreadPoolExecutor.java:885)
>>at
>> java.util.concurrent.ThreadPoolExecutor 
>> $Worker.run(ThreadPoolExecutor.java:907)
>>at java.lang.Thread.run(Thread.java:619)
>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
>> [data_dir/index]: [read past EOF]
>>at
>> com 
>> .lucene 
>> .LuceneSearchService.getSearchResults(LuceneSearchService.java:964)
>>... 12 more
>> Caused by: java.io.IOException: read past EOF
>>at
>> org 
>> .apache 
>> .lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)
>>at
>> org 
>> .apache 
>> .lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
>>at org.apache.lucene.store.IndexInput.readInt(IndexInput.java: 
>> 66)
>>at  
>> org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
>>at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java: 
>> 147)
>>at
>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
>>at
>> org 
>> .apache 
>> .lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
>>at
>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>
>> and the second one is
>>
>> at java.util.concurrent.Executors 
>> $RunnableAdapter.call(Executors.java:441)
>>at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>at
>> java.util.concurrent.ThreadPoolExecutor 
>> $Worker.runTask(ThreadPoolExecutor.java:885)
>>at
>> java.util.concurrent.ThreadPoolExecutor 
>> $Worker.run(ThreadPoolExecutor.java:907)
>>at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.lang.IllegalArgumentException: attempt to access a  
>> deleted
>> document
>>at
>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:657)
>>at
>> org 
>> .apache 
>> .lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
>>at
>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>> This is on master index .
>>
>>
>>
>> Any help is appreciated
>>
>> Thanks.
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19682684.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19691799.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index



OK.  I really need to see those stack traces to better understand this  
issue.


Also, does the issue still happen on 2.3, or 2.4 RC2?

Mike

Chaula Ganatra wrote:


Lucene 2.2.0, windows XP

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 8:00 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index


It's perfectly fine to have a reader open on an index, while an
IndexWriter runs optimize.

Which version of Lucene are you using?  And which OS & filesystem?

Mike

Chaula Ganatra wrote:


It was the Reader on same index, which I did not close so gave
exception
in writer.optimise()

Chaula

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 7:17 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index


Can you post the full stack trace in both cases?

Mike

Chaula Ganatra wrote:


I found one case when such multiple files are remained, when we call
writer.optimise() it throws exception and multiple files remained in
index dir.

After such multiple files, when we add document in index by calling
writer.addDocument it throws java.lang.NegativeArraySizeException

Regards,
Chaula

-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 6:02 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index

There is the CheckIndex tool included in the distribution for
checking/
fixing bad indexes, but it can't solve everything.

The bigger question is why it is happening to begin with.  Can you
describe your indexing process?  How do you know the index is
actually
corrupted?  Are you seeing exceptions when opening it?

-Grant
On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:


We have an application in which index will be updated frequently.

During development time, found that index files gets corrupted,  
i.e.

more than one  cfs files,some other extension files e.g. frq, fnm,
nrm

Remains there in index directory.

Is there any way that such issue does not occur at all or if it
happens
we can recover the index data again?

It would be a great help, if some one can.





Regards,

Chaula







--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index

2008-09-26 Thread Jason Rutherglen

I'm thinking more in terms of CRC32 checks performed on database
pages.  Is there a way to incorporate this technique in a way that
does not affect performance too much in Lucene?  The question is, when
is the CRC32 check is performed, and to which files is it applied if
any?

On Fri, Sep 26, 2008 at 12:13 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Corrupted data in what sense?
>
> EG if you don't trust your IO system to store data properly?
>
> Mike
>
> Jason Rutherglen wrote:
>
>> Mike,
>>
>> As part of my goal of trying to use Lucene as primary storage
>> mechanism (perhaps not the best idea), what do you think is the best
>> way to handle storing data in Lucene and preventing corrupted data the
>> way something like an SQL database handles corrupted data?  Or is
>> there simply no good way to do this?
>>
>> Jason
>>
>> On Fri, Sep 26, 2008 at 10:30 AM, Michael McCandless
>> <[EMAIL PROTECTED]> wrote:
>>>
>>> It's perfectly fine to have a reader open on an index, while an
>>> IndexWriter
>>> runs optimize.
>>>
>>> Which version of Lucene are you using?  And which OS & filesystem?
>>>
>>> Mike
>>>
>>> Chaula Ganatra wrote:
>>>
 It was the Reader on same index, which I did not close so gave exception
 in writer.optimise()

 Chaula

 -Original Message-
 From: Michael McCandless [mailto:[EMAIL PROTECTED]
 Sent: 26 September, 2008 7:17 PM
 To: java-user@lucene.apache.org
 Subject: Re: How to restore corrupted index


 Can you post the full stack trace in both cases?

 Mike

 Chaula Ganatra wrote:

> I found one case when such multiple files are remained, when we call
> writer.optimise() it throws exception and multiple files remained in
> index dir.
>
> After such multiple files, when we add document in index by calling
> writer.addDocument it throws java.lang.NegativeArraySizeException
>
> Regards,
> Chaula
>
> -Original Message-
> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
> Sent: 26 September, 2008 6:02 PM
> To: java-user@lucene.apache.org
> Subject: Re: How to restore corrupted index
>
> There is the CheckIndex tool included in the distribution for
> checking/
> fixing bad indexes, but it can't solve everything.
>
> The bigger question is why it is happening to begin with.  Can you
> describe your indexing process?  How do you know the index is actually
> corrupted?  Are you seeing exceptions when opening it?
>
> -Grant
> On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:
>
>> We have an application in which index will be updated frequently.
>>
>> During development time, found that index files gets corrupted, i.e.
>> more than one  cfs files,some other extension files e.g. frq, fnm,
>> nrm
>>
>> Remains there in index directory.
>>
>> Is there any way that such issue does not occur at all or if it
>> happens
>> we can recover the index data again?
>>
>> It would be a great help, if some one can.
>>
>>
>>
>>
>>
>> Regards,
>>
>> Chaula
>>
>>
>>
>>
>>
>
> --
> Grant Ingersoll
> http://www.lucidimagination.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

>>>
>>>
>>> -
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Caused by: java.io.IOException: read past EOF on Slave



This one looks spooky!

Is it easily repeated?  If you could print out which 2 terms you had  
tried to delete, and then zip up the index just before deleting those  
docs (after closing the writer) and send to me, I can try to  
understand what's wrong with the index.  It looks as if the *.tis file  
for one of the segments is truncated.


If you capture the series of add/update/delete documents, can you get  
a standalone Java test to show this?


Does this test create an entirely new index?

We did change the index format in 2.4 to use "true" UTF8 encoding for  
all text content; not sure that this applies here (to  
BufferedIndexReader it's all bytes) but it may.


BufferedIndexReader in general can do random IO, especially when  
reading the term dict file (*.tis), when you


Mike

Marcelo Ochoa wrote:


Michael:
 I just start testing 2.4rc2 running inside OJVM.
I found a similar stack trace during indexing:
IW 3 [Root Thread]:   flush: segment=_3 docStoreSegment=_3
docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
numDocs=2 numBufDelTerms=2
IW 3 [Root Thread]:   index before flush _1:C2->_1 _2:C2->_2
IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2
IW 3 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=264
docs/MB=7,943.758 new/old=0.237%
IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and 0 deleted
docIDs and 0 deleted queries on 3 segments.
IW 3 [Root Thread]: hit exception flushing deletes
Exception in thread "Root Thread" java.io.IOException: read past EOF
   at  
org 
.apache 
.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java)
   at  
org 
.apache 
.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
   at  
org 
.apache 
.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)

   at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
   at  
org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
   at  
org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
   at  
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
   at  
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
   at  
org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java)
   at  
org.apache.lucene.index.IndexReader.termDocs(IndexReader.java)
   at  
org 
.apache 
.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java)
   at  
org 
.apache 
.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:918)
   at  
org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java)
   at  
org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java)

   at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
   at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
   at  
org 
.apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainIndex.java: 
1308)


 I'll reinstall with a full debug info to see all line numbers in
Lucene java code.
 Is there a list of semantic changes at BufferedIndeInput code?
 I mean it do sequential or random writes for example.
 But anyway, I just compiled with latest code and ran my test suites,
I'll investigate the problem a bit more.
 Best regards, Marcelo.

On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:


Can you describe the sequence of steps that your replication  
process goes

through?

Also, which filesystem is the index being accessed through?

Mike

rahul_k123 wrote:



First of all, thanks to all the people who helped me in getting  
the lucene
replication setup working and right now its live in our  
production :-)


Everything working fine, except that i am seeing some exceptions on
slaves.

The following is the one which is occuring more often on slaves

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java: 
441)

 at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at

java.util.concurrent.ThreadPoolExecutor 
$Worker.runTask(ThreadPoolExecutor.java:885)

 at

java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:907)

 at java.lang.Thread.run(Thread.java:619)
Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
[data_dir/index]: [read past EOF]
 at

com 
.lucene 
.LuceneSearchService.getSearchResults(LuceneSearchService.java:964)

 ... 12 more
Caused by: java.io.IOException: read past EOF
 at

org 
.apache 
.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)

 at

org 
.apache 
.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java: 
38)
 at org.apache.lucene.store.IndexInput.readInt(IndexInput.java: 
66)
 at  
org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
 at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java: 
147)

 at
org.apache.lucene.index.SegmentReader.document(Seg

Re: Caused by: java.io.IOException: read past EOF on Slave



Which version of Lucene is this?  Looks like 2.3.x -- what's the x?

Can you run your app server with assertions enabled for  
org.apache.lucene.*?  It may catch something sooner.


Can you try running CheckIndex after the snapshot is produced, just to  
see if there is any corruption?


Your first exception (on the slave) seems like the *.fdx file of that  
one segment is somehow truncated, or, you are passing an out-of-bounds  
document number to IndexReader.document.


Your 2nd one (on the master) looks like an invalid (deleted) document  
number is being passed to IndexReader.document.


What is the context of these IndexReader.document(...) calls?  How are  
you getting the doc numbers that you're passing to them?  In both  
cases, an invalid doc number would explain your exception.  Are you  
doing any search caching, where you cache hits and then much later try  
to load the documents for each hit, or, something?


More questions below...

rahul_k123 wrote:



The following are steps..

1.We do indexing every 5 minutes on master and when indexing is done a
snapshot is taken


The IndexWriter is definitely closed before the snapshot is taken?

Are you creating a new index, or, just adding to an existing one?

2. On slave we have a cronjob which runs snappuller every 3 minutes  
to check

for new snapshots and installs it on slave if it finds new one


Sounds OK.  Does this entail a restart of the reader after the  
snapshot is installed?



(I am not using SOLR for indexing )


The file system is ext3


OK.

Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to restore corrupted index



OK it does sound like you're primarily protecting against an  
untrustworthy storage system (or, maybe, Lucene bugs ;).


Probably the best option is to do this fully externally, ie, compute  
digest yourself, store it away in a separate Lucene field, then test  
the digest on loading the field?


Mike

Jason Rutherglen wrote:


I'm thinking more in terms of CRC32 checks performed on database
pages.  Is there a way to incorporate this technique in a way that
does not affect performance too much in Lucene?  The question is, when
is the CRC32 check is performed, and to which files is it applied if
any?

On Fri, Sep 26, 2008 at 12:13 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:


Corrupted data in what sense?

EG if you don't trust your IO system to store data properly?

Mike

Jason Rutherglen wrote:


Mike,

As part of my goal of trying to use Lucene as primary storage
mechanism (perhaps not the best idea), what do you think is the best
way to handle storing data in Lucene and preventing corrupted data  
the

way something like an SQL database handles corrupted data?  Or is
there simply no good way to do this?

Jason

On Fri, Sep 26, 2008 at 10:30 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:


It's perfectly fine to have a reader open on an index, while an
IndexWriter
runs optimize.

Which version of Lucene are you using?  And which OS & filesystem?

Mike

Chaula Ganatra wrote:

It was the Reader on same index, which I did not close so gave  
exception

in writer.optimise()

Chaula

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 7:17 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index


Can you post the full stack trace in both cases?

Mike

Chaula Ganatra wrote:

I found one case when such multiple files are remained, when we  
call
writer.optimise() it throws exception and multiple files  
remained in

index dir.

After such multiple files, when we add document in index by  
calling

writer.addDocument it throws java.lang.NegativeArraySizeException

Regards,
Chaula

-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 September, 2008 6:02 PM
To: java-user@lucene.apache.org
Subject: Re: How to restore corrupted index

There is the CheckIndex tool included in the distribution for
checking/
fixing bad indexes, but it can't solve everything.

The bigger question is why it is happening to begin with.  Can  
you
describe your indexing process?  How do you know the index is  
actually

corrupted?  Are you seeing exceptions when opening it?

-Grant
On Sep 26, 2008, at 6:49 AM, Chaula Ganatra wrote:

We have an application in which index will be updated  
frequently.


During development time, found that index files gets  
corrupted, i.e.
more than one  cfs files,some other extension files e.g. frq,  
fnm,

nrm

Remains there in index directory.

Is there any way that such issue does not occur at all or if it
happens
we can recover the index data again?

It would be a great help, if some one can.





Regards,

Chaula







--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

ANNOUNCE: Application Period Opens for Travel Assistance to ApacheCon US 2008

2008-09-26 Thread Chris Hostetter



NOTE: This is a cross posted announcement to all Lucene sub-projects, 
please confine any replies to [EMAIL PROTECTED]


-

The Travel Assistance Committee is taking in applications for those wanting
to attend ApacheCon US 2008 between the 3rd and 7th November 2008 in New
Orleans.

The Travel Assistance Committee is looking for people who would like to be
able to attend ApacheCon US 2008 who need some financial support in order to
get there. There are VERY few places available and the criteria is high,
that aside applications are open to all open source developers who feel that
their attendance would benefit themselves, their project(s), the ASF and
open source in general.

Financial assistance is available for flights, accomodation and entrance
fees either in full or in part, depending on circumstances. It is intended
that all our ApacheCon events are covered, so it may be prudent for those in
Europe and or Asia to wait until an event closer to them comes up - you are
all welcome to apply for ApacheCon US of course, but there must be
compelling reasons for you to attend an event further away that your home
location for your application to be considered above those closer to the
event location.

More information can be found on the main Apache website at
http://www.apache.org/travel/index.html - where you will also find a link to
the application form and details for submitting.

Time is very tight for this event, so applications are open now and will end
on the 2nd October 2008 - to give enough time for travel arrangements to be
made.

Good luck to all those that will apply.

Regards,

The Travel Assistance Committee

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Caused by: java.io.IOException: read past EOF on Slave

2008-09-26 Thread Marcelo Ochoa

Mike:
  Actually there is more issues at first glance with OJVMDirectory integration.
  Note this, I am creating an index with two simple documents:
INFO: Performing: SELECT /*+ DYNAMIC_SAMPLING(0) RULE NOCACHE(T1) */
T1.rowid,F1,extractValue(F2,'/emp/name/text()')
"name",extractValue(F2,'/emp/@id') "id" FROM LUCENE.T1 for update nowait
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
FINE: Document
indexed,tokenized indexed,tokenized
indexed,tokenized>
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
FINE: Document
indexed,tokenized indexed,tokenized
indexed,tokenized>
IW 10 [Root Thread]:   flush: segment=_0 docStoreSegment=_0
docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
numDocs=2
numBufDelTerms=0
IW 10 [Root Thread]:   index before flush
IW 10 [Root Thread]: DW: flush postings as segment _0 numDocs=2
IW 10 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=166
docs/MB=12,633.446 new/old=0.149%
IFD [Root Thread]: now checkpoint "segments_1" [1 segments ; isCommit = false]
IW 10 [Root Thread]: LMP: findMerges: 1 segments
IW 10 [Root Thread]: LMP:   level -1.0 to 2.2741578: 1 segments
IW 10 [Root Thread]: CMS: now merge
IW 10 [Root Thread]: CMS:   index: _0:C2->_0
IW 10 [Root Thread]: CMS:   no more merges pending; now return
IW 10 [Root Thread]: now flush at close
IW 10 [Root Thread]:   flush: segment=null docStoreSegment=_0
docStoreOffset=2 flushDocs=false flushDeletes=true flushDocStores=true
numDocs=0 numBufDelTerms=0
IW 10 [Root Thread]:   index before flush _0:C2->_0
IW 10 [Root Thread]:   flush shared docStore segment _0
IW 10 [Root Thread]: DW: closeDocStore: 2 files to flush to segment _0 numDocs=2
IW 10 [Root Thread]: CMS: now merge
IW 10 [Root Thread]: CMS:   index: _0:C2->_0
IW 10 [Root Thread]: CMS:   no more merges pending; now return
IW 10 [Root Thread]: now call final commit()
IW 10 [Root Thread]: startCommit(): start sizeInBytes=0
IW 10 [Root Thread]: startCommit index=_0:C2->_0 changeCount=2
IW 10 [Root Thread]: now sync _0.fnm
IW 10 [Root Thread]: now sync _0.frq
IW 10 [Root Thread]: now sync _0.prx
IW 10 [Root Thread]: now sync _0.tis
IW 10 [Root Thread]: now sync _0.tii
IW 10 [Root Thread]: now sync _0.nrm
IW 10 [Root Thread]: now sync _0.fdx
IW 10 [Root Thread]: now sync _0.fdt
IW 10 [Root Thread]: done all syncs
IW 10 [Root Thread]: commit: pendingCommit != null
IFD [Root Thread]: now checkpoint "segments_2" [1 segments ; isCommit = true]
IFD [Root Thread]: deleteCommits: now decRef commit "segments_1"
IFD [Root Thread]: delete "segments_1"
IW 10 [Root Thread]: commit: done
IW 10 [Root Thread]: at close: _0:C2->_0
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.LuceneDomainIndex
ODCIIndexCreate
FINER: RETURN 0

Index created.

  And when I am trying to read the index I got:
INFO: Analyzer: [EMAIL PROTECTED]
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
INFO: qryStr: DESC(name:ravi)
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
INFO: storing cachingFilter: -1378376940 and searcher: 781713581
qryStr: DESC(name:ravi)
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex getSort
INFO: using sort: ,
Exception in thread "Root Thread" java.lang.IndexOutOfBoundsException:
Index: 6, Size: 4
at java.util.ArrayList.RangeCheck(ArrayList.java)
at java.util.ArrayList.get(ArrayList.java)
at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java)
at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java)
at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java)
at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java)
at org.apache.lucene.search.Similarity.idf(Similarity.java)
at org.apache.lucene.search.TermQuery$TermWeight.(TermQuery.java)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java)
at org.apache.lucene.search.Query.weight(Query.java)
at org.apache.lucene.search.Hits.(Hits.java:85)
at org.apache.lucene.search.Searcher.search(Searcher.java)
at 
org.apache.lucene.indexer.LuceneDomainIndex.ODCIStart(LuceneDomainIndex.java)

  Which definetly means that something is not well saved at OJVM
directory BLOB storage :(
  This are my files:
SQL> select file_size,name from it1$t;

 FILE_SIZE NAME
-- --
10 parameters
 1 updateCount
28 segments_1
20 segments.gen
 8 _0.frq
 8 _0.prx
   103 _0.tis
35 _0.tii
12 _0.nrm
22 _0.f

ApacheCon US promo