date:20071129

Problem with Add method

2007-11-29 Thread Liaqat Ali

This code generate error, kindly tell me that what parameters will be 
use when we use constructors.


Document doc = new Document();
 doc.add( Field.Keyword("id", keywords[i]));
 doc.add( Field.UnIndexed("country", unindexed[i]));
 doc.add(Field.UnStored("contents", unstored[i]));
 doc.add( Field.Text("city", text[i]));
 writer.addDocument(doc);


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Deprecated API

2007-11-29 Thread Liaqat Ali

i m studying LIA. but there is a problem with code. When i run the code 
i get errorsThe errors are related with the use of deprecated 
APIs.Kindly suggest me the right APIs and also instructions how to 
handle this situation with other code..



package lia.indexing;

import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;

import junit.framework.TestCase;
import java.io.IOException;

/**
*
*/
public abstract class BaseIndexingTestCase extends TestCase {
 protected String[] keywords = {"1", "2"};
 protected String[] unindexed = {"Netherlands", "Italy"};
 protected String[] unstored = {"Amsterdam has lots of bridges",
"Venice has lots of canals"};
 protected String[] text = {"Amsterdam", "Venice"};
 protected Directory dir;

 protected void setUp() throws IOException {
   String indexDir =
 System.getProperty("java.io.tmpdir", "tmp") +
 System.getProperty("file.separator") + "index-dir";
   dir = FSDirectory.getDirectory(indexDir, true);
   addDocuments(dir);
 }

 protected void addDocuments(Directory dir)
   throws IOException {
   IndexWriter writer = new IndexWriter(dir, getAnalyzer(),
 true);
   writer.setUseCompoundFile(isCompound());
   for (int i = 0; i < keywords.length; i++) {
 Document doc = new Document();
 doc.add(Field.Keyword("id", keywords[i]));
 doc.add(Field.UnIndexed("country", unindexed[i]));
 doc.add(Field.UnStored("contents", unstored[i]));
 doc.add(Field.Text("city", text[i]));
 writer.addDocument(doc);
   }
   writer.optimize();
   writer.close();
 }

 protected Analyzer getAnalyzer() {
   return new SimpleAnalyzer();
 }

 protected boolean isCompound() {
   return true;
 }

 public void testIndexWriter() throws IOException {
   IndexWriter writer = new IndexWriter(dir, getAnalyzer(),
 false);
   assertEquals(keywords.length, writer.docCount());
   writer.close();
 }

 public void testIndexReader() throws IOException {
   IndexReader reader = IndexReader.open(dir);
   assertEquals(keywords.length, reader.maxDoc());
   assertEquals(keywords.length, reader.numDocs());
   reader.close();
 }
}


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Problem with Add method

2007-11-29 Thread Shai Erera

Which Lucene version do you use?
If it's 2.2, then Field.Keyword, Field.UnIndexed etc. we removed.
You should instead do:

Document doc = new Document();
 doc.add(new Field("id", keywords[i], Store.NO, Index.UN_TOKENIZED));
 doc.add(new Field("country", unindexed[i], Store.YES,
Index.UN_TOKENIZED));
etc...

On Nov 29, 2007 10:25 AM, Liaqat Ali <[EMAIL PROTECTED]> wrote:

> This code generate error, kindly tell me that what parameters will be
> use when we use constructors.
>
> Document doc = new Document();
>  doc.add( Field.Keyword("id", keywords[i]));
>  doc.add( Field.UnIndexed("country", unindexed[i]));
>  doc.add(Field.UnStored("contents", unstored[i]));
>  doc.add( Field.Text("city", text[i]));
>  writer.addDocument(doc);
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-- 
Regards,

Shai Erera

Re: CorruptIndexException

2007-11-29 Thread Michael McCandless

That exception means your index was written with a newer version of
Lucene than the version you are using to open the IndexReader.

It looks like you used the unreleased (2.3 dev) version of Lucli from
the Lucene trunk and then went back to an older Lucene JAR (maybe 2.2?)
for accessing it?  In general writing an index with a newer version
of Lucene and then trying to access it using an older version of Lucene
doesn't work (whereas the opposite does).

I'm afraid you either have to switch to 2.3-dev for reading your index
(but beware it could have sneaky bugs ...), or, rebuild your index with
the 2.2 version of Lucene and use the 2.2 Lucli in the future.

Mike

"Melanie Langlois" <[EMAIL PROTECTED]> wrote:
> Hi,
> 
>  
> 
> I use Lucli to optimize my index, when my application was stopped. And
> after restarting my application, I could not serahc my index anymore, I
> got the following exception :
> 
>  
> 
> org.apache.lucene.index.CorruptIndexException: Unknown format version: -4
> 
> at
> org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:204)
> 
> at
> org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:190)
> 
> at
> 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:610)
> 
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:185)
> 
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:167)
> 
>  
> 
> I have two questions:
> 
> -why does it occurs ? Should I use another tool to access the index
> outside of my application ?
> 
> -do there is way to recover ?
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mélanie
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Error with lucene-core-2.2.0.jar

2007-11-29 Thread ing.sashaa

Hi all,
I'm using a program that use the Lucene Library. I've downloaded 
lucene-core-2.2.0.jar file and I'm trying it, but I get this error while trying 
to index my documents:

Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.lucene.document.Document.add(Lorg/apache/lucene/document/Field;)V

I've searched among old threads and I think that the error is caused by a wrong 
version of library. So, which one must I use?

Thanks.
Ale


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless

"Bill Janssen" <[EMAIL PROTECTED]> wrote:

> > Hmmm ... how many chunks of "about 50 pages" do you do before
> > hitting this?  Roughly how many docs are in the index when it
> > happens?
>
> Oh, gosh, not sure.  I'm guessing it's about half done.

Ugh, OK.  If we could boil this down to a smaller set that is easily
reproducible (and transferable to me) then I could try to track it
down.

Do you have another PPC machine to reproduce this on?  (To rule out
bad RAM/hard-drive on the first one).

Can you try running with the trunk version of Lucene (2.3-dev) and see
if the error still occurs?  EG you can download this AM's build here:

http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/288/artifact/artifacts

Another thing to try is turning on the infoStream
(IndexWriter.setInfoStream(...)) and capture & post the resulting log.
It will be very large since it takes quite a while for the error to
occur...

> So, I ran the same codebase with lucene-core-2.2.0.jar on an Intel
> Mac Pro, OS X 10.5.0, Java 1.5, and no exception is raised.
> Different corpus, about 5 pages instead of 2.  This is
> reinforcing my thinking that it's a big-endian issue.

That's a good question.

Lucene is endian independent: all writes to files boil eventually down
to a writeByte/writeBytes calls in o.a.l.store.IndexOutput such that
the ordering is controlled by Lucene, not the underlying CPU
architecture.

That said, it is clearly a difference in your test so it seems like a
compelling lead... is it possible to run this different corpus back on
the PPC machine, to rule out a corpus difference leading to the
exception?

> I've got 1735 documents, 18969 pages -- average page size 10.9, max
> page size 1235 (a physics textbook), 578 one-page documents.  These
> are Web pages, PDFs, articles, photos, scanned stuff, technical
> papers, etc.  I index six documents at a time, so I guess I'm
> averaging about 65 pages per chunk.  For each document, I index the
> whole text of the document as a Lucene Document, and I index the
> text of each page separately as a Document.  I use the "contents"
> fields and "pagecontents" fields for those two uses.  I also add
> metadata information to each: "title", multiple "author" fields,
> "date", "abstract", etc.

OK, sounds like a nice rich corpus :) Are you using term vectors,
stored fields, payloads on any of these?

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Deprecated API

2007-11-29 Thread Grant Ingersoll

Have a look at the Field.java class and it's constructors.  The other  
option is to look at what was deprecated on Lucene 1.9 and then look  
at Lucene 2.x.


Also, I have some up to date example files of indexing, etc. at http://www.lucenebootcamp.com 
 (follow the link to the SVN repository) which you can check out and  
compare.


Cheers,
Grant

On Nov 29, 2007, at 3:42 AM, Liaqat Ali wrote:

i m studying LIA. but there is a problem with code. When i run the  
code i get errorsThe errors are related with the use of deprecated  
APIs.Kindly suggest me the right APIs and also instructions how to  
handle this situation with other code..



package lia.indexing;

import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;

import junit.framework.TestCase;
import java.io.IOException;

/**
*
*/
public abstract class BaseIndexingTestCase extends TestCase {
protected String[] keywords = {"1", "2"};
protected String[] unindexed = {"Netherlands", "Italy"};
protected String[] unstored = {"Amsterdam has lots of bridges",
   "Venice has lots of canals"};
protected String[] text = {"Amsterdam", "Venice"};
protected Directory dir;

protected void setUp() throws IOException {
  String indexDir =
System.getProperty("java.io.tmpdir", "tmp") +
System.getProperty("file.separator") + "index-dir";
  dir = FSDirectory.getDirectory(indexDir, true);
  addDocuments(dir);
}

protected void addDocuments(Directory dir)
  throws IOException {
  IndexWriter writer = new IndexWriter(dir, getAnalyzer(),
true);
  writer.setUseCompoundFile(isCompound());
  for (int i = 0; i < keywords.length; i++) {
Document doc = new Document();
doc.add(Field.Keyword("id", keywords[i]));
doc.add(Field.UnIndexed("country", unindexed[i]));
doc.add(Field.UnStored("contents", unstored[i]));
doc.add(Field.Text("city", text[i]));
writer.addDocument(doc);
  }
  writer.optimize();
  writer.close();
}

protected Analyzer getAnalyzer() {
  return new SimpleAnalyzer();
}

protected boolean isCompound() {
  return true;
}

public void testIndexWriter() throws IOException {
  IndexWriter writer = new IndexWriter(dir, getAnalyzer(),
false);
  assertEquals(keywords.length, writer.docCount());
  writer.close();
}

public void testIndexReader() throws IOException {
  IndexReader reader = IndexReader.open(dir);
  assertEquals(keywords.length, reader.maxDoc());
  assertEquals(keywords.length, reader.numDocs());
  reader.close();
}
}


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

how to kill IndexSearcher object after every search

2007-11-29 Thread Sebastin


Hi All,
   Is there any possibility to kill the IndexSearcher Object after every
search.
-- 
View this message in context: 
http://www.nabble.com/how-to-kill-IndexSearcher-object-after-every-search-tf4897436.html#a14026451
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Compute the co-occurence beteen a phrase and a word

2007-11-29 Thread Grant Ingersoll

You run your SpanQuery, and get back the Spans.  From there, you need  
to load the document (either by reanalyzing the tokens or by using  
Term Vectors) and then you just have to setup your window around the  
position match.  Unfortunately, I don't think there is a better way in  
Lucene to get those terms in a window around a given position.  You  
might be able to if you altered Lucene to support moving both forward  
and backwards over positions, but I am not sure how difficult this is  
to do w/o looking more into it (and it isn't high on my list at the  
moment.)


Also, I adhere to Hoss' philosophy on private email: 
http://people.apache.org/~hossman/#private_q

-Grant

On Nov 28, 2007, at 10:46 AM, bigdoginuk wrote:



Hi, thanks for the reply.

But can anyone give me some more hints? I have checked SpanQuery,  
but still

haven't found out a solution.

Thanks.


Grant Ingersoll-6 wrote:


Have a look at SpanQuery and it's derivatives.  You will need to do
some post-processing as well.

-Grant

On Nov 28, 2007, at 6:41 AM, bigdoginuk wrote:



Hi all,
I want to compute the co-occurence frequency between a word and a
phrase(
this phrase contains some words, and the words in it should be
successive
and in order). It's like an NEAR operation (like setting slop at  
3...)


Does anyone know how to implement this?

Thanks in advance.

Rooney
--
View this message in context:
http://www.nabble.com/Compute-the-co-occurence-beteen-a-phrase-and-a-word-tf4887952.html#a13990651
Sent from the Lucene - Java Users mailing list archive at  
Nabble.com.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
View this message in context: 
http://www.nabble.com/Compute-the-co-occurence-beteen-a-phrase-and-a-word-tf4887952.html#a13995126
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Closing index searchers ...

2007-11-29 Thread Dragon Fly


Hi,

My application needs to close/open the index searcher periodically so that 
newly added documents are visible.  Is there a way to determine if there are 
any pending searches running against an index searcher or do I have to do my 
own reference counting? Thank you.

_
You keep typing, we keep giving. Download Messenger and join the i’m Initiative 
now.
http://im.live.com/messenger/im/home/?source=TAGLM

Re: Closing index searchers ...

2007-11-29 Thread German Kondolf

I had the same issue, and end up doing my own reference counting using
"acquire/release" strategy.

I used a single instance per searcher, every "acquire" counts +1 and
every "release" count -1, when a index is switched it receives a
"dispose" signal, then the release checks if there are processing
instances, if all releases were made then the last release closes the
searcher.

The interface looked like this:

public interface Acquirable {

public R acquire();
public void release();
public boolean isAcquired();
public boolean dispose();

}

In my implementation, I use a threadlocal to attach the searcher's
referenced instance (although it's a single instance per index
switch).

Hope it helps...

German-K

On Nov 29, 2007 12:15 PM, Dragon Fly <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> My application needs to close/open the index searcher periodically so that 
> newly added documents are visible.  Is there a way to determine if there are 
> any pending searches running against an index searcher or do I have to do my 
> own reference counting? Thank you.
>
> _
> You keep typing, we keep giving. Download Messenger and join the i'm 
> Initiative now.
> http://im.live.com/messenger/im/home/?source=TAGLM

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: how to kill IndexSearcher object after every search

2007-11-29 Thread German Kondolf

Yes, you just call "close()" method.

But, why would you like to do that?
The performance tips remarks exactly the opposite, keeping it alive as
long as possible favors internal lucene's caching of terms, query and
other internal objects.

On Nov 29, 2007 11:14 AM, Sebastin <[EMAIL PROTECTED]> wrote:
>
> Hi All,
>Is there any possibility to kill the IndexSearcher Object after every
> search.
> --
> View this message in context: 
> http://www.nabble.com/how-to-kill-IndexSearcher-object-after-every-search-tf4897436.html#a14026451
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Error with lucene-core-2.2.0.jar

2007-11-29 Thread Erick Erickson

I'm confused about what's going on here. Could you post the raw java
code that produces this error?

Best
Erick

On Nov 29, 2007 5:32 AM, ing.sashaa <[EMAIL PROTECTED]> wrote:

> Hi all,
> I'm using a program that use the Lucene Library. I've downloaded
> lucene-core-2.2.0.jar file and I'm trying it, but I get this error while
> trying to index my documents:
>
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.lucene.document.Document.add
> (Lorg/apache/lucene/document/Field;)V
>
> I've searched among old threads and I think that the error is caused by a
> wrong version of library. So, which one must I use?
>
> Thanks.
> Ale
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> Do you have another PPC machine to reproduce this on?  (To rule out
> bad RAM/hard-drive on the first one).

I'll dig up an old laptop and try it there.

> Another thing to try is turning on the infoStream
> (IndexWriter.setInfoStream(...)) and capture & post the resulting log.
> It will be very large since it takes quite a while for the error to
> occur...

I can do that.

> Lucene is endian independent: all writes to files boil eventually down
> to a writeByte/writeBytes calls in o.a.l.store.IndexOutput such that
> the ordering is controlled by Lucene, not the underlying CPU
> architecture.

I was actually thinking about the implementation of the bitstrings,
rather than data storage proper.

> Are you using term vectors,
> stored fields, payloads on any of these?

Stored fields.  I store a document ID (a 23-character string) for each.

Bill

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> > Another thing to try is turning on the infoStream
> > (IndexWriter.setInfoStream(...)) and capture & post the resulting log.
> > It will be very large since it takes quite a while for the error to
> > occur...
> 
> I can do that.

Here's what I see:

Optimizing...
merging segments _ram_a (1 docs) _ram_b (1 docs) _ram_c (1 docs) _ram_d (1 
docs) _ram_e (1 docs) _ram_f (1 docs) _ram_g (1 docs) _ram_h (1 docs) _ram_i (1 
docs) into _1va (9 docs)
[EMAIL PROTECTED] main: now checkpoint "segments_3ql" [isCommit = true]
[EMAIL PROTECTED] main:   IncRef "_1v4.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v4_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v5.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v5_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v6.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v6_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v7.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v7_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v8.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v8_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v9.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1va.fnm": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1va.fdx": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1va.fdt": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1va.tii": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1va.tis": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1va.frq": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1va.prx": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1va.nrm": pre-incr count is 0
[EMAIL PROTECTED] main: deleteCommits: now remove commit "segments_3qk"
[EMAIL PROTECTED] main:   DecRef "_1v4.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v4_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v5.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v5_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v6.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v6_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v7.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v7_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v8.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v8_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v9.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "segments_3qk": pre-decr count is 1
[EMAIL PROTECTED] main: delete "segments_3qk"
[EMAIL PROTECTED] main: now checkpoint "segments_3qm" [isCommit = true]
[EMAIL PROTECTED] main:   IncRef "_1v4.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v4_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v5.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v5_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v6.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v6_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v7.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v7_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v8.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v8_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v9.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1va.cfs": pre-incr count is 0
[EMAIL PROTECTED] main: deleteCommits: now remove commit "segments_3ql"
[EMAIL PROTECTED] main:   DecRef "_1v4.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v4_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v5.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v5_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v6.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v6_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v7.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v7_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v8.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v8_1.del": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1v9.cfs": pre-decr count is 2
[EMAIL PROTECTED] main:   DecRef "_1va.fnm": pre-decr count is 1
[EMAIL PROTECTED] main: delete "_1va.fnm"
[EMAIL PROTECTED] main:   DecRef "_1va.fdx": pre-decr count is 1
[EMAIL PROTECTED] main: delete "_1va.fdx"
[EMAIL PROTECTED] main:   DecRef "_1va.fdt": pre-decr count is 1
[EMAIL PROTECTED] main: delete "_1va.fdt"
[EMAIL PROTECTED] main:   DecRef "_1va.tii": pre-decr count is 1
[EMAIL PROTECTED] main: delete "_1va.tii"
[EMAIL PROTECTED] main:   DecRef "_1va.tis": pre-decr count is 1
[EMAIL PROTECTED] main: delete "_1va.tis"
[EMAIL PROTECTED] main:   DecRef "_1va.frq": pre-decr count is 1
[EMAIL PROTECTED] main: delete "_1va.frq"
[EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> > Another thing to try is turning on the infoStream
> > (IndexWriter.setInfoStream(...)) and capture & post the resulting log.
> > It will be very large since it takes quite a while for the error to
> > occur...
> 
> I can do that.

Here's a more complete dump.  I've modified the code so that I now
remove any existing versions of the document before re-indexing it and
its pages.

Bill

/Library/Java/Home/bin/java '-Dcom.parc.uplib.indexing.debugMode=true' 
'-Dcom.parc.uplib.indexing.indexProperties=contents:title:categories$,*:date@:apparent-mime-type*:authors$\sand\s:comment:abstract:email-message-id*:email-guid*:email-subject:email-from-name:email-from-address*:email-attachment-to*:email-thread-index*:email-references$,*:email-in-reply-to$,*:keywords$,*:album:performer:composer:music-genre*:audio-length:accompaniment:paragraph-ids$,*:sha-hash*'
 -classpath 
"/local/uplib/share/UpLib-1.7/code/lucene-core-2.2.0.jar:/local/uplib/share/UpLib-1.7/code/LuceneIndexing.jar"
 -Dorg.apache.lucene.writeLockTimeout=2 
com.parc.uplib.indexing.LuceneIndexing "/local/janssen-uplib/index" update 
/local/janssen-uplib/docs 01174-15-2815-270 01174-15-2552-042 01173-98-5675-575 
01173-98-4457-188 01173-83-8266-533 01173-80-8759-205
updating
doc_root_dir is /local/janssen-uplib/docs
Working on document /local/janssen-uplib/docs/01174-15-2815-270
  Adding header 'apparent-mime-type' I to 01174-15-2815-270
  Adding header 'authors' IT to 01174-15-2815-270
  Adding header 'categories' I (article) to 01174-15-2815-270
  Adding header 'date' I (20070317) to 01174-15-2815-270
  Adding header 'sha-hash' I to 01174-15-2815-270
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (3566):  human nature Full-Mental Nudit
page 1 (3100):  I know what you're thinking: W
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01174-15-2815-270 (3 versions)
Working on document /local/janssen-uplib/docs/01174-15-2552-042
  Adding header 'abstract' IT to 01174-15-2552-042
  Adding header 'apparent-mime-type' I to 01174-15-2552-042
  Adding header 'categories' I (photo) to 01174-15-2552-042
  Adding header 'date' I (20070316) to 01174-15-2552-042
  Adding header 'sha-hash' I to 01174-15-2552-042
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
Added 01174-15-2552-042 (1 versions)
Working on document /local/janssen-uplib/docs/01173-98-5675-575
  Adding header 'apparent-mime-type' I to 01173-98-5675-575
  Adding header 'authors' IT to 01173-98-5675-575
  Adding header 'categories' I (article) to 01173-98-5675-575
  Adding header 'categories' I (medical) to 01173-98-5675-575
  Adding header 'date' I (20070313) to 01173-98-5675-575
  Adding header 'sha-hash' I to 01173-98-5675-575
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (2730):  March 13, 2007 DOW JONES REPRI
page 1 (4445):  But just how far -- and how fa
page 2 (2638):  "We don't sell snow tires," sa
page 3 (981):  A spokeswoman for Rite Aid say
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01173-98-5675-575 (5 versions)
Working on document /local/janssen-uplib/docs/01173-98-4457-188
  Adding header 'apparent-mime-type' I to 01173-98-4457-188
  Adding header 'authors' IT to 01173-98-4457-188
  Adding header 'categories' I (article) to 01173-98-4457-188
  Adding header 'date' I (19911006) to 01173-98-4457-188
  Adding header 'sha-hash' I to 01173-98-4457-188
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (2897):  The Economics of the Colonial 
merging segments _ram_0 (1 docs) _ram_1 (1 docs) _ram_2 (1 docs) _ram_3 (1 
docs) _ram_4 (1 docs) _ram_5 (1 docs) _ram_6 (1 docs) _ram_7 (1 docs) _ram_8 (1 
docs) _ram_9 (1 docs) into _1v9 (10 docs)
flush 6 buffered deleted terms on 6 segments.
[EMAIL PROTECTED] main: now checkpoint "segments_3qj" [isCommit = true]
[EMAIL PROTECTED] main:   IncRef "_1v4.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v4_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v5.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v5_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v6.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v6_1.del": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v7.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v7_1.del": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1v8.cfs": pre-incr count is 1
[EMAIL PROTECTED] main:   IncRef "_1v8_1.del": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1v9.fnm": pre-incr count is 0
[EMAIL PROTECTED] main:   IncRef "_1v9

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> Can you try running with the trunk version of Lucene (2.3-dev) and see
> if the error still occurs?  EG you can download this AM's build here:
> 
>   
> http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/288/artifact/artifacts

Still there.  Here's the dump with last night's build:

/Library/Java/Home/bin/java '-Dcom.parc.uplib.indexing.debugMode=true' 
'-Dcom.parc.uplib.indexing.indexProperties=contents:title:categories$,*:date@:apparent-mime-type*:authors$\sand\s:comment:abstract:email-message-id*:email-guid*:email-subject:email-from-name:email-from-address*:email-attachment-to*:email-thread-index*:email-references$,*:email-in-reply-to$,*:keywords$,*:album:performer:composer:music-genre*:audio-length:accompaniment:paragraph-ids$,*:sha-hash*'
 -classpath 
"/local/uplib/share/UpLib-1.7/code/lucene-core-2.3-2007-11-29_02-49-31.jar:/local/uplib/share/UpLib-1.7/code/LuceneIndexing.jar"
 -Dorg.apache.lucene.writeLockTimeout=2 
com.parc.uplib.indexing.LuceneIndexing "/local/janssen-uplib/index" update 
/local/janssen-uplib/docs 01179-00-0750-547 01178-90-9186-558 01178-81-4212-772 
01178-81-3305-217 01178-73-1029-141 01178-72-8365-803
updating
doc_root_dir is /local/janssen-uplib/docs
IFD [main]: setInfoStream [EMAIL PROTECTED]
IW 0 [main]: setInfoStream: 
dir=org.apache.lucene.store.FSDirectory@/local/janssen-uplib/index 
autoCommit=true [EMAIL PROTECTED] [EMAIL PROTECTED] ramBufferSizeMB=16.0 
maxBuffereDocs=-1 maxBuffereDeleteTerms=-1 maxFieldLength=1 
index=_21:c19686 _22:c92
IW 0 [main]: setMaxFieldLength 2147483647
Working on document /local/janssen-uplib/docs/01179-00-0750-547
  Adding header 'abstract' IT to 01179-00-0750-547
  Adding header 'apparent-mime-type' I to 01179-00-0750-547
  Adding header 'authors' IT to 01179-00-0750-547
  Adding header 'categories' I (ebooks) to 01179-00-0750-547
  Adding header 'categories' I (economics) to 01179-00-0750-547
  Adding header 'categories' I (paper) to 01179-00-0750-547
  Adding header 'citation' I to 01179-00-0750-547
  Adding header 'date' I (20070128) to 01179-00-0750-547
  Adding header 'sha-hash' I to 01179-00-0750-547
  Adding header 'title' IT (Heterogeneity in Price Stickiness and the Real 
Effects of Monetary Shocks) to 01179-00-0750-547
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (2181):  Heterogeneity in Price Stickin
page 1 (2927):  1 Introduction There is ample 
page 2 (3135):  In the presence of strategic c
page 3 (3128):  Motivated by those questions, 
page 4 (3214):  ploring the tractability of th
page 5 (2491):  model with Taylor staggered wa
page 6 (1548):  real rigidities (Ball and Rome
page 7 (3098):  2.2 Calibrating the sectoral d
page 8 (1913):  distribution of price stickine
page 9 (1952):  reported in Table 1. Hencefort
page 10 (1635):  Figure 2 presents analogous re
page 11 (1743):  In the absence of strategic co
page 12 (2806):  Corollary 1 For an arbitrary h
page 13 (2380):  2.4.2 Growth rate shocks In th
page 14 (2962):  price changes. With heterogene
page 15 (3265):  ties and heterogeneity in the 
page 16 (1962):  complementarities. The results
page 17 (751):  to the response of the heterog
page 18 (489):  economies are embedded into th
page 19 (3295):  2.6 Fitting IRFs with an ident
page 20 (2066):  Table 3a: Best-Fitting Duratio
page 21 (2444):  This is an important step beca
page 22 (1976):  where ? is the discount factor
page 23 (1183):  Et "? Ct+1 Ct ¦?? It Pt Pt+1 #
page 24 (2188):  can be rewritten as: Pk,t = £ 
page 25 (1370):  pt = Z 1 0 f (k) pk,tdk, (10) 
page 26 (3269):  Heterogeneity in price stickin
page 27 (3117):  Irrespective of the net effect
page 28 (2084):  set of parameters involve high
page 29 (575):  0 5 10 15 20 25 30 35 40 0 x 1
page 30 (2185):  output and falling prices in a
page 31 (2358):  price changes that minimizes t
page 32 (2689):  These results are fully consis
page 33 (3600):  different sources of real rigi
page 34 (3168):  work in a model with heterogen
page 35 (2557):  single equation estimation of 
page 36 (1326):  Taking the limit as Æ ? 0 in e
page 37 (1796):  The output gap is constant at 
page 38 (1066):  The corresponding path for the
page 39 (1347):  4) Proof of Corollaries 1 and 
page 40 (2421):  Therefore, for ? Å 0, the expe
page 41 (1343):  p (t) = Z 1 0 f (k) ? ?? ?? R 
page 42 (2117):  As ? ? 0, this clearly converg
page 43 (1375):  model around the zero inflatio
page 44 (1497):  pt = Z 1 0 f (k) pk,tdk, yt = 
page 45 (1128):  Table A.3: Best-Fitting Durati
page 46 (898):  Multiplying by f (k) ?k and in
page 47 (1072):  Now, from (23): ?kxk,t = pk,t 
page 48 (268):  Finally, let ¹t ? pt ? pt?1 de
page 49 (1694):  References [1] Altissimo, F

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll

Are you still getting the original exception too or just the Array out  
of bounds one now?  Also, are you doing anything else to the index  
while this is happening?  The code at the point in the exception below  
is trying to properly handle deleted documents.


-Grant

On Nov 29, 2007, at 1:34 PM, Bill Janssen wrote:

Can you try running with the trunk version of Lucene (2.3-dev) and  
see

if the error still occurs?  EG you can download this AM's build here:

 
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/288/artifact/artifacts


Still there.  Here's the dump with last night's build:

/Library/Java/Home/bin/java '- 
Dcom.parc.uplib.indexing.debugMode=true' '- 
Dcom.parc.uplib.indexing.indexProperties=contents:title:categories 
$,*:date@:apparent-mime-type*:authors$\sand\s:comment:abstract:email- 
message-id*:email-guid*:email-subject:email-from-name:email-from- 
address*:email-attachment-to*:email-thread-index*:email-references 
$,*:email-in-reply-to$,*:keywords$,*:album:performer:composer:music- 
genre*:audio-length:accompaniment:paragraph-ids$,*:sha-hash*' - 
classpath "/local/uplib/share/UpLib-1.7/code/lucene- 
core-2.3-2007-11-29_02-49-31.jar:/local/uplib/share/UpLib-1.7/code/ 
LuceneIndexing.jar" -Dorg.apache.lucene.writeLockTimeout=2  
com.parc.uplib.indexing.LuceneIndexing "/local/janssen-uplib/index"  
update /local/janssen-uplib/docs 01179-00-0750-547 01178-90-9186-558  
01178-81-4212-772 01178-81-3305-217 01178-73-1029-141  
01178-72-8365-803

updating
doc_root_dir is /local/janssen-uplib/docs
IFD [main]: setInfoStream  
deletionPolicy 
[EMAIL PROTECTED]
IW 0 [main]: setInfoStream: dir=org.apache.lucene.store.FSDirectory@/ 
local/janssen-uplib/index autoCommit=true  
[EMAIL PROTECTED]  
mergeScheduler 
[EMAIL PROTECTED]  
ramBufferSizeMB=16.0 maxBuffereDocs=-1 maxBuffereDeleteTerms=-1  
maxFieldLength=1 index=_21:c19686 _22:c92

IW 0 [main]: setMaxFieldLength 2147483647
Working on document /local/janssen-uplib/docs/01179-00-0750-547
 Adding header 'abstract' IT to 01179-00-0750-547
 Adding header 'apparent-mime-type' I to 01179-00-0750-547
 Adding header 'authors' IT to 01179-00-0750-547
 Adding header 'categories' I (ebooks) to 01179-00-0750-547
 Adding header 'categories' I (economics) to 01179-00-0750-547
 Adding header 'categories' I (paper) to 01179-00-0750-547
 Adding header 'citation' I to 01179-00-0750-547
 Adding header 'date' I (20070128) to 01179-00-0750-547
 Adding header 'sha-hash' I to 01179-00-0750-547
 Adding header 'title' IT (Heterogeneity in Price Stickiness and the  
Real Effects of Monetary Shocks) to 01179-00-0750-547
 Created empty doc Document01179-00-0750-547> stored/uncompressed,indexed  
stored/uncompressed,indexed>

 Using charset utf8 for contents.txt
 Using language en for contents.txt
   page 0 (2181):  Heterogeneity in Price Stickin
   page 1 (2927):  1 Introduction There is ample
   page 2 (3135):  In the presence of strategic c
   page 3 (3128):  Motivated by those questions,
   page 4 (3214):  ploring the tractability of th
   page 5 (2491):  model with Taylor staggered wa
   page 6 (1548):  real rigidities (Ball and Rome
   page 7 (3098):  2.2 Calibrating the sectoral d
   page 8 (1913):  distribution of price stickine
   page 9 (1952):  reported in Table 1. Hencefort
   page 10 (1635):  Figure 2 presents analogous re
   page 11 (1743):  In the absence of strategic co
   page 12 (2806):  Corollary 1 For an arbitrary h
   page 13 (2380):  2.4.2 Growth rate shocks In th
   page 14 (2962):  price changes. With heterogene
   page 15 (3265):  ties and heterogeneity in the
   page 16 (1962):  complementarities. The results
   page 17 (751):  to the response of the heterog
   page 18 (489):  economies are embedded into th
   page 19 (3295):  2.6 Fitting IRFs with an ident
   page 20 (2066):  Table 3a: Best-Fitting Duratio
   page 21 (2444):  This is an important step beca
   page 22 (1976):  where ? is the discount factor
   page 23 (1183):  Et "? Ct+1 Ct ¦?? It Pt Pt+1 #
   page 24 (2188):  can be rewritten as: Pk,t = £
   page 25 (1370):  pt = Z 1 0 f (k) pk,tdk, (10)
   page 26 (3269):  Heterogeneity in price stickin
   page 27 (3117):  Irrespective of the net effect
   page 28 (2084):  set of parameters involve high
   page 29 (575):  0 5 10 15 20 25 30 35 40 0 x 1
   page 30 (2185):  output and falling prices in a
   page 31 (2358):  price changes that minimizes t
   page 32 (2689):  These results are fully consis
   page 33 (3600):  different sources of real rigi
   page 34 (3168):  work in a model with heterogen
   page 35 (2557):  single equation estimation of
   page 36 (1326):  Taking the limit as Æ ? 0 in e
   page 37 (1796):  The output gap is constant at
   page 38 (1066):  The corresponding path for the
   page 39 (1347):  4) Proof of Corollaries 1 and
   page 40 (2421):  Therefore, for ? Å 0, the expe
   page 41 (1343):  p (t) = Z 1 0 f (k) ? ?? ?? R
   page 42 (2117):  As ? ? 0, this clearly converg
   page 43 (1375):  mode

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> Are you still getting the original exception too or just the Array out =20=
> 
> of bounds one now?  Also, are you doing anything else to the index =20
> while this is happening?  The code at the point in the exception below =20=
> 
> is trying to properly handle deleted documents.

Just the array-out-of-bounds one, now.  The current version of the
code creates a writer, then deletes all old Lucene 'Document'
instances belonging to the specified UpLib doc-id, using that writer,
then re-indexes that UpLib doc-id (which consists of one-to-N Lucene
'Document's).  After doing the six UpLib documents, it calls
optimize().

I'm going back to the old code now.  It uses the 2.0 APIs, so it uses
an IndexReader to delete the existing instances, then closes that
reader (which if I understand it properly should flush the index back
to disk), then creates a new writer to re-index the same documents,
then does the optimize with that writer, which is where the
CorruptIndexException started coming up.  I'm going to run that again
with 2.0, then with last night's build.

I'm not sure if the success with 2.0 meant that a corrupted index
wasn't being detected, or if it wasn't being corrupted in the first
place.

Bill



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless

"Bill Janssen" <[EMAIL PROTECTED]> wrote:
> Here's the dump with last night's build:

Those logs look healthy up until the exception.

One odd thing is when you instantiate your writer, your index has 2
segments in it.  I expected only 1 since each time you visit your
index you leave it optimized.  (Or, maybe you're not calling
setInfoStream immediately after opening the writer?).

The error still only happens on the one PPC machine, even after
upgrading to trunk?  EG not on an Intel box?

Have you tried another PPC machine?

> I'm going back to the old code now.  It uses the 2.0 APIs, so it
> uses an IndexReader to delete the existing instances, then closes
> that reader (which if I understand it properly should flush the
> index back to disk), then creates a new writer to re-index the same
> documents, then does the optimize with that writer, which is where
> the CorruptIndexException started coming up.  I'm going to run that
> again with 2.0, then with last night's build.

Could you post this part of the code (deleting) too?

> I'm not sure if the success with 2.0 meant that a corrupted index
> wasn't being detected, or if it wasn't being corrupted in the first
> place.

Likely the corruption really isn't happening.  That particular check
for "docs out of order" is present in 2.0 as well.

Is it possible to whittle down your test to a smaller set of documents?
EG if you only re-index one document at a time, does the exception
happen sooner?  Ideally we can reduce this to a test I can reproduce
then I can track it down...

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll



On Nov 29, 2007, at 2:26 PM, Bill Janssen wrote:

Are you still getting the original exception too or just the Array  
out =20=


of bounds one now?  Also, are you doing anything else to the index  
=20
while this is happening?  The code at the point in the exception  
below =20=


is trying to properly handle deleted documents.


Just the array-out-of-bounds one, now.  The current version of the
code creates a writer, then deletes all old Lucene 'Document'
instances belonging to the specified UpLib doc-id, using that writer,
then re-indexes that UpLib doc-id (which consists of one-to-N Lucene
'Document's).  After doing the six UpLib documents, it calls
optimize().


I'm curious what happens if you call optimize after doing the deletion  
but before the re-indexing.


Also, could you try out the CheckIndex tool in 2.3-dev before and  
after the deletes?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Boost One Term Query

2007-11-29 Thread java_user_


Boosting a one term query does not have an affect on the score.

For example:
apple

Has the same score as:
apple^3

But repeating the term will up the score
apple apple apple

I expected the score to go up when boosting a one term query.  Is that a
wrong expectation?

Thanks!
-- 
View this message in context: 
http://www.nabble.com/Boost-One-Term-Query-tf4900128.html#a14035572
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> Could you post this part of the code (deleting) too?

Here it is:


private static void remove (File index_file, String[] doc_ids, int start) {

String number;
String list;
Term term;
TermDocs matches;

if (debug_mode)
System.err.println("index file is " + index_file + " and it " + 
(index_file.exists() ? "exists." : "does not exist."));

try {

if (index_file.exists() && (doc_ids.length > start)) {
IndexReader reader = IndexReader.open(index_file);
try {
for (int i = start;  i < doc_ids.length;  i++) {
term = new Term("id", doc_ids[i]);
int deleted = reader.deleteDocuments(term);
System.out.println("Deleted " + deleted + " existing 
instances of " + doc_ids[i]);
}
} finally {
reader.close();
}
}

} catch (Exception e) {
if (debug_mode) {
  e.printStackTrace(System.err);
} else {
System.out.println("* LuceneIndexing 'remove' raised " + 
e.getClass() + " with message " + e.getMessage());
System.err.println("LuceneIndexing 'remove': caught a " + 
e.getClass() +
   "\n with message: " + e.getMessage());
System.out.flush();
}
System.exit(JAVA_EXCEPTION);
}
System.out.flush();
}

private static void update (File index_file, File doc_root_dir, String[] 
ids, int start) {

ExtractIndexingInfo.DocumentIterator docit;
String number;

remove (index_file, ids, start);

try {

// Now add the documents to the index
IndexWriter writer = new IndexWriter(index_file, new 
StandardAnalyzer(), !index_file.exists());
if (debug_mode)
writer.setInfoStream(System.err);
writer.setMaxFieldLength(Integer.MAX_VALUE);

try {
for (int i = start;  i < ids.length;  i ++) {
docit = build_document_iterator(doc_root_dir, ids[i]);
int count = 0;
while (docit.hasNext()) {
writer.addDocument((Document)(docit.next()));
count += 1;
}
System.out.println("Added " + docit.id + " (" + count + " 
versions)");
System.out.flush();
}
} finally {
// And close the index
System.out.println("Optimizing...");
// See 
http://www.gossamer-threads.com/lists/lucene/java-dev/47895 about optimize
// Can fail if low on disk space
writer.optimize();
writer.close();
}

} catch (Exception e) {
if (debug_mode) {
e.printStackTrace(System.err);
} else {
System.out.println("* Lucene search engine raised " + 
e.getClass() + " with message " + e.getMessage());
System.err.println(" 'update' caught a " + e.getClass() +
   "\n with message: " + e.getMessage());
System.out.flush();
}
System.exit(JAVA_EXCEPTION);
}
System.out.flush();
}

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> Have you tried another PPC machine?

No.  It's in another location, but perhaps I can get it tomorrow.  On
the other hand, the success when using 2.0 makes it likely to me that
the machine isn't the problem.

OK, I've reverted to my original codebase (where I first create a
reader and do the deletions, then create a writer and do the additions
and optimize), and it works fine with lucene-core-2.0.0, but fails
with lucene-core-2.3.-whatever (last night's build).  Here's the dump:

indexing with /Library/Java/Home/bin/java  
-Dcom.parc.uplib.indexing.debugMode=true 
"-Dcom.parc.uplib.indexing.indexProperties=contents:title:categories$,*:date@:apparent-mime-type*:authors$\sand\s:comment:abstract:email-message-id*:email-guid*:email-subject:email-from-name:email-from-address*:email-attachment-to*:email-thread-index*:email-references$,*:email-in-reply-to$,*:keywords$,*:album:performer:composer:music-genre*:audio-length:accompaniment:paragraph-ids$,*:sha-hash*"
 -classpath 
"/local/uplib/share/UpLib-1.7/code/lucene-core-2.3-2007-11-29_02-49-31.jar:/local/uplib/share/UpLib-1.7/code/LuceneIndexing.jar"
 -Dorg.apache.lucene.writeLockTimeout=2 
com.parc.uplib.indexing.LuceneIndexing "/local/janssen-uplib/index" update 
/local/janssen-uplib/docs 01160-06-3246-773 01159-97-2914-663 01159-89-7507-719 
01159-89-5614-073 01159-89-1159-244 01159-89-0665-499
thr001: acquiring lock:  LuceneIndex...
thr001: acquired lock:  LuceneIndex*
thr001: releasing lock:  LuceneIndex*
thr001:   indexing output is  
stored/uncompressed,indexed 
stored/uncompressed,indexed>
Added 01160-06-3246-773 (1 versions)
Working on document /local/janssen-uplib/docs/01159-97-2914-663
  Adding header 'apparent-mime-type' I to 01159-97-2914-663
  Adding header 'authors' IT to 01159-97-2914-663
  Adding header 'categories' I (cartoon) to 01159-97-2914-663
  Adding header 'date' I (19951004) to 01159-97-2914-663
  Adding header 'sha-hash' I to 01159-97-2914-663
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
Added 01159-97-2914-663 (1 versions)
Working on document /local/janssen-uplib/docs/01159-89-7507-719
  Adding header 'apparent-mime-type' I to 01159-89-7507-719
  Adding header 'sha-hash' I to 01159-89-7507-719
  Adding header 'title' IT (Photoshop Metal Texture) to 01159-89-7507-719
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (580):  Tutorials\xa5 News\xa5 Exclusives\xa5 S
page 1 (1680):  On a new layer create a gradie
page 2 (1118):  Scrapes and scratches are irre
page 3 (470):  Bevel Settings\xa5 Contour Settin
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01159-89-7507-719 (5 versions)
Working on document /local/janssen-uplib/docs/01159-89-5614-073
  Adding header 'apparent-mime-type' I to 01159-89-5614-073
  Adding header 'sha-hash' I to 01159-89-5614-073
  Adding header 'title' IT (Creating Virtual Mats and Frames with The GIMP) to 
01159-89-5614-073
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (663):  All photographs and articles o
page 1 (600):  Although real mats and frames 
page 2 (999):  The Procedure First of all you
page 3 (615):  Run the Add Mat script (Script
page 4 (693):  in the GIMP toolbox Pattern: U
page 5 (703):  3D lighted/shaded appearance. 
page 6 (719):  Bevel Fill Color, pops up a di
page 7 (714):  texture afterwards. Default: o
page 8 (797):  recommended, especially if you
page 9 (461):  moving outwards, as in adding 
page 10 (67):  11 Creating Virtual Mats and F
page 11 (67):  12 Creating Virtual Mats and F
page 12 (378):  Time to add a frame. Run Scrip
page 13 (498):  in Frame Fill Color FG color: 
page 14 (717):  and background colors, not in 
page 15 (685):  the pattern to for texturing t
page 16 (721):  added along the inner boundary
page 17 (1006):  leave a selection in place cov
page 18 (904):  A drop shadow on the entire fr
page 19 (629):  threshold sliders to the right
page 20 (901):  Bump Map" and fill it with whi
page 21 (786):  image window, do a Select All 
page 22 (393):  In the Layers dialog, choose t
page 23 (937):  "Keep Trans." option near the 
page 24 (239):  Last modified: Mon May 9 23:36
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01159-89-5614-073 (26 versions)
Working on document /local/janssen-uplib/docs/01159-89-1159-244
  Adding header 'apparent-mime-type' I to 01159-89-1159-244
  Adding header 'authors' IT to 01159-89-1159-244
  Adding header 'categories' I (ebooks) to 01159-89-1159-244
  Adding header 'categories' I (article) to 01159-89-1159-244
  Adding header 'date' I (20050100) to 01159-89-1159-244
  Adding header 's

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

> Also, could you try out the CheckIndex tool in 2.3-dev before and  
> after the deletes?

Great idea!  I don't suppose there's a jar file of it?

Bill

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Bill Janssen

So, it's a little clearer.  I get the Array-out-of-bounds exception if
I'm re-indexing some already indexed documents -- if there are
deletions involved.  I get the CorruptIndexException if I'm indexing
freshly -- no deletions.  Here's an example of that (with the latest
nightly).  I removed the existing index, then reindexed the collection
six UpLib docs at a time, till I hit the corruption.

Bill

/Library/Java/Home/bin/java  -Dcom.parc.uplib.indexing.debugMode=true 
"-Dcom.parc.uplib.indexing.indexProperties=contents:title:categories$,*:date@:apparent-mime-type*:authors$\sand\s:comment:abstract:email-message-id*:email-guid*:email-subject:email-from-name:email-from-address*:email-attachment-to*:email-thread-index*:email-references$,*:email-in-reply-to$,*:keywords$,*:album:performer:composer:music-genre*:audio-length:accompaniment:paragraph-ids$,*:sha-hash*"
 -classpath 
"/local/uplib/share/UpLib-1.7/code/lucene-core-2.3-2007-11-29_02-49-31.jar:/local/uplib/share/UpLib-1.7/code/LuceneIndexing.jar"
 -Dorg.apache.lucene.writeLockTimeout=2 
com.parc.uplib.indexing.LuceneIndexing "/local/janssen-uplib/index" update 
/local/janssen-uplib/docs 01113-86-6099-767 01113-86-5485-936 01113-86-0975-795 
01113-62-2881-882 01113-44-7730-580 01113-44-7684-477
thr002: acquiring lock:  LuceneIndex...
thr002: acquired lock:  LuceneIndex*
thr002: releasing lock:  LuceneIndex*
thr002:   indexing output is  
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (4219):  Question: My chives have grown
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01113-86-6099-767 (2 versions)
Working on document /local/janssen-uplib/docs/01113-86-5485-936
  Adding header 'abstract' IT to 01113-86-5485-936
  Adding header 'apparent-mime-type' I to 01113-86-5485-936
  Adding header 'authors' IT to 01113-86-5485-936
  Adding header 'categories' I (paper) to 01113-86-5485-936
  Adding header 'categories' I (sensepad) to 01113-86-5485-936
  Adding header 'citation' I to 01113-86-5485-936
  Adding header 'date' I (20040524) to 01113-86-5485-936
  Adding header 'sha-hash' I to 01113-86-5485-936
  Adding header 'title' IT (Designing Interaction, not Interfaces) to 
01113-86-5485-936
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (3855):  Designing Interaction, not Int
page 1 (5688):  Figure 1. Interaction as a phe
page 2 (5831):  Interaction models can be eval
page 3 (5770):  Reification turns concepts and
page 4 (5558):  Figure 6. A mock-up of the DPI
page 5 (5963):  In joint work with Yves Guiard
page 6 (6819):  I propose making interactions 
page 7 (5622):  Graphical Application. Proc. A
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01113-86-5485-936 (9 versions)
Working on document /local/janssen-uplib/docs/01113-86-0975-795
  Adding header 'apparent-mime-type' I to 01113-86-0975-795
  Adding header 'categories' I (article) to 01113-86-0975-795
  Adding header 'date' I (20050414) to 01113-86-0975-795
  Adding header 'sha-hash' I to 01113-86-0975-795
  Adding header 'source' IT to 01113-86-0975-795
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (1851):  About sponsorship Simplifying 
page 1 (2900):  Latvia and Lithuania, Estonia'
page 2 (3088):  How much fairness is gained fo
page 3 (5317):  At the time of its reform, Est
page 4 (1101):  In part, the tax system is bur
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01113-86-0975-795 (6 versions)
Working on document /local/janssen-uplib/docs/01113-62-2881-882
  Adding header 'apparent-mime-type' I to 01113-62-2881-882
  Adding header 'categories' I (article) to 01113-62-2881-882
  Adding header 'date' I (20050328) to 01113-62-2881-882
  Adding header 'keywords' I (neuroeconomics) to 01113-62-2881-882
  Adding header 'sha-hash' I to 01113-62-2881-882
  Adding header 'title' IT (Neuroeconomics:  Why Logic Often Takes a Backseat) 
to 01113-62-2881-882
  Created empty doc Document 
stored/uncompressed,indexed 
stored/uncompressed,indexed>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
page 0 (2957):  Close Window MARCH 28, 2005 EC
page 1 (3856):  these attacks on rationality ?
page 2 (484):  Even believers in neuroeconomi
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01113-62-2881-882 (4 versions)
Working on document /local/janssen-uplib/docs/01113-44-7730-580
  Adding header 'apparent-mime-type' I to 01113-44-7730-580
  Adding header 'categories' I (flowport) to 01113-44-7730-580
  Adding header 'categories' I (receipt) to 01113-44-7730-580

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless


This is in the nightly JAR.  It's o.a.l.index.CheckIndex (it defines
a static main).

Mike

"Bill Janssen" <[EMAIL PROTECTED]> wrote:
> > Also, could you try out the CheckIndex tool in 2.3-dev before and  
> > after the deletes?
> 
> Great idea!  I don't suppose there's a jar file of it?
> 
> Bill
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll

Just a theory (make that a guess), Mike, but is it possible that the  
one merge scheduler is hitting a synchronization issue with the  
deletedDocuments bit vector?  That is one thread is cleaning it up and  
the other is accessing and they aren't synchronizing their access?


This doesn't explain the original problem, but maybe this one?

On Nov 29, 2007, at 4:46 PM, Bill Janssen wrote:


Have you tried another PPC machine?


No.  It's in another location, but perhaps I can get it tomorrow.  On
the other hand, the success when using 2.0 makes it likely to me that
the machine isn't the problem.

OK, I've reverted to my original codebase (where I first create a
reader and do the deletions, then create a writer and do the additions
and optimize), and it works fine with lucene-core-2.0.0, but fails
with lucene-core-2.3.-whatever (last night's build).  Here's the dump:

indexing with /Library/Java/Home/bin/java  - 
Dcom.parc.uplib.indexing.debugMode=true "- 
Dcom.parc.uplib.indexing.indexProperties=contents:title:categories 
$,*:date@:apparent-mime-type*:authors$\sand\s:comment:abstract:email- 
message-id*:email-guid*:email-subject:email-from-name:email-from- 
address*:email-attachment-to*:email-thread-index*:email-references 
$,*:email-in-reply-to$,*:keywords$,*:album:performer:composer:music- 
genre*:audio-length:accompaniment:paragraph-ids$,*:sha-hash*" - 
classpath "/local/uplib/share/UpLib-1.7/code/lucene- 
core-2.3-2007-11-29_02-49-31.jar:/local/uplib/share/UpLib-1.7/code/ 
LuceneIndexing.jar" -Dorg.apache.lucene.writeLockTimeout=2  
com.parc.uplib.indexing.LuceneIndexing "/local/janssen-uplib/index"  
update /local/janssen-uplib/docs 01160-06-3246-773 01159-97-2914-663  
01159-89-7507-719 01159-89-5614-073 01159-89-1159-244  
01159-89-0665-499

thr001: acquiring lock:  LuceneIndex...
thr001: acquired lock:  LuceneIndex*
thr001: releasing lock:  LuceneIndex*
thr001:   indexing output is IFD [main]: setInfoStream  
deletionPolicy 
[EMAIL PROTECTED]
IW 0 [main]: setInfoStream: dir=org.apache.lucene.store.FSDirectory@/ 
local/janssen-uplib/index autoCommit=true  
[EMAIL PROTECTED]  
mergeScheduler 
[EMAIL PROTECTED]  
ramBufferSizeMB=16.0 maxBuffereDocs=-1 maxBuffereDeleteTerms=-1  
maxFieldLength=1 index=_4j:c19686

IW 0 [main]: setMaxFieldLength 2147483647
Working on document /local/janssen-uplib/docs/01160-06-3246-773
 Adding header 'apparent-mime-type' I to 01160-06-3246-773
 Adding header 'authors' IT to 01160-06-3246-773
 Adding header 'categories' I (cartoon) to 01160-06-3246-773
 Adding header 'date' I (19951005) to 01160-06-3246-773
 Adding header 'sha-hash' I to 01160-06-3246-773
 Created empty doc Document01160-06-3246-773> stored/uncompressed,indexed  
stored/uncompressed,indexed>

Added 01160-06-3246-773 (1 versions)
Working on document /local/janssen-uplib/docs/01159-97-2914-663
 Adding header 'apparent-mime-type' I to 01159-97-2914-663
 Adding header 'authors' IT to 01159-97-2914-663
 Adding header 'categories' I (cartoon) to 01159-97-2914-663
 Adding header 'date' I (19951004) to 01159-97-2914-663
 Adding header 'sha-hash' I to 01159-97-2914-663
 Created empty doc Document01159-97-2914-663> stored/uncompressed,indexed  
stored/uncompressed,indexed>

Added 01159-97-2914-663 (1 versions)
Working on document /local/janssen-uplib/docs/01159-89-7507-719
 Adding header 'apparent-mime-type' I to 01159-89-7507-719
 Adding header 'sha-hash' I to 01159-89-7507-719
 Adding header 'title' IT (Photoshop Metal Texture) to  
01159-89-7507-719
 Created empty doc Document01159-89-7507-719> stored/uncompressed,indexed  
stored/uncompressed,indexed>

 Using charset utf8 for contents.txt
 Using language en for contents.txt
   page 0 (580):  Tutorials\xa5 News\xa5 Exclusives\xa5 S
   page 1 (1680):  On a new layer create a gradie
   page 2 (1118):  Scrapes and scratches are irre
   page 3 (470):  Bevel Settings\xa5 Contour Settin
 Using charset utf8 for contents.txt
 Using language en for contents.txt
Added 01159-89-7507-719 (5 versions)
Working on document /local/janssen-uplib/docs/01159-89-5614-073
 Adding header 'apparent-mime-type' I to 01159-89-5614-073
 Adding header 'sha-hash' I to 01159-89-5614-073
 Adding header 'title' IT (Creating Virtual Mats and Frames with The  
GIMP) to 01159-89-5614-073
 Created empty doc Document01159-89-5614-073> stored/uncompressed,indexed  
stored/uncompressed,indexed>

 Using charset utf8 for contents.txt
 Using language en for contents.txt
   page 0 (663):  All photographs and articles o
   page 1 (600):  Although real mats and frames
   page 2 (999):  The Procedure First of all you
   page 3 (615):  Run the Add Mat script (Script
   page 4 (693):  in the GIMP toolbox Pattern: U
   page 5 (703):  3D lighted/shaded appearance.
   page 6 (719):  Bevel Fill Color, pops up a di
   page 7 (714):  texture afterwards. Default: o
   page 8 (797):  recommended, especially if you
   page 9 (461):  moving outwards, as in adding
   page 10 (67):  11 Creating Virtual Mats and F

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless

"Bill Janssen" <[EMAIL PROTECTED]> wrote:
> No.  It's in another location, but perhaps I can get it tomorrow.
> On the other hand, the success when using 2.0 makes it likely to me
> that the machine isn't the problem.

Yeah good point.  Seems like a long shot (wishful thinking on my
part!).

Your errors seem to happen around the same area (~20K docs).  If you
skip the first say ~18K docs does the error still happen?  We need to
somehow narrow this down.

Or is there any way I could get a temporary account to log into this
box and try to track this down?  (If indeed it doesn't happen on an
x86 box -- I unfortunately don't have access to a PPC machine).

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Michael McCandless

"Grant Ingersoll" <[EMAIL PROTECTED]> wrote:
> Just a theory (make that a guess), Mike, but is it possible that the  
> one merge scheduler is hitting a synchronization issue with the  
> deletedDocuments bit vector?  That is one thread is cleaning it up and  
> the other is accessing and they aren't synchronizing their access?

Well, in trunk I think we are hitting the bit vector in synchronized
contexts, correctly.  (I sure think/hope so :).  Also, in the context
of merging, the deleted docs bit vector is read only.

This sure does spookily sound like LUCENE-140!!  I hope that one is not
coming back from the dead!

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-29 Thread Grant Ingersoll


I have PPC and Intel access if that helps.  Just need a test case.

On Nov 29, 2007, at 5:37 PM, Michael McCandless wrote:



"Bill Janssen" <[EMAIL PROTECTED]> wrote:

No.  It's in another location, but perhaps I can get it tomorrow.
On the other hand, the success when using 2.0 makes it likely to me
that the machine isn't the problem.


Yeah good point.  Seems like a long shot (wishful thinking on my
part!).

Your errors seem to happen around the same area (~20K docs).  If you
skip the first say ~18K docs does the error still happen?  We need to
somehow narrow this down.

Or is there any way I could get a temporary account to log into this
box and try to track this down?  (If indeed it doesn't happen on an
x86 box -- I unfortunately don't have access to a PPC machine).

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: CorruptIndexException

2007-11-29 Thread Melanie Langlois

Thank you, I indeed use newer version of Lucli by mistake.

-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 29, 2007 6:30 PM
To: java-user@lucene.apache.org
Subject: Re: CorruptIndexException

That exception means your index was written with a newer version of
Lucene than the version you are using to open the IndexReader.

It looks like you used the unreleased (2.3 dev) version of Lucli from
the Lucene trunk and then went back to an older Lucene JAR (maybe 2.2?)
for accessing it?  In general writing an index with a newer version
of Lucene and then trying to access it using an older version of Lucene
doesn't work (whereas the opposite does).

I'm afraid you either have to switch to 2.3-dev for reading your index
(but beware it could have sneaky bugs ...), or, rebuild your index with
the 2.2 version of Lucene and use the 2.2 Lucli in the future.

Mike

"Melanie Langlois" <[EMAIL PROTECTED]> wrote:
> Hi,
> 
>  
> 
> I use Lucli to optimize my index, when my application was stopped. And
> after restarting my application, I could not serahc my index anymore, I
> got the following exception :
> 
>  
> 
> org.apache.lucene.index.CorruptIndexException: Unknown format version: -4
> 
> at
> org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:204)
> 
> at
> org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:190)
> 
> at
> 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:610)
> 
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:185)
> 
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:167)
> 
>  
> 
> I have two questions:
> 
> -why does it occurs ? Should I use another tool to access the index
> outside of my application ?
> 
> -do there is way to recover ?
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mélanie
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Problem with Add method

Deprecated API

Re: Problem with Add method

Re: CorruptIndexException

Error with lucene-core-2.2.0.jar

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: Deprecated API

how to kill IndexSearcher object after every search

Re: Compute the co-occurence beteen a phrase and a word

Closing index searchers ...

Re: Closing index searchers ...

Re: how to kill IndexSearcher object after every search

Re: Error with lucene-core-2.2.0.jar

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Boost One Term Query

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

RE: CorruptIndexException

32 matches

Site Navigation

Mail list logo

Footer information