Re: parameter create in IndexWriter

2006-09-06 Thread jacky
I am afraid i don't understand it. Input the wrong path? This will be happen rarely since the index path is always hard code in the config file. And usually, users need to create a brand new index, and then append the new document into it. Best Regards. jacky

Indexing MS Powerpoint files with Lucene

2006-09-06 Thread Venkateshprasanna
Is there any filter available for extracting text from MS Powerpoint files and indexing them? The lucene website suggests the POI project, which, it seems does not support PPT files as of now. Regards, Venkateshprasanna -- View this message in context: http://www.nabble.com/which-way-to-index-

Re: parameter create in IndexWriter

2006-09-06 Thread Daniel Noll
jacky wrote: > hi, >Today I found a funny thing, if "create" parameter in IndexWriter set > false, > and there is no index Segements in the Directory before, then an IOException > will be > thrown. >I am confused that why not using this logic : if there is a Segments, then > append it,

parameter create in IndexWriter

2006-09-06 Thread jacky
hi, Today I found a funny thing, if "create" parameter in IndexWriter set false, and there is no index Segements in the Directory before, then an IOException will be thrown. I am confused that why not using this logic : if there is a Segments, then append it, create it otherwise. I kno

Re: Atomic index/search for a phrase

2006-09-06 Thread Venkateshprasanna
Which is more efficient with respect to performance? Indexing a phrase as it is and searcing with the help of a TermQuery OR Storing only single words in index and making use of quoted search phrases? Regards, Venkateshprasanna If you index "A Phrase" as untokenized, you would find i

Re: Update index

2006-09-06 Thread Doron Cohen
"WATHELET Thomas" <[EMAIL PROTECTED]> wrote on 23/08/2006 00:49:25: > Is it possible to update fields in an existing index. > If yes how to proceed. Unfortunately no. To update a (document's) field that document must be removed and re-added. -

Re: Indexer large file and hi performance indexing

2006-09-06 Thread Doron Cohen
"HODAC, Olivier" <[EMAIL PROTECTED]> wrote on 06/09/2006 03:04:15: > > hello, > > I design an application which bottleneck concerns the indexing > process. Objects indexation blocks the user's action. Furthermore, I > need to index a large maount of documents (3 per day) and save > them on the

Re: parser question

2006-09-06 Thread Mark Miller
yes its ANDing them. Doing the query 'software engineer', 'software OR engineer', 'software AND engineer' all return the same results. the generated queries for them respectively are '(field:software field:engineer)', '(field:software field:engineer)' and '(+field:software +field:engineer)'.

Re: parser question

2006-09-06 Thread Erick Erickson
Also, watch your query, the OR is case-sensitive. If you lowercase the entire string, the 'OR' gets lowercased too, in which case it's not interpreted as an operator. On 9/6/06, Chris Salem <[EMAIL PROTECTED]> wrote: i set the default operator to AND, but if i have a query with an OR in it it d

Re: parser question

2006-09-06 Thread Mark Miller
Are you sure it is anding them? field:software field:engineer indicates an OR operation. +field:software +field:engineer indicates an AND operation. - Mark Chris Salem wrote: i set the default operator to AND, but if i have a query with an OR in it it doesn't work, for example, if i ha

parser question

2006-09-06 Thread Chris Salem
i set the default operator to AND, but if i have a query with an OR in it it doesn't work, for example, if i have the query 'software OR engineer' the parser interprets it as 'field:software field:engineer' and AND's them. how would i fix this? Chris Salem 440.946.5214 x5458 [EMAIL PROTECTED]

Re: How does PhraseQuery search for quoted phrases?

2006-09-06 Thread Erik Hatcher
On Sep 6, 2006, at 5:37 AM, Venkateshprasanna wrote: How does PhraseQuery search for quoted phrases when the index does not store these phrases as it is? Is there any analyzer that indexes the phrases? The analyzer is responsible for some additional data about terms, specifically in this

Indexer large file and hi performance indexing

2006-09-06 Thread HODAC, Olivier
hello, I design an application which bottleneck concerns the indexing process. Objects indexation blocks the user's action. Furthermore, I need to index a large maount of documents (3 per day) and save them on the file system. The first developments have been initiate with lucene 1.4.3 and

Re: Keep hits in results

2006-09-06 Thread jacky
Oh, that is not a good news. So when the index is updated, the held searcher will get the old documents? That is indeed not a good idea to keep the hits. I had thought lucene did this work for users. Best Regards. jacky - Original Message - From: "Erik Hatcher" <

How does PhraseQuery search for quoted phrases?

2006-09-06 Thread Venkateshprasanna
How does PhraseQuery search for quoted phrases when the index does not store these phrases as it is? Is there any analyzer that indexes the phrases? -- View this message in context: http://www.nabble.com/How-does-PhraseQuery-search-for-quoted-phrases--tf2225757.html#a6167885 Sent from the Luce

Re: Keep hits in results

2006-09-06 Thread Erik Hatcher
On Sep 6, 2006, at 4:41 AM, jacky wrote: Erik, thanks! You are right! It is not a good idea to hold on a hits since the index will be updated. So, when i keep a hits, and then the index is updated, the searcher will be auto-updated,too. rigtht? No, Lucene itself has no "auto-update" of

Re: Keep hits in results

2006-09-06 Thread jacky
Erik, thanks! You are right! It is not a good idea to hold on a hits since the index will be updated. So, when i keep a hits, and then the index is updated, the searcher will be auto-updated,too. rigtht? Best Regards. jacky - Original Message - From: "Erik Hatch

Re: which way to index pdf,word,excel

2006-09-06 Thread Christiaan Fluit
Have a look at Aperture: http://aperture.sourceforge.net/ It provides components for crawling and text and metadata extraction. It's still in alpha stage though. The development code in CVS has already improved a lot over the last official alpha release. Chris -- James liu wrote: i wanna fin

Re: Keep hits in results

2006-09-06 Thread Erik Hatcher
On Sep 6, 2006, at 12:56 AM, jacky wrote: hi, The following words are quoted from "lucene in action": "There are a couple of implementation approaches: 1. Keep the original Hits and IndexSearcher instances available while the user is navigating the search results. 2. Requery each time

Re: obtaining the number of documents stored in a .cfs file

2006-09-06 Thread Volodymyr Bychkoviak
One more note: this should be in package 'org.apache.lucene.index;' because it uses some package visible classes :) Volodymyr Bychkoviak wrote: I've used following code to recover index. Note: it only works with .cfs files. String path = // path to index File file = new File(path);

Re: obtaining the number of documents stored in a .cfs file

2006-09-06 Thread Volodymyr Bychkoviak
I've used following code to recover index. Note: it only works with .cfs files. String path = // path to index File file = new File(path); Directory directory = FSDirectory.getDirectory(file, false); String[] files = file.list(new FilenameFilter() { public boolean accept(File

Re: Keep hits in results

2006-09-06 Thread Doron Cohen
Yes, you're right, my mistake. (So only the stateless/stateful more/less simple consideration remains). "jacky" <[EMAIL PROTECTED]> wrote on 05/09/2006 23:57:44: > doron, Thanks! > But in lucene api: For performance reasons it is recommended to > open only one > IndexSearcher and use it for all