Re: Lucene search returns Zip file Name

2007-01-05 Thread Aslam Bari
Hi Erick, Thanks for reply, Can you tell me how to change indexer to store custom fields so that i can store two new fields like "Main File Name" and "Real File Name". so i can store Zip File name as Main file and Actual file where data found in Real File. Thanks... Erick Erickson wrote: > > Y

Re: Is DWF extractor for Lucene exists?

2007-01-05 Thread Aslam Bari
Hi Grant, Thanks for reply, Can you tell me how to change indexer to store custom fields so that i can store two new fields like "Main File Name" and "Real File Name". so i can store Zip File name as Main file and Actual file where data found in Real File. Thanks... Grant Ingersoll-6 wrote: > >

Standards Complaint Browser Campaign

2007-01-05 Thread EDMOND KEMOKAI
Hi Guys Sorry about the off-topic posting but I thought the mailing list consist of the audience for this campaign. A campaign has been launch to try to encourage web developers and webmasters from using IE hacks to obscure the browsers shortcomings. Please read the appeal from the address below

Re: Memory Allocation

2007-01-05 Thread Erick Erickson
Only by specifying the memory allowed for the JVM as far as I know (-Xmx). Erick On 1/5/07, Ryan O'Hara <[EMAIL PROTECTED]> wrote: I just started using SpellChecker and I have encountered a java heap space exception, which may or may not be related to using SpellChecker. I was wondering if t

Re: All readers must have same maxDoc: 16651064!=16507074

2007-01-05 Thread Rob Staveley (Tom)
Oh I get it now. That was a great explanation. Thanks Doron. Doron Cohen wrote: Rob, "Rob Staveley (Tom)" <[EMAIL PROTECTED]> wrote on 05/01/2007 06:18:10: I'm attempting to delete documents matching a term on a ParallelReader and got the error message above, presumably while addi

RE: Clustering Lucene with 40 Servers

2007-01-05 Thread Steve Harris
I don't know if terracotta is the right solution for you or not but... --> Their examples talk of a 4 node cluster. This is way too small for my needs. No 4 node limit. It was just a sample probably. Cheers, Steve -Original Message- From: Biggy [mailto:[EMAIL PROTECTED] Sent: Wednesday

Re: All readers must have same maxDoc: 16651064!=16507074

2007-01-05 Thread Doron Cohen
Rob, "Rob Staveley (Tom)" <[EMAIL PROTECTED]> wrote on 05/01/2007 06:18:10: > I'm attempting to delete documents matching a term on a ParallelReader and > got the error message above, presumably while adding directories to the > ParallelReader. See the javadocs for ParallelReader - http://lucene.

Memory Allocation

2007-01-05 Thread Ryan O'Hara
I just started using SpellChecker and I have encountered a java heap space exception, which may or may not be related to using SpellChecker. I was wondering if there is a way to allocate the amount of memory Lucene uses during a search? Thanks, Ryan --

All readers must have same maxDoc: 16651064!=16507074

2007-01-05 Thread Rob Staveley (Tom)
I'm attempting to delete documents matching a term on a ParallelReader and got the error message above, presumably while adding directories to the ParallelReader. I'm puzzled, because I don't need to have the same maxDoc (and numDoc) in index directories for a ParallelMultiSearcher, so what's the

Re: Query

2007-01-05 Thread Erik Hatcher
On Jan 5, 2007, at 4:25 AM, maarsh wrote: Hi, i am not getting this error when i use lucene1.4.3 . but only with lucene2.0.0 . is there something i need to do for this Certainly there is something you'll need to do to get it working :) But as far as the problem being something Lucene-spe

Re: efficient ways of updating document

2007-01-05 Thread Erick Erickson
Also, look at the IndexModifier class, which hides a number of the ugly details. Under the covers, I think (although I haven't looked) that it pretty much does exactly what you outlined, at least the recommendation that you do your deletes and adds in batches leads me to think so. Best Erick On

Re: Lucene search returns Zip file Name

2007-01-05 Thread Erick Erickson
You only get things out of an index that you put in there. At index time, you need to associate file names with content. Something like indexing the text of each file in the zip file as a separate lucene document, perhaps with the associated zip file name and the real file name. Best Erick On 1

Re: Is DWF extractor for Lucene exists?

2007-01-05 Thread Grant Ingersoll
Lucene works with text; we don't provide any custom extractors. You will need to parse DWF (I'm not even sure what a DWF file is) and construct the Document/Fields that you want based on your content. -Grant On Jan 5, 2007, at 7:09 AM, Aslam Bari wrote: Dear All, I need to index and sear

Re: Lucene index update

2007-01-05 Thread Erick Erickson
Try http://www.getopt.org/luke/ or just google lucene luke. If you're going to be playing with the indexes, I really, really, really recommend that you get a copy of Luke and get familiar with it. It allows you to examine your indexes, explains queries, shows the effects of different analyzers on

Re: efficient ways of updating document

2007-01-05 Thread Marcelo Ochoa
John: I had implemented a batch (delete,insert,update) operation for the Oracle Lucene Domain Index using OJVMDirectory, see patch: http://issues.apache.org/jira/browse/LUCENE-724 The strategy used in this solution is to enqueue all operations on the table which have a column indexed by Lucene

Is DWF extractor for Lucene exists?

2007-01-05 Thread Aslam Bari
Dear All, I need to index and search DWF files. Is it possible in Lucene. If yes, Then which extractor should i use to index the DWF files. Thanks... -- View this message in context: http://www.nabble.com/Is-DWF-extractor-for-Lucene-exists--tf2925205.html#a8176977 Sent from the Lucene - Java Us

Lucene search returns Zip file Name

2007-01-05 Thread Aslam Bari
Dear all, I m using lucene to index zip files. Suppose a zip file contains 4 files. All files get indexed well with uri of Zip file means when i search for any content the result comes and the resutl file name is zip file, but i need to know the real file name in which the data found. How to get t

Re: getting the maximum Hits doc

2007-01-05 Thread Nils Höller
Am Donnerstag, den 04.01.2007, 14:38 -0600 schrieb Dennis Kubes: > Hits should be sorted according to score. Getting the first document > should give you the one with the highest score. > > Dennis Hi thanks for your comment, I found the mistake. Now the Hits are sorted. Nils > > Nils Höl

Re: getting the maximum Hits doc

2007-01-05 Thread Erik Hatcher
Maybe your MySearcher is doing something different than the "Hits IndexSearcher.search()" method? On Jan 4, 2007, at 3:38 PM, Dennis Kubes wrote: Hits should be sorted according to score. Getting the first document should give you the one with the highest score. Dennis Nils Höller wrote

Re: Lucene index update

2007-01-05 Thread Ivan Vasilev
Thanks a lot for your help Erick, I make simple class that uses your idea about extracting data that is indexed but unstored by using TermDocs/TermEnums classes. It works properly. Yes I also think that in some cases the approach with re-indexing documents will be faster. It depends on the docu

Re: Query

2007-01-05 Thread maarsh
Hi, i am not getting this error when i use lucene1.4.3 . but only with lucene2.0.0 . is there something i need to do for this regards maarsh On 1/4/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: This is not a Lucene issue, but rather a 3rd party tool you're using, which seems to have instrumente

Re: efficient ways of updating document

2007-01-05 Thread Otis Gospodnetic
John, Batch deletion followed by batch addition is the best practise. Ning Li made some changes that improve the performance of non-batch mass delete/add operations, but I'm not sure what the state of those changes is (whether they are still in Lucene's JIRA, or whether they are in CVS). Otis