RE: Linking two different indexes

2007-03-25 Thread Damien McCarthy
Hi Mike, IndexReader provides a method addIndex() which should do what you are looking for, if I understand correctly. Damien -Original Message- From: Yakn [mailto:[EMAIL PROTECTED] Sent: 25 March 2007 03:02 To: java-user@lucene.apache.org Subject: Linking two different indexes I am t

Re: MergeFactor and MaxBufferedDocs value should ...?

2007-03-25 Thread Erick Erickson
I should add that in my situation, the number of documents that fit in ram is...er...problematical to determine. My current project is composed of books that I chose to index as a single book at a time. Unfortunately, answering the question "how big is a book" doesn't help much, they range from 2

RE: Linking two different indexes

2007-03-25 Thread Yakn
Thanks Damien, I believe that addIndex(index) is only going to add the index into the new indexes. But how do I actually link the document either at search time or index time from the url in the database indexes and the Nutch index? So to explain my problem a little better Nutch Index

Re: Reverse search

2007-03-25 Thread markharw00d
On app startup: 1) parse all Queries and place in an array. 2) Create a RAMIndex containing a doc for each query with content consisting of the query's terms (see Query.extractTerms). For optimal performance only index the most rare term for queries with multiple mandatory criteria e.g. Phrase

Re: index word files ( doc )

2007-03-25 Thread Antony Bowesman
I've been using Ryan's textmining in prefence to the POI as internally TM uses POI and the Word6 extractor so handles a greater variety of files. Ryan, thanks for fixing your site. Do you have any plans/ideas on how to parse the 'fast-saved' files and any ideas on Word files older than the Wor

Re: index word files ( doc )

2007-03-25 Thread Ryan Ackley
Yes I do have plans for adding fast save support and support for more file formats. The time frame for this happening is the next couple of months. I'm playing with the idea of offering a commercial version. I want to continue to support the open source community so I want to keep it open source

Re: index word files ( doc )

2007-03-25 Thread Daniel Noll
Ryan Ackley wrote: As the author of both Word POI and textmining.org, I recommend using textmining.org. POI is for general purpose manipulation of Word documents. textmining's only purpose is extracting text. I wish the two would collaborate though. It's true that POI contains code for writin

Re: Linking two different indexes

2007-03-25 Thread Daniel Noll
Yakn wrote: Thanks Damien, I believe that addIndex(index) is only going to add the index into the new indexes. But how do I actually link the document either at search time or index time from the url in the database indexes and the Nutch index? So to explain my problem a little better Nutch Inde

how to search over another search

2007-03-25 Thread Mohammad Norouzi
hi I have two separated index but there are some fields that are common between them. now I want to search from one index and then apply the result to the second one. what solution do you suggest? what happens on fields? I mean the first document has some fields that are not present in the second

Re: index word files ( doc )

2007-03-25 Thread Antony Bowesman
Ryan Ackley wrote: Yes I do have plans for adding fast save support and support for more file formats. The time frame for this happening is the next couple of months. That would be good when it comes. It would be nice if it could handle a 'brute force' mode where in the event of problems, it