Re: Indexing in multi-threaded environment

2005-05-03 Thread Chris Lamprecht
Hi Sodel, You could use a single queue, where one thread pulls things off the queue and any number of threads put things on the queue. You can index say 1000 documents each to RAMDirectories in multiple threads, then enqueue the RAMDirectories. When the queue reaches a certain size, the single t

Re: Indexing of virtual "made up" documents

2005-05-03 Thread Erik Hatcher
On Apr 30, 2005, at 7:01 AM, Daniel Stephan wrote: Erik, thank you very much for your help! I am not in the position to build the indexing (other features are in line before that), yet, but I will try Lucene for it. Looks very good :) What I did not ask the last time, because it just occurred to

Re: PerFieldSimilarity

2005-05-03 Thread Erik Hatcher
On May 3, 2005, at 5:57 PM, Robichaud, Jean-Philippe wrote: Hi Everyone, I've been searching the archive without success to answer this one: is it possible to specify one similarity class per field, just like we can do with an analyzer ? I know I can change the similarity of the searcher, bu

indexing to multiple indexes

2005-05-03 Thread Omar Didi
Hi guys, I am writing an indexing applications that writes to 4 different indexes. the way it work is the following: I write to one index every 1 documents and then close that index and call optimize() as well, at the same time I write to the second index and close it after 1 docs and s

RE: Implementation of a ScoreObject ?

2005-05-03 Thread Robichaud, Jean-Philippe
I would gladly help. I fear that my Java skills are probably a little limited for the task, but hey, why not. I would certainly need some guidance as to where to start from. I'm just to unfamiliar with complexes queries structures and scoring methodology. While I'm pretty sure reading the entir

PerFieldSimilarity

2005-05-03 Thread Robichaud, Jean-Philippe
Hi Everyone, I've been searching the archive without success to answer this one: is it possible to specify one similarity class per field, just like we can do with an analyzer ? I know I can change the similarity of the searcher, but that restrict me to break some complex queries into different

Re: Compass 0.4 Released

2005-05-03 Thread Gusenbauer Stefan
[EMAIL PROTECTED] wrote: > Hi, > > " >We are please to announce the initial release of Compass, a new > concept in semantic Search Engine/Object Mapping (OSEM) technology. >Compass is a Java framework, built on top of the Lucene Search > Engine, making it simple to map your Java object m

RE: Indexing in multi-threaded environment

2005-05-03 Thread Mufaddal Khumri
Hi , The calls to the IndexWriter.addIndexes is synchronized. Your code should not have to do anything more than just calling it. I believe roughly this will be the scenario that you are looking for: - while(there is more data) - spawn a thread to handle creating documents for this data

Fwd: getting document metadata

2005-05-03 Thread Pablo Gomes Ludermir
Forgot to send to the list. -- Forwarded message -- From: Pablo Gomes Ludermir <[EMAIL PROTECTED]> Date: May 3, 2005 9:07 PM Subject: Re: getting document metadata To: Luke Shannon <[EMAIL PROTECTED]> I actually would like to have a single field on the Document object, named CON

Re: getting document metadata

2005-05-03 Thread Luke Shannon
Hi Pablo; Can you give a little more detail? I don't understand what you mean when you say "indexing the path when adding the document to the index". If you get a Lucene document using LucenePDFDocument class (http://www.pdfbox.org/javadoc/index.html), the document returned will contain a field

RE: Indexing in multi-threaded environment

2005-05-03 Thread Peter Veentjer - Anchor Men
You should only give a single thread access to the indexwriter. I have created a indexupdater that stores all the delete and write requests and once and a while a thread (triggered by Quartz) processes the requests in a single batch. another way would be synchronizing the indexupdater and only

getting document metadata

2005-05-03 Thread Pablo Gomes Ludermir
Hello all, I would like to retrieve some document metadata after the search, i.e. the documents that are returned in the Hits would be PDFs and I would be able to get some info using PDFBox. But I am not sure about indexing the path when adding the document to the index (I do some processing with

Indexing in multi-threaded environment

2005-05-03 Thread Sodel Vazquez-Reyes
Hi, I am starting my application in multi-threaded environment, could somebody show me any examples with serialize calls to the IndexWriter.addDocument(Document)? because my idea is to use RAMDirectory based in parallel, one in each thread, and merges them into a single index on the disk using Ind

Re: HTML parser??

2005-05-03 Thread Damian Gajda
Hello, > This documents are in german. In this documents are different special > characters, and different ways of writing this special characters, like "Ã", > "ö" and "ö". Do somebody know a parsing engine that has no problems > with all this different ways to write this special characters? I've

Re: Compass 0.4 Released

2005-05-03 Thread Erik Hatcher
On May 3, 2005, at 8:18 AM, Joseph B. Ottinger wrote: Well, it's been on theserverside.com for about thirty minutes already... :) This message was delayed as it was sent from an unsubscribed address and I had to moderate it in. So I saw it first :) Erik On Tue, 3 May 2005 [EMAIL PROTECTE

Re: HTML parser??

2005-05-03 Thread Erik Hatcher
On May 3, 2005, at 4:35 AM, Bartosch Warzecha wrote: Hello, I´m building a search engine for HTML-Dokuments, and I´ve got a HTML-parsing problem. This documents are in german. In this documents are different special characters, and different ways of writing this special characters, like "ö", "

Re: Compass 0.4 Released

2005-05-03 Thread Joseph B. Ottinger
Well, it's been on theserverside.com for about thirty minutes already... :) On Tue, 3 May 2005 [EMAIL PROTECTED] wrote: Hi, " We are please to announce the initial release of Compass, a new concept in semantic Search Engine/Object Mapping (OSEM) technology. Compass is a Java framework, built

Compass 0.4 Released

2005-05-03 Thread kimchy . compass
Hi, " We are please to announce the initial release of Compass, a new concept in semantic Search Engine/Object Mapping (OSEM) technology. Compass is a Java framework, built on top of the Lucene Search Engine, making it simple to map your Java object model into a Search Engine. Providing

HTML parser??

2005-05-03 Thread Bartosch Warzecha
Hello, I´m building a search engine for HTML-Dokuments, and I´ve got a HTML-parsing problem. This documents are in german. In this documents are different special characters, and different ways of writing this special characters, like "ö", "ö" and "ö". Do somebody know a parsing engine that has