Lucene in clustered environment (Tomcat)

2005-06-06 Thread Ben
Hi I would like to use Lucene in a clustered environment, what are the things that I should consider and do? I would like to use the same ordinary index storage for all the nodes in the the cluster, possible? Thanks, Ben - To u

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Chris Hostetter
: was computing the score. This was a big performance gain. About 2x and : since its the slowest part of our app it was a nice one. :) : : We were using a TermQuery though. I believe that one search on one BooleanQuery containing 20 TermQueries should be faster then 20 searches on 20 TermQuerie

RE: Is there a thai language analyzer available

2005-06-06 Thread Alex Kiselevski
Randy, Did you find any solution for Hebrew analyzer ? Alex Kiselevski +9.729.776.4346 (desk) +9.729.776.1504 (fax) AMDOCS > INTEGRATED CUSTOMER MANAGEMENT -Original Message- From: Randy Darling [mailto:[EMAIL PROTECTED] Sent: Monday, June 06, 2005 10:22 PM To: java-user@lucene.apac

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Kevin Burton
Matt Quail wrote: We have a system where I'll be given 10 or 20 unique keys. I assume you mean you have one unique-key field, and you are given 10-20 values to find for this one field? Internally I'm creating a new Term and then calling IndexReader.termDocs() on this term. Then if te

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Kevin Burton
Chris Hostetter wrote: I haven't profiled either of thse suggestions but: 1) have you tried constructing a BooleanQuery of all 10-20 terms? Is the total time to execute the search, and access each Hit slower then your termDocs approach? Actually using any type of query was very slow. Th

RE: RE REQUEST: SPECIFIC HIT

2005-06-06 Thread Karthik N S
Hi Apologies. The problem is not with Optics or 'O' , Since the 3rd and 6th Document is Indexed as Document 3 contains = ELECTRONICS DIGITAL CAMERA 0PTICS Document 6 contains = ELECTRONICS DIGITAL CAMERA OPTICS CABEL A search Criteria for 'digital camera Optics' should return

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Matt Quail
We have a system where I'll be given 10 or 20 unique keys. I assume you mean you have one unique-key field, and you are given 10-20 values to find for this one field? Internally I'm creating a new Term and then calling IndexReader.termDocs() on this term. Then if termdocs.next() matches

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Chris Hostetter
I haven't profiled either of thse suggestions but: 1) have you tried constructing a BooleanQuery of all 10-20 terms? Is the total time to execute the search, and access each Hit slower then your termDocs approach? 2) have you tried sorting your terms first, then opening a TermDocs on the

Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Kevin Burton
Hey. I'm trying to figure out the FASTEST way to solve this problem. We have a system where I'll be given 10 or 20 unique keys. Which are stored as non-tokenized fields within Lucene. Each key represents a unique document. Internally I'm creating a new Term and then calling IndexReader.te

Re: Calculating idf across multiple indexes

2005-06-06 Thread yahootintin . 11533894
Hmmm... I'll look into that. I thought the MultiSearcher would still need access to each index. Does the RemoteSearchable avoid that? Will it allow me to delegate searching to multiple boxes and then collate the results correctly? Thanks for the tip about the RemoteSearchable. --- java-us

log4j:WARN No appenders could be found for logger

2005-06-06 Thread afonseca
Hi! I'm newbie in java, and not a real coder. I'm implementing a digital library (windows)with 2 open sources: a server aplications called FEDORA (www.fedora.info) and a JSPs interface called ELATED (http://elated.sourceforge.net). when I start the fedora server I get: c:\fedora-2.0\server\bin>

Re: Calculating idf across multiple indexes

2005-06-06 Thread Daniel Naber
On Tuesday 07 June 2005 00:49, [EMAIL PROTECTED] wrote: > The problem is that if I tell Lucene about only one of the indexes > it has no way of knowing what the total document frequency is across the > other index servers. Can't you use ParallelMultiSearcher and/or RemoteSearchable? Regards Dan

Re: Calculating idf across multiple indexes

2005-06-06 Thread yahootintin . 11533894
Hi Daniel, The problem is that if I tell Lucene about only one of the indexes it has no way of knowing what the total document frequency is across the other index servers. Does that make sense? I think my collator will need to calculate the idf somehow. Thanks. --- java-user@lucene.apa

Re: Calculating idf across multiple indexes

2005-06-06 Thread Daniel Naber
On Tuesday 07 June 2005 00:02, [EMAIL PROTECTED] wrote: >  How are others working around > this issue? This has been fixed in the development version of Lucene. It's already quite stable, so I suggest to try it (needs to be checked out from SVN). Regards Daniel -- http://www.danielnaber.de

Calculating idf across multiple indexes

2005-06-06 Thread yahootintin . 11533894
Hi, Due to the size of my index, I need to break it into several different segments. I have a service that gets a query from the user and contacts each index searcher service asynchronously and waits for the results. The results are then collated and returned to the user. The problem is tha

Relative term frequency?

2005-06-06 Thread Andy Liu
Is there a way to calculate term frequency scores that are relative to the number of terms in the field of the document? We want to override tf() in this way to curb keyword spamming in web pages. In Similarity, only the document's term frequency is passed into the tf() method: float tf(int freq

issues with concurrent indexing and searching with HitCollector

2005-06-06 Thread Peter Kim
Hi, I did a quick google search and couldn't find any info on this... I seem to be having a problem when I try to execute a search using a HitCollector while the index is being indexed. Does it make sense that I could be getting this error because the index is being merged while the HitCollector i

Is there a thai language analyzer available

2005-06-06 Thread Randy Darling
Does anyone know of a good Thai language analyzer for Lucene? I saw this email out there for a Thai language analyzer: http://mail-archives.apache.org/mod_mbox/jakarta-lucene-dev/200402.mbox/ [EMAIL PROTECTED] But it looks like it requires modifying the existing parser which I would prefer no

Re: Indexing and Hit Highlighting OCR Data

2005-06-06 Thread Steven Rowe
There is a proposal to extend indexing (item #11 in the API Changes section): http://wiki.apache.org/jakarta-lucene/Lucene2Whiteboard An excerpt: 11. (Hard) Make indexing more flexible, so that one could e.g., not store positions or even frequencies, or alternately, to store extra inf

Free IR testbed

2005-06-06 Thread Andrew Boyd
Can someone point me to a free ir testbed? I was hoping for a testbed that has at least 500k+ documents. I did see TRAC which it looks like a for pay test bed. Thanks, Andrew - To unsubscribe, e-mail: [EMAIL PROTECTED] For add

Re: searches and updates concurrency problem

2005-06-06 Thread Daniel Naber
On Monday 06 June 2005 11:11, Stefano Buliani wrote: > My problem is that the index update procedure and the searches could run > simultaneously, and, if they do, they corrupt the index file. Search is a read-only thing, so why should it corrupt the index? Even having several writers at the same

Re: searches and updates concurrency problem

2005-06-06 Thread Aalap Parikh
Hi, As per my understanding of Lucene, I think concurrent search and update to an index should not corrupt the index, given that only a single index-modifying operation is executing at any point of time. So in short, you can have multiple search operations and not more than one index update (add a

Re: RE REQUEST: SPECIFIC HIT

2005-06-06 Thread Paul Elschot
On Monday 06 June 2005 08:40, Karthik N S wrote: > Hi > > Guys. > > Apologies. > > with refrence to my last main dted Mon, 14 Mar 2005 > > http://mail-archives.apache.org/mod_mbox/lucene-java-user/200503.mbox/%3COBE > [EMAIL PROTECTED] > > I would like to again request some Help in the se

Re: FieldCache and Sort

2005-06-06 Thread Yonik Seeley
Why do we keep the lookup array around? The actual field value is needed to sort results from multiple searchers (multisearcher). -Yonik On 6/1/05, John Wang <[EMAIL PROTECTED]> wrote: > Hi: > >In the current Lucene sorting implementation, FieldCache is used to > retrieve 2 arrays, the looku

RE: deleting on a keyword field

2005-06-06 Thread Max Pfingsthorn
Hi! Thanks for all the replies. I do know that the readers should be reopened, but that is not the problem. I try to remove some docs, and add their new versions again to incrementally update the index. After updating the index with the same document twice, I opened the index in luke. There I s

Re: searches and updates concurrency problem

2005-06-06 Thread Thomas Plümpe
Thanks Stefano, well done. It's not for sure that the concurrently running updates and searches do corrupt the index (a full update that removes the files and rebuilds them can obviously do so), but I'm sure they'll get the point and will tell us how things like that are handled. Best, Thomas >

Re: searches and updates concurrency problem

2005-06-06 Thread Maik Schreiber
> My problem is that the index update procedure and the searches could run > simultaneously, and, if they do, they corrupt the index file. > Is there a way to let Lucene handle this concurrency automatically (like > stop the searches till the update is finished)? Lucene does not handle this by its

searches and updates concurrency problem

2005-06-06 Thread Stefano Buliani
Hi everyone, I'm a newbie of Lucene, just installed it. My problem is that the index update procedure and the searches could run simultaneously, and, if they do, they corrupt the index file. Is there a way to let Lucene handle this concurrency automatically (like stop the searches till the update i

RE REQUEST: SPECIFIC HIT

2005-06-06 Thread Karthik N S
Hi Guys. Apologies. with refrence to my last main dted  Mon, 14 Mar 2005 http://mail-archives.apache.org/mod_mbox/lucene-java-user/200503.mbox/[EMAIL PROTECTED]  I would like to again request some Help in the search Concepts. I have Indexed documents sucessfully and they would be Docume