Re: constructing smaller phrase queries given a multi-word query

2006-10-19 Thread Chris Hostetter
: eg. "rowling goblet of fire" - need to match rowling in 1 field & : "goblet of fire in another : "hilary duff most wanted" - need to match "hilary duff" in 1 field & : "most wanted" in another : > Why not just index those separate fields into the yet a third field and : > search there? : > : >

Re: constructing smaller phrase queries given a multi-word query

2006-10-19 Thread Mekin Maheshwari
On 10/19/06, Erick Erickson <[EMAIL PROTECTED]> wrote: What is the use case you're trying to solve? It doesn't make sense to me that you want to take a query from a user and split it over fields under the covers. Well I am planning on doing exactly that, given that we have seen some amount of u

Re: Lucene document length

2006-10-19 Thread Daniel Noll
Mª Paz Belmonte López wrote: Hello, I´m working with Lucene. I need to get the documents length. I had seen the documentations and I don´t find anything. What exactly do you mean by the length of a "document"? A document has a number of fields, and you could get the length of a field if the

Re: Don't use the same index for updating and searching

2006-10-19 Thread Chris Hostetter
This doesn't really sound right ... were you by any chance using NFS (or some other network storage mechanism) in the orriginal implimentation? : So, this turned out not to be stable! After a while (a day or so, or two), : any index would get corrupted, because a segment would disappear. : : I c

RE: implementing our own Scorer (BM25)

2006-10-19 Thread J.Zhu
Hi, we are having a discussion in java-dev@lucene.apache.org about implementing probabilistic language modelling approaches such as BM25 in Lucene. Hope you can join us there. Jianhan -Original Message- From: beatriz ramos [mailto:[EMAIL PROTECTED] Sent: 19 October 2006 16:36 To: java-u

Re: implementing our own Scorer (BM25)

2006-10-19 Thread beatriz ramos
Excuse me, I don't want to write a very long email. This is the BM25 Scorer formule: log((N-f+0.5)/(f+0.5)) · (k1 + 1) · c / (c+k1·( (1-b)+b·l/L)) where N = total number of documents f = inverse frecuency (number of documents which contain the term)

Lucene document length

2006-10-19 Thread Mª Paz Belmonte López
Hello, I´m working with Lucene. I need to get the documents length. I had seen the documentations and I don´t find anything. Do you have any idea? Thanks. __ LLama Gratis a cualquier PC del Mundo. Llamadas a fijos y móviles desde

Don't use the same index for updating and searching

2006-10-19 Thread Hes Siemelink
Hi, I posted a while ago about sudden FileNotFoundExceptions. In a nutshell: my Lucene index went corrupt after a couple of days under heavy load on a Linux server with missing segment files. The problem kept occuring, but I haven't found the cause. I couldn't reproduce it with a simulated load o

Re: implementing our own Scorer (BM25)

2006-10-19 Thread Grant Ingersoll
Please provide more information about what you have done so far. On Oct 19, 2006, at 9:10 AM, beatriz ramos wrote: Hello, I'm trying to implement my own scoring algorithm with Lucene but I don't get any results. Lucene documentation explains how to implement new scoring, modifying Query,

RE: Preventing merging by IndexWriter

2006-10-19 Thread Johan Stuyts
> I just searched for 'faceted' on the e-mails I've seen since > I subscribed to > the list, and there are certainly discussions out there... I did already, but... > This thread might be particularly useful, started 15-May-2006 > *Aggregating category hits it seems I missed this one. Thanks.

implementing our own Scorer (BM25)

2006-10-19 Thread beatriz ramos
Hello, I'm trying to implement my own scoring algorithm with Lucene but I don't get any results. Lucene documentation explains how to implement new scoring, modifying Query, Weight and Scorer classes. I have tried this but doesn't work Do you have any idea? I need some example to understand the

Re: termpositions at index time...

2006-10-19 Thread Erick Erickson
Thanks. That's very similar to what we're doing, and I'd love to see some technical details too... Erick On 10/19/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: On Oct 18, 2006, at 4:50 PM, Erick Erickson wrote: > We're indexing books. I need to > a> return books ordered by relevancy > b> for an

Re: constructing smaller phrase queries given a multi-word query

2006-10-19 Thread Erick Erickson
What is the use case you're trying to solve? It doesn't make sense to me that you want to take a query from a user and split it over fields under the covers. Why not just index those separate fields into the yet a third field and search there? Or why not just put it all into one field in the fir

Re: farsi parser

2006-10-19 Thread Pierrick Brihaye
Hi, pc123 a écrit : Do we have a farsi parser available? If "arabic" parser is available, how can i customize it to farsi/persian. If you are talking about Aramorph for Java (http://www.nongnu.org/aramorph/english/index.html), you are on the wrong track. Aramorph is a *morphological* analyz

Delivery Status Notification (Failure)

2006-10-19 Thread postmaster
This is an automatically generated Delivery Status Notification. Delivery to the following recipients failed. java-user@lucene.apache.org Reporting-MTA: dns;av.mimer.no Received-From-MTA: dns;av.mimer.no Arrival-Date: Tue, 17 Oct 2006 06:52:24 +0200 Final-Recipient: rfc822;java-user@lu

farsi analyser

2006-10-19 Thread pc123
sorry i meant farsi analyser instead of farsi parser. -- View this message in context: http://www.nabble.com/farsi-analyser-tf2472949.html#a6895440 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To uns

farsi parser

2006-10-19 Thread pc123
Do we have a farsi parser available? If "arabic" parser is available, how can i customize it to farsi/persian. -- View this message in context: http://www.nabble.com/farsi-parser-tf2472916.html#a6895371 Sent from the Lucene - Java Users mailing list archive at Nabble.com. -

RE: NativeFSLockFactory problem

2006-10-19 Thread Frank Kunemann
Hi Mike, no problem. Just good to know its not my fault this time... ;) Regards, Frank -Original Message- From: Michael McCandless [mailto:[EMAIL PROTECTED] Sent: Thursday, October 19, 2006 12:03 PM To: java-user@lucene.apache.org Subject: Re: NativeFSLockFactory problem Frank Kuneman

Re: termpositions at index time...

2006-10-19 Thread Erik Hatcher
On Oct 18, 2006, at 4:50 PM, Erick Erickson wrote: We're indexing books. I need to a> return books ordered by relevancy b> for any single book, return the number of hits in each chapter (which, of course, may be many pages). I think your application deserves a good look at XTF:

Re: NativeFSLockFactory problem

2006-10-19 Thread Michael McCandless
Frank Kunemann wrote: Hi all, I'm trying to use the new class NativeFSLockFactory, but as you can guess I have a problem using it. Don't know what I'm doing wrong, so here is the code: There is a serious bug with NativeFSLockFactory as it now stands -- it's precisely the issue you've come ac