Re: Lucene scoring: coord_q_d factor

2006-12-14 Thread Karl Koch
: java-user@lucene.apache.org Betreff: Re: Lucene scoring: coord_q_d factor > Karl Koch wrote: > > If I do not misunderstand that extract, I would say it suggests the > combination of coordination level matching with IDF. I am interested in your > view and those who read this? &

Re: Lucene scoring: coord_q_d factor

2006-12-13 Thread Karl Koch
oord_q_d factor > On Wednesday 13 December 2006 16:42, Karl Koch wrote: > > Do you know about any papers that discuss this? > > Coordination is called co-ordination In the original idf paper by > K. Spärck Jones, A statistical interpretation of term specificity > and

Re: Lucene scoring: coord_q_d factor

2006-12-13 Thread Karl Koch
Do you know about any papers that discuss this? Karl Original-Nachricht Datum: Wed, 13 Dec 2006 10:31:41 -0500 Von: "Yonik Seeley" <[EMAIL PROTECTED]> An: java-user@lucene.apache.org Betreff: Re: Lucene scoring: coord_q_d factor > On 12/13/06, Karl Koc

Re: Lucene scoring: coord_q_d factor

2006-12-13 Thread Karl Koch
Betreff: Re: Lucene scoring: coord_q_d factor > Karl Koch wrote: > > Is there any other paper that actually shows the benefit of doing > > this particular normalisation with coord_q_d? I am not suggesting > > here that it is not useful, I am just looking for evidence how the

Re: Lucene scoring: coord_q_d factor

2006-12-12 Thread Karl Koch
: java-user@lucene.apache.org Betreff: Re: Lucene scoring: coord_q_d factor > Karl Koch wrote: > > The coord(q,d) normalisation is "a score factor based on how many of > > the query terms are found in the specified document." and described > > here: > > >

Lucene scoring: coord_q_d factor

2006-12-12 Thread Karl Koch
Hello group, The coord(q,d) normalisation is "a score factor based on how many of the query terms are found in the specified document." and described here: http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html#formula_coord Does this have a theoretical base? On what b

Lucene scoring: Term frequency normalisation

2006-12-12 Thread Karl Koch
Hi, I have a question about the current Lucene scoring algoritm. In this scoring algorithm, the term frequency is calcualted by using the square root of the number of occuring terms as described in http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html#formula_tf Havi

Re: Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-12 Thread Karl Koch
Hello Doron (and all the others who read here):), thank you for your effort and your time. I really appreciate it. :) I understand why normalisation is done in general. Mainly, to normalise the bias of oversized documents. In the literature I have read so far, there is usually a high effort on

Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-11 Thread Karl Koch
Well it doesn't since there is not justification of why it is the way it is. Its like saying, here is that car with 5 weels... enjoy driving. Karl Original-Nachricht Datum: Sun, 10 Dec 2006 13:12:29 -0800 Von: Doron Cohen <[EMAIL PROTECTED]> An: java-user@lucene.apache.org Be

RE: Vector Space Model <-> Probabilistic Model

2006-03-16 Thread Karl Koch
It was published by Norbert Fuhr in the IR Summer Scholl Proceedings. I found it via Google by using the small extention ext:pdf :-) that time... http://www.is.informatik.uni-duisburg.de/bib/pdf/ir/Fuhr:00a.pdf In return, you can do me also a favour and email me (personally, if you like since thi

Re: Vector Space Model <-> Probabilistic Model

2006-02-17 Thread Karl Koch
D]> > An: java-user@lucene.apache.org > Betreff: Re: Vector Space Model <-> Probabilistic Model > Datum: Thu, 16 Feb 2006 14:19:02 -0500 > > You may find some useful reading at: > http://wiki.apache.org/jakarta-lucene/InformationRetrieval > > Karl Koch wrote: >

Vector Space Model <-> Probabilistic Model

2006-02-16 Thread Karl Koch
I am looking for a comparison between the theoretical Vector Space Model and the theoretical Probabilistic Model in Information Retrieval. I know that comcrete implementations do differ from that. However, I am looking for papers that compare the performance of both in particular applications. Doe

Re-Opening IndexSearcher

2005-11-20 Thread Karl Koch
Hello, how do I close and open an IndexSearcher object in order to free resources that cause my system to throw an IOException saing "Too many open files" as well as trouble with an index lock file ? I have the following code: synchronized public static Hits search(String queryString, String[]

Urgent - File Lock in Lucene 1.2

2005-11-20 Thread Karl Koch
Hello group, I am running Lucene 1.2 and I have the following error message. I got this message when performing a search: Failed to obtain file lock on /tmp/qcop-msg-qpe I am running Lucene 1.2 on a Sharp Zaurus PDA with embedded Linux. When I look through the exceptions I have before that I ca

Re: About searching in multiple fields with one query

2005-11-14 Thread Karl Koch
of the scores from each > field). > > I like the simplicity with Lucene 1.2, and am considering porting the > compound file format back to Lucene 1.2 so it will be more robust. > > Cheers, > > Jian > > On 11/13/05, Karl Koch <[EMAIL PROTECTED]> wrote:

About searching in multiple fields with one query

2005-11-13 Thread Karl Koch
Hello all, I have a question about searching within multiple fields. I have the following code for doing that (searchFields provides two fields in which I want to search): IndexSearcher searcher = new IndexSearcher(indexDirectory); // search over multiple index fields Query query = MultiFieldQuer

Re: About Combining Scores

2005-11-13 Thread Karl Koch
Marius Kirsch <[EMAIL PROTECTED]> > An: java-user@lucene.apache.org > Betreff: Re: About Combining Scores > Datum: Sun, 13 Nov 2005 10:10:22 +0100 > > On Sun, Nov 13, 2005 at 12:04:41AM +0100, Karl Koch wrote: > > My aim is to combine this two scores. The Lucenes score i

About Combining Scores

2005-11-12 Thread Karl Koch
Hello Lucene experts, I am working on a perhaps interesting problem. I am using Lucene as an IR engine that allows users to search for documents. Additioanlly I use a user model that produces a second score. This second score represents a different aspect of document relevance based on data from a

Lucene 1.2 Score formula

2005-11-12 Thread Karl Koch
Hello experts, sorry for cross posting but this is really important for me. For documentation purposes I need to know the exact scoring formula that is used by the Lucene 1.2 release. I have found a scoring formula in the Lucene book but this is likely oriented on the 1.4 release and might have ch

Re: Question about scoring normalisation

2005-11-06 Thread Karl Koch
che Nachricht --- > Von: Ira Goldstein <[EMAIL PROTECTED]> > An: "Karl Koch" <[EMAIL PROTECTED]> > Betreff: Re: Question about scoring normalisation > Datum: Sun, 06 Nov 2005 08:08:59 -0500 > > Karl -- > Hi. I've been thinking about adding a pivoted norm

Re: Scoring formula

2005-11-05 Thread Karl Koch
be between 1 and 0 if the > highest score is greater than 1. > > -Yonik > Now hiring -- http://forms.cnet.com/slink?231706 > > > On 11/5/05, Karl Koch <[EMAIL PROTECTED]> wrote: > > Yes, the Similarity class existed in version 1.2, but no description is &

Question about scoring normalisation

2005-11-05 Thread Karl Koch
Hello all, I am wondering how many of you actually work with own scoring mechanism (overwriting Lucenes standard scoring) and how many of you do work on how to normalise this score. I would like to add a second score on top of Lucenes TF/IDF score. The resulting score is most likely higher then

Re: Scoring formula

2005-11-05 Thread Karl Koch
> --- Ursprüngliche Nachricht --- > Von: Otis Gospodnetic <[EMAIL PROTECTED]> > An: java-user@lucene.apache.org > Betreff: Re: Scoring formula > Datum: Fri, 4 Nov 2005 12:12:52 -0800 (PST) > > The formula should also be in the javadoc for Similarity class, if it > was there in 1.2. &

Scoring formula

2005-11-04 Thread Karl Koch
Hello group, the scoring formula for Lucene is well explained in "Lucene in Action". However, is this formula also valid for Lucene 1.2 (which I am using). I need to know that for documentation purposes. If not, where can I find the currect formula since I do not want to interpret if from the code

Re: Blackberry

2005-09-14 Thread Karl Koch
I have to disagree. I run Lucene 1.2 on a Sharp Zaurus PDA with Java 1.1 successfully. It is not the latest version, but basic search is no problem like this. I am not sure if it compiles with Java 1.1 (maybe not) but it certainly runs with it... I am completely sure what you mean with loading. I

BM25 with Lucene

2005-09-05 Thread Karl Koch
Hello all, did somebody here implement and run the BM25 algorithm with Lucene (perferably Lucene 1.2 but any information or even code about that would be very helpful on any Lucene version). Kind Regards, Karl -- Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko! Satte Provisi

Re: Books about Lucene?

2005-08-30 Thread Karl Koch
Hello group, thank you for all your discussion, suggestios and help. I thought I will run some investgations on that sourcecode with Lucene 1.2 and document them. With the help of chen I might be able to create a version that can do the job. Perhaps we can then create some small footprint solution

Lucene in IR Research

2005-08-26 Thread Karl Koch
Hello all, I would like to know about papers that where written and used Lucene as the unerlying search engine. E.g. Lucene as baseline search engine and some modifications to compare it with baseline Lucene system etc. Please provide links to published papers if possible. Kind regards, Karl --

Re: Books about Lucene?

2005-08-26 Thread Karl Koch
this excellent piece of Open Source - almost everybody of us whould spend months to find out what he already knows. Kind Regards, Karl > --- Ursprüngliche Nachricht --- > Von: Otis Gospodnetic <[EMAIL PROTECTED]> > An: Erik Hatcher <[EMAIL PROTECTED]>, Karl Koch > <[EMAI

Lucene 1.3 on Java 1.2 ?

2005-08-18 Thread Karl Koch
Does Lucene 1.3 theoretically run on Java 1.2 ? I have tried and got JIT errors when trying to search an index on the harddisk: --- output from Eclipse Java IDE--- A nonfatal internal JIT (3.10.107(x)) error 'chgTarg: Conditional' has occurred in : 'org/apach

MySimilarity with Lucene 1.2 ?

2005-08-18 Thread Karl Koch
Hello Lucene experts, as you might have seen in my previous postings, I am bound to use not more than Lucene 1.2 (due to hardware limitations I can only use Java 1.1 or 1.2). I would like to do my own Similarity implementation which, I think, would allow me to insert other algorithms in Lucene wh

MySimilarity with Lucene 1.2 ?

2005-08-18 Thread Karl Koch
Hello Lucene experts, as you might have seen in my previous postings, I am bound to use not more than Lucene 1.2 (due to hardware limitations I can only use Java 1.1 or 1.2). I would like to do my own Similarity implementation which, I think, would allow me to insert other algorithms in Lucene w

Re: Books about Lucene?

2005-08-18 Thread Karl Koch
.org > Betreff: Re: Books about Lucene? > Datum: Wed, 17 Aug 2005 20:28:09 -0400 > > On Aug 17, 2005, at 2:49 PM, Karl Koch wrote: > > Are there any other books (despite "Lucene in Action") perhaps > > written in a > > different perspective (e.g. differen

Books about Lucene?

2005-08-17 Thread Karl Koch
Are there any other books (despite "Lucene in Action") perhaps written in a different perspective (e.g. different applications or problem areas)? Karl -- 5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail +++ GMX - die erste Adresse für Mail, Message, More +++ ---

Applied Lucene: Search functionality on PDAs

2005-08-17 Thread Karl Koch
Hello all, I am developing code for Lucene 1.2 on a Sharp Zaurus using Java 1.1/1.2. (Unfortunately I was not able to run version 1.3 on this setting.) Does somebody know projects (eventually Open Source) also concerned with running Lucene on platforms that only allow small footprint applications.

[Lucene 1.2] Change the Scoring?

2005-08-17 Thread Karl Koch
Hello Lucene experts, I would like to insert my own scoring algorithm in Lucene 1.2 (I need to use this old Lucene version due to hardware limitations (PDA and Java 1.2)). Has somebody done things like that (eventually in the past) and can suggest approaches and perhaps a code example? Workarounds

[Lucene 1.2] Boolean OR on all query terms

2005-08-17 Thread Karl Koch
Hello experts, I have the following code: Query query = QueryParser.parse(queryString, searchFields[0], analyser); Hits hits = searcher.search(query); and the following code for search across multiple fields: Query query = MultiFieldQueryParser.parse(queryString, searchFields, analyser); hits =

[Lucene 1.2] Boolean OR on all query terms

2005-08-17 Thread Karl Koch
Hello experts, I have the following code: Query query = QueryParser.parse(queryString, searchFields[0], analyser); Hits hits = searcher.search(query); and the following code for search across multiple fields: Query query = MultiFieldQueryParser.parse(queryString, searchFields, analyser); hits =