RE: Indexing In Lucene

2006-10-03 Thread W.H. van Atteveldt
I don't know what you're doing but the to: header is empty in your email which is really annoying (since I rely on the to: to sort my mail) > -Original Message- > From: Ajani, Akil (Cognizant) [mailto:[EMAIL PROTECTED] > Sent: dinsdag 3 oktober 2006 10:47 > Subject: Indexing In Lucene > >

wildcard in phrase query: problem with idf / scoring; QueryParser; MultiPhraseQuery

2006-07-03 Thread W.H. van Atteveldt
Dear List, I am using lucene to count the number of hits of queries in documents (ie taking raw frequencies as scores), which seems to work fairly well using a modified Similarity, returning freq for tf and 1.0 for everyting else, and a HitCollector to collect the hits. I also want to allow 'pref

RE: Scoring purely on term frequencies

2006-06-27 Thread W.H. van Atteveldt
ifferent terms > : (I'm > : > referring to a real multi-term query, not a phrase as you mentioned - > : > "the man" - which should work). > : > The problem I see is that it you loose the ability to use boosts (I > : > assume this is fine by you). > : > &

RE: Scoring purely on term frequencies

2006-06-27 Thread W.H. van Atteveldt
ior of frequencies (tf implementation as sqrt). > The frameworks makes these smarter adjustments possible, it does not > mean you need it in your case. > > Ziv > > > > -Original Message- > From: W.H. van Atteveldt [mailto:[EMAIL PROTECTED] > Sent: Saturday,

Scoring purely on term frequencies

2006-05-19 Thread W.H. van Atteveldt
Dear list, I am interested in using Lucene for analyzing documents based on occurrence of certain keywords. As such, I am not interested in the 'top' or 'best' documents, but I do want to know exactly how many words in the query matched. Thus, instead of the complicated formula used by default, I