Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread markharw00d
>>So I'm afraid I can't use the technique you recommend. ah right - so the TermVector you use from the index will return mixed and lower case versions of the same text. One point to note - this would mean that of the 25 or so top terms selected by MoreLikeThis for querying there is a reasonable

RE: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread Jong Kim
comparison in MoreLikeThis class in Lucene's contrib/queries project >>the case matters only for those words that should be included. Jong, just want to check we're on the same page - you do know MoreLikeThis has a kind of automatic Stop-Wording built in , yes? MoreLikeThis looks at

Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread markharw00d
>>the case matters only for those words that should be included. Jong, just want to check we're on the same page - you do know MoreLikeThis has a kind of automatic Stop-Wording built in , yes? MoreLikeThis looks at the document frequency of all terms in the "this" text you provide and only sele

Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread Erick Erickson
- Original Message From: Jong Kim <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, 9 July, 2007 3:55:03 PM Subject: RE: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project >>Or are you saying that you have deliberately chosen t

RE: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread Jong Kim
the useful class even more useful. /Jong -Original Message- From: mark harwood [mailto:[EMAIL PROTECTED] Sent: Monday, July 09, 2007 11:54 AM To: java-user@lucene.apache.org Subject: Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project OK. I can see the

Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread mark harwood
ying different analyzer is no good for my case. /Jong -Original Message- From: mark harwood [mailto:[EMAIL PROTECTED] Sent: Monday, July 09, 2007 5:01 AM To: java-user@lucene.apache.org Subject: Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project >>

RE: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread Jong Kim
supply stop words in a case-insensitive fashion? - Original Message From: Jong Kim <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, 9 July, 2007 3:00:05 PM Subject: RE: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project My applicat

Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread mark harwood
case-insensitive fashion? - Original Message From: Jong Kim <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, 9 July, 2007 3:00:05 PM Subject: RE: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project My application stores term vecto

RE: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread Jong Kim
- From: mark harwood [mailto:[EMAIL PROTECTED] Sent: Monday, July 09, 2007 5:01 AM To: java-user@lucene.apache.org Subject: Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project >>I need this comparison to be case-insensitive The choice of case-sensi

Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-09 Thread mark harwood
>>I need this comparison to be case-insensitive The choice of case-sensitivity (and preservation of punctuation, numbers etc etc) is controlled by your choice of analyzer that you pass to MoreLikeThis. If you want to ensure your list of stop words adheres to the same logic - use the same analyz

Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project

2007-07-08 Thread Chris Hostetter
: I need this comparison to be case-insensitive, but I don't see any way of : achieving it by extending this class. I would have created a subclass of : MoreLikeThis and override the isNoiseWord() method. However, the problem is : that, neither isNoiseWord() method nor the instance variables refer