subject:"Re\: Need Help\: Business Scenario to lucene implementation"

Re: Need Help: Business Scenario to lucene implementation

2011-09-01 Thread Saurabh Gokhale

Hi Grant, Thanks for the reply. I would definitely look into Solr Deduplication approch. But since I am using pure lucene and not Solr, I am not sure how feasible that would be to find something in lucene or try duplicating it. But thats looks to be the way forward. Also regarding the question a

Re: Need Help: Business Scenario to lucene implementation

2011-09-01 Thread Grant Ingersoll

I'd probably treat this as a deduplication problem and look to use a fuzzy matching approach, such as the TextProfileSignature in Solr/Nutch: http://wiki.apache.org/solr/Deduplication, which I believe is tunable as to it's threshold of acceptance. I'd also likely give pushback on the notion of

Re: Need Help: Business Scenario to lucene implementation

2011-08-31 Thread Saurabh Gokhale

Can some one pls help with the logic that can be applied to decide on the closeness requirement given below (like 50% matching). This matching is a pure text matching. Since the current lucene score does not translate into the percentage of closeness, is there anything else that can give this info

Re: Need Help: Business Scenario to lucene implementation

Re: Need Help: Business Scenario to lucene implementation

Re: Need Help: Business Scenario to lucene implementation

3 matches

Site Navigation

Mail list logo

Footer information