Indeed - you bring up interesting questions. You may want to take a look at NUTCH first, however - I am not sure if they have done some of the Google-like ranking you mention.
However - collaborative relevance enhancement, based on user feedback, would be a nice Web-2.0-ish feature to bake into the Lucene core :-) Dejan _____ From: sachin [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 23, 2006 5:31 AM To: java-user@lucene.apache.org Subject: Scoring Technique based on Relevance Feeback & other Parameters Hello Great/smart guys This is my first question for this group as I started working on the Lucene last month. Lucene provide the scoring of documents based on TF-IDF vector analysis. Lucene also provides the Scorer and Weight inside the Search package. By implementing new type of tuple (Query,Weight,Scorer) I can easily implement new Scoring technique. Unfortunatly Lucene index shows that it stores only TF / Position vectors for each term within document. I am interested in investigating new scoring technique where I will use some other parameters relating to the Term to rank the documents. For an example web page ranking is assisted by parameters like number of links towards webpage and number of link from web - page. It indicates that we need to store relatively more information about terms within the index. But HoW ? . I need to investigate Another parameter is relevance feedback from the User. Ranking should get affected by relevance feedback from the user. Would someone interested in helping out or thinking about the same problem.