Hi all, I have two questions related to the Lucene ranking. 1) Does anyone know how the posting lists (term -> doc1 doc2 doc3) from the index are sorted? It is used a TFxIDF value, the boost value or none to sort documents (doc1 doc2 doc3)? Does Lucene compute the ranking for all the documents in the posting lists or only part? 2) Does anyone know how to add more ranking features to the ranking function of Lucene (eg. Pagerank, BM25)? Extending the DefaultSimilarity class from Lucene is insufficient to achieve this. It is only prepared to change the TFxIDF function. Thanks in advance.
-- Miguel Costa HYPERLINK "http://xldb.fc.ul.pt/~mcosta/"http://xldb.fc.ul.pt/~mcosta/ FCCN-Fundação para a Computação Científica Nacional Av. do Brasil, n.º 101 1700-066 Lisboa Tel.: +351 21 8440190 Fax: +351 218472167 HYPERLINK "outbind://25/www.fccn.pt"www.fccn.pt Aviso de Confidencialidade Esta mensagem é exclusivamente destinada ao seu destinatário, podendo conter informação CONFIDENCIAL, cuja divulgação está expressamente vedada nos termos da lei. Caso tenha recepcionado indevidamente esta mensagem, solicitamos-lhe que nos comunique esse mesmo facto por esta via ou para o telefone +351 218440100 devendo apagar o seu conteúdo de imediato. This message is intended exclusively for its addressee. It may contain CONFIDENTIAL information protected by law. If this message has been received by error, please notify us via e-mail or by telephone +351 218440100 and delete it immediately. No virus found in this outgoing message. Checked by AVG. Version: 7.5.524 / Virus Database: 269.23.14/1425 - Release Date: 09-05-2008 12:38