On Fri, Apr 24, 2009 at 12:28 AM, Steven Bethard <beth...@stanford.edu>wrote:
> On 4/23/2009 2:08 PM, Marcus Herou wrote: > > But perhaps one could use a FieldCache somehow ? > > Some code snippets that may help. I add the PageRank value as a field of > the documents I index with Lucene like this: > > Document document = new Document(); > double pageRank = this.pageRanks.getCount(article.getId()); > document.add(new Field( > PAGE_RANK_FIELD_NAME, Float.toString((float)pageRank), > Field.Store.YES, Field.Index.NOT_ANALYZED)); Note that there's no need to store this field - it is the indexed value which is being used. Also, note an additional approach: page-ranks could be maintained externally, conceptually an array: float[] pageRank, where pageRank[docid] is the PR of that doc. This has the challenge of matching with index docids and so will not work well in a dynamic env where docs are deleted and hence docids are changed. But, if your setting is static in terms of docids, this would allow you to update the PRs without re-indexing the entire collection. To take this path, extend ValueSource over this array, and construct a ValueSourceQuery over that value source. This ValueSourceQuery will now be your pageRankQuery, passed to CustomScoreQuery. Doron