In Lucene 3.6 I had code that replicated a Dismax Query, and the search used fuzzy queries in some cases to match values. But I was finding the score attributed to matches on fuzzy searches was completely different to the score attributed to matches on exact searches so the total score returned was not good. I improved this by extends TopTermsRewrite so that if the query is a prefix query we boost it as if was exact match, I dont fully understand this but it improved things somewhat, but in Lucene 4.1 the rewrite() and addClause() methods are final

So how can I implement this in Lucene 4.1, do I even need to - is there a more intuitive way to improve the scoring.

This is what I currently have that wont compile because of the final methods

    //TODO FIXME WAS Overriding methods that are now final
public static class MultiTermUseIdfOfSearchTerm<Q extends DisjunctionMaxQuery> extends TopTermsRewrite<Query> {

//public static final class MultiTermUseIdfOfSearchTerm extends TopTermsRewrite<BooleanQuery> {
        private final TFIDFSimilarity similarity;

        public MultiTermUseIdfOfSearchTerm(int size) {
            super(size);
            this.similarity = new DefaultSimilarity();

        }

        @Override
        protected int getMaxSize() {
            return BooleanQuery.getMaxClauseCount();
        }

        @Override
        protected DisjunctionMaxQuery getTopLevelQuery() {
            return new DisjunctionMaxQuery(0.1f);
        }

        @Override
        protected void addClause(Query topLevel, Term term, float boost) {
            final Query tq = new ConstantScoreQuery(new TermQuery(term));
            tq.setBoost(boost);
            ((DisjunctionMaxQuery)topLevel).add(tq);
        }

protected float getQueryBoost(final IndexReader reader, final MultiTermQuery query)
                throws IOException {
            float idf = 1f;
            float df;
            if (query instanceof PrefixQuery)
            {
                PrefixQuery fq = (PrefixQuery) query;
                df = reader.docFreq(fq.getPrefix());
                if(df>=1)
                {
//Same as idf value for search term, 0.5 acts as length norm idf = (float)Math.pow(similarity.idf((int) df, reader.numDocs()),2) * 0.5f;
                }
            }
            return idf;
        }

        @Override
public Query rewrite(final IndexReader reader, final MultiTermQuery query) throws IOException { DisjunctionMaxQuery bq = (DisjunctionMaxQuery)super.rewrite(reader, query);

            float idfBoost = getQueryBoost(reader, query);
            Iterator<Query> iterator = bq.iterator();
            while(iterator.hasNext())
            {
                Query next = iterator.next();
                next.setBoost(next.getBoost() * idfBoost);
            }
            return bq;
        }

    }

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to