What replaces the computeNorm method in DefaultSimilarity in 4.1

Ive always subclassed DefaultSimilarity to resolve an issue whereby when document has multiple values in a field (because has one-many relationship) its score worse then a document which just has single value but the computeNorm()
method has gone, but when I tried to rewrite the method for 4.1 as follows

public void  computeNorm(FieldInvertState state, Norm norm) {

        if (state.getName().equals("alias")) {
            if(state.getLength()>=3) {
                norm.setFloat(state.getBoost() * 0.578f);
            }
            else {
                super.computeNorm(state, norm);
            }
        }
        else {
            super.computeNorm(state, norm);
        }
    }



I found it was final so what should I do.


3.6 Code:

package org.musicbrainz.search.analysis;

import org.apache.lucene.index.FieldInvertState;
import org.apache.lucene.search.similarities.DefaultSimilarity;

/**
* Calculates a score for a match, overridden to deal with problems with alias fields in artist and label indexes
 */
public class MusicbrainzSimilarity extends DefaultSimilarity
{

    /**
* Calculates a value which is inversely proportional to the number of terms in the field. When multiple * aliases are added to an artist (or label) it is seen as one field, so artists with many aliases can be * disadvantaged against when the matching alias is radically different to other aliases.
     *
     * @return score component
     */
    public float computeNorm(String field, FieldInvertState state) {

//This will match both artist and label aliases and is applicable to both, didn't use the constant
        //ArtistIndexField.ALIAS because that would be confusing
        if (field.equals("alias")) {
            if(state.getLength()>=3)
            {
return state.getBoost() * 0.578f; //Same result as normal calc if field had three terms the most common scenario
            }
            else {
                return super.computeNorm(field,state);
            }
        }
        else
        {
            return super.computeNorm(field,state);
        }
    }


    /**
* This method calculates a value based on how many times the search term was found in the field. Because * we have only short fields the only real case (apart from rare exceptions like Duran Duran Duran) whereby
     * the term term is found more than twice would be when
* a search term matches multiples aliases, to remove the bias this gives towards artists/labels with * many aliases we limit the value to what would be returned for a two term match.
     *
* Note: would prefer to do this just for alias field, but the field is not passed as a parameter.
     * @param freq
     * @return score component
     */
    @Override
    public float tf(float freq) {
        if (freq > 2.0f) {
            return 1.41f; //Same result as if matched term twice

        } else {
            return super.tf(freq);
        }
    }
}


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to