[
https://issues.apache.org/jira/browse/LUCENE-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222055#comment-13222055
]
Neil Hooey commented on LUCENE-3749:
------------------------------------
This change breaks per-field similarity configuration in Solr. Specifically
with this commit:
{code}
commit 5d371928263d8d78d0e52781340ae95506bd9bf6
Author: Robert Muir <[email protected]>
Date: Mon Feb 6 12:48:01 2012 +0000
LUCENE-3749: replace SimilarityProvider with PerFieldSimilarityWrapper
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1241001
13f79535-47bb-0310-9956-ffa450edef68
{code}
I have the following configuration in my schema.xml:
{code}
<fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField"
>
<analyzer>
<tokenizer
class="com.foo.lucene.analysis.core.PayloadTermTokenizerFactory"/>
<filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<similarity class="com.foo.lucene.search.PayloadSimilarity" />
</fieldtype>
{code}
But when I build against and use a version of a Solr with the commit mentioned
above, my similarity class is no longer executed. I've confirmed this by
putting prints in the scorePayload(), tf() and idf() functions and noticing
they print before and don't print after including that commit.
It seems this is intentional, based on Robert Muir's comments, but how can you
get per-field similarity to work in Solr with this new code?
> Similarity.java javadocs and simplifications for 4.0
> ----------------------------------------------------
>
> Key: LUCENE-3749
> URL: https://issues.apache.org/jira/browse/LUCENE-3749
> Project: Lucene - Java
> Issue Type: Task
> Affects Versions: 4.0
> Reporter: Robert Muir
> Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3749.patch, LUCENE-3749_part2.patch
>
>
> As part of adding additional scoring systems to lucene, we made a lower-level
> Similarity
> and the existing stuff became e.g. TFIDFSimilarity which extends it.
> However, I always feel bad about the complexity introduced here (though I do
> feel there
> are some "excuses", that its a difficult challenge).
> In order to try to mitigate this, we also exposed an easier API
> (SimilarityBase) on top of
> it that makes some assumptions (and trades off some performance) to try to
> provide something
> consumable for e.g. experiments.
> Still, we can cleanup a few things with the low-level api: fix outdated
> documentation and
> shoot for better/clearer naming etc.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]