[
https://issues.apache.org/jira/browse/SOLR-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012809#comment-13012809
]
Robert Muir commented on SOLR-2338:
-----------------------------------
{quote}
i don't personally think it would be confusing, but i also don't think we need
to advertise it in the example.
we should definitely encourage using similarity per field type, but for people
who have used it in the past, having it continue to work as a "global default"
when fieldTypes don't define a similarity gives us nice back-compatibility.
{quote}
I agree here, this is a good compromise and by not advertising it in the
example, I won't have concerns about the example being confusing.
{quote}
More generally though, i'm thinking that the same <similarity/> tag can be used
for both the old style (global default) Similarity/SimilarityFactory and the
new SimilarityProviderFactory using instanceof checks...
{quote}
I have to disagree on this one. The new SimilarityProvider serves a totally
different purpose, its not a global sim: it answers to requests for sims for
specific fields. The only reason I provided a factory for it, is so that users
can tune the parts of lucene's relevance ranking system that are not per-field:
coord() and queryNorm(). But its not a way to configure tf() or idf() or
anything like that. In the patch I added this with "expert" to the example,
though we could remove it from the example entirely if its too expert (might
be?)
So I think we should do as you suggest and allow a global <similarity/> that is
the default term weighting unless otherwise specified by a field, but we
shouldn't confuse this with the parts that arent field-specific...
{quote}
The one other thing i just noticed is that you have
SimilarityProviderFactory.init(SolrParams)
{quote}
I configured it this way, because this is how <similarity/> worked before (and
it was just enough XML to not scare me away). Is it possible we can defer this
improvement to a later issue? I think we should give this a little more
thought, for example if we do this its a break in the API for
<similarityFactory/>, which this patch does not actually do: it only MOVES it
to the fieldType.
> improved per-field similarity integration into schema.xml
> ---------------------------------------------------------
>
> Key: SOLR-2338
> URL: https://issues.apache.org/jira/browse/SOLR-2338
> Project: Solr
> Issue Type: Improvement
> Components: Schema and Analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Assignee: Robert Muir
> Attachments: SOLR-2338.patch, SOLR-2338.patch
>
>
> Currently since LUCENE-2236, we can enable Similarity per-field, but in
> schema.xml there is only a 'global' factory
> for the SimilarityProvider.
> In my opinion this is too low-level because to customize Similarity on a
> per-field basis, you have to set your own
> CustomSimilarityProvider with <similarity class=.../> and manage the
> per-field mapping yourself in java code.
> Instead I think it would be better if you just specify the Similarity in the
> FieldType, like after <analyzer>.
> As far as the example, one idea from LUCENE-1360 was to make a "short_text"
> or "metadata_text" used by the
> various metadata fields in the example that has better norm quantization for
> its shortness...
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]