[
https://issues.apache.org/jira/browse/SOLR-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SOLR-17726:
----------------------------------
Labels: morelikethis pull-request-available (was: morelikethis)
> CloudMLTQParser fails to use copyFields due to RealTime Get
> -----------------------------------------------------------
>
> Key: SOLR-17726
> URL: https://issues.apache.org/jira/browse/SOLR-17726
> Project: Solr
> Issue Type: Bug
> Components: MoreLikeThis
> Affects Versions: 9.8.1
> Reporter: ilariapet
> Priority: Major
> Labels: morelikethis, pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When using CloudMLTQParser (the default MLT parser in SolrCloud), fields that
> are populated exclusively via copyField are not taken into account when
> constructing the MoreLikeThis query.
> This happens because CloudMLTQParser relies on a RealTime Get (`/get`)
> request to retrieve the source document by ID, and the document returned by
> RealTime Get does not include fields generated via copyField (i.e. are not
> part of the original SolrIDocument).
> As a result, even if the copyField target is stored and has proper
> termVectors configured, CloudMLTQParser skips the field silently, and the MLT
> query ends up empty.
>
> This behavior differs from SimpleMLTQParser (used in Solr standalone), which
> does not rely on RealTimeGet but instead extracts the stored field content
> and re-applies the analysis chain dynamically.
>
> *STEPS TO REPRODUCE*
> 1. Define these fields in the schema.xml:
> {code:java}
> <field name="description" type="text_general" indexed="true" stored="true"/>
> <field name="descriptionMLT" type="text_general_mlt" indexed="true"
> stored="true" termVectors="true"/>
> <copyField source="description" dest="descriptionMLT"/> {code}
> 2. Index a document that sets only the {{description}} field. The
> {{descriptionMLT}} field is expected to be populated automatically via the
> configured copyField directive.
> 3. ** Run an MLT query:
> {code:java}
> /select?q={!mlt qf=descriptionMLT}doc_id {code}
> 4. The resulting parsed query will be empty:
> {code:java}
> "parsedquery": "+() -documentId:32000"{code}
> If the same document is reindexed explicitly setting {{{}descriptionMLT{}}},
> the MLT query works.
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]