[ 
https://issues.apache.org/jira/browse/SOLR-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ilariapet updated SOLR-17726:
-----------------------------
          Component/s: MoreLikeThis
    Affects Version/s: 9.8.1
          Description: 
When using CloudMLTQParser (the default MLT parser in SolrCloud), fields that 
are populated exclusively via copyField are not taken into account when 
constructing the MoreLikeThis query.

This happens because CloudMLTQParser relies on a RealTime Get (`/get`) request 
to retrieve the source document by ID, and the document returned by RealTime 
Get does not include fields generated via copyField (i.e. are not part of the 
original SolrIDocument).

As a result, even if the copyField target has proper termVectors configured, 
CloudMLTQParser skips the field silently, and the MLT query ends up empty.
 
This behavior differs from SimpleMLTQParser (used in Solr standalone), which 
does not rely on RealTimeGet but instead extracts the stored field content and 
re-applies the analysis chain dynamically.
 
*STEPS TO REPRODUCE*
1. Define these fields in the schema.xml:
{code:java}
<field name="description" type="text_general" indexed="true" stored="true"/>
<field name="descriptionMLT" type="text_general_mlt" indexed="true" 
stored="true" termVectors="true"/>

<copyField source="description" dest="descriptionMLT"/> {code}
2. Index a document that sets only the {{description}} field. The 
{{descriptionMLT}} field is expected to be populated automatically via the 
configured copyField directive.
3. ** Run an MLT query:
{code:java}
/select?q={!mlt qf=descriptionMLT}doc_id {code}
4. The resulting parsed query will be empty:
{code:java}
"parsedquery": "+() -documentId:32000"{code}
If the same document is reindexed explicitly setting {{{}descriptionMLT{}}}, 
the MLT query works.
 
 
 
               Labels: morelikethis  (was: )

> CloudMLTQParser fails to use copyFields due to RealTimeGet
> ----------------------------------------------------------
>
>                 Key: SOLR-17726
>                 URL: https://issues.apache.org/jira/browse/SOLR-17726
>             Project: Solr
>          Issue Type: Bug
>          Components: MoreLikeThis
>    Affects Versions: 9.8.1
>            Reporter: ilariapet
>            Priority: Major
>              Labels: morelikethis
>
> When using CloudMLTQParser (the default MLT parser in SolrCloud), fields that 
> are populated exclusively via copyField are not taken into account when 
> constructing the MoreLikeThis query.
> This happens because CloudMLTQParser relies on a RealTime Get (`/get`) 
> request to retrieve the source document by ID, and the document returned by 
> RealTime Get does not include fields generated via copyField (i.e. are not 
> part of the original SolrIDocument).
> As a result, even if the copyField target has proper termVectors configured, 
> CloudMLTQParser skips the field silently, and the MLT query ends up empty.
>  
> This behavior differs from SimpleMLTQParser (used in Solr standalone), which 
> does not rely on RealTimeGet but instead extracts the stored field content 
> and re-applies the analysis chain dynamically.
>  
> *STEPS TO REPRODUCE*
> 1. Define these fields in the schema.xml:
> {code:java}
> <field name="description" type="text_general" indexed="true" stored="true"/>
> <field name="descriptionMLT" type="text_general_mlt" indexed="true" 
> stored="true" termVectors="true"/>
> <copyField source="description" dest="descriptionMLT"/> {code}
> 2. Index a document that sets only the {{description}} field. The 
> {{descriptionMLT}} field is expected to be populated automatically via the 
> configured copyField directive.
> 3. ** Run an MLT query:
> {code:java}
> /select?q={!mlt qf=descriptionMLT}doc_id {code}
> 4. The resulting parsed query will be empty:
> {code:java}
> "parsedquery": "+() -documentId:32000"{code}
> If the same document is reindexed explicitly setting {{{}descriptionMLT{}}}, 
> the MLT query works.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to