[ 
https://issues.apache.org/jira/browse/SOLR-16239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550488#comment-17550488
 ] 

Joel Bernstein edited comment on SOLR-16239 at 6/6/22 1:57 PM:
---------------------------------------------------------------

One of the first joins that would be nice to support is an aggregation that 
joins to return the grouping key from a different collection. This allows for 
query plans that aggregate first and then join to fetch the group key following 
the aggregation. Here is an example:

{code:java}
SELECT c.product_name, COUNT(*) AS cnt FROM signals s
LEFT JOIN catalog c ON s.product_id = c.product_id
GROUP BY c.product_name 
ORDER BY cnt desc 
LIMIT 25
{code}

This could be rewritten to a very efficient:

{code:java}
select(fetch(facet(products, buckets="product_id"),
               on="product_id=product_id",
               fl="product_name"),
        count as cnt,
        product_name)          
{code}




was (Author: joel.bernstein):
One of the first joins that would be nice to support is an aggregation that 
joins to return the grouping key from a different collection. This allows for 
query plans that aggregate first and then join to fetch the group key following 
the aggregation. Here is an example:

{code:java}
SELECT c.product_name, COUNT(*) AS cnt FROM signals s
LEFT JOIN catalog c ON s.product_id = c.product_id
GROUP BY c.product_name 
ORDER BY cnt desc 
LIMIT 25
{code}

This could be rewritten to a very efficient:

{code:java}
select(fetch(facet(products, buckets="product_id"),
                   on="product_id=product_id",
                   fl="product_name"),
            count as cnt,
            product_name)          
{code}



> Add Join query plans to Solr SQL
> --------------------------------
>
>                 Key: SOLR-16239
>                 URL: https://issues.apache.org/jira/browse/SOLR-16239
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Parallel SQL
>            Reporter: Joel Bernstein
>            Priority: Major
>              Labels: RobustSQL
>
> This is an umbrella ticket for adding join query plans for Solr SQL.
> Solr 9 adds significant performance improvements to the export handler. These 
> improvements were done in part to support fast distributed joins in Solr SQL. 
> Streaming Expressions already supports hash joins and merge joins and has 
> limited support for nested loop joins (fetch). What needs to be done is to 
> add Rules to the Calcite planner that pushes the joins down to the SQL 
> handler.
> Calcite also has the ability to execute joins so part of this work will also 
> be to gracefully fall back to Calcite's join engine. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to