[jira] [Updated] (CALCITE-2696) Make it easier to configure SqlToRelConverter.Config.getInSubQueryThreshold()

Julian Hyde (JIRA) Wed, 28 Nov 2018 09:46:20 -0800


     [ 
https://issues.apache.org/jira/browse/CALCITE-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Julian Hyde updated CALCITE-2696:
---------------------------------
    Description: 
A {{Filter}} containing an IN clause is not passed to {{Enumerable.scan}}.

I'm using the Calcite JDBC driver with an own SchemaFactory (defined by a model 
property) that provides a schema containing a ProjectableFilterableTable:
{code:java}
String model = "inline:" //
+ "{" //
+ " version: '1.0', " //
+ " defaultSchema: 'test'," //
+ " schemas: [" //
+ " {" //
+ " name: 'test'," //
+ " type: 'custom'," //
+ " factory: '" + TestSchemaFactory.class.getName() + "'" //
+ " }"
+ " ]" //
+ "}";
Properties properties = new Properties();
properties.put(CalciteConnectionProperty.MODEL.camelName(), model);
connection = DriverManager.getConnection("jdbc:calcite:", properties);
{code}
 

 
{code:java}
class TestTable extends AbstractQueryableTable implements 
ProjectableFilterableTable {

  public Enumerable<Object[]> scan(DataContext root, List<RexNode> filters, 
int[] projects) {
...
  }

  ...
}{code}
 

It maps to a Java class and provides two Integer typed columns "value1" and 
"value2".

The following query leads to a quite expensive behavior in the scan method if 
the following statement is executed:

 
{code:java}
SELECT "value" FROM "TEST_TABLE" WHERE "value1" = 1 AND "value2" in 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
{code}
The scan method is invoked with a filter that only covers the part "value1" = 
1, the IN clause is completely omitted. The result on the JDBC side is still 
valid but in my case this still leads to a full scan of a large underlying data 
set (millions of rows).

Interestingly the filter part reflecting the IN operator is provided if the 
number of elements in the list is below 20. It seems that this is controlled by 
org.apache.calcite.sql2rel.SqlToRelConverter.Config#getInSubQueryThreshold. It 
would at be very helpful if this behavior could be confgiured on the JDBC 
property level.

  was:
I'm using the Calcite JDBC driver with an own SchemaFactory (defined by a model 
property) that provides a schema containing a ProjectableFilterableTable:
{code:java}
String model = "inline:" //
+ "{" //
+ " version: '1.0', " //
+ " defaultSchema: 'test'," //
+ " schemas: [" //
+ " {" //
+ " name: 'test'," //
+ " type: 'custom'," //
+ " factory: '" + TestSchemaFactory.class.getName() + "'" //
+ " }"
+ " ]" //
+ "}";
Properties properties = new Properties();
properties.put(CalciteConnectionProperty.MODEL.camelName(), model);
connection = DriverManager.getConnection("jdbc:calcite:", properties);
{code}
 

 
{code:java}
class TestTable extends AbstractQueryableTable implements 
ProjectableFilterableTable {

  public Enumerable<Object[]> scan(DataContext root, List<RexNode> filters, 
int[] projects) {
...
  }

  ...
}{code}
 

It maps to a Java class and provides two Integer typed columns "value1" and 
"value2".

The following query leads to a quite expensive behavior in the scan method if 
the following statement is executed:

 
{code:java}
SELECT "value" FROM "TEST_TABLE" WHERE "value1" = 1 AND "value2" in 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
{code}
The scan method is invoked with a filter that only covers the part "value1" = 
1, the IN clause is completely omitted. The result on the JDBC side is still 
valid but in my case this still leads to a full scan of a large underlying data 
set (millions of rows).

Interestingly the filter part reflecting the IN operator is provided if the 
number of elements in the list is below 20. It seems that this is controlled by 
org.apache.calcite.sql2rel.SqlToRelConverter.Config#getInSubQueryThreshold. It 
would at be very helpful if this behavior could be confgiured on the JDBC 
property level.


> Make it easier to configure SqlToRelConverter.Config.getInSubQueryThreshold()
> -----------------------------------------------------------------------------
>
>                 Key: CALCITE-2696
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2696
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.17.0
>            Reporter: Dirk Mahler
>            Assignee: Julian Hyde
>            Priority: Major
>         Attachments: calcite-in-clause.zip
>
>
> A {{Filter}} containing an IN clause is not passed to {{Enumerable.scan}}.
> I'm using the Calcite JDBC driver with an own SchemaFactory (defined by a 
> model property) that provides a schema containing a 
> ProjectableFilterableTable:
> {code:java}
> String model = "inline:" //
> + "{" //
> + " version: '1.0', " //
> + " defaultSchema: 'test'," //
> + " schemas: [" //
> + " {" //
> + " name: 'test'," //
> + " type: 'custom'," //
> + " factory: '" + TestSchemaFactory.class.getName() + "'" //
> + " }"
> + " ]" //
> + "}";
> Properties properties = new Properties();
> properties.put(CalciteConnectionProperty.MODEL.camelName(), model);
> connection = DriverManager.getConnection("jdbc:calcite:", properties);
> {code}
>  
>  
> {code:java}
> class TestTable extends AbstractQueryableTable implements 
> ProjectableFilterableTable {
>   public Enumerable<Object[]> scan(DataContext root, List<RexNode> filters, 
> int[] projects) {
> ...
>   }
>   ...
> }{code}
>  
> It maps to a Java class and provides two Integer typed columns "value1" and 
> "value2".
> The following query leads to a quite expensive behavior in the scan method if 
> the following statement is executed:
>  
> {code:java}
> SELECT "value" FROM "TEST_TABLE" WHERE "value1" = 1 AND "value2" in 
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
> {code}
> The scan method is invoked with a filter that only covers the part "value1" = 
> 1, the IN clause is completely omitted. The result on the JDBC side is still 
> valid but in my case this still leads to a full scan of a large underlying 
> data set (millions of rows).
> Interestingly the filter part reflecting the IN operator is provided if the 
> number of elements in the list is below 20. It seems that this is controlled 
> by 
> org.apache.calcite.sql2rel.SqlToRelConverter.Config#getInSubQueryThreshold. 
> It would at be very helpful if this behavior could be confgiured on the JDBC 
> property level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CALCITE-2696) Make it easier to configure SqlToRelConverter.Config.getInSubQueryThreshold()

Reply via email to