Re: JDBC Table and parameters provider

2020-04-23 Thread Flavio Pompermaier
I've created 3 ticket related to this discussion, feel free to comment them: 1. https://issues.apache.org/jira/browse/FLINK-17358 - JDBCTableSource support FiltertableTableSource 2. https://issues.apache.org/jira/browse/FLINK-17360 - Support custom partitioners in JDBCReadOptions

Re: JDBC Table and parameters provider

2020-04-22 Thread Jingsong Li
> Specify "query" and "provider" Yes, your proposal looks reasonable to me. Key can be "scan.***" like in [1]. > specify parameters Maybe we need add something like "scan.parametervalues.provider.type", it can be "bound, specify, custom": - when bound, using old partitionLowerBound and partitionUp

Re: JDBC Table and parameters provider

2020-04-22 Thread Jingsong Li
Hi, You can configure table name for JDBC source. So this table name can be a rich sql: "(SELECT public.A.x, public.B.y FROM public.A JOIN public.B on public.A.pk = public.B.fk )" So the final scan query statement will be: "select ... from (SELECT public.

Re: JDBC Table and parameters provider

2020-04-22 Thread Flavio Pompermaier
Sorry Jingsong but I didn't understand your reply..Can you better explain the following sentences please? Probably I miss some Table API background here (I used only JDBOutputFormat). "We can not use a simple "scan.query.statement", because in JDBCTableSource, it also deal with project pushdown too

Re: JDBC Table and parameters provider

2020-04-22 Thread Jingsong Li
Thanks for the explanation. You can create JIRA for this. For "SELECT public.A.x, public.B.y FROM public.A JOIN public.B on public.A.pk = public.B.fk . " We can not use a simple "scan.query.statement", because in JDBCTableSource, it also deal with project

Re: JDBC Table and parameters provider

2020-04-22 Thread Flavio Pompermaier
Because in my use case the parallelism was not based on a range on keys/numbers but on a range of dates, so I needed a custom Parameter Provider. For what regards pushdown I don't know how Flink/Blink currently works..for example, let's say I have a Postgres catalog containing 2 tables (public.A an

Re: JDBC Table and parameters provider

2020-04-22 Thread Jingsong Li
Hi, You are right about the lower and upper, it is a must to parallelize the fetch of the data. And filter pushdown is used to filter more data at JDBC server. Yes, we can provide "scan.query.statement" and "scan.parameter.values.provider.class" for jdbc connector. But maybe we need be careful ab

Re: JDBC Table and parameters provider

2020-04-22 Thread Flavio Pompermaier
Maybe I am wrong but support pushdown for JDBC is one thing (that is probably useful) while parameters providers are required if you want to parallelize the fetch of the data. You are not mandated to use NumericBetweenParametersProvider, you can use the ParametersProvider you prefer, depending on t

Re: JDBC Table and parameters provider

2020-04-22 Thread Jingsong Li
Hi, Now in JDBCTableSource.getInputFormat, It's written explicitly: WHERE XXX BETWEEN ? AND ?. So we must use `NumericBetweenParametersProvider`. I don't think this is a good and long-term solution. I think we should support filter push-down for JDBCTableSource, so in this way, we can write the fi

JDBC Table and parameters provider

2020-04-21 Thread Flavio Pompermaier
Hi all, we have a use case where we have a prepared statement that we parameterize using a custom parameters provider (similar to what happens in testJDBCInputFormatWithParallelismAndNumericColumnSplitting[1]). How can we handle this using the JDBC table API? What should we do to handle such a use