Re: Are the Table API Connectors production ready?

yuxia Mon, 13 Mar 2023 18:44:19 -0700

The plan shows the filters has been pushed down. But remeber, although pused 
down, the filesystem table won't accept the filter. So, it'll be still like 
scan 
all files.


Best regards, 
Yuxia 


发件人: "Maryam Moafimadani" <maryam.moafimad...@shopify.com> 
收件人: "Hang Ruan" <ruanhang1...@gmail.com> 
抄送: "yuxia" <luoyu...@alumni.sjtu.edu.cn>, "ravi suryavanshi" 
<ravi_suryavan...@yahoo.com>, "Yaroslav Tkachenko" <yaros...@goldsky.com>, 
"Shammon FY" <zjur...@gmail.com>, "User" <user@flink.apache.org> 
发送时间: 星期一, 2023年 3 月 13日 下午 10:07:57 
主题: Re: Are the Table API Connectors production ready? 

Hi All, 
It's exciting to see file filtering in the plan for development. I am curious 
whether the following query on a filesystem connector would actually push down 
the filter on metadata `file.path`? 

Select score, `file.path` from MyUserTable WHERE `file.path` LIKE '%prefix_%' 

== Optimized Execution Plan == 
Calc(select=[score, file.path], where=[LIKE(file.path, '%2022070611284%')]) 
+- TableSourceScan(table=[[default_catalog, default_database, MyUserTable, 
filter=[LIKE(file.path, _UTF-16LE'%2022070611284%')]]], fields=[score, 
file.path]) 

Thanks, 
Maryam 

On Mon, Mar 13, 2023 at 8:55 AM Hang Ruan < [ mailto:ruanhang1...@gmail.com | 
ruanhang1...@gmail.com ] > wrote: 



Hi, yuxia, 
I would like to help to complete this task. 

Best, 
Hang 

yuxia < [ mailto:luoyu...@alumni.sjtu.edu.cn | luoyu...@alumni.sjtu.edu.cn ] > 
于2023年3月13日周一 09:32写道： 

BQ_BEGIN

Yeah, you're right. We don't provide filtering files with patterns. And 
actually we had already a jira[1] for it. 
I was intended to do this in the past, but don't have much time. Anyone who are 
insterested can take it over. We're 
happy to help review. 

[1] [ https://issues.apache.org/jira/browse/FLINK-17398 | 
https://issues.apache.org/jira/browse/FLINK-17398 ] 

Best regards, 
Yuxia 


发件人: "User" < [ mailto:user@flink.apache.org | user@flink.apache.org ] > 
收件人: "Yaroslav Tkachenko" < [ mailto:yaros...@goldsky.com | 
yaros...@goldsky.com ] >, "Shammon FY" < [ mailto:zjur...@gmail.com | 
zjur...@gmail.com ] > 
抄送: "User" < [ mailto:user@flink.apache.org | user@flink.apache.org ] > 
发送时间: 星期一, 2023年 3 月 13日 上午 12:36:46 
主题: Re: Are the Table API Connectors production ready? 

Thanks a lot, Yaroslav and Shammon. 
I want to use the Filesystem Connector. I tried it works well till it is 
running. If the job is restarted. It processes all the files again. 

Could not find the move or delete option after collecting the files. Also, I 
could not find the filtering using patterns. 

Pattern matching is required as different files exist in the same folder. 

Regards, 
Ravi 
On Friday, 10 March, 2023 at 05:47:27 am IST, Shammon FY < [ 
mailto:zjur...@gmail.com | zjur...@gmail.com ] > wrote: 


Hi Ravi 

Agree with Yaroslav and if you find any problems in use, you can create an 
issue in jira [ 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK | 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK ] . I have 
used kafka/jdbc/hive in production too, they work well. 

Best, 
Shammon 

On Fri, Mar 10, 2023 at 1:42 AM Yaroslav Tkachenko < [ 
mailto:yaros...@goldsky.com | yaros...@goldsky.com ] > wrote: 

BQ_BEGIN

Hi Ravi, 

All of them should be production ready. I've personally used half of them in 
production. 

Do you have any specific concerns? 

On Thu, Mar 9, 2023 at 9:39 AM [ http://ravi_suryavanshi.yahoo.com/ | 
ravi_suryavanshi.yahoo.com ] via user < [ mailto:user@flink.apache.org | 
user@flink.apache.org ] > wrote: 

BQ_BEGIN

Hi, 
Can anyone help me here? 

Thanks and regards, 
Ravi 

On Monday, 27 February, 2023 at 09:33:18 am IST, [ 
http://ravi_suryavanshi.yahoo.com/ | ravi_suryavanshi.yahoo.com ] via user < [ 
mailto:user@flink.apache.org | user@flink.apache.org ] > wrote: 


Hi Team, 


In Flink 1.16.0, we would like to use some of the Table API Connectors for 
production. Kindly let me know if the below connectors are production ready or 
only for testing purposes. 

Name Version Source Sink [ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/filesystem/
 | Filesystem ]          Bounded and Unbounded Scan, Lookup      Streaming 
Sink, Batch Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/elasticsearch/
 | Elasticsearch ]     6.x & 7.x       Not supported   Streaming Sink, Batch 
Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/opensearch/
 | Opensearch ]   1.x & 2.x       Not supported   Streaming Sink, Batch Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/kafka/
 | Apache Kafka ]      0.10+   Unbounded Scan  Streaming Sink, Batch Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/dynamodb/
 | Amazon DynamoDB ]                Not supported   Streaming Sink, Batch Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/kinesis/
 | Amazon Kinesis Data Streams ]             Unbounded Scan  Streaming Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/firehose/
 | Amazon Kinesis Data Firehose ]           Not supported   Streaming Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/jdbc/
 | JDBC ]               Bounded Scan, Lookup    Streaming Sink, Batch Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/hbase/
 | Apache HBase ]      1.4.x & 2.2.x   Bounded Scan, Lookup    Streaming Sink, 
Batch Sink 
[ 
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/hive/overview/
 | Apache Hive ] 

Thanks and regards 




BQ_END



BQ_END


BQ_END



-- 
Maryam Moafimadani 
Senior Data Developer @ [ http://www.shopify.com/ | Shopify ]

Re: Are the Table API Connectors production ready?

Reply via email to