Re: [ANNOUNCE] Release 1.13.0, release candidate #0

2021-04-01 Thread Xingbo Huang
Hi Dawid, Thanks a lot for the great work! Regarding to the issue of flink-python, I have provided a quick fix and will try to fix it ASAP. Best, Xingbo Dawid Wysakowicz 于2021年4月2日周五 上午4:04写道: > Hi everyone, > As promised I created a release candidate #0 for the version 1.13.0. I am > not star

Why is Hive dependency flink-sql-connector-hive not available on Maven Central?

2021-04-01 Thread Yik San Chan
The question is cross-posted in StackOverflow https://stackoverflow.com/questions/66914119/flink-why-is-hive-dependency-flink-sql-connector-hive-not-available-on-maven-ce According to [Flink SQL Hive: Using bundled hive jar]( https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connect

Re: Avro schema

2021-04-01 Thread Sumeet Malhotra
Just realized, my question was probably not clear enough. :-) I understand that the Avro (or JSON for that matter) format can be ingested as described here: https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connect.html#apache-avro-format, but this still requires the entire table sp

Avro schema

2021-04-01 Thread Sumeet Malhotra
Hi, Is it possible to directly import Avro schema while ingesting data into Flink? Or do we always have to specify the entire schema in either SQL DDL for Table API or using DataStream data types? From a code maintenance standpoint, it would be really helpful to keep one source of truth for the sc

Re: [DISCUSS] Feature freeze date for 1.13

2021-04-01 Thread Guowei Ma
Hi, Yuval Thanks for your contribution. I am not a SQL expert, but it seems to be beneficial to users, and the amount of code is not much and only left is the test. Therefore, I am open to this entry into rc1. But according to the rules, you still have to see if there are other PMC's objections wi

How to know if task-local recovery kicked in for some nodes?

2021-04-01 Thread Sonam Mandal
Hello, We are experimenting with task local recovery and I wanted to know whether there is a way to validate that some tasks of the job recovered from the local state rather than the remote state. We've currently set this up to have 2 Task Managers with 2 slots each, and we run a job with para

Re: Flink Taskmanager failure recovery and large state

2021-04-01 Thread Guowei Ma
Hi, Yaroslav AFAIK Flink does not retry if the download checkpoint from the storage fails. On the other hand the FileSystem already has this retry mechanism already. So I think there is no need for flink to retry. I am not very sure but from the log it seems that the gfs's retry is interrupted by

Re: s3 FileSystem Error "s3 file system implementation does not support recoverable writers"

2021-04-01 Thread Guowei Ma
Hi, Robert It seems that your AccessKeyId is not valid. I think you could find more detailed from [1] about how to configure the s3' access key. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/ Best, Guowei On Thu, Apr 1, 2021 at 9:19 PM Robert Cullen w

[ANNOUNCE] Release 1.13.0, release candidate #0

2021-04-01 Thread Dawid Wysakowicz
|Hi everyone,| |As promised I created a release candidate #0 for the version 1.13.0. I am not starting a vote for this release as I've created it mainly for verifying the release process. We are still aware of some improvements coming in shortly. However we will greatly appreciate any help testing

Re: [External] : Re: Need help with executing Flink CLI for native Kubernetes deployment

2021-04-01 Thread Fuyao Li
Hi Yang, Thanks for sharing the insights. For problem 1: I think I can’t do telnet in the container. I tried to use curl 144.25.13.78:8081 and I could see the HTML of Flink dashboard UI. This proves such public IP is reachable inside the cluster. Just as you mentioned, there might still be som

Re: DataStream from kafka topic

2021-04-01 Thread Arian Rohani
Thank you Arvid, I was going to suggest something like this also. We use TestContainers and the docker images provided by ververica to do exactly this in our team. I am currently working on a small project on github to start sharing for use cases like this. The project will contain some example so

Question about setting up Task-local recovery with a RocksDB state backend

2021-04-01 Thread Sonam Mandal
Hello, I've been going through the documentation for task-local recovery and came across this section which discusses that with incremental check

Re: Flink Taskmanager failure recovery and large state

2021-04-01 Thread Yaroslav Tkachenko
Hi Guowei, I thought Flink can support any HDFS-compatible object store like the majority of Big Data frameworks. So we just added "flink-shaded-hadoop-2-uber" and "gcs-connector-latest-hadoop2" dependencies to the classpath, after that using "gs" prefix seems to be possible: state.checkpoints.di

Re: [DISCUSS] Feature freeze date for 1.13

2021-04-01 Thread Yuval Itzchakov
Hi All, I would really love to merge https://github.com/apache/flink/pull/15307 prior to 1.13 release cutoff, it just needs some more tests which I can hopefully get to today / tomorrow morning. This is a critical fix as now predicate pushdown won't work for any stream which generates a watermark

Re: s3 FileSystem Error "s3 file system implementation does not support recoverable writers"

2021-04-01 Thread Robert Cullen
Guowei, I changed to “s3a://argo-artifacts/“ but now I get this error: BTW: I'm using the example playground from here: [1] https://docs.ververica.com/getting_started/installation.html org.apache.flink.util.SerializedThrowable: 4bcbea1c-254c-4a7f-9348-14bf5f3e1f1a/cmdaa-0-0.ext: initiate MultiP

Re: DataStream from kafka topic

2021-04-01 Thread Arvid Heise
Arian gave good pointers, but I'd go even further: you should have ITCases where you pretty much just execute a mini job with docker-based Kafka and run it automatically. I strongly recommend to check out testcontainers [1], it makes writing such a test a really smooth experience. [1] https://www.

Re: Re: How does Flink SQL read Avro union?

2021-04-01 Thread Arvid Heise
Hi Vincent, yes if you cannot influence the schema, then there is little you can do on SQL level and your workaround is probably the only way to go. However, I'd encourage you to speak with the other consumers/producers to find a way without unions. They are also ugly to use in all strongly typed

Re: Proper way to get DataStream

2021-04-01 Thread Arvid Heise
Hi, it seems as if the data is written with a confluent registry in mind, so you cannot use option 1: the kafka record is invalid avro as it contains a 5 byte prefix that identifies the schema. So the second way, is the way to go and it actually works well: it tells you that you have read with a

Re: Checkpoint timeouts at times of high load

2021-04-01 Thread Guowei Ma
Hi, I think there are many reasons that could lead to the checkpoint timeout. Would you like to share some detailed information of checkpoint? For example, the detailed checkpoint information from the web.[1] And which Flink version do you use? [1] https://ci.apache.org/projects/flink/flink-docs-

Re: Flink Taskmanager failure recovery and large state

2021-04-01 Thread Guowei Ma
Hi, Yaroslav AFAIK there is no official GCS FileSystem support in FLINK. Does the GCS is implemented by yourself? Would you like to share the whole log of jm? BTW: From the following log I think the implementation has already some retry mechanism. >>> Interrupted while sleeping before retry. Giv

Re: Restoring from Flink Savepoint in Kubernetes not working

2021-04-01 Thread Matthias Pohl
The logs would have helped to understand better what you were doing. The stacktrace you shared indicates that you either asked for the status of a savepoint creation that had already been completed and was, therefore, removed from the operations cache or you used some job ID/request ID pair that w

Checkpoint timeouts at times of high load

2021-04-01 Thread Geldenhuys, Morgan Karl
Hi Community, I have a number of flink jobs running inside my session cluster with varying checkpoint intervals plus a large amount of operator state and in times of high load, the jobs fail due to checkpoint timeouts (set to 6 minutes). I can only assume this is because the latencies for savi

Re: JDBC connector support for JSON

2021-04-01 Thread Matthias Pohl
Hi Fanbin, I'm not that familiar with the FlinkSQL features. But it looks like the JdbcConnector does not support Json as stated in the documentation [1]. You might work around it by implementing your own user-defined functions [2]. I hope this helps. Matthias [1] https://ci.apache.org/projects/f

Re: ARM support

2021-04-01 Thread Guowei Ma
Hi, Rex I think that Flink does not have an official release that supports the arm architecture. There are some efforts and discussion [1][2][3] about supporting the architecture. I think you could find some builds at openlabtesting. [4] But AFAIK there is no clear timeline about that.(correct me

Re: [DISCUSS] Feature freeze date for 1.13

2021-04-01 Thread Kurt Young
Thanks Dawid, I have merged FLINK-20320. Best, Kurt On Thu, Apr 1, 2021 at 2:49 PM Dawid Wysakowicz wrote: > Hi all, > > @Kurt @Arvid I think it's fine to merge those two, as they are pretty much > finished. We can wait for those two before creating the RC0. > > @Leonard Personally I'd be ok w