RE: Could not stop job with a savepoint

2022-03-09 Thread Schwalbe Matthias
Hi Vinicius, Your case, the taskmanager being actively killed by yarn was the other way this happened. You are using RocksDBStateBackend, right? Not being sure, I’ve got the strong suspicion that this has got to do with the glibc bug that is seemingly in the works. There is some documentation h

Re: Fwd: [DISCUSS] Flink's supported APIs and Hive query syntax

2022-03-09 Thread Jark Wu
Thanks Martijn for the reply and summary. I totally agree with your plan and thank Yuxia for volunteering the Hive tech debt issue. I think we can create an umbrella issue for this and target version 1.16. We can discuss details and create subtasks there. Regarding dropping old Hive versions, I'm

回复:Fwd: [DISCUSS] Flink's supported APIs and Hive query syntax

2022-03-09 Thread 罗宇侠(莫辞)
Thanks Martijn for your insights. About the tech debt/maintenance with regards to Hive query syntax, I would like to chip-in and expect it can be resolved for Flink 1.16. Best regards, Yuxia ​ --原始邮件 -- 发件人:Martijn Visser 发送时间:Thu Mar 10 04:03:34 2022 收件人:User

Fwd: [DISCUSS] Flink's supported APIs and Hive query syntax

2022-03-09 Thread Martijn Visser
(Forwarding this also to the User mailing list as I made a typo when replying to this email thread) -- Forwarded message - From: Martijn Visser Date: Wed, 9 Mar 2022 at 20:57 Subject: Re: [DISCUSS] Flink's supported APIs and Hive query syntax To: dev , Francesco Guardiani , Timo W

Re: Flink sql calculate principle question

2022-03-09 Thread Martijn Visser
Hi JianWen, I think this explained in the documentation on Dynamic Tables [1] Does that answer your question? Best regards, Martijn [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/dynamic_tables/#query-restrictions On Wed, 9 Mar 2022 at 18:32, JianWen Huang wr

Re: Left join query not clearing state after migrating from 1.9.0 to 1.14.3

2022-03-09 Thread Roman Khachatryan
Hi Prakhar, Thanks for sharing this > Also, can you give us an idea of exactly what details you are looking for? It would be helpful to know the sizes of different parts of the checkpoint and the timings (e.g. sync/async phases, alignment duration, etc.) [1] Could you please share: 1. "Checkpoin

Flink sql calculate principle question

2022-03-09 Thread JianWen Huang
As we all know. Flink sql will be translated into streaming api. Look at a sql : SELECT color, sum(id) FROM T GROUP BY color. In the actual calculation, Flink will store the whole t-stream into the state, and a piece of data in the stream will trigger a full stream calculation. Or does the stat

Re: Could not stop job with a savepoint

2022-03-09 Thread Vinicius Peracini
So apparently the YARN container for Task Manager is running out of memory during the savepoint execution. Never had any problems with checkpoints though. Task Manager configuration: "taskmanager.memory.process.size": "10240m", "taskmanager.memory.managed.fraction": "0.6", "taskmanager.memory.jvm-

Re: Could not stop job with a savepoint

2022-03-09 Thread Vinicius Peracini
Bom dia Schwalbe! Thanks for the reply. I'm using Flink 1.14.0. EMR is a managed cluster platform to run big data applications on AWS. This way Flink services are running on YARN. I tried to create another savepoint today and was able to retrieve the Job Manager log: 2022-03-09 15:42:10,294 INFO

Re: replay kinesis events

2022-03-09 Thread Guoqin Zheng
Hi Danny, Thanks for getting back to me. This is very helpful and makes a lot of sense to me. Thanks, -Guoqin On Wed, Mar 9, 2022 at 1:32 AM Danny Cranmer wrote: > Hey Guoqin, > > In order to achieve this you would need to either: > - Restart the job and resume from an old savepoint (taken be

Evolving Schemas with ParquetColumnarRowInputFormat

2022-03-09 Thread Kevin Lam
Hi all, We're interested in using ParquetColumnarRowInputFormat or similar with evolving Parquet schemas. Any advice or recommendations? Specifically, the situ

Re: Flink Statefun Kafka Ingress Record Key Deserializer

2022-03-09 Thread Igal Shilman
Hello Xin Li, Indeed the built in ingress that ships with StateFun requires that the key part will be a utf-8 string, This string then becomes the id part of the target address. StateFun is extensible via the StatefulFunctionModule[1] and customizing the Kafka ingress is also possible, take a loo

Re: [statefun] hadoop dependencies and StatefulFunctionsConfigValidator

2022-03-09 Thread Igal Shilman
Hi Fil, I've replied in the JIRA Cheers, Igal. On Tue, Mar 8, 2022 at 6:08 PM Filip Karnicki wrote: > Hi Roman, Igal (@ below) > > Thank you for your answer. I don't think I'll have access to flink's lib > folder given it's a shared Cloudera cluster. The only thing I could think > of is to not

Re: Flatmap operator in an Asynchronous call

2022-03-09 Thread Arvid Heise
You can use flatMap to flatten and have an asyncIO after it. On Wed, Mar 9, 2022 at 8:08 AM Diwakar Jha wrote: > Thanks Gen, I will look into customized Source and SpiltEnumerator. > > On Mon, Mar 7, 2022 at 10:20 PM Gen Luo wrote: > >> Hi Diwakar, >> >> An asynchronous flatmap function without

Re: Move savepoint to another s3 bucket

2022-03-09 Thread Lukáš Drbal
Hi Dawid, I just tried the same steps on flink builded from git branch release-1.13 and everything works as expected! Thank you all! L. On Wed, Mar 9, 2022 at 8:49 AM Dawid Wysakowicz wrote: > Hi Lukas, > > I am afraid you're hitting this bug: > https://issues.apache.org/jira/browse/FLINK-259

Re: replay kinesis events

2022-03-09 Thread Danny Cranmer
Hey Guoqin, In order to achieve this you would need to either: - Restart the job and resume from an old savepoint (taken before the events you want to replay), assuming the state is still compatible with your bugfix, or - Restart the job without any state and seed the consumer with the start posit