Re: Savepoint Location from Flink REST API

2020-03-20 Thread Ufuk Celebi
Hey Aaron, you can expect one of the two responses for COMPLETED savepoints [1, 2]. 1. Success { "status": { "id": "completed" }, "savepoint": { "location": "string" } } 2. Failure { "status": { "id": "completed" }, "savepoint": { "failure-cause": { "class":

Re: Savepoint Location from Flink REST API

2020-03-20 Thread Aaron Langford
Roman, Thanks for the info. That's super helpful. I'd be interested in picking that ticket up. One additional question: the states that can return from this API are only described as 'COMPLETED' or 'IN_PROGRESS'. How are failures represented for this endpoint? Aaron On Fri, Mar 20, 2020 at 2:29

Re: Very large _metadata file

2020-03-20 Thread Jacob Sevart
Thanks, makes sense. What about using the config mechanism? We're collecting and distributing some environment variables at startup, would it also work to include a timestamp with that? Also, would you be interested in a patch to note the caveat about union state metadata in the documentation? J

Re: Flink Release Security Workflow

2020-03-20 Thread Robert Metzger
Hey Mark, thanks a lot for reaching out. There is no dedicated security workflow for a Flink release. This is the guide for creating a Flink release (for Flink committers, not for just building Flink locally): https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release As part of the

Re: Some question about flink temp files

2020-03-20 Thread Khachatryan Roman
Hi Reo, Please find the answers to your questions below. > 1, what is the usage of this tmp files? These files are used by Flink internally for things like caching state locally, storing jars and so on. They are not intended for the end-user. > 2, Is there have any mechanism of flink to manage th

Re: Flink YARN app terminated before the client receives the result

2020-03-20 Thread Till Rohrmann
Yes you are right that `thenAcceptAsync` only breaks the control flow but it does not guarantee that the `RestServer` has actually sent the response to the client. Maybe we also need something similar to FLINK-10309 [1]. The problem I see with this approach is that it makes all RestHandlers statefu

Re: Hadoop user jar for flink 1.9 plus

2020-03-20 Thread Vishal Santoshi
Awesome, thanks! On Tue, Mar 17, 2020 at 11:14 AM Chesnay Schepler wrote: > You can download flink-shaded-hadoop from the downloads page: > https://flink.apache.org/downloads.html#additional-components > > On 17/03/2020 15:56, Vishal Santoshi wrote: > > We have been on flink 1.8.x on production

Re: Flink YARN app terminated before the client receives the result

2020-03-20 Thread DONG, Weike
Hi Tison & Till, Changing *thenAccept *into *thenAcceptAsync *in the MiniDispatcher#cancelJob does not help to solve the problem in my environment. However, I have found that adding a* Thread.sleep(2000) *before the return of JobCancellationHandler#handleRequest solved the problem (at least the sy

RE: Flink Conf "yarn.flink-dist-jar" Question

2020-03-20 Thread Hailu, Andreas
Hi Yang, This is good to know. As a stopgap measure until a solution between 13938 and 14964 arrives, we can automate the application staging directory cleanup from our client should the process fail. It’s not ideal, but will at least begin to manage our users’ quota. I’ll continue to watch the

Re: Flink long state TTL Concerns

2020-03-20 Thread Matthew Magsombol
Gotcha, ok Thanks! I think this is everything I need to know for now! I can get around using thrift as a state data type by using generic flink data type and upon read, I can convert to thrift data type to pass to my sink. On Fri, Mar 20, 2020 at 1:15 AM Andrey Zagrebin wrote: > *Resources:* >

Some question about flink temp files

2020-03-20 Thread Reo Lei
Hi all, Recently, I found flink tmp files(localState, blobStore-*, flink-dist-cach-*, flink-io-*, flink-netty-shuffle-* etc.) has been grown to a total of about 6GB size. I have no idea what usage abot this files, and How big it will grow. So, I have some questions about the tmep files of flink a

Re: Savepoint Location from Flink REST API

2020-03-20 Thread Khachatryan Roman
Hey Aaron, You can use /jobs/:jobid/savepoints/:triggerid to get the location when the checkpoint is completed. Please see https://ci.apache.org/projects/flink/flink-docs-release-1.10/api/java/index.html?org/apache/flink/runtime/rest/handler/job/savepoints/SavepointHandlers.html Meanwhile, I've

Re: SQL Timetamp types incompatible after migration to 1.10

2020-03-20 Thread Paul Lam
Filed an issue to track this problem. [1] [1] https://issues.apache.org/jira/browse/FLINK-16693 Best, Paul Lam > 在 2020年3月20日,17:17,Paul Lam 写道: > > Hi Jark, > > Sorry for my late reply. > > Yes, I’m using the old planner. I’ve tried the

Re: SQL Timetamp types incompatible after migration to 1.10

2020-03-20 Thread Paul Lam
Hi Jark, Sorry for my late reply. Yes, I’m using the old planner. I’ve tried the blink planner, and it works well. We would like to switch to the blink planner, but we’ve developed some custom features on the old planner, so it would take some time to port the codes. So I might give a try to

Re: Help with flink hdfs sink

2020-03-20 Thread Yang Wang
I think Jingsong is right. You miss a slash in your HDFS path. Usually a HDFS path is like this "hdfs://nameservice/path/of/your/file". And the nameservice could be omitted if you want to use the defaultFS configured in the core-site.xml. Best, Yang Jingsong Li 于2020年3月20日周五 上午10:09写道: > Hi N

Re: state schema evolution for case classes

2020-03-20 Thread Apoorv Upadhyay
Thanks Gordon for the suggestion, I am going by this repo : https://github.com/mrooding/flink-avro-state-serialization So far I am able to alter the scala case classes and able to restore from savepoint using memory state backend, but when I am using rocksdb as statebackend and try to restore fro

Re: Flink long state TTL Concerns

2020-03-20 Thread Andrey Zagrebin
*Resources:* If you use heap state backend, the cleanup happens while processing records in the same thread so there is direct connection with the number of cores. If you use RocksDB state backend, extra cpus can be used by async compaction and should speed up the background cleanup. *Incremental

Re: How can i set the value of taskmanager.network.numberOfBuffers ?

2020-03-20 Thread Xintong Song
Hi Forideal, Do you mean you have 700 slots per TM or in total? How many TMs do you have? And how many slots do you have per TM? Also, when is the screenshot taken? It is after the job is fully initiated? It seems you only need 1k+ network buffers. Thank you~ Xintong Song On Fri, Mar 20, 202