Re: Flink Deserialisation JSON to Java;

2020-05-04 Thread Jark Wu
Hi Aissa, You can easily do this by using Flink SQL, you can define a kafka table using Flink DDL: CREATE TABLE sensor_logs ( `date` STRING, `main` ROW< `ph` DOUBLE, `whc` DOUBLE, `temperature` DOUBLE, `humidity` DOUBLE>, `id` BIGINT, `coord` ROW<

Re: Restore from savepoint with Iterations

2020-05-04 Thread Ken Krugler
Hi Ashish, The workaround we did was to throttle data flowing in the iteration (in code), though not sure if that’s possible for your situation. You could remove the iteration by writing to a Kafka topic at the end of the part of your workflow that is currently an iteration, and then consuming

Re: Restore from savepoint with Iterations

2020-05-04 Thread Ashish Pokharel
Hi Ken, Thanks for the quick response! I came across FLIP-15 on my next google search after I sent email :) It DEFINITELY looks that way. As I was watching logs and nature of how job gets stuck it does look like buffer is blocked. But FLIP-15 has not moved further though. So there are no worka

Re: Restore from savepoint with Iterations

2020-05-04 Thread Ken Krugler
Hi Ashish, Wondering if you’re running into the gridlock problem I mention on slide #25 here: https://www.slideshare.net/FlinkForward/flink-forward-san-francisco-2018-ken-krugler-building-a-scalable-focused-web-crawler-with-flink

Re: Restore from savepoint with Iterations

2020-05-04 Thread Ashish Pokharel
Could this be FLIP-15 related as well then? > On May 4, 2020, at 9:41 PM, Ashish Pokharel wrote: > > Hi all, > > Hope everyone is doing well! > > I am running into what seems like a deadlock (application stalled) situation > with a Flink streaming job upon restore from savepoint. Job has a sl

Restore from savepoint with Iterations

2020-05-04 Thread Ashish Pokharel
Hi all, Hope everyone is doing well! I am running into what seems like a deadlock (application stalled) situation with a Flink streaming job upon restore from savepoint. Job has a slowly moving stream (S1) that needs to be “stateful” and a continuous stream (S2) which is “joined” with slow mov

Flink Deserialisation JSON to Java;

2020-05-04 Thread Aissa Elaffani
Hello, Please can you share with me, some demos or examples of deserialization with flink. I need to consume some kafka message produced by sensors in JSON format. here is my JSON message : {"date": "2018-05-31 15:10", "main": {"ph": 5.0, "whc": 60.0, "temperature": 9.5, "humidity": 96}, "id": 2582

Re: multiple joins in one job

2020-05-04 Thread lec ssmi
I mean using pure sql statement to make it . Can it be possible? Fabian Hueske 于2020年5月4日周一 下午4:04写道: > Hi, > > If the interval join emits the time attributes of both its inputs, you can > use either of them as a time attribute in a following operator because the > join ensures that the watermar

Writing _SUCCESS Files (Streaming and Batch)

2020-05-04 Thread Peter Groesbeck
I am replacing an M/R job with a Streaming job using the StreamingFileSink and there is a requirement to generate an empty _SUCCESS file like the old Hadoop job. I have to implement a similar Batch job to read from backup files in case of outages or downtime. The Batch job question was answered he

Re: Support for Flink in EMR 6.0

2020-05-04 Thread KristoffSC
Actually it seems there is already ongoing discussion about installing Flink 1.10 on EMR http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/upgrade-flink-from-1-9-1-to-1-10-0-on-EMR-td34114.html -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: flink-s3-fs-hadoop retry configuration

2020-05-04 Thread Jeff Henrikson
> 2) How can I tell if flink-s3-fs-hadoop is actually managing to pick up > the hadoop configuration I have provided, as opposed to some separate > default configuration? I'm reading the docs and source of flink-fs-hadoop-shaded. I see that core-default-shaded.xml has fs.s3a.connection.maximum

Re: Benchmark for Stateful Functions

2020-05-04 Thread Omid Bakhshandeh
Hi Tzu-Li, Thanks for the answer. I am looking into the latency of a loopback given #number of functions in the system. More precisely the loopback time on *f1 *calls a random function *f_i* and *f_i* change its state and reply back. Also, it would be nice to see the latency on ingress and egres

Re: java.lang.IllegalStateException: The RPC connection is already closed

2020-05-04 Thread Manish G
https://issues.apache.org/jira/browse/FLINK-16373 On Mon, May 4, 2020 at 9:37 PM Manish G wrote: > I found another similar issue: > > > On Mon, May 4, 2020 at 9:28 PM Steven Wu wrote: > >> Manish, might be related to this bug, which is fixed in 1.10.1. >> >> >> https://issues.apache.org/jira/b

Re: java.lang.IllegalStateException: The RPC connection is already closed

2020-05-04 Thread Manish G
I found another similar issue: On Mon, May 4, 2020 at 9:28 PM Steven Wu wrote: > Manish, might be related to this bug, which is fixed in 1.10.1. > > > https://issues.apache.org/jira/browse/FLINK-14316?focusedCommentId=16946580&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpa

Re: java.lang.IllegalStateException: The RPC connection is already closed

2020-05-04 Thread Steven Wu
Manish, might be related to this bug, which is fixed in 1.10.1. https://issues.apache.org/jira/browse/FLINK-14316?focusedCommentId=16946580&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946580 On Mon, May 4, 2020 at 5:52 AM Manish G wrote: > Hi, > > I have se

Re: Support for Flink in EMR 6.0

2020-05-04 Thread Ori Popowski
Thanks. I also contacted AWS support and they should keep me updated about it. Hoping to see this soon :) On Mon, May 4, 2020 at 4:41 PM Robert Metzger wrote: > Hey Ori, > > thanks a lot for reaching out to the user@ mailing list. This is more a > question for the EMR Team at Amazon than the F

Re: Support for Flink in EMR 6.0

2020-05-04 Thread Robert Metzger
Hey Ori, thanks a lot for reaching out to the user@ mailing list. This is more a question for the EMR Team at Amazon than the Flink community. But it seems that the lack of Hadoop 3 support in Flink [1] seems to be the reason why EMR 6.0 doesn't have Flink support. Flink 1.11 will most likely have

Re: Wait for cancellation event with CEP

2020-05-04 Thread Maxim Parkachov
Hi Till, thank you for very detailed answer, now it is absolutely clear. Regards, Maxim. On Thu, Apr 30, 2020 at 7:19 PM Till Rohrmann wrote: > Hi Maxim, > > I think your problem should be solvable with the CEP library: > > So what we are doing here is to define a pattern forward followed by a

Support for Flink in EMR 6.0

2020-05-04 Thread Ori Popowski
Hi, EMR 6.0.0 has been released [1], and this release ignores Apache Flink (as well as other applications). Are there any plans to add support for Apache Flink for EMR 6.0.0 in the future? Thanks. [1] https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-emr-announces-emr-release-6-with-new

Re: How to list timers registered in timer service?

2020-05-04 Thread Seth Wiesman
Hi Lasse, In the state processor api, KeyedStateReaderFunction#readKey has a parameter called `Context` which you can use to read the registered event time and proc time timers for a key. Best, Seth On Fri, May 1, 2020 at 2:57 AM Lasse Nedergaard < lassenedergaardfl...@gmail.com> wrote: > Hi.

java.lang.IllegalStateException: The RPC connection is already closed

2020-05-04 Thread Manish G
Hi, I have set up flink and kafka locally. When I start my flink program(configured ot read messages from kafka topic), I get error as: 2020-05-04 18:17:58.035 INFO 23516 --- [lt-dispatcher-2] o.a.f.r.taskexecutor.JobLeaderService: Successful registration at job manager akka://flink/user/job

Re: InvalidTypesException: Input mismatch while consuming Kafka messages

2020-05-04 Thread Manish G
Thanks. It worked by introducing a custom DeserializationSchema. On Mon, May 4, 2020 at 3:04 PM Robert Metzger wrote: > Hi, > Can you provide the full stack trace of your exception? > Most likely, the error is caused by this setting: > > properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_C

Autoscaling flink application

2020-05-04 Thread Manish G
Hi, I understand task parallelism in flink, but is it possible to configure dynamic horizontal scaling also. Manish

Re: Any way to increase sort buffer size?

2020-05-04 Thread Stephan Ewen
Posting an update here, because it came up again: Have a look at https://issues.apache.org/jira/browse/FLINK-17192 specifically this comment: > There is a hidden/experimental feature in the sorter to offload large records, but it is not active by default. > > You can try and add "taskmanager.runt

Re: InvalidTypesException: Input mismatch while consuming Kafka messages

2020-05-04 Thread Robert Metzger
Hi, Can you provide the full stack trace of your exception? Most likely, the error is caused by this setting: properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, MyCustomClassDeserializer.class.getName()); You need to use Flink's DeserializationSchema. On Mon, May 4, 2020 at 1

InvalidTypesException: Input mismatch while consuming Kafka messages

2020-05-04 Thread Manish G
I have following code: // Properties properties = new Properties(); properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, MyCustomClassDeserializer.class.getName()); FlinkKafkaConsumer kafkaConsumer = new FlinkKafkaConsumer( "test-kafka=top

Re: multiple joins in one job

2020-05-04 Thread Fabian Hueske
Hi, If the interval join emits the time attributes of both its inputs, you can use either of them as a time attribute in a following operator because the join ensures that the watermark will be aligned with both of them. Best, Fabian Am Mo., 4. Mai 2020 um 00:48 Uhr schrieb lec ssmi : > Thanks