How to handle exceptions in Kafka sink?

2018-09-13 Thread HarshithBolar
I have a Flink job that writes data into Kafka. The Kafka topic has maximum message size set to 5 MB, so if I try to write any record larger than 5 MB, it throws the following exception and brings the job down.

Re: WriteTimeoutException in Cassandra sink kill the job

2018-09-11 Thread HarshithBolar
Have you configured checkpointing in your job. If enabled, the job should revert back to the last stored checkpoint in case of a failure and process the failed record again. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Does Flink support keyed watermarks?

2018-09-05 Thread HarshithBolar
We collect driving data from thousands of users, each vehicle is associated with a IMEI (unique code). The device installed in these vehicles emits GPS points in 5 second intervals. My requirement is to assemble all the GPS points that belong to a single trip and construct a Trip object, for a give

Does Flink support keyed watermarks? If not, is there any plan of implementing it in future versions? What are my alternatives?

2018-09-05 Thread HarshithBolar
We collect driving data from thousands of users, each vehicle is associated with a IMEI (unique code). The device installed in these vehicles emits GPS points in 5 second intervals. My requirement is to assemble all the GPS points that belong to a single trip and construct a Trip object, for a give

Does Flink support keyed watermarks? If not, is there any plan of implementing it soon? What are my alternatives?

2018-09-05 Thread HarshithBolar
We collect driving data from thousands of users, each vehicle is associated with a IMEI (unique code). The device installed in these vehicles emits GPS points in 5 second intervals. My requirement is to assemble all the GPS points that belong to a single trip and construct a Trip object, for a giv

What are the general reasons for a Flink Task Manager to crash? How to troubleshoot?

2018-08-30 Thread HarshithBolar
We're running Flink on a 5 node Flink cluster with two Job Managers and three Task Managers. Of late, we're facing this issue where once every day or so, all three task managers get killed, making the number of available task slots 0 causing all the jobs running on that cluster to fail. The only r

How to pass a dynamic path while writing to files using writeFileAsText(path)?

2018-08-20 Thread HarshithBolar
Let's say I have a Stream with elements of type `String`. I want to write each element in the stream to a separate file in some folder. I'm using the following set up. > filteredStream.writeAsText(path).setParallelism(1); How do I make this path variable? I even tried adding `System.nanotime(

Re: How to read from Cassandra using Apache Flink?

2018-06-06 Thread HarshithBolar
I figured out a way to solve this by writing my own code, but would love to know if there are better - more efficient solutions. Here's the answer - https://stackoverflow.com/questions/50697296/how-to-read-from-cassandra-using-apache-flink/50721953#50721953 @Chesnay I've been wondering about this.

How to read from Cassandra using Apache Flink?

2018-06-05 Thread HarshithBolar
My flink program should do a Cassandra look up for each input record and based on the results, should do some further processing. But I'm currently stuck at reading data from Cassandra. This is the code snippet I've come up with so far. > ClusterBuilder secureCassandraSinkClusterBuilder = new Clu

Re: Job execution fails when parallelism is increased beyond 1

2018-05-29 Thread HarshithBolar
Just a heads up. I haven't found the root cause for this issue yet but restarting all the nodes seems to have solved this issue. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Job execution fails when parallelism is increased beyond 1

2018-05-28 Thread HarshithBolar
I'm using Flink 1.4.2 -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Job execution fails when parallelism is increased beyond 1

2018-05-28 Thread HarshithBolar
I'm submitting a Flink job to a cluster that has three Task Managers via the Flink dashboard. When I set `Parallelism` to 1 (which is default), everything runs as expected. But when I increase `Parallelism` to anything more than 1, the job fails with the exception, /java.io.FileNotFoundExcepti