Re: How to advance Kafka offsets manually with checkpointing enabled on Flink (TableAPI)

2021-03-18 Thread Rex Fenley
Thanks for the info! On Thu, Mar 18, 2021 at 7:46 AM Dawid Wysakowicz wrote: > Hi Rex, > > The approach you described is definitely possible in the DataStream API. > You could replace the uid of your Kafka source and start your job with your > checkpoint with the allowNonRestoredState option ena

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-18 Thread Alexey Trenikhun
I Yun, I've changed configuration to use block blobs, however due to another issue [1], I can't make savepoint, I hope eventually job will able to process backlog, then I will take savepoint, re-test and let you know Thanks, Alexey [1] http://apache-flink-user-mailing-list-archive.2336050.n4.n

Re: PyFlink java.io.EOFException at java.io.DataInputStream.readInt

2021-03-18 Thread Yik San Chan
Hi Dian, I am able to reproduce this issue in a much simpler setup. Let me update with the simpler reproducible example shortly. Best, Yik San On Fri, Mar 19, 2021 at 11:28 AM Yik San Chan wrote: > Hi Dian, > > It is a good catch, though after changing to use > flink-sql-connector-kafka_2.11-1

Re: PyFlink java.io.EOFException at java.io.DataInputStream.readInt

2021-03-18 Thread Yik San Chan
Hi Dian, It is a good catch, though after changing to use flink-sql-connector-kafka_2.11-1.12.0.jar I still get exactly the same error. Best, Yik San On Fri, Mar 19, 2021 at 11:02 AM Dian Fu wrote: > > I noticed that you use "flink-sql-connector-kafka_2.12-1.12.0.jar”. Does > the jar files in

Re: PyFlink java.io.EOFException at java.io.DataInputStream.readInt

2021-03-18 Thread Dian Fu
I noticed that you use "flink-sql-connector-kafka_2.12-1.12.0.jar”. Does the jar files in the cluster nodes are also built with Scala 2.12? PyFlink package bundles jar files with Scala 2.11 by default. I’m still not sure if it’s related to this issue. However, I think this is problematic. Could

Re: PyFlink java.io.EOFException at java.io.DataInputStream.readInt

2021-03-18 Thread Yik San Chan
Hi Dian, The PyFlink version is 1.12.0 and the Flink version in the cluster nodes is also 1.12.0 $ which flink /data/apache/flink/flink-1.12.0/bin/flink Best, Yik San On Fri, Mar 19, 2021 at 10:26 AM Dian Fu wrote: > Hi, > > What’s the Flink version in the cluster nodes? It should matches the

Re: PyFlink java.io.EOFException at java.io.DataInputStream.readInt

2021-03-18 Thread Dian Fu
Hi, What’s the Flink version in the cluster nodes? It should matches the PyFlink version. Regards, Dian > 2021年3月18日 下午5:01,Yik San Chan 写道: > > This question is cross-posted on StackOverflow > https://stackoverflow.com/questions/66687797/pyflink-java-io-eofexception-at-java-io-datainputstre

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-18 Thread Dian Fu
Does the job runs in detached mode or attached mode? Could you share some code snippets and the job submission command if possible? Regards, Dian > 2021年3月18日 下午8:17,Robert Cullen 写道: > > Dian, > > Thanks for your reply. Yes, I would submit the same job in kubernetes > session mode. Someti

Re: Python API + Unit Testing

2021-03-18 Thread Dian Fu
Hi, Do you mean how to run Python unit test? If so, you could refer to [1] for more details. Regards, Dian [1] https://cwiki.apache.org/confluence/display/FLINK/Setting+up+a+Flink+development+environment > 2021年3月18日 下午10:46,Kevin Lam 写道: > > Hi all, > > I noticed there isn't much in the w

Eliminating Shuffling Under FlinkSQL

2021-03-18 Thread Aeden Jameson
It's my understanding that a group by is also a key by under the hood. As a result that will cause a shuffle operation to happen. Our source is a Kafka topic that is keyed so that any give partition contains all the data that is needed for any given consuming TM. Is there a way using FlinkSQL to el

Understanding Max Parallelism

2021-03-18 Thread Aeden Jameson
I'm trying to get my head around the impact of setting max parallelism. * Does max parallelism primarily serve as a reservation for future increases to parallelism? The reservation being the ability to restore from checkpoints and savepoints after increases to parallelism. * Does it serve as a ru

Flink SQL : Interval Outer/Left Join not working as expected

2021-03-18 Thread Aneesha Kaushal
Hi, I am doing a simple POC using Flink SQL and I am facing some issues with Interval Join. *Use Case*: I have two Kafka streams and using Flink SQL interval join I want to remove rows from* stream 1*(abandoned_user_visits) that are present in *stream 2*(orders) within some time interval. *Data:

Re: Production Readiness of File Source

2021-03-18 Thread Dawid Wysakowicz
Hi, As for the issue of production readiness of the File Source(and other components) I'd recommend having a look at the PR, which is close to being merged where we express our opinion how we see certain components: https://github.com/apache/flink-web/pull/426 I am also cc'ing Stephan who wrote t

Re: ClassCastException after upgrading Flink application to 1.11.2

2021-03-18 Thread Dawid Wysakowicz
Could you share a full stacktrace with us? Could you check the stack trace also in the task managers logs? As a side note, make sure you are using the same version of all Flink dependencies. Best, Dawid On 17/03/2021 06:26, soumoks wrote: > Hi, > > We have upgraded an application originally wri

Python API + Unit Testing

2021-03-18 Thread Kevin Lam
Hi all, I noticed there isn't much in the way of testing discussed in the Python API docs for Flink. Does the community have any best-practices or recommendations on how testing should be done with PyFlink? Thanks!

Re: How to advance Kafka offsets manually with checkpointing enabled on Flink (TableAPI)

2021-03-18 Thread Dawid Wysakowicz
Hi Rex, The approach you described is definitely possible in the DataStream API. You could replace the uid of your Kafka source and start your job with your checkpoint with the allowNonRestoredState option enabled[1]. I am afraid though it is not possible to change the uid in Table API/SQL Anothe

Re: custom metrics within a Trigger

2021-03-18 Thread Dawid Wysakowicz
Do you mind sharing the code how do you register your metrics with the TriggerContext? It could help us identify where does name collisions come from. As far as I am aware it should be fine to use the TriggerContext for registering metrics. Best, Dawid On 16/03/2021 17:35, Aleksander Sumowski wr

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-18 Thread Robert Cullen
Dian, Thanks for your reply. Yes, I would submit the same job in kubernetes session mode. Sometimes the job would succeed but successive tries would fail. No stack trace, the job would never return a job id: In this case I redeployed the cluster and the job completed ... and multiple tries were

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-18 Thread Yun Tang
Hi Alexey, Flink would only write once for checkpointed files. Could you try to write checkpointed files as block blob format and see whether the problem still existed? Best Yun Tang From: Alexey Trenikhun Sent: Thursday, March 18, 2021 13:54 To: Yun Tang ; Tzu

Approaches to customize the parallelism in SQL generated operators

2021-03-18 Thread eef hhj
Hi team, Currently the SQL generated operator has all the same parallelism by default, and we faced a issue that the in the case of multiple join, the operator at later stage faces larger computation so that the overall pipeline is back-presured and it causes checkpoint fail(expired) occasionaly.

Parameter to config read frequency in Kafka SQL connector

2021-03-18 Thread eef hhj
Hi team, We are in a situatoin that we want to reduce the read frequency of Kafka SQL connector. I did some investigation on the properties of Kafka client, while it seems it does not have such options. Athough I found the batch size config('properties.max.partition.fetch.bytes') among the config

Flink minimum resource recommendation on k8s cluster

2021-03-18 Thread Amit Bhatia
Hi, Is there any minimum resource ( CPU & Memory) recommendation to start flink jobmanager and taskmanager pods on k8s cluster. Regards, Amit Bhatia

PyFlink java.io.EOFException at java.io.DataInputStream.readInt

2021-03-18 Thread Yik San Chan
This question is cross-posted on StackOverflow https://stackoverflow.com/questions/66687797/pyflink-java-io-eofexception-at-java-io-datainputstream-readint I have a PyFlink job that reads from Kafka source, transform, and write to Kafka sink. This is a `tree` view of my working directory. ``` > t

Re: Application cluster - Best Practice

2021-03-18 Thread Till Rohrmann
Hi Tamir, one big consideration with making things public is that the community cannot change it easily because users expect that these interfaces are stable. This translates effectively to higher maintenance costs for the community. Cheers, Till On Wed, Mar 17, 2021 at 3:13 PM Tamir Sagi wrote

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-18 Thread Dian Fu
Hi Robert, 1) Do you mean that when submitting the same job multiple times and it succeed sometimes and hangs sometimes or it only hangs for some specific job? 2) Which deployment mode do you use? 3) Is it possible to dump the stack trace? It would help us understanding what’s happening. Thank

inputFloatingBuffersUsage=0?

2021-03-18 Thread Alexey Trenikhun
Hello, While trying to investigate back pressure using hints from [1], I've noticed that flink_taskmanager_job_task_buffers_inPoolUsage and flink_taskmanager_job_task_buffers_inputFloatingBuffersUsage are always 0, which looks suspicious, are these metrics still populated ? Thanks, Alexey [1]