Big Data Contract Roles ?

2022-09-14 Thread sri hari kali charan Tummala
Hi Flink Users/ Spark Users, Is anyone hiring contract corp to corp big Data spark scala or Flink scala roles ? Thanks Sri

Re: A classloading question about submitting Flink jobs on Yarn-Perjob Mode

2022-09-14 Thread yu'an huang
I guess there may be a class conflict between your user jar and Flink lib. For the error: java.lang.NoClassDefFoundError, it may caused Exception happening when Initializing a class. I suggest you set the log level to DEBUG and send the client log here. Let’s look whether there are any new findi

Re: DataStream and DataStreamSource

2022-09-14 Thread Noel OConnor
Awesome, thanks for the info! much appreciated. On Wed, Sep 14, 2022 at 5:04 PM Jing Ge wrote: > > Hi, > > Welcome to the Flink community! > > A DataStreamSource is a DataStream. It is normally used as the starting point > of a DataStream. All related methods in StreamExecutionEnvironment that

Re: DataStream and DataStreamSource

2022-09-14 Thread Jing Ge
Hi, Welcome to the Flink community! A DataStreamSource is a DataStream. It is normally used as the starting point of a DataStream. All related methods in StreamExecutionEnvironment that create a DataStream return actually a DataStreamSource, because it is where a DataStream starts. Commonly, yo

Re: ExecutionMode in ExecutionConfig

2022-09-14 Thread zhanghao.chen
It's added in Flink 1.14: https://nightlies.apache.org/flink/flink-docs-master/zh/release-notes/flink-1.14/#expose-a-consistent-globaldataexchangemode. Not sure if there's a way to change this in 1.13 Best, Zhanghao Chen From: Hailu, Andreas Sent: Wednesday, Sep

RE: ExecutionMode in ExecutionConfig

2022-09-14 Thread Hailu, Andreas
I can give this a try. Do you know which Flink version does this feature become available in? ah From: zhanghao.c...@outlook.com Sent: Wednesday, September 14, 2022 11:10 AM To: Hailu, Andreas [Engineering] ; user@flink.apache.org Subject: Re: ExecutionMode in ExecutionConfig Could you try se

Re: ExecutionMode in ExecutionConfig

2022-09-14 Thread zhanghao.chen
Could you try setting ”execution.batch-shuffle-mode‘=‘ALL_EXCHANGES_PIPELINED’? Looks like the ExecutionMode in ExecutionConfig does not work for DataStream APIs. The default shuffling behavior for a DataStream API in batch mode is 'ALL_EXCHANGES_BLOCKING' where upstream and downstream tasks ru

Re: Can flink dynamically detect high load and increase the job parallelism automatically?

2022-09-14 Thread zhanghao.chen
Flink will not try to help you do autoscaling and the parallelism is fixed unless you enable reactive mode/adaptive scheduler. Max parallelism just means the maximum parallelism with which you can rescale your job without losing states. The max parallelism limit is related to the Flink key group

RE: ExecutionMode in ExecutionConfig

2022-09-14 Thread Hailu, Andreas
Hi Zhanghao, That seems different than what I'm referencing and one of my points of confusion - the documents refer to ExecutionMode as BATCH and STREAMING which is different than what the code refers to it as Runtime Mode e.g. env.setRuntimeMode(RuntimeExecutionMode.BATCH); I'm referring to t

DataStream and DataStreamSource

2022-09-14 Thread Noel OConnor
Hi, I'm new to flink and I'm trying to integrate it with apache pulsar. I've gone through the demos and I get how they work but one aspect that I can't figure out is what's the difference between a DataStream and a DataStreamSource. When would you use one over the other? cheers Noel

RE: Can flink dynamically detect high load and increase the job parallelism automatically?

2022-09-14 Thread Erez Yaakov
Maybe 'automatically parallelism change' is a not accurate term for describing what I mean, so let me re-phrase it: Assuming I'm submitting my job with parallelism = 2 and max parallelism = 128 (default). My expectation is that any instance of the job will actually have several instances at ru

Extending RMQSource

2022-09-14 Thread Nadia Mostafa
Hello, I have extended the RMQSource class and overrode setupQueue method to declare a queue and bind it to an exchange. Now, when I stop the flink job the queue is not deleted. I tried to override cancel() and close() to delete the queue but I found they are not called on stopping the job. Is t