Aljoscha is right. Multiple consumers in the same consumer group can not
read from the same partition.
You'll need to create a Kafka topic with more partitions to have higher
parallelism.
On Wed, Jul 6, 2016 at 10:45 AM, Aljoscha Krettek
wrote:
> Hi,
> unfortunately the reading of one Kafka part
Hi,
unfortunately the reading of one Kafka partition cannot be split among
several parallel instances of the Kafka source. So if you have only 2
partitions your reading parallelism is limited to that. You are right that
this can lead to bad performance and underutilization. The only solution I
see
Hi,
The re-balance actually distributes it to all the task managers, and now
all TM's are getting utilized, You were right , I am seeing two
boxes(Tasks) now.
I have one question regarding the task slots :
For the source the parallelism is set to 56, now when we see on the UI and
click on source
Thanks a lot guys, this helps to understand better
Regards,
Vinay Patil
On Mon, Jul 4, 2016 at 8:43 PM, Stephan Ewen wrote:
> Just to be sure: Each *subtask* has one thread - so for each task, there
> are as many parallel threads (distributed across nodes) as your parallelism
> indicates.
>
> F
Just to be sure: Each *subtask* has one thread - so for each task, there
are as many parallel threads (distributed across nodes) as your parallelism
indicates.
For most cases, having long chains and then a higher parallelism is a good
choice.
Cases where individual functions (MapFunction, etc) do
Hi,
chaining is useful to minimize communication overhead. But in your case you
might benefit more from having good cluster utilization. There seems to be
a tradeoff. Maybe you can run some easy tests to see how it behaves for you.
Cheers,
Aljoscha
On Mon, 4 Jul 2016 at 16:28 Vinay Patil wrote:
Thanks,
so is operator chaining useful in terms of utilizing the resources or we
should keep the chaining to minimal use, say 3-4 operators and disable
chaining ?
I am worried because I am seeing all the operators in one box on flink UI.
Regards,
Vinay Patil
On Mon, Jul 4, 2016 at 7:13 PM, Aljo
Hi,
this is true, yes. If the number of Kafka partitions is less than the
parallelism then some of the sources might not be utilized. If you insert a
rebalance after the sources you should be able to utilize all the
downstream operations equally.
Cheers,
Aljoscha
On Mon, 4 Jul 2016 at 11:13 Vinay
Just an update, the task will be executed by multiple threads , my bad I
asked the wrong way.
Can you please clarify other things.
Out of 8 node only 3 of them are getting utilized, reading the data from
Kafka , does it mean that the Kafka partitions are set to less number ?
What if we use rescal
Hi,
According to the documentation :
*"**Each task is executed by one thread ,**Chaining operators together into
tasks is a useful optimization: it reduces the overhead of thread-to-thread
handover and buffering, and increases overall throughput while decreasing
latency"*
So does it mean that the
10 matches
Mail list logo