Hello Fabian,

Thanks a lot for the explanation. When the operators (including source / sink) 
are chained, what is the method of communication between them?

We use Kafka for data source and I was interested to learn the mechanism of 
communication from Kafka to the next operator, say flatMap… So I studied Flink 
codebase to some extent and looked into our log file as well.

I see a while(true) loop in Task.java’s run() method (which seems to call the 
StreamTask’s invoke(), which I suppose will eventually execute the code for the 
operator) and I also see while (running) in Kafka09Fetcher.java’s run method. 
When Kafka09Fetcher’s run() receives a record from lower layer, it calls 
emitRecord() method, which I suppose writes into a ByteBuffer?  how does this 
ByteBuffer get shared with the next operator (for e.g flatMap)? It’s difficult 
for me to imagine the two while (running) working at the same time when the 
fetcher & flatMap are chained together.

If my above understanding is incorrect, please help me understand. You may just 
point the chain of java method calls…

Sorry for sounding confused. I only have a very little knowledge so far from my 
study. Would appreciate your explanation in this regard.

Thanks,
Buvana

From: Fabian Hueske [mailto:fhue...@gmail.com]
Sent: Monday, September 26, 2016 2:41 PM
To: user@flink.apache.org
Subject: Re: TaskManager & task slots

Hi Buvana,
A TaskManager runs as a single JVM process. A TaskManager provides a certain 
number of processing slots. Slots do not guard CPU time, IO, or JVM memory. At 
the moment they only isolate managed memory which is only used for batch 
processing. For streaming applications their only purpose is to limit the 
number of parallel threads that can be executed by a TaskManager.
In each processing slot, a full slice of a program can be executed, i.e., one 
parallel subtask of each operator of a program. Given a simple program (source 
-> map -> sink), a slot can process one subtask of the source, the mapper, and 
the sink (it is possible to split a program to be executed in more slots). Each 
operator can be executed as a separate thread. However, in many situations, 
operators are chained within the same thread to improve performance (again it 
is possible to disallow chaining).
Let me know if you have more questions,
Fabian

2016-09-26 20:31 GMT+02:00 Ramanan, Buvana (Nokia - US) 
<buvana.rama...@nokia-bell-labs.com<mailto:buvana.rama...@nokia-bell-labs.com>>:
Hello,

I would like to understand the following better:

https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/config.html#configuring-taskmanager-processing-slots

Fundamental question – what is the notion of Task Slot? Does it correspond to 
one JVM? Or the Task Manager itself corresponds to one JVM?
Example-1 shows a parallelism of 1 and has 3 operators – flatMap, Reduce & 
Sink. Here comes the question – are these 3 operators running a separate 
threads within a JVM?

Sorry for the naïve questions. I studied the following links and could not get 
a clear answer:
https://ci.apache.org/projects/flink/flink-docs-release-1.1/internals/general_arch.html
https://ci.apache.org/projects/flink/flink-docs-release-1.1/internals/job_scheduling.html

Are there more documents under Flink’s wiki site / elsewhere? Please point me 
to more info on the architecture.

thank you,
regards,
Buvana


Reply via email to