Hi All,
@Xitong thanks a lot for driving the discussion.
I also reviewed the FLIP and it looks quite good to me.
Here are some comments:
- One thing I wanted to discuss is the backwards-compatibility with the
previous user setups. We could list which options we plan to deprecate.
From
I also agree that all the configuration should be calculated out of
TaskManager.
So a full configuration should be generated before TaskManager started.
Override the calculated configurations through -D now seems better.
Best,
Yang
Xintong Song 于2019年9月2日周一 上午11:39写道:
> I just updated the
I just updated the FLIP wiki page [1], with the following changes:
- Network memory uses JVM direct memory, and is accounted when setting
JVM max direct memory size parameter.
- Use dynamic configurations (`-Dkey=value`) to pass calculated memory
configs into TaskExecutors, instead of
Yes I'll address the memory reservation functionality in a separate FLIP to
cooperate with FLIP-49 (sorry for being late for the discussion).
Best Regards,
Yu
On Mon, 2 Sep 2019 at 11:14, Xintong Song wrote:
> Sorry for the late response.
>
> - Regarding the `TaskExecutorSpecifics` naming, let
Sorry for the late response.
- Regarding the `TaskExecutorSpecifics` naming, let's discuss the detail in
PR.
- Regarding passing parameters into the `TaskExecutor`, +1 for using
dynamic configuration at the moment, given that there are more questions to
be discussed to have a general framework for
What I forgot to add is that we could tackle specifying the configuration
fully in an incremental way and that the full specification should be the
desired end state.
On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann wrote:
> I think our goal should be that the configuration is fully specified when
I think our goal should be that the configuration is fully specified when
the process is started. By considering the internal calculation step to be
rather validate existing values and calculate missing ones, these two
proposal shouldn't even conflict (given determinism).
Since we don't want to ch
I see. Under the assumption of strict determinism that should work.
The original proposal had this point "don't compute inside the TM, compute
outside and supply a full config", because that sounded more intuitive.
On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann wrote:
> My understanding was tha
My understanding was that before starting the Flink process we call a
utility which calculates these values. I assume that this utility will do
the calculation based on a set of configured values (process memory, flink
memory, network memory etc.). Assuming that these values don't differ from
the v
When computing the values in the JVM process after it started, how would
you deal with values like Max Direct Memory, Metaspace size. native memory
reservation (reduce heap size), etc? All the values that are parameters to
the JVM process and that need to be supplied at process startup?
On Wed, Au
Thanks for the clarification. I have some more comments:
- I would actually split the logic to compute the process memory
requirements and storing the values into two things. E.g. one could name
the former TaskExecutorProcessUtility and the latter
TaskExecutorProcessMemory. But we can discuss thi
Just add my 2 cents.
Using environment variables to override the configuration for different
taskmanagers is better.
We do not need to generate dedicated flink-conf.yaml for all taskmanagers.
A common flink-conf.yam and different environment variables are enough.
By reducing the distributed cached
One note on the Environment Variables and Configuration discussion.
My understanding is that passed ENV variables are added to the
configuration in the "GlobalConfiguration.loadConfig()" method (or similar).
For all the code inside Flink, it looks like the data was in the config to
start with, jus
Thanks for the comments, Till.
I've also seen your comments on the wiki page, but let's keep the
discussion here.
- Regarding 'TaskExecutorSpecifics', how do you think about naming it
'TaskExecutorResourceSpecifics'.
- Regarding passing memory configurations into task executors, I'm in favor
of d
Hi Xintong,
thanks for addressing the comments and adding a more detailed
implementation plan. I have a couple of comments concerning the
implementation plan:
- The name `TaskExecutorSpecifics` is not really descriptive. Choosing a
different name could help here.
- I'm not sure whether I would pa
Hi everyone,
I just updated the FLIP document on wiki [1], with the following changes.
- Removed open question regarding MemorySegment allocation. As
discussed, we exclude this topic from the scope of this FLIP.
- Updated content about JVM direct memory parameter according to recent
d
@Xintong: Concerning "wait for memory users before task dispose and memory
release": I agree, that's how it should be. Let's try it out.
@Xintong @Jingsong: Concerning " JVM does not wait for GC when allocating
direct memory buffer": There seems to be pretty elaborate logic to free
buffers when al
Thanks for the inputs, Jingsong.
Let me try to summarize your points. Please correct me if I'm wrong.
- Memory consumers should always avoid returning memory segments to
memory manager while there are still un-cleaned structures / threads that
may use the memory. Otherwise, it would caus
Quick question for option 1.1 Stephan: Does this variant entail that we
distinguish between native and direct memory off heap managed memory? If
this is the case, then it won't be possible for users to run a streaming
job using RocksDB and a batch DataSet job on the same session cluster
unless they
Hi stephan:
About option 2:
if additional threads not cleanly shut down before we can exit the task:
In the current case of memory reuse, it has freed up the memory it
uses. If this memory is used by other tasks and asynchronous threads
of exited task may still be writing, there will be concurr
My main concern with option 2 (manually release memory) is that segfaults
in the JVM send off all sorts of alarms on user ends. So we need to
guarantee that this never happens.
The trickyness is in tasks that uses data structures / algorithms with
additional threads, like hash table spill/read and
Thanks for the comments, Stephan. Summarized in this way really makes
things easier to understand.
I'm in favor of option 2, at least for the moment. I think it is not that
difficult to keep it segfault safe for memory manager, as long as we always
de-allocate the memory segment when it is release
About the "-XX:MaxDirectMemorySize" discussion, maybe let me summarize it a
bit differently:
We have the following two options:
(1) We let MemorySegments be de-allocated by the GC. That makes it segfault
safe. But then we need a way to trigger GC in case de-allocation and
re-allocation of a bunch
Thanks for sharing your opinion Till.
I'm also in favor of alternative 2. I was wondering whether we can avoid
using Unsafe.allocate() for off-heap managed memory and network memory with
alternative 3. But after giving it a second thought, I think even for
alternative 3 using direct memory for off
Thanks for the clarification Xintong. I understand the two alternatives now.
I would be in favour of option 2 because it makes things explicit. If we
don't limit the direct memory, I fear that we might end up in a similar
situation as we are currently in: The user might see that her process gets
k
Let me explain this with a concrete example Till.
Let's say we have the following scenario.
Total Process Memory: 1GB
JVM Direct Memory (Task Off-Heap Memory + JVM Overhead): 200MB
Other Memory (JVM Heap Memory, JVM Metaspace, Off-Heap Managed Memory and
Network Memory): 800MB
For alternative 2
I guess you have to help me understand the difference between alternative 2
and 3 wrt to memory under utilization Xintong.
- Alternative 2: set XX:MaxDirectMemorySize to Task Off-Heap Memory and JVM
Overhead. Then there is the risk that this size is too low resulting in a
lot of garbage collection
Hi xintong,till
> Native and Direct Memory
My point is setting a very large max direct memory size when we do not
differentiate direct and native memory. If the direct memory,including user
direct memory and framework direct memory,could be calculated correctly,then
i am in favor of setting dire
Thanks for replying, Till.
About MemorySegment, I think you are right that we should not include this
issue in the scope of this FLIP. This FLIP should concentrate on how to
configure memory pools for TaskExecutors, with minimum involvement on how
memory consumers use it.
About direct memory, I t
Thanks for proposing this FLIP Xintong.
All in all I think it already looks quite good. Concerning the first open
question about allocating memory segments, I was wondering whether this is
strictly necessary to do in the context of this FLIP or whether this could
be done as a follow up? Without kn
Thanks for the feedback, Yang.
Regarding your comments:
*Native and Direct Memory*
I think setting a very large max direct memory size definitely has some
good sides. E.g., we do not worry about direct OOM, and we don't even need
to allocate managed / network memory with Unsafe.allocate() .
Howev
Hi xintong,
Thanks for your detailed proposal. After all the memory configuration are
introduced, it will be more powerful to control the flink memory usage. I
just have few questions about it.
- Native and Direct Memory
We do not differentiate user direct memory and native memory. They ar
32 matches
Mail list logo