Thanks Chesnay for this FLIP and sorry for touching it a bit delay on my side.

I also have some similar concerns which Till already proposed before.

1. The consistent terminology in different components. On JM side, 
PartitionTracker#getPersistedBlockingPartitions is defined for getting global 
partitions. And on RM side, we define the method of #registerGlobalPartitions 
correspondingly for handover the partitions from JM. I think it is better to 
unify the term in different components for for better understanding the 
semantic. Concering whether to use global or persistent, I prefer the "global" 
term personally. Because it describes the scope of partition clearly, and the 
"persistent" is more like the partition storing way or implementation detail. 
In other words, the global partition might also be cached in memory of TE, not 
must persist into files from semantic requirements. Whether memory or 
persistent file is just the implementation choice.

2. On TE side, we might rename the method #releasePartitions to 
#releaseOrPromotePartitions which describes the function precisely and keeps 
consistent with PartitionTracker#stopTrackingAndReleaseOrPromotePartitionsFor().

3. Very agree with Till's suggestions of global PartitionTable on TE side and 
sticking to TE's heartbeat report to RM for global partitions.

4. Considering ShuffleMaster, it was built inside JM and expected to 
interactive with JM before. Now the RM also needs to interactive with 
ShuffleMaster to release global partitions. Then it might be better to move 
ShuffleMaster outside of JM, and the lifecycle of ShuffleMaster should be 
consistent with RM.

5. Nit: TM->TE in the section of Proposed Changes: "TMs retain global 
partitions for successful jobs"

Best,
Zhijiang


------------------------------------------------------------------
From:Till Rohrmann <trohrm...@apache.org>
Send Time:2019年9月10日(星期二) 10:10
To:dev <dev@flink.apache.org>
Subject:Re: [DISCUSS] FLIP-67: Global partitions lifecycle

Thanks Chesnay for drafting the FLIP and starting this discussion.

I have a couple of comments:

* I know that I've also coined the terms global/local result partition but
maybe it is not the perfect name. Maybe we could rethink the terminology
and call them persistent result partitions?
* Nit: I would call the last parameter of void releasePartitions(JobID
jobId, Collection<ResultPartitionID> partitionsToRelease,
Collection<ResultPartitionID> partitionsToPromote) either
partitionsToRetain or partitionsToPersistent.
* I'm not sure whether partitionsToRelease should contain a
global/persistent result partition id. I always thought that the user will
be responsible for managing the lifecycle of a global/persistent
result partition.
* Instead of extending the PartitionTable to be able to store
global/persistent and local/transient result partitions, I would rather
introduce a global PartitionTable to store the global/persistent result
partitions explicitly. I think there is a benefit in making things as
explicit as possible.
* The handover logic between the JM and the RM for the global/persistent
result partitions seems a bit brittle to me. What will happen if the JM
cannot reach the RM? I think it would be better if the TM announces the
global/persistent result partitions to the RM via its heartbeats. That way
we don't rely on an established connection between the JM and RM and we
keep the TM as the ground of truth. Moreover, the RM should simply forward
the release calls to the TM without much internal logic.

Cheers,
Till

On Fri, Sep 6, 2019 at 3:16 PM Chesnay Schepler <ches...@apache.org> wrote:

> Hello,
>
> FLIP-36 (interactive programming)
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink>
>
> proposes a new programming paradigm where jobs are built incrementally
> by the user.
>
> To support this in an efficient manner I propose to extend partition
> life-cycle to support the notion of /global partitions/, which are
> partitions that can exist beyond the life-time of a job.
>
> These partitions could then be re-used by subsequent jobs in a fairly
> efficient manner, as they don't have to persisted to an external storage
> first and consuming tasks could be scheduled to exploit data-locality.
>
> The FLIP outlines the required changes on the JobMaster, TaskExecutor
> and ResourceManager to support this from a life-cycle perspective.
>
> This FLIP does /not/ concern itself with the /usage/ of global
> partitions, including client-side APIs, job-submission, scheduling and
> reading said partitions; these are all follow-ups that will either be
> part of FLIP-36 or spliced out into separate FLIPs.
>
>

Reply via email to