Thanks Chesnay for this FLIP and sorry for touching it a bit delay on my side.
I also have some similar concerns which Till already proposed before. 1. The consistent terminology in different components. On JM side, PartitionTracker#getPersistedBlockingPartitions is defined for getting global partitions. And on RM side, we define the method of #registerGlobalPartitions correspondingly for handover the partitions from JM. I think it is better to unify the term in different components for for better understanding the semantic. Concering whether to use global or persistent, I prefer the "global" term personally. Because it describes the scope of partition clearly, and the "persistent" is more like the partition storing way or implementation detail. In other words, the global partition might also be cached in memory of TE, not must persist into files from semantic requirements. Whether memory or persistent file is just the implementation choice. 2. On TE side, we might rename the method #releasePartitions to #releaseOrPromotePartitions which describes the function precisely and keeps consistent with PartitionTracker#stopTrackingAndReleaseOrPromotePartitionsFor(). 3. Very agree with Till's suggestions of global PartitionTable on TE side and sticking to TE's heartbeat report to RM for global partitions. 4. Considering ShuffleMaster, it was built inside JM and expected to interactive with JM before. Now the RM also needs to interactive with ShuffleMaster to release global partitions. Then it might be better to move ShuffleMaster outside of JM, and the lifecycle of ShuffleMaster should be consistent with RM. 5. Nit: TM->TE in the section of Proposed Changes: "TMs retain global partitions for successful jobs" Best, Zhijiang ------------------------------------------------------------------ From:Till Rohrmann <trohrm...@apache.org> Send Time:2019年9月10日(星期二) 10:10 To:dev <dev@flink.apache.org> Subject:Re: [DISCUSS] FLIP-67: Global partitions lifecycle Thanks Chesnay for drafting the FLIP and starting this discussion. I have a couple of comments: * I know that I've also coined the terms global/local result partition but maybe it is not the perfect name. Maybe we could rethink the terminology and call them persistent result partitions? * Nit: I would call the last parameter of void releasePartitions(JobID jobId, Collection<ResultPartitionID> partitionsToRelease, Collection<ResultPartitionID> partitionsToPromote) either partitionsToRetain or partitionsToPersistent. * I'm not sure whether partitionsToRelease should contain a global/persistent result partition id. I always thought that the user will be responsible for managing the lifecycle of a global/persistent result partition. * Instead of extending the PartitionTable to be able to store global/persistent and local/transient result partitions, I would rather introduce a global PartitionTable to store the global/persistent result partitions explicitly. I think there is a benefit in making things as explicit as possible. * The handover logic between the JM and the RM for the global/persistent result partitions seems a bit brittle to me. What will happen if the JM cannot reach the RM? I think it would be better if the TM announces the global/persistent result partitions to the RM via its heartbeats. That way we don't rely on an established connection between the JM and RM and we keep the TM as the ground of truth. Moreover, the RM should simply forward the release calls to the TM without much internal logic. Cheers, Till On Fri, Sep 6, 2019 at 3:16 PM Chesnay Schepler <ches...@apache.org> wrote: > Hello, > > FLIP-36 (interactive programming) > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink> > > proposes a new programming paradigm where jobs are built incrementally > by the user. > > To support this in an efficient manner I propose to extend partition > life-cycle to support the notion of /global partitions/, which are > partitions that can exist beyond the life-time of a job. > > These partitions could then be re-used by subsequent jobs in a fairly > efficient manner, as they don't have to persisted to an external storage > first and consuming tasks could be scheduled to exploit data-locality. > > The FLIP outlines the required changes on the JobMaster, TaskExecutor > and ResourceManager to support this from a life-cycle perspective. > > This FLIP does /not/ concern itself with the /usage/ of global > partitions, including client-side APIs, job-submission, scheduling and > reading said partitions; these are all follow-ups that will either be > part of FLIP-36 or spliced out into separate FLIPs. > >