Hi Becket, The intermediate result will indeed be automatically re-generated by resubmitting the original DAG. And that job could fail as well. In that case, we need to decide if we should resubmit the original DAG to re-generate the intermediate result or give up and throw an exception to the user. And the config is to indicate how many resubmit should happen before giving up.
Thanks, Xuannan On Fri, Apr 24, 2020 at 4:19 PM Becket Qin <becket....@gmail.com> wrote: > Hi Xuannan, > > I am not entirely sure if I understand the cases you mentioned. The users > > can use the cached table object returned by the .cache() method in other > > job and it should read the intermediate result. The intermediate result > can > > gone in the following three cases: 1. the user explicitly call the > > invalidateCache() method 2. the TableEnvironment is closed 3. failure > > happens on the TM. When that happens, the intermeidate result will not be > > available unless it is re-generated. > > > What confused me was that why do we need to have a *cache.retries.max > *config? > Shouldn't the missing intermediate result always be automatically > re-generated if it is gone? > > Thanks, > > Jiangjie (Becket) Qin > > > On Fri, Apr 24, 2020 at 3:59 PM Xuannan Su <suxuanna...@gmail.com> wrote: > > > Hi Becket, > > > > Thanks for the comments. > > > > On Fri, Apr 24, 2020 at 9:12 AM Becket Qin <becket....@gmail.com> wrote: > > > > > Hi Xuannan, > > > > > > Thanks for picking up the FLIP. It looks good to me overall. Some quick > > > comments / questions below: > > > > > > 1. Do we also need changes in the Java API? > > > > > > > Yes, the public interface of Table and TableEnvironment should be made in > > the Java API. > > > > > > > 2. What are the cases that users may want to retry reading the > > intermediate > > > result? It seems that once the intermediate result has gone, it will > not > > be > > > available later without being generated again, right? > > > > > > > I am not entirely sure if I understand the cases you mentioned. The > users > > can use the cached table object returned by the .cache() method in other > > job and it should read the intermediate result. The intermediate result > can > > gone in the following three cases: 1. the user explicitly call the > > invalidateCache() method 2. the TableEnvironment is closed 3. failure > > happens on the TM. When that happens, the intermeidate result will not be > > available unless it is re-generated. > > > > 3. In the "semantic of cache() method" section, the description "The > > > semantic of the *cache() *method is a little different depending on > > whether > > > auto caching is enabled or not." seems not explained. > > > > > > > This line is actually outdated and should be removed, as we are not > adding > > the auto caching functionality in this FLIP. Auto caching will be added > in > > the future, and the semantic of cache() when auto caching is enabled will > > be discussed in detail by a new FLIP. I will remove the descriptor to > avoid > > further confusion. > > > > > > > Thanks, > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > On Wed, Apr 22, 2020 at 4:00 PM Xuannan Su <suxuanna...@gmail.com> > > wrote: > > > > > > > Hi folks, > > > > > > > > I'd like to start the discussion about FLIP-36 Support Interactive > > > > Programming in Flink Table API > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink > > > > > > > > The FLIP proposes to add support for interactive programming in Flink > > > Table > > > > API. Specifically, it let users cache the intermediate > results(tables) > > > and > > > > use them in the later jobs. > > > > > > > > Even though the FLIP has been discussed in the past[1], the FLIP > hasn't > > > > formally passed the vote yet. And some of the design and > implementation > > > > detail have to change to incorporates the cluster partition proposed > in > > > > FLIP-67[2]. > > > > > > > > Looking forward to your feedback. > > > > > > > > Thanks, > > > > Xuannan > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-67%3A+Cluster+partitions+lifecycle > > > > [2] > > > > > > > > > > > > > > https://lists.apache.org/thread.html/b372fd7b962b9f37e4dace3bc8828f6e2a2b855e56984e58bc4a413f@%3Cdev.flink.apache.org%3E > > > > > > > > > >