Hi Becket, Thanks for bringing this up! For a long time, the intermediate cache problem has always been a pain point of the Flink streaming model. As far as I know, it’s quite a block for iterate operations in batch-related libs such as Gelly and FlinkML.
Actually, there’s an old JIRA[1], aiming to solve the cache problem more “thoroughly”. Compared with your proposal, it makes the persistence in DataSet level, which also allows the internal operations based on the DataSet API to benefit. I totally understand the importance of Table API, but just wonder whether we should consider this problem in a larger view, i.e., adding a `PersistentService` rather than a `TablePersistentService` (as described in the "Flink Services" section). Thanks, Xingcan [1] https://issues.apache.org/jira/browse/FLINK-1730 > On Nov 20, 2018, at 8:56 AM, Becket Qin <becket....@gmail.com> wrote: > > Hi all, > > As a few recent email threads have pointed out, it is a promising > opportunity to enhance Flink Table API in various aspects, including > functionality and ease of use among others. One of the scenarios where we > feel Flink could improve is interactive programming. To explain the issues > and facilitate the discussion on the solution, we put together the > following document with our proposal. > > https://docs.google.com/document/d/1d4T2zTyfe7hdncEUAxrlNOYr4e5IMNEZLyqSuuswkA0/edit?usp=sharing > > Feedback and comments are very welcome! > > Thanks, > > Jiangjie (Becket) Qin