What's more I make a little change in WindowOperator for ListState in https://issues.apache.org/jira/browse/FLINK-5572
发件人:Stephan Ewen 收件人:dev 抄送:iuxinc...@huawei.com,Aljoscha Krettek,时金魁 时间:2017-01-20 18:35:46 主题:Re: States split over to external storage Hi! This is an interesting suggestion. Just to make sure I understand it correctly: Do you design this for cases where the state per machine is larger than that machines memory/disk? And in that case, you cannot solve the problem by scaling out (having more machines)? Stephan On Tue, Jan 17, 2017 at 6:29 AM, Chen Qin <c...@uber.com> wrote: > Hi there, > > I would like to discuss split over local states to external storage. The > use case is NOT another external state backend like HDFS, rather just to > expand beyond what local disk/ memory can hold when large key space exceeds > what task managers could handle. Realizing FLINK-4266 might be hard to > tacking all-in-one, I would live give a shot to split-over first. > > An intuitive approach would be treat HeapStatebackend as LRU cache and > split over to external key/value storage when threshold triggered. To make > this happen, we need minor refactor to runtime and adding set/get logic. > One nice thing of keeping HDFS to store snapshots would be avoid versioning > conflicts. Once checkpoint restore happens, partial write data will be > overwritten with previously checkpointed value. > > Comments? > > -- > -Chen Qin >