[jira] [Commented] (FLINK-4266) Remote Database Statebackend

Chen Qin (JIRA) Fri, 05 Aug 2016 21:54:02 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410474#comment-15410474
 ]


Chen Qin commented on FLINK-4266:
---------------------------------

https://docs.google.com/document/d/1diHQyOPZVxgmnmYfiTa6glLf-FlFjSHcL8J3YR2xLdk/edit?usp=sharing


> Remote Database Statebackend
> ----------------------------
>
>                 Key: FLINK-4266
>                 URL: https://issues.apache.org/jira/browse/FLINK-4266
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.0.3, 1.2.0
>            Reporter: Chen Qin
>            Priority: Minor
>
> Current FileSystem statebackend limits whole state size to disk space. 
> Dealing with scale out checkpoint states beyond local disk space such as long 
> running task that hold window content for long period of time. Pipelines 
> needs to split out states to durable remote storage even replicated to 
> different data centers.
> We draft a design that leverage checkpoint id as mono incremental logic 
> timestamp and perform range query to get evicited state k/v. we also 
> introduce checkpoint time commit and eviction threshold that reduce "hot 
> states" hitting remote db per every update between adjacent checkpoints by 
> tracking update logs and merge them, do batch updates only when checkpoint; 
> lastly, we are looking for eviction policy that can identify "hot keys" in 
> k/v state and lazy load those "cold keys" from remote storage(e.g Cassandra).
> For now, we don't have good story regarding to data retirement. We might have 
> to keep forever until manually run command and clean per job related state 
> data. Some of features might related to incremental checkpointing feature, we 
> hope to align with effort there also.
> Welcome comments, I will try to put a draft design doc after gathering some 
> feedback.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4266) Remote Database Statebackend

Reply via email to