[
https://issues.apache.org/jira/browse/KUDU-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xu Yao updated KUDU-2673:
-------------------------
Labels: features roadmap-candidate (was: features)
> Event timestamp support with kudu.
> ----------------------------------
>
> Key: KUDU-2673
> URL: https://issues.apache.org/jira/browse/KUDU-2673
> Project: Kudu
> Issue Type: Improvement
> Components: java, spark, tserver
> Affects Versions: 1.8.0
> Reporter: yangz
> Priority: Major
> Labels: features, roadmap-candidate
> Fix For: 1.8.0
>
>
> Kudu has the ability to read historical data. But it is based by the
> timestamp produced by kudu transaction and mvcc system. The timestamp kudu
> used greatly weakened the usability.
> For our use case. we write data to kudu from data stream. We use range
> partition by day.
> We want to get the hour version from kudu. So we need read history data from
> kudu.
> It produced by undo file. But when user give a timestamp, it means timestamp
> the event happen, associated with the data. Not the timestamp kudu produced.
> So we need a way to set event timestamp to the kudu system.
> Finally, we got a way to solve this problem.
> But our solution has two limit.
> # We only update the table by a row, and for one row we have a timestamp
> with it.
> # For getting the right history version of data, we need the data stream
> send data by event time order.
> Despite these problems, it has satisfied our current business.
>
> And our implement also solve part problem for the wrong order problem of
> event time if you only need the newest data, which will not read undo file.
> for the data send into kudu, t1 < t2
> t1 upsert -> t2 upsert -> newest will be t2 value
> t2 upsert -> t1 upsret -> (current kudu implement) t1, our implement
> will be t2.
>
> Maybe our solution is not the best for the problem. But I think kudu snapshot
> read should support event time.
> Our solution is not so complete for all user cases. But I hope it will be
> useful for some cases with the community.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)