Hi Chen Zhang one question about "and a delete bitmap (marks some rowid as deleted)”: how to handle transaction information by a bitmap? for example, transaction_100 delete a row, but this still visible to transaction_99, but not visible to trasanction_101. How to handle this case?
Br/Minghong At 2022-06-23 19:14:58, "Zhu,Xiaoli" <zhuxiaol...@baidu.com> wrote: >Hi Chen Zhang, > >I am very interested in this topic, and want to participate in the development. > >在 2022/6/23 下午2:44,“Chen Zhang”<chzhang1...@gmail.com> 写入: > > Hi devs, > > Unique-Key data model is widely used in scenarios like Flink-CDC, user > profile(用户画像), E-commerce orders, but the query performance for current > Merge-On-Read implementation is not good, due to the following reasons: > > 1. Doris can't determine whether one row in a segment file is latest or > outdated, so it has to do some extra merge sort before getting the > latest data, and key comparison is quite CPU-costive. > 2. Aggregate function predicate push down is not supported by the > Unique-Key data model due to reason(1). > > I'd like to propose to support a Merge-On-Write implementation for the > Unique-Key data model, which leverages a new segment-file-level primary > key index (used for point lookup on write) and a delete bitmap (marks some > rowid as deleted), which can optimize read performance significantly. > > At the beginning, we wanted to add another Primary-Key data model with > Merge-On-Write implementation, but after a lot of discussion, we'd prefer > to improve the Unique-Key data model rather than adding another one. > > I'll add detailed design and related research in the DSIP doc later. > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org >For additional commands, e-mail: dev-h...@doris.apache.org >