Done!



--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
morning...@apache.org





在 2022-06-24 08:48:29,"Chen Zhang" <chzhang1...@gmail.com> 写道:
>@Mingyu, my username: zhannngchen. Thanks~
>
>Best
>Chen Zhang
>On Jun 24, 2022, 00:56 +0800, 陈明雨 <morning...@163.com>, wrote:
>> Hi Zhang Chen:
>> I have created a DSIP-018 for this[1]. But you need to create an account and 
>> tell me your username.
>>
>>
>> [1] 
>> https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model
>>
>>
>>
>>
>> --
>>
>> 此致!Best Regards
>> 陈明雨 Mingyu Chen
>>
>> Email:
>> morning...@apache.org
>>
>>
>>
>>
>>
>> At 2022-06-23 22:29:49, "Chen Zhang" <chzhang1...@gmail.com> wrote:
>> > @Minghong We'll use a multi-version delete bitmap, only save delta for 
>> > each version.
>> > For example, we have a rowset with version [0-98], transaction 99 updated 
>> > some row in that rowset, and so does transaction 100 and 101, there would 
>> > be 3 delete bitmaps on that rowset, corresponding to rows updated by 
>> > version 99, 100 and 101. A query with version x will only see the bitmap 
>> > up to version x. There's more details about space saving and cache 
>> > acceleration, let's discuss it in DSIP.
>> >
>> > @Xiaoli, our team have finished most develop works for the basic function 
>> > in our private repository, but there‘s still lots of works to do, welcome 
>> > to get involve.
>> >
>> > @Mingyu, could you help to create a DISP doc? I don't seem to have 
>> > permission.
>> >
>> > Best
>> > Chen Zhang
>> > On Jun 23, 2022, 21:41 +0800, Zhou Minghong <minghong.z...@163.com>, wrote:
>> > > Hi Chen Zhang
>> > > one question about "and a delete bitmap (marks some rowid as deleted)”:
>> > > how to handle transaction information by a bitmap?
>> > > for example, transaction_100 delete a row, but this still visible to 
>> > > transaction_99, but not visible to trasanction_101. How to handle this 
>> > > case?
>> > >
>> > >
>> > > Br/Minghong
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > At 2022-06-23 19:14:58, "Zhu,Xiaoli" <zhuxiaol...@baidu.com> wrote:
>> > > > Hi Chen Zhang,
>> > > >
>> > > > I am very interested in this topic, and want to participate in the 
>> > > > development.
>> > > >
>> > > > 在 2022/6/23 下午2:44,“Chen Zhang”<chzhang1...@gmail.com> 写入:
>> > > >
>> > > > Hi devs,
>> > > >
>> > > > Unique-Key data model is widely used in scenarios like Flink-CDC, user
>> > > > profile(用户画像), E-commerce orders, but the query performance for current
>> > > > Merge-On-Read implementation is not good, due to the following reasons:
>> > > >
>> > > > 1. Doris can't determine whether one row in a segment file is latest or
>> > > > outdated, so it has to do some extra merge sort before getting the
>> > > > latest data, and key comparison is quite CPU-costive.
>> > > > 2. Aggregate function predicate push down is not supported by the
>> > > > Unique-Key data model due to reason(1).
>> > > >
>> > > > I'd like to propose to support a Merge-On-Write implementation for the
>> > > > Unique-Key data model, which leverages a new segment-file-level primary
>> > > > key index (used for point lookup on write) and a delete bitmap (marks 
>> > > > some
>> > > > rowid as deleted), which can optimize read performance significantly.
>> > > >
>> > > > At the beginning, we wanted to add another Primary-Key data model with
>> > > > Merge-On-Write implementation, but after a lot of discussion, we'd 
>> > > > prefer
>> > > > to improve the Unique-Key data model rather than adding another one.
>> > > >
>> > > > I'll add detailed design and related research in the DSIP doc later.
>> > > >
>> > > >
>> > > > ---------------------------------------------------------------------
>> > > > To unsubscribe, e-mail: dev-unsubscr...@doris.apache.org
>> > > > For additional commands, e-mail: dev-h...@doris.apache.org
>> > > >

Reply via email to