Good catch! PengHui Li <peng...@apache.org> 于2022年3月8日周二 13:37写道:
> > We don't support rollback the data of the previous version Once you > enable > this feature. > > If you want to roll back to an old version, need to disable the cursor > compression, > wait a while, or restart the broker first to make sure the cursor data can > be flush to > the cursor ledger. And then roll back to the old version. > > Penghui > > On Tue, Mar 8, 2022 at 10:31 AM Zixuan Liu <node...@gmail.com> wrote: > > > Hi Xiaolong, > > > > It is disabled by default. Once you enable this feature: > > When reading your data, we will check your data header, if it is > compressed > > data, we will parse this data by compression format, otherwise parse it > by > > the original way. > > When updating your data, we will compress your data by the compression > > type. > > > > We don't support rollback the data of the previous version Once you > enable > > this feature. > > > > Thanks, > > Zixuan > > > > > > r...@apache.org <ranxiaolong...@gmail.com> 于2022年3月7日周一 16:16写道: > > > > > Hi Zixuan: > > > > > > Here I am more concerned about whether this feature will break backward > > > compatibility, for historical data or old clusters, how do we use this > > > feature. > > > > > > -- > > > Thanks > > > Xiaolong Ran > > > > > > Zixuan Liu <node...@gmail.com> 于2022年3月7日周一 15:14写道: > > > > > > > Hi everyone, > > > > > > > > Good catch! I update my proposal on > > > > https://github.com/apache/pulsar/issues/14529, and the compatibility > > > part > > > > has been appended: > > > > > > > > 1. The compression is disabled by default > > > > 2. We need to consider how to migrate the old data when this > > compression > > > > has been enabled. If the cursor data header is compressed format, we > > will > > > > parse the bytes data by compressed format, otherwise we will parse > the > > > > cursor data directly by the original way > > > > > > > > Zixuan Liu <node...@gmail.com> 于2022年3月7日周一 15:11写道: > > > > > > > > > Hi PengHui, > > > > > > > > > > Sorry, the correct URL: > > https://github.com/apache/pulsar/issues/14529. > > > > > > > > > > :( Because of the problem of subscription, the email here is very > > > > > confusing. > > > > > > > > > > > > > > > PengHui Li <peng...@apache.org> 于2022年3月7日周一 12:39写道: > > > > > > > > > >> Hi Zixuan, > > > > >> > > > > >> Looks like you have added the wrong link for the proposal? > > > > >> https://github.com/apache/pulsar/issues/14395 is for PIP-44 > > > > >> > > > > >> Penghui > > > > >> > > > > >> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li <peng...@apache.org> > > > wrote: > > > > >> > > > > >> > > This is a global setting now. But I wonder if we should > compress > > > it > > > > >> only > > > > >> > if the size > > > > >> > is over a threshold? > > > > >> > > > > > >> > +1 > > > > >> > > > > > >> > Penghui > > > > >> > > > > > >> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli < > > eolive...@gmail.com > > > > > > > > >> > wrote: > > > > >> > > > > > >> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang < > jianghait...@apache.org> > > > ha > > > > >> >> scritto: > > > > >> >> > > > > >> >> > This is a global setting now. But I wonder if we should > > compress > > > it > > > > >> only > > > > >> >> > if the size > > > > >> >> > is over a threshold? > > > > >> >> > > > > >> >> > > > > >> >> Good idea > > > > >> >> > > > > >> >> Enrico > > > > >> >> > > > > >> >> > > > > >> >> Because: > > > > >> >> > 1. It's not easy for us to notice some managed cursor info is > > too > > > > >> large > > > > >> >> in > > > > >> >> > advance, normally it would be found only if it have actual > > > impact. > > > > >> But > > > > >> >> if > > > > >> >> > we enable this compression in advance, it will took some > extra > > > > >> computing > > > > >> >> > resources. > > > > >> >> > 2. It seems that it won't be a common case that this managed > > > cursor > > > > >> info > > > > >> >> > is too large (only if there are a lot > individualDeletedMessages > > > and > > > > >> >> > batchedEntryDeletionIndexInfo). So not quite necessary to > > > compress > > > > >> all > > > > >> >> > managed cursor info. > > > > >> >> > > > > > >> >> > Regards, > > > > >> >> > Haiting > > > > >> >> > > > > > >> >> > > > > > >> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote: > > > > >> >> > > Hi Pulsar Community, > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > I create a proposal that support ManagedCursorInfo > > compression. > > > > >> >> > > > > > > >> >> > > The proposal can be found: > > > > >> >> https://github.com/apache/pulsar/issues/14395 > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > Motivation > > > > >> >> > > > > > > >> >> > > The cursor data is managed by ZooKeeper/etcd metadata > store. > > > When > > > > >> >> > > cursor data becomes more and more, the data size will > > increase > > > > and > > > > >> >> > > will take a lot of time to pull the data. Therefore, it is > > > > >> necessary > > > > >> >> > > to add compression for the cursor, which can reduce the > size > > of > > > > >> data > > > > >> >> > > and reduce the time of pulling data. > > > > >> >> > > Goal > > > > >> >> > > > > > > >> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the > > > > >> >> ManagedCursorInfo. > > > > >> >> > > Implementation > > > > >> >> > > > > > > >> >> > > - Cursor compression format > > > > >> >> > > [MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] + > > > > >> >> > > [MANAGED_CURSOR_INFO_PAYLOAD] > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > - > > > > >> >> > > > > > > >> >> > > MAGIC_NUMBER > > > > >> >> > > Ox4779 > > > > >> >> > > - > > > > >> >> > > > > > > >> >> > > METADATA > > > > >> >> > > Add a named ManagedCursorInfoMetadata message to > > > > >> >> MLDataFormats.proto: > > > > >> >> > > message ManagedCursorInfoMetadata { > > > > >> >> > > required CompressionType compressionType = 1; > > > > >> >> > > required int32 uncompressedSize = 2; > > > > >> >> > > } > > > > >> >> > > > > > > >> >> > > Currently, these compressions have been supported, we only > > need > > > > to > > > > >> >> > > deal with compression and decompression of the > > > ManagedCursorInfo > > > > >> data: > > > > >> >> > > > > > > >> >> > > - > > > > >> >> > > > > > > >> >> > > Get CursorInfo from the metadata store > > > > >> >> > > We will check the cursor data header, if it is > compressed, > > > we > > > > >> will > > > > >> >> > > parse the bytes data by compressed format, otherwise by the > > > > >> original > > > > >> >> > > way. > > > > >> >> > > - > > > > >> >> > > > > > > >> >> > > Add/Update CursorInfo to the metadata store > > > > >> >> > > The default is to use compression if the compression > type > > is > > > > >> >> > specified. > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > Thanks, > > > > >> >> > > Zixuan > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> > > > > > >> > > > > > > > > > > > > > > >