Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-06 Thread Enrico Olivelli
Il Dom 6 Mar 2022, 05:04 Haiting Jiang  ha scritto:

> This is a global setting now. But I wonder if we should compress it only
> if the size
> is over a threshold?


Good idea

Enrico


  Because:
> 1. It's not easy for us to notice some managed cursor info is too large in
> advance,  normally it would be found only if it have actual impact. But if
> we enable this compression in advance, it will took some extra computing
> resources.
> 2. It seems that it won't be a common case that this managed cursor info
> is too large (only if there are a lot individualDeletedMessages and
> batchedEntryDeletionIndexInfo). So not quite necessary to compress all
> managed cursor info.
>
> Regards,
> Haiting
>
>
> On 2022/03/02 04:41:16 Zixuan Liu wrote:
> > Hi Pulsar Community,
> >
> >
> > I create a proposal that support ManagedCursorInfo compression.
> >
> > The proposal can be found: https://github.com/apache/pulsar/issues/14395
> >
> >
> > Motivation
> >
> > The cursor data is managed by ZooKeeper/etcd metadata store. When
> > cursor data becomes more and more, the data size will increase and
> > will take a lot of time to pull the data. Therefore, it is necessary
> > to add compression for the cursor, which can reduce the size of data
> > and reduce the time of pulling data.
> > Goal
> >
> > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the ManagedCursorInfo.
> > Implementation
> >
> >- Cursor compression format
> >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> > [MANAGED_CURSOR_INFO_PAYLOAD]
> >
> >
> >-
> >
> >MAGIC_NUMBER
> >Ox4779
> >-
> >
> >METADATA
> >Add a named ManagedCursorInfoMetadata message to MLDataFormats.proto:
> >message ManagedCursorInfoMetadata {
> >   required CompressionType compressionType = 1;
> >   required int32 uncompressedSize = 2;
> >}
> >
> > Currently, these compressions have been supported, we only need to
> > deal with compression and decompression of the ManagedCursorInfo data:
> >
> >-
> >
> >Get CursorInfo from the metadata store
> >We will check the cursor data header, if it is compressed, we will
> > parse the bytes data by compressed format, otherwise by the original
> > way.
> >-
> >
> >Add/Update CursorInfo to the metadata store
> >The default is to use compression if the compression type is
> specified.
> >
> >
> > Thanks,
> > Zixuan
> >
>


[GitHub] [pulsar-client-node] hassanzia32 opened a new issue #201: Support for Dead Letter Queueus in Consumers?

2022-03-06 Thread GitBox


hassanzia32 opened a new issue #201:
URL: https://github.com/apache/pulsar-client-node/issues/201


   Does the nodeJS client currently support [dead letter 
queues](https://pulsar.apache.org/docs/en/concepts-messaging/#dead-letter-topic)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-manager] HartmutDIL commented on issue #283: Pulsar Manager keeps saying "This environment is error. Please check it"

2022-03-06 Thread GitBox


HartmutDIL commented on issue #283:
URL: https://github.com/apache/pulsar-manager/issues/283#issuecomment-1059977409


   I was able to solve the problem in the meantime. The reason was that JWT was 
switched on and therefore an unauthorized access attempt occurred which was of 
course rejected. 
   The correct URL is therefore "http://pulsar-ci-broker:8080";, for anyone who 
wants to know it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-site] dave2wave opened a new pull request #12: Fixed Rewrites by adding flags [R=301,DPI,L]

2022-03-06 Thread GitBox


dave2wave opened a new pull request #12:
URL: https://github.com/apache/pulsar-site/pull/12


   Flags are described here: 
https://httpd.apache.org/docs/current/rewrite/flags.html
   
   I'm making a PR to better expose the fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-site] dave2wave merged pull request #12: Fixed Rewrites by adding flags [R=301,DPI,L]

2022-03-06 Thread GitBox


dave2wave merged pull request #12:
URL: https://github.com/apache/pulsar-site/pull/12


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-site] dave2wave opened a new issue #13: Logo and navigation menu bar overlap

2022-03-06 Thread GitBox


dave2wave opened a new issue #13:
URL: https://github.com/apache/pulsar-site/issues/13


   See the screenshot:
   https://user-images.githubusercontent.com/29803617/156935251-71402bd0-b3b0-4e7e-81d1-6d64e63c1082.png";>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-site] urfreespace opened a new pull request #14: fix: Logo and navigation menu bar overlap

2022-03-06 Thread GitBox


urfreespace opened a new pull request #14:
URL: https://github.com/apache/pulsar-site/pull/14


   fix: https://github.com/apache/pulsar-site/issues/13


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-site] urfreespace merged pull request #14: fix: Logo and navigation menu bar overlap

2022-03-06 Thread GitBox


urfreespace merged pull request #14:
URL: https://github.com/apache/pulsar-site/pull/14


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-site] urfreespace closed issue #13: Logo and navigation menu bar overlap

2022-03-06 Thread GitBox


urfreespace closed issue #13:
URL: https://github.com/apache/pulsar-site/issues/13


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [DISCUSS] Releasing pulsar-client-go 0.8.1

2022-03-06 Thread r...@apache.org
+1

--
Thanks
Xiaolong Ran

PengHui Li  于2022年3月5日周六 18:10写道:

> +1
>
> Penghui
>
> On Sat, Mar 5, 2022 at 4:58 AM Matteo Merli 
> wrote:
>
> > +1 Thanks Rui, we should eliminate the GPL dependency ASAP.
> >
> >
> >
> > --
> > Matteo Merli
> > 
> >
> > On Thu, Mar 3, 2022 at 2:08 AM Rui Fu  wrote:
> > >
> > > Hi everyone,
> > >
> > > I would like to start a discussion here about starting a new release of
> > > pulsar-client-go v0.8.1. Recently we have some of dependencies updated
> > PRs
> > > from the community, [1] is bumping `github.com/beefsack/go-rate`
> 
> >  to the
> > > latest version, which migrates the license from GPL to MIT. [2] is
> > bumping `
> > > github.com/prometheus/client_golang`
> 
> >  to address the
> > CVE-2022-21698. For
> > > more details, please check the links below.
> > >
> > > As the v0.8.0 was just released weeks ago and the next release will
> start
> > > about 2 month later, I think we should start the release of v0.8.1.
> > >
> > > [1]: https://github.com/apache/pulsar-client-go/pull/735
> > > [2]: https://github.com/apache/pulsar-client-go/pull/738
> > >
> > > --
> > >
> > > Best Regards,
> > >
> > > Rui Fu
> >
>


[DISUSS] Improve unit test stability

2022-03-06 Thread 包子
Hi, I want to discuss how to improve the stability of unit testing.I found that 
most flaky unit tests can be reproduc locally and only need to be executed a 
few more times.

Can we add the following mandatory constraints to ensure the stability of unit 
testing of code?
1. If new / modified unit tests are included in PR, they need to be run 
continuously on CI for more than n times.
2. We can write scripts and parse git change records to find new or modified 
unit tests.



[GitHub] [pulsar-site] dave2wave commented on issue #13: Logo and navigation menu bar overlap

2022-03-06 Thread GitBox


dave2wave commented on issue #13:
URL: https://github.com/apache/pulsar-site/issues/13#issuecomment-1060181883


   I'm not sure that this is setup correctly. as at one width it's ok but then 
a slightly wider window it's not. Can it always take the narrower approach
   https://user-images.githubusercontent.com/29803617/156968065-82b8b845-cc6c-43b5-9f5b-1e5d5ed82b2b.png";>
   https://user-images.githubusercontent.com/29803617/156968082-23dda266-7a19-487d-a15e-52177fe6357c.png";>
   ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-06 Thread PengHui Li
> This is a global setting now. But I wonder if we should compress it only
if the size
is over a threshold?

+1

Penghui

On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli  wrote:

> Il Dom 6 Mar 2022, 05:04 Haiting Jiang  ha
> scritto:
>
> > This is a global setting now. But I wonder if we should compress it only
> > if the size
> > is over a threshold?
>
>
> Good idea
>
> Enrico
>
>
>   Because:
> > 1. It's not easy for us to notice some managed cursor info is too large
> in
> > advance,  normally it would be found only if it have actual impact. But
> if
> > we enable this compression in advance, it will took some extra computing
> > resources.
> > 2. It seems that it won't be a common case that this managed cursor info
> > is too large (only if there are a lot individualDeletedMessages and
> > batchedEntryDeletionIndexInfo). So not quite necessary to compress all
> > managed cursor info.
> >
> > Regards,
> > Haiting
> >
> >
> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
> > > Hi Pulsar Community,
> > >
> > >
> > > I create a proposal that support ManagedCursorInfo compression.
> > >
> > > The proposal can be found:
> https://github.com/apache/pulsar/issues/14395
> > >
> > >
> > > Motivation
> > >
> > > The cursor data is managed by ZooKeeper/etcd metadata store. When
> > > cursor data becomes more and more, the data size will increase and
> > > will take a lot of time to pull the data. Therefore, it is necessary
> > > to add compression for the cursor, which can reduce the size of data
> > > and reduce the time of pulling data.
> > > Goal
> > >
> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the ManagedCursorInfo.
> > > Implementation
> > >
> > >- Cursor compression format
> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> > > [MANAGED_CURSOR_INFO_PAYLOAD]
> > >
> > >
> > >-
> > >
> > >MAGIC_NUMBER
> > >Ox4779
> > >-
> > >
> > >METADATA
> > >Add a named ManagedCursorInfoMetadata message to
> MLDataFormats.proto:
> > >message ManagedCursorInfoMetadata {
> > >   required CompressionType compressionType = 1;
> > >   required int32 uncompressedSize = 2;
> > >}
> > >
> > > Currently, these compressions have been supported, we only need to
> > > deal with compression and decompression of the ManagedCursorInfo data:
> > >
> > >-
> > >
> > >Get CursorInfo from the metadata store
> > >We will check the cursor data header, if it is compressed, we will
> > > parse the bytes data by compressed format, otherwise by the original
> > > way.
> > >-
> > >
> > >Add/Update CursorInfo to the metadata store
> > >The default is to use compression if the compression type is
> > specified.
> > >
> > >
> > > Thanks,
> > > Zixuan
> > >
> >
>


Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-06 Thread PengHui Li
Hi Zixuan,

Looks like you have added the wrong link for the proposal?
https://github.com/apache/pulsar/issues/14395 is for PIP-44

Penghui

On Mon, Mar 7, 2022 at 12:37 PM PengHui Li  wrote:

> > This is a global setting now. But I wonder if we should compress it only
> if the size
> is over a threshold?
>
> +1
>
> Penghui
>
> On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli 
> wrote:
>
>> Il Dom 6 Mar 2022, 05:04 Haiting Jiang  ha
>> scritto:
>>
>> > This is a global setting now. But I wonder if we should compress it only
>> > if the size
>> > is over a threshold?
>>
>>
>> Good idea
>>
>> Enrico
>>
>>
>>   Because:
>> > 1. It's not easy for us to notice some managed cursor info is too large
>> in
>> > advance,  normally it would be found only if it have actual impact. But
>> if
>> > we enable this compression in advance, it will took some extra computing
>> > resources.
>> > 2. It seems that it won't be a common case that this managed cursor info
>> > is too large (only if there are a lot individualDeletedMessages and
>> > batchedEntryDeletionIndexInfo). So not quite necessary to compress all
>> > managed cursor info.
>> >
>> > Regards,
>> > Haiting
>> >
>> >
>> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
>> > > Hi Pulsar Community,
>> > >
>> > >
>> > > I create a proposal that support ManagedCursorInfo compression.
>> > >
>> > > The proposal can be found:
>> https://github.com/apache/pulsar/issues/14395
>> > >
>> > >
>> > > Motivation
>> > >
>> > > The cursor data is managed by ZooKeeper/etcd metadata store. When
>> > > cursor data becomes more and more, the data size will increase and
>> > > will take a lot of time to pull the data. Therefore, it is necessary
>> > > to add compression for the cursor, which can reduce the size of data
>> > > and reduce the time of pulling data.
>> > > Goal
>> > >
>> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
>> ManagedCursorInfo.
>> > > Implementation
>> > >
>> > >- Cursor compression format
>> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
>> > > [MANAGED_CURSOR_INFO_PAYLOAD]
>> > >
>> > >
>> > >-
>> > >
>> > >MAGIC_NUMBER
>> > >Ox4779
>> > >-
>> > >
>> > >METADATA
>> > >Add a named ManagedCursorInfoMetadata message to
>> MLDataFormats.proto:
>> > >message ManagedCursorInfoMetadata {
>> > >   required CompressionType compressionType = 1;
>> > >   required int32 uncompressedSize = 2;
>> > >}
>> > >
>> > > Currently, these compressions have been supported, we only need to
>> > > deal with compression and decompression of the ManagedCursorInfo data:
>> > >
>> > >-
>> > >
>> > >Get CursorInfo from the metadata store
>> > >We will check the cursor data header, if it is compressed, we will
>> > > parse the bytes data by compressed format, otherwise by the original
>> > > way.
>> > >-
>> > >
>> > >Add/Update CursorInfo to the metadata store
>> > >The default is to use compression if the compression type is
>> > specified.
>> > >
>> > >
>> > > Thanks,
>> > > Zixuan
>> > >
>> >
>>
>


Re: [DISCUSS] PIP-139 : Support Broker send command to real close producer/consumer.

2022-03-06 Thread guo jiwei
Hi  Penghui,
   After test, we can use #12136 to stop the replicator.


Regards
Jiwei Guo (Tboy)


On Sat, Mar 5, 2022 at 5:31 PM PengHui Li  wrote:

> > Great point. I was focused on deleting namespaces and missed the case
> where the user wants to delete a topic from a replicated namespace. I
> agree that we should make it possible to delete these topics without
> removing the namespace replication.
>
> Oh, sorry. I thought again, after
> https://github.com/apache/pulsar/pull/12136
> introduced topic level geo-replication configuration, users can disable for
> a topic even if the namespace enabled geo-replication.
>
> And topic level policy is introduced in 2.6.0. I think It's time to make it
> one of
> the features turned on by default.
>
> So, we can follow the steps that stop the replication first and then delete
> the
> topic from clusters.
>
> > Note that it is already possible
> to delete these topics by deleting with force or by configuring
> inactive topic deletion.
>
> Hmm, I think both of them do not work for now, because the producer of
> the replicator will always be active.
>
> > Considering users can delete replicated topics by deleting with force,
> are you saying we need to provide a non-force way to delete these
> topics?
>
> Yes, if the topic doesn't have users' producers/consumers, users don't need
> to force delete the topic, it's a normal deletion.
>
>
> --
> @mattison @jiwei could you please check if we can use
> https://github.com/apache/pulsar/pull/12136
> to stop the geo-replication first, and then delete the topic?
> If it works, we don't need to change the protocol.
>
> Thanks,
> Penghui
>
>
>
> On Thu, Mar 3, 2022 at 7:12 AM Michael Marshall 
> wrote:
>
> > > The geo-replication's configuration can be centralized by using
> > > one configuration store. But that doesn't change anything, we should
> > > provide the same behavior for both centralized and decentralized
> > > configuration store.
> >
> > Perhaps I misused the word decentralized. I meant that the challenge
> > comes because geo-replication configurations for clusters are
> > independent from each other.
> >
> > > The current challenge is
> > > users usually set replicated clusters for a namespace, if remove
> > > the replication configuration, the entire namespace will be affected.
> > > We have supported setting the replicated configuration for a topic [1],
> > > only for 2.10.0 or later.
> >
> > Great point. I was focused on deleting namespaces and missed the case
> > where the user wants to delete a topic from a replicated namespace. I
> > agree that we should make it possible to delete these topics without
> > removing the namespace replication. Note that it is already possible
> > to delete these topics by deleting with force or by configuring
> > inactive topic deletion.
> >
> > > we should provided steps for how to delete an unused replicated
> > > topic and what are the effects of not removing properly
> >
> > I agree, and I think documentation coupled with helpful error logs
> > will be very helpful here.
> >
> > Considering users can delete replicated topics by deleting with force,
> > are you saying we need to provide a non-force way to delete these
> > topics?
> >
> > If we want to provide a path to normal deletion, we already have an
> > `isRemote` flag on producers, and the inactive topic deletion code
> > uses this flag to determine if there are non-replication producers
> > connected to a topic. We could modify the deletion logic for a global
> > topic so that it can be deleted as long as the only producers
> > connected are remote producers. My main concern is that normal
> > deletion could allow users to miss the nuance that they must also
> > delete the topic in the other cluster. This might be an overly
> > cautious concern, though.
> >
> > Thanks,
> > Michael
> >
> >
> >
> > On Tue, Mar 1, 2022 at 1:55 AM PengHui Li  wrote:
> > >
> > > > I agree with you that it'd be nice to provide the same deletion
> > > behavior. However, because geo-replication's configuration is
> > > decentralized, I think namespace or topic deletion is more complicated
> > > than unreplicated deletion. Note that users cannot currently delete
> > > namespaces that are configured with remote replication clusters.
> > >
> > > The geo-replication's configuration can be centralized by using
> > > one configuration store. But that doesn't change anything, we should
> > > provide the same behavior for both centralized and decentralized
> > > configuration store.
> > >
> > > > Note that while a user cannot explicitly delete a replicated topic,
> > > they can remove the replication configuration in cluster A and cluster
> > > B, and then they are left with unreplicated namespaces and topics,
> > > which can be deleted.
> > >
> > > Yes, stopping the replication first is a more elegant approach,
> > > and it is also very easy to understand for users. The

Re: [DISUSS] Improve unit test stability

2022-03-06 Thread Haiting Jiang
+1 for this great idea.
Although I am not sure if there is an easy and accurate way to "find new or 
modified unit tests".

Thanks,
Haiting

On 2022/03/07 03:11:47 包子 wrote:
> Hi, I want to discuss how to improve the stability of unit testing.I found 
> that most flaky unit tests can be reproduc locally and only need to be 
> executed a few more times.
> 
> Can we add the following mandatory constraints to ensure the stability of 
> unit testing of code?
> 1. If new / modified unit tests are included in PR, they need to be run 
> continuously on CI for more than n times.
> 2. We can write scripts and parse git change records to find new or modified 
> unit tests.
> 
> 


Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-06 Thread Zixuan Liu
Hi Mattison,

Thanks for your feedback!  I think using two configurations is more
flexible, and users can set up different compression types.

Best,
Zixuan

mattison chao  于2022年3月6日周日 08:41写道:

> Hi, Zi Xuan
>
> After deep think, I have another question:
>
> Why don't we combine ledger compaction and cursor compaction into one
> configuration switch?
>
> I’m not sure, do we have users who need to set the compression
> configuration for ledger and cursor separately? I think if they were to be
> set, they would all be set to the same.
>
>
> We could add new configurations ``xxxCompressionType``  to control the
> ledger and cursor compress and deprecate the
> ``managedLedgerInfoCompressionType`` for compatibility.
>
> After that, if users set ``managedLedgerInfoCompressionType``, we just
> compress the ledger, if the user set another configuration
> ``xxxCompressionType `` we will compress the ledger and cursor.
>
>
> It’s just a question or suggestion. You can feel free to go ahead.
>
> Best,
> Mattison
>
>
>
> > On Mar 6, 2022, at 8:22 AM, mattison chao 
> wrote:
> >
> >
> > Great work!
> >
> > I have no other comments other than the compatibility everybody
> mentioned.
> >
> > Best,
> > Mattison
> >
> >> On Mar 6, 2022, at 4:55 AM, Enrico Olivelli 
> wrote:
> >>
> >> Good proposal.
> >> It is important that this is disabled by default otherwise we cannot
> easily
> >> support the rollback
> >>
> >> Apart from that I don't have other comments
> >>
> >>
> >>
> >> Enrico
> >>
> >> Il Sab 5 Mar 2022, 11:22 PengHui Li  ha scritto:
> >>
> >>> Hi Zixuan,
> >>>
> >>> We should add the compatibility part to the proposal.
> >>> And should also provide steps to roll back to the old version which
> >>> enabled the compression in the new version.
> >>>
> >>> I don't have objections to the proposal, and we have done
> >>> the same enhancement for topic metadata
> >>> https://github.com/apache/pulsar/pull/11490,
> >>> and this proposal also follows the same way.
> >>>
> >>> Thanks,
> >>> Penghui
> >>>
> >>> On Thu, Mar 3, 2022 at 10:26 AM Zixuan Liu  wrote:
> >>>
>  Thank you for your feedback.
> 
>  Forward compatibility is required:
> 
>  1. Get CursorInfo from the metadata store: We will check the cursor
> data
>  header, if it is compressed, we will parse the bytes data by
> compressed
>  format, otherwise we will parse the cursor data directly by the
> original
>  way.
> 
>  2. Add/Update CursorInfo to the metadata store: The default is to use
>  compression if the compression type is specified, otherwise we will
> put
>  this data to the metadata store directly.
> 
> 
> 
> > 2022年3月2日 下午11:48,Zike Yang  写道:
> >
> > Hi, Zixuan
> > Thanks for creating this PIP.  Here are my thoughts.
> >
> >> CursorInfo compression format
> >>
> >> [MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
>  [MANAGED_CURSOR_INFO_PAYLOAD]
> >>
> >> MAGIC_NUMBER: Ox4779
> >
> > Since we change the ManagedCursorInfo data format here. How do we
> > handle the old data format in the ZK. Could you explain the
> > compatibility for this PIP?
> >
> > Thanks,
> > Zike Yang
> >
> > On Wed, Mar 2, 2022 at 3:34 PM Zixuan Liu    node...@gmail.com>> wrote:
> >>
> >> Hi Pulsar Community,
> >>
> >> I create a proposal that ManagedCursorInfo compression. The proposal
>  can be found: https://github.com/apache/pulsar/issues/14529 <
>  https://github.com/apache/pulsar/issues/14529> <
>  https://github.com/apache/pulsar/issues/14529 <
>  https://github.com/apache/pulsar/issues/14529>>
> >>
> >> Thanks,
> >> Zixuan
> >>
> >> --
> >>
> >> Motivation
> >>
> >> The cursor data is managed by ZooKeeper/etcd metadata store. When
>  cursor data becomes more and more, the data size will increase and
> will
>  take a lot of time to pull the data. Therefore, it is necessary to add
>  compression for the cursor, which can reduce the size of data and
> reduce
>  the time of pulling data.
> >>
> >> Goal
> >>
> >> Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
> >>> ManagedCursorInfo.
> >>
> >> Implementation
> >>
> >> CursorInfo compression format
> >>
> >> [MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
>  [MANAGED_CURSOR_INFO_PAYLOAD]
> >>
> >> MAGIC_NUMBER: Ox4779
> >>
> >> METADATA
> >> Add a named ManagedCursorInfoMetadata message to MLDataFormats.proto
> >>
> >> message ManagedCursorInfoMetadata {
> >>  required CompressionType compressionType = 1;
> >>  required int32 uncompressedSize = 2;
> >> }
> >> CursorInfo compression and decompression design
> >>
> >> Currently, these compressions types have been defined and
> implemented
>  by Pulsar, we only need to deal with compression and decompression of
>

Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-06 Thread Zixuan Liu
Hi Haiting,

Good catch! I can add a threshold to decide to compress or not.

Best,
Zixuan


Haiting Jiang  于2022年3月6日周日 12:04写道:

> This is a global setting now. But I wonder if we should compress it only
> if the size
> is over a threshold?  Because:
> 1. It's not easy for us to notice some managed cursor info is too large in
> advance,  normally it would be found only if it have actual impact. But if
> we enable this compression in advance, it will took some extra computing
> resources.
> 2. It seems that it won't be a common case that this managed cursor info
> is too large (only if there are a lot individualDeletedMessages and
> batchedEntryDeletionIndexInfo). So not quite necessary to compress all
> managed cursor info.
>
> Regards,
> Haiting
>
>
> On 2022/03/02 04:41:16 Zixuan Liu wrote:
> > Hi Pulsar Community,
> >
> >
> > I create a proposal that support ManagedCursorInfo compression.
> >
> > The proposal can be found: https://github.com/apache/pulsar/issues/14395
> >
> >
> > Motivation
> >
> > The cursor data is managed by ZooKeeper/etcd metadata store. When
> > cursor data becomes more and more, the data size will increase and
> > will take a lot of time to pull the data. Therefore, it is necessary
> > to add compression for the cursor, which can reduce the size of data
> > and reduce the time of pulling data.
> > Goal
> >
> > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the ManagedCursorInfo.
> > Implementation
> >
> >- Cursor compression format
> >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> > [MANAGED_CURSOR_INFO_PAYLOAD]
> >
> >
> >-
> >
> >MAGIC_NUMBER
> >Ox4779
> >-
> >
> >METADATA
> >Add a named ManagedCursorInfoMetadata message to MLDataFormats.proto:
> >message ManagedCursorInfoMetadata {
> >   required CompressionType compressionType = 1;
> >   required int32 uncompressedSize = 2;
> >}
> >
> > Currently, these compressions have been supported, we only need to
> > deal with compression and decompression of the ManagedCursorInfo data:
> >
> >-
> >
> >Get CursorInfo from the metadata store
> >We will check the cursor data header, if it is compressed, we will
> > parse the bytes data by compressed format, otherwise by the original
> > way.
> >-
> >
> >Add/Update CursorInfo to the metadata store
> >The default is to use compression if the compression type is
> specified.
> >
> >
> > Thanks,
> > Zixuan
> >
>


Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-06 Thread Zixuan Liu
Hi PengHui,

Sorry, the correct URL: https://github.com/apache/pulsar/issues/14529.

:( Because of the problem of subscription, the email here is very confusing.


PengHui Li  于2022年3月7日周一 12:39写道:

> Hi Zixuan,
>
> Looks like you have added the wrong link for the proposal?
> https://github.com/apache/pulsar/issues/14395 is for PIP-44
>
> Penghui
>
> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li  wrote:
>
> > > This is a global setting now. But I wonder if we should compress it
> only
> > if the size
> > is over a threshold?
> >
> > +1
> >
> > Penghui
> >
> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli 
> > wrote:
> >
> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang  ha
> >> scritto:
> >>
> >> > This is a global setting now. But I wonder if we should compress it
> only
> >> > if the size
> >> > is over a threshold?
> >>
> >>
> >> Good idea
> >>
> >> Enrico
> >>
> >>
> >>   Because:
> >> > 1. It's not easy for us to notice some managed cursor info is too
> large
> >> in
> >> > advance,  normally it would be found only if it have actual impact.
> But
> >> if
> >> > we enable this compression in advance, it will took some extra
> computing
> >> > resources.
> >> > 2. It seems that it won't be a common case that this managed cursor
> info
> >> > is too large (only if there are a lot individualDeletedMessages and
> >> > batchedEntryDeletionIndexInfo). So not quite necessary to compress all
> >> > managed cursor info.
> >> >
> >> > Regards,
> >> > Haiting
> >> >
> >> >
> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
> >> > > Hi Pulsar Community,
> >> > >
> >> > >
> >> > > I create a proposal that support ManagedCursorInfo compression.
> >> > >
> >> > > The proposal can be found:
> >> https://github.com/apache/pulsar/issues/14395
> >> > >
> >> > >
> >> > > Motivation
> >> > >
> >> > > The cursor data is managed by ZooKeeper/etcd metadata store. When
> >> > > cursor data becomes more and more, the data size will increase and
> >> > > will take a lot of time to pull the data. Therefore, it is necessary
> >> > > to add compression for the cursor, which can reduce the size of data
> >> > > and reduce the time of pulling data.
> >> > > Goal
> >> > >
> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
> >> ManagedCursorInfo.
> >> > > Implementation
> >> > >
> >> > >- Cursor compression format
> >> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> >> > > [MANAGED_CURSOR_INFO_PAYLOAD]
> >> > >
> >> > >
> >> > >-
> >> > >
> >> > >MAGIC_NUMBER
> >> > >Ox4779
> >> > >-
> >> > >
> >> > >METADATA
> >> > >Add a named ManagedCursorInfoMetadata message to
> >> MLDataFormats.proto:
> >> > >message ManagedCursorInfoMetadata {
> >> > >   required CompressionType compressionType = 1;
> >> > >   required int32 uncompressedSize = 2;
> >> > >}
> >> > >
> >> > > Currently, these compressions have been supported, we only need to
> >> > > deal with compression and decompression of the ManagedCursorInfo
> data:
> >> > >
> >> > >-
> >> > >
> >> > >Get CursorInfo from the metadata store
> >> > >We will check the cursor data header, if it is compressed, we
> will
> >> > > parse the bytes data by compressed format, otherwise by the original
> >> > > way.
> >> > >-
> >> > >
> >> > >Add/Update CursorInfo to the metadata store
> >> > >The default is to use compression if the compression type is
> >> > specified.
> >> > >
> >> > >
> >> > > Thanks,
> >> > > Zixuan
> >> > >
> >> >
> >>
> >
>


Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-06 Thread Zixuan Liu
Hi everyone,

Good catch! I update my proposal on
https://github.com/apache/pulsar/issues/14529, and the compatibility part
has been appended:

1. The compression is disabled by default
2. We need to consider how to migrate the old data when this compression
has been enabled. If the cursor data header is compressed format, we will
parse the bytes data by compressed format, otherwise we will parse the
cursor data directly by the original way

Zixuan Liu  于2022年3月7日周一 15:11写道:

> Hi PengHui,
>
> Sorry, the correct URL: https://github.com/apache/pulsar/issues/14529.
>
> :( Because of the problem of subscription, the email here is very
> confusing.
>
>
> PengHui Li  于2022年3月7日周一 12:39写道:
>
>> Hi Zixuan,
>>
>> Looks like you have added the wrong link for the proposal?
>> https://github.com/apache/pulsar/issues/14395 is for PIP-44
>>
>> Penghui
>>
>> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li  wrote:
>>
>> > > This is a global setting now. But I wonder if we should compress it
>> only
>> > if the size
>> > is over a threshold?
>> >
>> > +1
>> >
>> > Penghui
>> >
>> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli 
>> > wrote:
>> >
>> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang  ha
>> >> scritto:
>> >>
>> >> > This is a global setting now. But I wonder if we should compress it
>> only
>> >> > if the size
>> >> > is over a threshold?
>> >>
>> >>
>> >> Good idea
>> >>
>> >> Enrico
>> >>
>> >>
>> >>   Because:
>> >> > 1. It's not easy for us to notice some managed cursor info is too
>> large
>> >> in
>> >> > advance,  normally it would be found only if it have actual impact.
>> But
>> >> if
>> >> > we enable this compression in advance, it will took some extra
>> computing
>> >> > resources.
>> >> > 2. It seems that it won't be a common case that this managed cursor
>> info
>> >> > is too large (only if there are a lot individualDeletedMessages and
>> >> > batchedEntryDeletionIndexInfo). So not quite necessary to compress
>> all
>> >> > managed cursor info.
>> >> >
>> >> > Regards,
>> >> > Haiting
>> >> >
>> >> >
>> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
>> >> > > Hi Pulsar Community,
>> >> > >
>> >> > >
>> >> > > I create a proposal that support ManagedCursorInfo compression.
>> >> > >
>> >> > > The proposal can be found:
>> >> https://github.com/apache/pulsar/issues/14395
>> >> > >
>> >> > >
>> >> > > Motivation
>> >> > >
>> >> > > The cursor data is managed by ZooKeeper/etcd metadata store. When
>> >> > > cursor data becomes more and more, the data size will increase and
>> >> > > will take a lot of time to pull the data. Therefore, it is
>> necessary
>> >> > > to add compression for the cursor, which can reduce the size of
>> data
>> >> > > and reduce the time of pulling data.
>> >> > > Goal
>> >> > >
>> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
>> >> ManagedCursorInfo.
>> >> > > Implementation
>> >> > >
>> >> > >- Cursor compression format
>> >> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
>> >> > > [MANAGED_CURSOR_INFO_PAYLOAD]
>> >> > >
>> >> > >
>> >> > >-
>> >> > >
>> >> > >MAGIC_NUMBER
>> >> > >Ox4779
>> >> > >-
>> >> > >
>> >> > >METADATA
>> >> > >Add a named ManagedCursorInfoMetadata message to
>> >> MLDataFormats.proto:
>> >> > >message ManagedCursorInfoMetadata {
>> >> > >   required CompressionType compressionType = 1;
>> >> > >   required int32 uncompressedSize = 2;
>> >> > >}
>> >> > >
>> >> > > Currently, these compressions have been supported, we only need to
>> >> > > deal with compression and decompression of the ManagedCursorInfo
>> data:
>> >> > >
>> >> > >-
>> >> > >
>> >> > >Get CursorInfo from the metadata store
>> >> > >We will check the cursor data header, if it is compressed, we
>> will
>> >> > > parse the bytes data by compressed format, otherwise by the
>> original
>> >> > > way.
>> >> > >-
>> >> > >
>> >> > >Add/Update CursorInfo to the metadata store
>> >> > >The default is to use compression if the compression type is
>> >> > specified.
>> >> > >
>> >> > >
>> >> > > Thanks,
>> >> > > Zixuan
>> >> > >
>> >> >
>> >>
>> >
>>
>