Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-07 Thread r...@apache.org
Hi Zixuan:

Here I am more concerned about whether this feature will break backward
compatibility, for historical data or old clusters, how do we use this
feature.

--
Thanks
Xiaolong Ran

Zixuan Liu  于2022年3月7日周一 15:14写道:

> Hi everyone,
>
> Good catch! I update my proposal on
> https://github.com/apache/pulsar/issues/14529, and the compatibility part
> has been appended:
>
> 1. The compression is disabled by default
> 2. We need to consider how to migrate the old data when this compression
> has been enabled. If the cursor data header is compressed format, we will
> parse the bytes data by compressed format, otherwise we will parse the
> cursor data directly by the original way
>
> Zixuan Liu  于2022年3月7日周一 15:11写道:
>
> > Hi PengHui,
> >
> > Sorry, the correct URL: https://github.com/apache/pulsar/issues/14529.
> >
> > :( Because of the problem of subscription, the email here is very
> > confusing.
> >
> >
> > PengHui Li  于2022年3月7日周一 12:39写道:
> >
> >> Hi Zixuan,
> >>
> >> Looks like you have added the wrong link for the proposal?
> >> https://github.com/apache/pulsar/issues/14395 is for PIP-44
> >>
> >> Penghui
> >>
> >> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li  wrote:
> >>
> >> > > This is a global setting now. But I wonder if we should compress it
> >> only
> >> > if the size
> >> > is over a threshold?
> >> >
> >> > +1
> >> >
> >> > Penghui
> >> >
> >> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli 
> >> > wrote:
> >> >
> >> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang  ha
> >> >> scritto:
> >> >>
> >> >> > This is a global setting now. But I wonder if we should compress it
> >> only
> >> >> > if the size
> >> >> > is over a threshold?
> >> >>
> >> >>
> >> >> Good idea
> >> >>
> >> >> Enrico
> >> >>
> >> >>
> >> >>   Because:
> >> >> > 1. It's not easy for us to notice some managed cursor info is too
> >> large
> >> >> in
> >> >> > advance,  normally it would be found only if it have actual impact.
> >> But
> >> >> if
> >> >> > we enable this compression in advance, it will took some extra
> >> computing
> >> >> > resources.
> >> >> > 2. It seems that it won't be a common case that this managed cursor
> >> info
> >> >> > is too large (only if there are a lot individualDeletedMessages and
> >> >> > batchedEntryDeletionIndexInfo). So not quite necessary to compress
> >> all
> >> >> > managed cursor info.
> >> >> >
> >> >> > Regards,
> >> >> > Haiting
> >> >> >
> >> >> >
> >> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
> >> >> > > Hi Pulsar Community,
> >> >> > >
> >> >> > >
> >> >> > > I create a proposal that support ManagedCursorInfo compression.
> >> >> > >
> >> >> > > The proposal can be found:
> >> >> https://github.com/apache/pulsar/issues/14395
> >> >> > >
> >> >> > >
> >> >> > > Motivation
> >> >> > >
> >> >> > > The cursor data is managed by ZooKeeper/etcd metadata store. When
> >> >> > > cursor data becomes more and more, the data size will increase
> and
> >> >> > > will take a lot of time to pull the data. Therefore, it is
> >> necessary
> >> >> > > to add compression for the cursor, which can reduce the size of
> >> data
> >> >> > > and reduce the time of pulling data.
> >> >> > > Goal
> >> >> > >
> >> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
> >> >> ManagedCursorInfo.
> >> >> > > Implementation
> >> >> > >
> >> >> > >- Cursor compression format
> >> >> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> >> >> > > [MANAGED_CURSOR_INFO_PAYLOAD]
> >> >> > >
> >> >> > >
> >> >> > >-
> >> >> > >
> >> >> > >MAGIC_NUMBER
> >> >> > >Ox4779
> >> >> > >-
> >> >> > >
> >> >> > >METADATA
> >> >> > >Add a named ManagedCursorInfoMetadata message to
> >> >> MLDataFormats.proto:
> >> >> > >message ManagedCursorInfoMetadata {
> >> >> > >   required CompressionType compressionType = 1;
> >> >> > >   required int32 uncompressedSize = 2;
> >> >> > >}
> >> >> > >
> >> >> > > Currently, these compressions have been supported, we only need
> to
> >> >> > > deal with compression and decompression of the ManagedCursorInfo
> >> data:
> >> >> > >
> >> >> > >-
> >> >> > >
> >> >> > >Get CursorInfo from the metadata store
> >> >> > >We will check the cursor data header, if it is compressed, we
> >> will
> >> >> > > parse the bytes data by compressed format, otherwise by the
> >> original
> >> >> > > way.
> >> >> > >-
> >> >> > >
> >> >> > >Add/Update CursorInfo to the metadata store
> >> >> > >The default is to use compression if the compression type is
> >> >> > specified.
> >> >> > >
> >> >> > >
> >> >> > > Thanks,
> >> >> > > Zixuan
> >> >> > >
> >> >> >
> >> >>
> >> >
> >>
> >
>


RE: [VOTE] Pulsar Release 2.8.3 Candidate 4

2022-03-07 Thread Masahiro Sakamoto
+1 (binding)

- Checked checksums and signatures
- Checked license headers using Apache Rat
- Compiled the source
- Ran the standalone server
- Confirmed that producer and consumer work properly
- Validated functions, connectors, and stateful functions

Regards,

Masahiro Sakamoto
Yahoo Japan Corp.
E-mail: massa...@yahoo-corp.jp

-Original Message-
From: Michael Marshall  
Sent: Thursday, March 3, 2022 2:33 PM
To: Dev 
Subject: [VOTE] Pulsar Release 2.8.3 Candidate 4

This is the third release candidate for Apache Pulsar, version 2.8.3.

It fixes the following issues:
https://github.com/apache/pulsar/compare/v2.8.2...v2.8.3-candidate-4

*** Please download, test and vote on this release. This vote will stay open 
for at least 72 hours ***

Note that we are voting upon the source (tag), binaries are provided for 
convenience.

Source and binary files:
https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.3-candidate-4/

There are many checksums and signatures to validate, including 
apache-pulsar-2.8.3-bin.tar.gz, apache-pulsar-2.8.3-src.tar.gz, 
apache-pulsar-offloaders-2.8.3-bin.tar.gz, and all of the connectors.
All are located here:
https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.3-candidate-4/.

Unofficial Docker images:
michaelmarshall/pulsar:2.8.3-rc4
michaelmarshall/pulsar-all:2.8.3-rc4
michaelmarshall/pulsar-standalone:2.8.3-rc4
michaelmarshall/pulsar-grafana:2.8.3-rc4

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachepulsar-1145/

The tag to be voted upon:
v2.8.3-candidate-4 (ee87c7d6c20186ae59298a9a9ec1fdb2b09954c7)
https://github.com/apache/pulsar/releases/tag/v2.8.3-candidate-4

Pulsar's KEYS file containing PGP keys we use to sign the release:
https://dist.apache.org/repos/dist/dev/pulsar/KEYS

Please download the source package, and follow the README to build and run the 
Pulsar standalone service.


Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-07 Thread Enrico Olivelli
Il Lun 7 Mar 2022, 09:16 r...@apache.org  ha
scritto:

> Hi Zixuan:
>
> Here I am more concerned about whether this feature will break backward
> compatibility, for historical data or old clusters, how do we use this
> feature.
>

It is disabled by default.
New code will be able to read u compressed data. So you can safely update
Pulsar.
Once you enable the feature then you cannot rollback to previous versions.
The fact that we are going to compress only big values will allow you to
have more chances to be able to downgrade even if you enabled the feature


Enrico

Enrico


> --
> Thanks
> Xiaolong Ran
>
> Zixuan Liu  于2022年3月7日周一 15:14写道:
>
> > Hi everyone,
> >
> > Good catch! I update my proposal on
> > https://github.com/apache/pulsar/issues/14529, and the compatibility
> part
> > has been appended:
> >
> > 1. The compression is disabled by default
> > 2. We need to consider how to migrate the old data when this compression
> > has been enabled. If the cursor data header is compressed format, we will
> > parse the bytes data by compressed format, otherwise we will parse the
> > cursor data directly by the original way
> >
> > Zixuan Liu  于2022年3月7日周一 15:11写道:
> >
> > > Hi PengHui,
> > >
> > > Sorry, the correct URL: https://github.com/apache/pulsar/issues/14529.
> > >
> > > :( Because of the problem of subscription, the email here is very
> > > confusing.
> > >
> > >
> > > PengHui Li  于2022年3月7日周一 12:39写道:
> > >
> > >> Hi Zixuan,
> > >>
> > >> Looks like you have added the wrong link for the proposal?
> > >> https://github.com/apache/pulsar/issues/14395 is for PIP-44
> > >>
> > >> Penghui
> > >>
> > >> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li 
> wrote:
> > >>
> > >> > > This is a global setting now. But I wonder if we should compress
> it
> > >> only
> > >> > if the size
> > >> > is over a threshold?
> > >> >
> > >> > +1
> > >> >
> > >> > Penghui
> > >> >
> > >> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli  >
> > >> > wrote:
> > >> >
> > >> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang 
> ha
> > >> >> scritto:
> > >> >>
> > >> >> > This is a global setting now. But I wonder if we should compress
> it
> > >> only
> > >> >> > if the size
> > >> >> > is over a threshold?
> > >> >>
> > >> >>
> > >> >> Good idea
> > >> >>
> > >> >> Enrico
> > >> >>
> > >> >>
> > >> >>   Because:
> > >> >> > 1. It's not easy for us to notice some managed cursor info is too
> > >> large
> > >> >> in
> > >> >> > advance,  normally it would be found only if it have actual
> impact.
> > >> But
> > >> >> if
> > >> >> > we enable this compression in advance, it will took some extra
> > >> computing
> > >> >> > resources.
> > >> >> > 2. It seems that it won't be a common case that this managed
> cursor
> > >> info
> > >> >> > is too large (only if there are a lot individualDeletedMessages
> and
> > >> >> > batchedEntryDeletionIndexInfo). So not quite necessary to
> compress
> > >> all
> > >> >> > managed cursor info.
> > >> >> >
> > >> >> > Regards,
> > >> >> > Haiting
> > >> >> >
> > >> >> >
> > >> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
> > >> >> > > Hi Pulsar Community,
> > >> >> > >
> > >> >> > >
> > >> >> > > I create a proposal that support ManagedCursorInfo compression.
> > >> >> > >
> > >> >> > > The proposal can be found:
> > >> >> https://github.com/apache/pulsar/issues/14395
> > >> >> > >
> > >> >> > >
> > >> >> > > Motivation
> > >> >> > >
> > >> >> > > The cursor data is managed by ZooKeeper/etcd metadata store.
> When
> > >> >> > > cursor data becomes more and more, the data size will increase
> > and
> > >> >> > > will take a lot of time to pull the data. Therefore, it is
> > >> necessary
> > >> >> > > to add compression for the cursor, which can reduce the size of
> > >> data
> > >> >> > > and reduce the time of pulling data.
> > >> >> > > Goal
> > >> >> > >
> > >> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
> > >> >> ManagedCursorInfo.
> > >> >> > > Implementation
> > >> >> > >
> > >> >> > >- Cursor compression format
> > >> >> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> > >> >> > > [MANAGED_CURSOR_INFO_PAYLOAD]
> > >> >> > >
> > >> >> > >
> > >> >> > >-
> > >> >> > >
> > >> >> > >MAGIC_NUMBER
> > >> >> > >Ox4779
> > >> >> > >-
> > >> >> > >
> > >> >> > >METADATA
> > >> >> > >Add a named ManagedCursorInfoMetadata message to
> > >> >> MLDataFormats.proto:
> > >> >> > >message ManagedCursorInfoMetadata {
> > >> >> > >   required CompressionType compressionType = 1;
> > >> >> > >   required int32 uncompressedSize = 2;
> > >> >> > >}
> > >> >> > >
> > >> >> > > Currently, these compressions have been supported, we only need
> > to
> > >> >> > > deal with compression and decompression of the
> ManagedCursorInfo
> > >> data:
> > >> >> > >
> > >> >> > >-
> > >> >> > >
> > >> >> > >Get CursorInfo from the metadata store
> > >> >> > >We will check the cursor data header, if it is compressed,
> we
> > >> will
> > >> >> > > par

[VOTE] Pulsar Node.js Client Release 1.6.2 Candidate 1

2022-03-07 Thread Guangning E
Hi everyone,
Please review and vote on the release candidate #1 for the version 1.6.2,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This is the first release candidate for Apache Pulsar Node.js client,
version 1.6.2.

It fixes the following issues:
https://github.com/apache/pulsar-client-node/issues?q=label%3Arelease%2Fv1.6.2+

Please download the source files and review this release candidate:
- Review release notes
- Download the source package (verify shasum and asc) and follow the
README.md to build and run the Pulsar Node.js client.

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Source files:
https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-node/pulsar-client-node-1.6.2-candidate-1/

Pulsar's KEYS file containing PGP keys we use to sign the release:
https://dist.apache.org/repos/dist/dev/pulsar/KEYS

SHA-512 checksum:
9fa23565c247040c43ba363ae19e0c2b3e6f88dc3b26553b5e41262085b51d55ff357d6d7006ea1f3a714ab3b53107b42b8b8ac8ddf01f9f77f2bd1143aef93d
 pulsar-client-node-1.6.2.tar.gz

The tag to be voted upon:
v1.6.2-rc.1
https://github.com/apache/pulsar-client-node/releases/tag/v1.6.2-rc.1


Re: [VOTE] Pulsar Release 2.8.3 Candidate 4

2022-03-07 Thread Enrico Olivelli
+1 (binding)
- verified checksums, signatures, RAT, spotbugs...(using JDK11 on Mac)
- built the docker image
- built a couple of applications and run their tests
- run tests of a few applications using the Docker image (built from
source), those tests cover a good part of Transactions Java client
usages and Pulsar IO
- run Jakarta JMS 2.0 TCK (technology compatibility kit)

Thank you very much Michael

Enrico


Il giorno lun 7 mar 2022 alle ore 09:25 Masahiro Sakamoto
 ha scritto:
>
> +1 (binding)
>
> - Checked checksums and signatures
> - Checked license headers using Apache Rat
> - Compiled the source
> - Ran the standalone server
> - Confirmed that producer and consumer work properly
> - Validated functions, connectors, and stateful functions
>
> Regards,
>
> Masahiro Sakamoto
> Yahoo Japan Corp.
> E-mail: massa...@yahoo-corp.jp
>
> -Original Message-
> From: Michael Marshall 
> Sent: Thursday, March 3, 2022 2:33 PM
> To: Dev 
> Subject: [VOTE] Pulsar Release 2.8.3 Candidate 4
>
> This is the third release candidate for Apache Pulsar, version 2.8.3.
>
> It fixes the following issues:
> https://github.com/apache/pulsar/compare/v2.8.2...v2.8.3-candidate-4
>
> *** Please download, test and vote on this release. This vote will stay open 
> for at least 72 hours ***
>
> Note that we are voting upon the source (tag), binaries are provided for 
> convenience.
>
> Source and binary files:
> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.3-candidate-4/
>
> There are many checksums and signatures to validate, including 
> apache-pulsar-2.8.3-bin.tar.gz, apache-pulsar-2.8.3-src.tar.gz, 
> apache-pulsar-offloaders-2.8.3-bin.tar.gz, and all of the connectors.
> All are located here:
> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.3-candidate-4/.
>
> Unofficial Docker images:
> michaelmarshall/pulsar:2.8.3-rc4
> michaelmarshall/pulsar-all:2.8.3-rc4
> michaelmarshall/pulsar-standalone:2.8.3-rc4
> michaelmarshall/pulsar-grafana:2.8.3-rc4
>
> Maven staging repo:
> https://repository.apache.org/content/repositories/orgapachepulsar-1145/
>
> The tag to be voted upon:
> v2.8.3-candidate-4 (ee87c7d6c20186ae59298a9a9ec1fdb2b09954c7)
> https://github.com/apache/pulsar/releases/tag/v2.8.3-candidate-4
>
> Pulsar's KEYS file containing PGP keys we use to sign the release:
> https://dist.apache.org/repos/dist/dev/pulsar/KEYS
>
> Please download the source package, and follow the README to build and run 
> the Pulsar standalone service.


Re: [DISUSS] Improve unit test stability

2022-03-07 Thread Enrico Olivelli
Il Lun 7 Mar 2022, 07:46 Haiting Jiang  ha scritto:

> +1 for this great idea.
> Although I am not sure if there is an easy and accurate way to "find new
> or modified unit tests".
>
> Thanks,
> Haiting
>
> On 2022/03/07 03:11:47 包子 wrote:
> > Hi, I want to discuss how to improve the stability of unit testing.I
> found that most flaky unit tests can be reproduc locally and only need to
> be executed a few more times.
> >
> > Can we add the following mandatory constraints to ensure the stability
> of unit testing of code?
> > 1. If new / modified unit tests are included in PR, they need to be run
> continuously on CI for more than n times.
> > 2. We can write scripts and parse git change records to find new or
> modified unit tests.
>

I am not sure that this helps.
You can still be very lucky and unfortunately many times the impact is on
other tests, not the new tests.

I believe that the best weapons we have are:
- Good code reviews
- if there is a error on CI, don't default to 'rerun-failure-checks' but
look carefully into the errors (we could disable the ability to rerun
failed tests to non committers and so giving more control on CI to the
committer who is sponsoring a patch)


Enrico


>
> >
>


Re: [DISUSS] Improve unit test stability

2022-03-07 Thread Baozi
Thank for your reply. I have some different viewpoint. Please have a look.

> I am not sure that this helps.
> You can still be very lucky and unfortunately many times the impact is on
> other tests, not the new tests.

Yes, that cannot avoid other unit tests affected by this PR. However, the 
stability of unit tests new or modified in the current PR can be ensure. I 
think it's useful.

> I believe that the best weapons we have are:
> - Good code reviews

Yes,We need good code review. But, When the unit test passes, we usually think 
that the PR is OK. I'm not sure if this alone can guarantee.

> - if there is a error on CI, don't default to 'rerun-failure-checks' but
> look carefully into the errors (we could disable the ability to rerun
> failed tests to non committers and so giving more control on CI to the
> committer who is sponsoring a patch)

If this feature is disabled, there may be the following problems:
1. The merger efficiency of each PR is reduced.
2. The owner of the PR needs to deal with additional work.


To sum up, I think the essential problem is that we need as far as possible 
ensure that the unit tests involved in each PR are stable before merging. Let 
the owner of the PR handle the test stability involved in the PR. a good code 
review is one of the mechanisms, and we need more effective and force means to 
check.

Hope to get more suggestions and actions.Otherwise, we have fixed the existing 
flaky tests, but as more features are added, they will continue to be produced. 

> PS: Maybe I'm worried too much. Is it normal to have a lot of flaky tests?




Re: [DISCUSS] Releasing pulsar-client-go 0.8.1

2022-03-07 Thread Sijie Guo
+1

On Sun, Mar 6, 2022 at 6:46 PM r...@apache.org 
wrote:

> +1
>
> --
> Thanks
> Xiaolong Ran
>
> PengHui Li  于2022年3月5日周六 18:10写道:
>
> > +1
> >
> > Penghui
> >
> > On Sat, Mar 5, 2022 at 4:58 AM Matteo Merli 
> > wrote:
> >
> > > +1 Thanks Rui, we should eliminate the GPL dependency ASAP.
> > >
> > >
> > >
> > > --
> > > Matteo Merli
> > > 
> > >
> > > On Thu, Mar 3, 2022 at 2:08 AM Rui Fu  wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > I would like to start a discussion here about starting a new release
> of
> > > > pulsar-client-go v0.8.1. Recently we have some of dependencies
> updated
> > > PRs
> > > > from the community, [1] is bumping `github.com/beefsack/go-rate`
> 
> > 
> > >  to the
> > > > latest version, which migrates the license from GPL to MIT. [2] is
> > > bumping `
> > > > github.com/prometheus/client_golang`
> 
> > 
> > >  to address the
> > > CVE-2022-21698. For
> > > > more details, please check the links below.
> > > >
> > > > As the v0.8.0 was just released weeks ago and the next release will
> > start
> > > > about 2 month later, I think we should start the release of v0.8.1.
> > > >
> > > > [1]: https://github.com/apache/pulsar-client-go/pull/735
> > > > [2]: https://github.com/apache/pulsar-client-go/pull/738
> > > >
> > > > --
> > > >
> > > > Best Regards,
> > > >
> > > > Rui Fu
> > >
> >
>


[ANNOUNCE] New Committer: Andrey Yegorov

2022-03-07 Thread Dave Fisher
The Apache Pulsar Project Management Committee (PMC) has invited Andrey Yegorov
https://github.com/dlg99 to become a committer and we are pleased to
announce that he has accepted.

Andrey has made great contributions to Pulsar including Connector and Adaptor 
work along with updating dependencies for CVEs
Welcome and Congratulations, Li Li!

Please join us in congratulating and welcoming Andrey onboard!

Best Regards,
Dave Fisher on behalf of the Pulsar PMC

Re: [Discuss] Create Pulsar client release notes

2022-03-07 Thread Sijie Guo
Yu, thank you for driving this effort! Great to see a proposal focusing on
this! +1

On Fri, Mar 4, 2022 at 2:59 AM Yu  wrote:

> Hi Pulsarers,
>
> For Pulsar release notes [1], we had issues below for a long time.
>
> - All contents are on a single Pulsar release note page. No navigations.
> It's easy to get lost and hard to understand which changes belong to
> which Pulsar version.
> - Java, C++, Python, WebSocket clients do not have independent release
> notes, they belong to parts of Pulsar release notes. It is hard to find and
> use.
> - Go, Node.js, C# clients’ changelogs are hosted in their own GitHub repos
> and not shown on the Pulsar website. Users need more clicks to get the info.
> - ...
>
> Recently, we got some negative feedback from users. I've submitted some
> changes [2] but it's a stopgap.
>
> To completely solve these problems, I propose to make some changes in PIP
> 148 [3], including but not limited to:
>
> - Create a "Release notes" chapter to docs, which shows all releases
> changes and release timeline.
> - Add necessary explanations, such as time-based release plan, release
> frequency, semantic versioning, maintenance life cycle, etc.
> - Create independent release notes for all clients.
> - Automate the process of generating all release notes, which is
> relevant to PIP 112 [4].
> - Add navigations, etc.
> - ...
>
>
> Here are mockups [5]. You can check and comment without login.
>
> ACTION: Please provide your feedback within 72 hours. **If there is no
> discussion or objection, we’ll implement them as shown in the mockups.**
>
> We’d love your feedback! Thanks!
>
> [1] https://pulsar.apache.org/en/release-notes/
> [2] https://github.com/apache/pulsar/pull/14430
> [3]
> https://docs.google.com/document/d/1o6MWV3GvXQgKw1ZpL86y43xjnYvaUKrl6HxJKZnQyEU/edit#heading=h.35s8x8c7ja4c
>
> [4]
> https://docs.google.com/document/d/1Ul2qIChDe8QDlDwJBICq1VviYZhdk1djKJJC5wXAGsI/edit#
> [5]
> https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=0
>
>


Re: [ANNOUNCE] New Committer: Andrey Yegorov

2022-03-07 Thread Enrico Olivelli
Congratulations!

Enrico

Il Lun 7 Mar 2022, 20:26 Dave Fisher  ha scritto:

> The Apache Pulsar Project Management Committee (PMC) has invited Andrey
> Yegorov
> https://github.com/dlg99 to become a committer and we are pleased to
> announce that he has accepted.
>
> Andrey has made great contributions to Pulsar including Connector and
> Adaptor work along with updating dependencies for CVEs
> Welcome and Congratulations, Li Li!
>
> Please join us in congratulating and welcoming Andrey onboard!
>
> Best Regards,
> Dave Fisher on behalf of the Pulsar PMC


Re: [ANNOUNCE] New Committer: Andrey Yegorov

2022-03-07 Thread Andrey Yegorov
Thank you!

On Mon, Mar 7, 2022 at 12:23 PM Enrico Olivelli  wrote:

> Congratulations!
>
> Enrico
>
> Il Lun 7 Mar 2022, 20:26 Dave Fisher  ha scritto:
>
> > The Apache Pulsar Project Management Committee (PMC) has invited Andrey
> > Yegorov
> > https://github.com/dlg99 to become a committer and we are pleased to
> > announce that he has accepted.
> >
> > Andrey has made great contributions to Pulsar including Connector and
> > Adaptor work along with updating dependencies for CVEs
> > Welcome and Congratulations, Li Li!
> >
> > Please join us in congratulating and welcoming Andrey onboard!
> >
> > Best Regards,
> > Dave Fisher on behalf of the Pulsar PMC
>


-- 
Andrey Yegorov


Re: [Discuss] Create Pulsar client release notes

2022-03-07 Thread Huanli Meng
+1

BR//Huanli

> On Mar 8, 2022, at 3:52 AM, Sijie Guo  wrote:
> 
> Yu, thank you for driving this effort! Great to see a proposal focusing on 
> this! +1
> 
> On Fri, Mar 4, 2022 at 2:59 AM Yu  > wrote:
> Hi Pulsarers,
> 
> For Pulsar release notes [1], we had issues below for a long time. 
> 
> - All contents are on a single Pulsar release note page. No navigations. It's 
> easy to get lost and hard to understand which changes belong to which Pulsar 
> version.
> - Java, C++, Python, WebSocket clients do not have independent release notes, 
> they belong to parts of Pulsar release notes. It is hard to find and use.
> - Go, Node.js, C# clients’ changelogs are hosted in their own GitHub repos 
> and not shown on the Pulsar website. Users need more clicks to get the info.
> - ...
> 
> Recently, we got some negative feedback from users. I've submitted some 
> changes [2] but it's a stopgap.
> 
> To completely solve these problems, I propose to make some changes in PIP 148 
> [3], including but not limited to:
> 
> - Create a "Release notes" chapter to docs, which shows all releases changes 
> and release timeline.
> - Add necessary explanations, such as time-based release plan, release 
> frequency, semantic versioning, maintenance life cycle, etc.
> - Create independent release notes for all clients.
> - Automate the process of generating all release notes, which is relevant to 
> PIP 112 [4].
> - Add navigations, etc.
> - ...
> 
> 
> Here are mockups [5]. You can check and comment without login. 
> 
> ACTION: Please provide your feedback within 72 hours. **If there is no 
> discussion or objection, we’ll implement them as shown in the mockups.**
> 
> We’d love your feedback! Thanks!
> 
> [1] https://pulsar.apache.org/en/release-notes/ 
>  
> [2] https://github.com/apache/pulsar/pull/14430 
> 
> [3] 
> https://docs.google.com/document/d/1o6MWV3GvXQgKw1ZpL86y43xjnYvaUKrl6HxJKZnQyEU/edit#heading=h.35s8x8c7ja4c
>  
> 
>  
> [4] 
> https://docs.google.com/document/d/1Ul2qIChDe8QDlDwJBICq1VviYZhdk1djKJJC5wXAGsI/edit#
>  
> 
> [5] 
> https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=0
>  
> 
> 



Re: [VOTE] Pulsar Release 2.8.3 Candidate 4

2022-03-07 Thread PengHui Li
+1 (binding)

- Checked signatures
- Run standalone, publish/consume messages
- Checked connector and stateful function

Penghui


On Mon, Mar 7, 2022 at 7:28 PM Enrico Olivelli  wrote:

> +1 (binding)
> - verified checksums, signatures, RAT, spotbugs...(using JDK11 on Mac)
> - built the docker image
> - built a couple of applications and run their tests
> - run tests of a few applications using the Docker image (built from
> source), those tests cover a good part of Transactions Java client
> usages and Pulsar IO
> - run Jakarta JMS 2.0 TCK (technology compatibility kit)
>
> Thank you very much Michael
>
> Enrico
>
>
> Il giorno lun 7 mar 2022 alle ore 09:25 Masahiro Sakamoto
>  ha scritto:
> >
> > +1 (binding)
> >
> > - Checked checksums and signatures
> > - Checked license headers using Apache Rat
> > - Compiled the source
> > - Ran the standalone server
> > - Confirmed that producer and consumer work properly
> > - Validated functions, connectors, and stateful functions
> >
> > Regards,
> >
> > Masahiro Sakamoto
> > Yahoo Japan Corp.
> > E-mail: massa...@yahoo-corp.jp
> >
> > -Original Message-
> > From: Michael Marshall 
> > Sent: Thursday, March 3, 2022 2:33 PM
> > To: Dev 
> > Subject: [VOTE] Pulsar Release 2.8.3 Candidate 4
> >
> > This is the third release candidate for Apache Pulsar, version 2.8.3.
> >
> > It fixes the following issues:
> > https://github.com/apache/pulsar/compare/v2.8.2...v2.8.3-candidate-4
> >
> > *** Please download, test and vote on this release. This vote will stay
> open for at least 72 hours ***
> >
> > Note that we are voting upon the source (tag), binaries are provided for
> convenience.
> >
> > Source and binary files:
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.3-candidate-4/
> >
> > There are many checksums and signatures to validate, including
> apache-pulsar-2.8.3-bin.tar.gz, apache-pulsar-2.8.3-src.tar.gz,
> apache-pulsar-offloaders-2.8.3-bin.tar.gz, and all of the connectors.
> > All are located here:
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.3-candidate-4/.
> >
> > Unofficial Docker images:
> > michaelmarshall/pulsar:2.8.3-rc4
> > michaelmarshall/pulsar-all:2.8.3-rc4
> > michaelmarshall/pulsar-standalone:2.8.3-rc4
> > michaelmarshall/pulsar-grafana:2.8.3-rc4
> >
> > Maven staging repo:
> > https://repository.apache.org/content/repositories/orgapachepulsar-1145/
> >
> > The tag to be voted upon:
> > v2.8.3-candidate-4 (ee87c7d6c20186ae59298a9a9ec1fdb2b09954c7)
> > https://github.com/apache/pulsar/releases/tag/v2.8.3-candidate-4
> >
> > Pulsar's KEYS file containing PGP keys we use to sign the release:
> > https://dist.apache.org/repos/dist/dev/pulsar/KEYS
> >
> > Please download the source package, and follow the README to build and
> run the Pulsar standalone service.
>


Re: [Discuss] Create Pulsar client release notes

2022-03-07 Thread PengHui Li
+1

Penghui

On Tue, Mar 8, 2022 at 8:39 AM Huanli Meng 
wrote:

> +1
>
> BR//Huanli
>
> > On Mar 8, 2022, at 3:52 AM, Sijie Guo  wrote:
> >
> > Yu, thank you for driving this effort! Great to see a proposal focusing
> on this! +1
> >
> > On Fri, Mar 4, 2022 at 2:59 AM Yu  li...@apache.org>> wrote:
> > Hi Pulsarers,
> >
> > For Pulsar release notes [1], we had issues below for a long time.
> >
> > - All contents are on a single Pulsar release note page. No navigations.
> It's easy to get lost and hard to understand which changes belong to which
> Pulsar version.
> > - Java, C++, Python, WebSocket clients do not have independent release
> notes, they belong to parts of Pulsar release notes. It is hard to find and
> use.
> > - Go, Node.js, C# clients’ changelogs are hosted in their own GitHub
> repos and not shown on the Pulsar website. Users need more clicks to get
> the info.
> > - ...
> >
> > Recently, we got some negative feedback from users. I've submitted some
> changes [2] but it's a stopgap.
> >
> > To completely solve these problems, I propose to make some changes in
> PIP 148 [3], including but not limited to:
> >
> > - Create a "Release notes" chapter to docs, which shows all releases
> changes and release timeline.
> > - Add necessary explanations, such as time-based release plan, release
> frequency, semantic versioning, maintenance life cycle, etc.
> > - Create independent release notes for all clients.
> > - Automate the process of generating all release notes, which is
> relevant to PIP 112 [4].
> > - Add navigations, etc.
> > - ...
> >
> >
> > Here are mockups [5]. You can check and comment without login.
> >
> > ACTION: Please provide your feedback within 72 hours. **If there is no
> discussion or objection, we’ll implement them as shown in the mockups.**
> >
> > We’d love your feedback! Thanks!
> >
> > [1] https://pulsar.apache.org/en/release-notes/ <
> https://pulsar.apache.org/en/release-notes/>
> > [2] https://github.com/apache/pulsar/pull/14430 <
> https://github.com/apache/pulsar/pull/14430>
> > [3]
> https://docs.google.com/document/d/1o6MWV3GvXQgKw1ZpL86y43xjnYvaUKrl6HxJKZnQyEU/edit#heading=h.35s8x8c7ja4c
> <
> https://docs.google.com/document/d/1o6MWV3GvXQgKw1ZpL86y43xjnYvaUKrl6HxJKZnQyEU/edit#heading=h.35s8x8c7ja4c>
>
> > [4]
> https://docs.google.com/document/d/1Ul2qIChDe8QDlDwJBICq1VviYZhdk1djKJJC5wXAGsI/edit#
> <
> https://docs.google.com/document/d/1Ul2qIChDe8QDlDwJBICq1VviYZhdk1djKJJC5wXAGsI/edit#
> >
> > [5]
> https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=0
> <
> https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=0
> >
> >
>
>


Re: [DISCUSS] Releasing pulsar-client-go 0.8.1

2022-03-07 Thread Rui Fu
Thanks for all your votings, I will start working on the release.

On 2022/03/07 19:14:00 Sijie Guo wrote:
> +1
> 
> On Sun, Mar 6, 2022 at 6:46 PM r...@apache.org 
> wrote:
> 
> > +1
> >
> > --
> > Thanks
> > Xiaolong Ran
> >
> > PengHui Li  于2022年3月5日周六 18:10写道:
> >
> > > +1
> > >
> > > Penghui
> > >
> > > On Sat, Mar 5, 2022 at 4:58 AM Matteo Merli 
> > > wrote:
> > >
> > > > +1 Thanks Rui, we should eliminate the GPL dependency ASAP.
> > > >
> > > >
> > > >
> > > > --
> > > > Matteo Merli
> > > > 
> > > >
> > > > On Thu, Mar 3, 2022 at 2:08 AM Rui Fu  wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > I would like to start a discussion here about starting a new release
> > of
> > > > > pulsar-client-go v0.8.1. Recently we have some of dependencies
> > updated
> > > > PRs
> > > > > from the community, [1] is bumping `github.com/beefsack/go-rate`
> > 
> > > 
> > > >  to the
> > > > > latest version, which migrates the license from GPL to MIT. [2] is
> > > > bumping `
> > > > > github.com/prometheus/client_golang`
> > 
> > > 
> > > >  to address the
> > > > CVE-2022-21698. For
> > > > > more details, please check the links below.
> > > > >
> > > > > As the v0.8.0 was just released weeks ago and the next release will
> > > start
> > > > > about 2 month later, I think we should start the release of v0.8.1.
> > > > >
> > > > > [1]: https://github.com/apache/pulsar-client-go/pull/735
> > > > > [2]: https://github.com/apache/pulsar-client-go/pull/738
> > > > >
> > > > > --
> > > > >
> > > > > Best Regards,
> > > > >
> > > > > Rui Fu
> > > >
> > >
> >
> 


[GitHub] [pulsar-site] Anonymitaet commented on a change in pull request #8: Integrating changes from community feedback

2022-03-07 Thread GitBox


Anonymitaet commented on a change in pull request #8:
URL: https://github.com/apache/pulsar-site/pull/8#discussion_r821257740



##
File path: site2/website-next/docusaurus.config.js
##
@@ -254,28 +205,10 @@ module.exports = {
   position: "right",
 },
 {
-  label: "Version",
-  to: "/docs",
-  position: "right",
-  items: [
-{
-  label: "2.9.0",
-  to: "/docs/2.9.0/",
-},
-{
-  label: "2.8.0",
-  to: "/docs/2.8.0/",
-},
-{
-  label: "2.7.0",
-  to: "/docs/2.7.0/",
-},
-{
-  label: "2.6.0",
-  to: "/docs/2.6.0/",
-},
-  ],
-},

Review comment:
   Talked w/ @ @D-2-Ed, "Version" should be replaced w/ "Download" (or 
other word). The "Download" page should contain [new "Release notes" 
page](https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=149553782).
 





-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: Creating Good Release notes

2022-03-07 Thread PengHui Li
For 2.10.0 release, I'm using `gh` command to generate the release note.

For example

```
gh pr list -L 1000 --search "is:pr milestone:2.10.0 is:merged
-label:release/2.9.1 -label:release/2.9.2 " --json title,number,url | jq -r
'.[] | "\(.title) [\(.number)](\(.url))"'   | grep -v "website" | grep -v
"doc" | grep -v "Documentation"
```

I noticed that we need to address couple issues

1. The title of the PR should be refined during the PR review, before
merging it.
2. Add the milestone and release/xxx tag(if needed) before merging the PR
3. For merging the PR, the committer need to check the merge information
carefully, Github will use the first commit message
as the merge commit message, sometime it's not a meaningful message
4. Add the component/ tag before merging the PR

So that we can generate the release note automatically according to the
component tag.

Thanks,
Penghui

On Wed, Dec 8, 2021 at 9:33 AM Li Li  wrote:

> +1
>
>
> > On Dec 3, 2021, at 7:49 PM, Anonymitaet _ 
> wrote:
> >
> > Hi Pulsarers,
> >
> > Thanks for your suggestions.
> >
> > I think we need to make consensus on the following issues:
> >
> > #1
> > What should be included in the RN (release note)?
> >
> > Only include major changes (important features/enhancements/bug fixes)
> in list form rather than a raw dump of PRs.
> >
> > Reason:
> >
> > - The target audience is Pulsar developers and users, who usually skim
> RN to look for features that matter to them, so it is necessary to have
> this place for easier search.
> >
> > - For each release, to help users know the highlights in a quicker way,
> usually we write tech blogs [1] to explain more details (e.g. What has
> changed?, Why has it changed?, How is the user impacted?, What does the
> user need to do now?), so RN can be a "simple list" of what we have
> achieved.
> >
> > - Users can get changelogs on GitHub if they want to know every change
> of a release.
> >
> > #2
> > How to create a quality RN efficiently?
> >
> > If RN is regarded as an afterthought and finished as a last-minute task,
> it is likely not written well. Instead of rushing, treating RN as a part of
> development reduces release manager's workload and makes communication more
> coordinated. Consequently, **the process of the current workflow can be
> improved**:
> >
> > 2.a.
> > Create guidelines/standards for writing a qualified PR title (and
> description) and git commit message.
> > →This really matters for release manager to know the changes and cut a
> release.
> >
> > 2.b.
> > Give PR with corresponding labels .
> > e.g. `release/note-required`, labels related to PR changes (such as
> `component/functions`, `component/java-client`)
> > →So that automatic tools knows contain which PR and assign the PR to the
> correct chapter in RN.
> >
> > 2.c.
> > Create templates for RN.
> >
> > 2.d.
> > Find tools to generate RN automatically based on qualified PR title, PR
> labels, and RN template.
> >
> > In this case, if each PR information is provided in a consistent and
> clear way, RN can be automatically generated in a quick manner. Also it
> saves release manager's life.
> >
> > [1] Tech blog example:
> https://pulsar.apache.org/blog/2021/09/23/Apache-Pulsar-2-8-1/
> >
> >>>
> >
> > On 2021/12/3, 12:04, "Yunze Xu"  wrote:
> >
> >First I agree with Jonathan that we should perform some changes with
> >the original PR descriptions.
> >
> >Then, classifying these PRs is also necessary, otherwise the release
> notes
> >would be meaningless. There are a lot of PRs that should be classfied
> in
> >Misc part of https://github.com/apache/pulsar/pull/12425 <
> https://github.com/apache/pulsar/pull/12425> and I also gave
> >some comments in the PR.
> >
> >IMO, it’s okay to ignore the PRs that only fix some typos or fix some
> flaky tests.
> >But I found many PRs in Misc part should also be noted.
> >
> >We should not sacrifice the release quality for a new release like
> 2.9.1.
> >
> >> 2021年12月2日 下午7:11,Enrico Olivelli  写道:
> >>
> >> Hello community,
> >>
> >> There is an open discussion on the Pulsar 2.9.0 release notes PR:
> >> https://github.com/apache/pulsar/pull/12425
> >>
> >> I have created the block of release notes by downloading the list of PR
> >> using some GitHub API.
> >> Then I have manually classified:
> >> - News and Noteworthy: cool things in the Release
> >> - Breaking Changes: things you MUST know when you upgrade
> >> - Java Client, C++ Client, Python Client, Functions/Pulsar IO
> >>
> >> The goal is to provide useful information for people who want to upgrade
> >> Pulsar.
> >>
> >> My problems are:
> >> - PR titles are often badly written, but I don't want to fix all of them
> >> (typos,  tenses of verbs, formatting)
> >> - There are more than 300 PRs, I don't want to classify them manually, I
> >> just highlighted the most important from my point of view
> >>
> >> If for 2.9.0 we still keep a list of PR, then I believe that the current
> >

Re: [DISUSS] Improve unit test stability

2022-03-07 Thread PengHui Li
I think we can try this idea, it looks like to find a way to
ensure the new changes from the test will not introduce
new uncertainty.

Penghui

On Mon, Mar 7, 2022 at 11:27 PM Baozi 
wrote:

> Thank for your reply. I have some different viewpoint. Please have a look.
>
> > I am not sure that this helps.
> > You can still be very lucky and unfortunately many times the impact is on
> > other tests, not the new tests.
>
> Yes, that cannot avoid other unit tests affected by this PR. However, the
> stability of unit tests new or modified in the current PR can be ensure. I
> think it's useful.
>
> > I believe that the best weapons we have are:
> > - Good code reviews
>
> Yes,We need good code review. But, When the unit test passes, we usually
> think that the PR is OK. I'm not sure if this alone can guarantee.
>
> > - if there is a error on CI, don't default to 'rerun-failure-checks' but
> > look carefully into the errors (we could disable the ability to rerun
> > failed tests to non committers and so giving more control on CI to the
> > committer who is sponsoring a patch)
>
> If this feature is disabled, there may be the following problems:
> 1. The merger efficiency of each PR is reduced.
> 2. The owner of the PR needs to deal with additional work.
>
>
> To sum up, I think the essential problem is that we need as far as
> possible ensure that the unit tests involved in each PR are stable before
> merging. Let the owner of the PR handle the test stability involved in the
> PR. a good code review is one of the mechanisms, and we need more effective
> and force means to check.
>
> Hope to get more suggestions and actions.Otherwise, we have fixed the
> existing flaky tests, but as more features are added, they will continue to
> be produced.
>
> > PS: Maybe I'm worried too much. Is it normal to have a lot of flaky
> tests?
>
>
>


Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-07 Thread Zixuan Liu
Hi Xiaolong,

It is disabled by default. Once you enable this feature:
When reading your data, we will check your data header, if it is compressed
data, we will parse this data by compression format, otherwise parse it by
the original way.
When updating your data, we will compress your data by the compression type.

We don't support rollback the data of the previous version Once you enable
this feature.

Thanks,
Zixuan


r...@apache.org  于2022年3月7日周一 16:16写道:

> Hi Zixuan:
>
> Here I am more concerned about whether this feature will break backward
> compatibility, for historical data or old clusters, how do we use this
> feature.
>
> --
> Thanks
> Xiaolong Ran
>
> Zixuan Liu  于2022年3月7日周一 15:14写道:
>
> > Hi everyone,
> >
> > Good catch! I update my proposal on
> > https://github.com/apache/pulsar/issues/14529, and the compatibility
> part
> > has been appended:
> >
> > 1. The compression is disabled by default
> > 2. We need to consider how to migrate the old data when this compression
> > has been enabled. If the cursor data header is compressed format, we will
> > parse the bytes data by compressed format, otherwise we will parse the
> > cursor data directly by the original way
> >
> > Zixuan Liu  于2022年3月7日周一 15:11写道:
> >
> > > Hi PengHui,
> > >
> > > Sorry, the correct URL: https://github.com/apache/pulsar/issues/14529.
> > >
> > > :( Because of the problem of subscription, the email here is very
> > > confusing.
> > >
> > >
> > > PengHui Li  于2022年3月7日周一 12:39写道:
> > >
> > >> Hi Zixuan,
> > >>
> > >> Looks like you have added the wrong link for the proposal?
> > >> https://github.com/apache/pulsar/issues/14395 is for PIP-44
> > >>
> > >> Penghui
> > >>
> > >> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li 
> wrote:
> > >>
> > >> > > This is a global setting now. But I wonder if we should compress
> it
> > >> only
> > >> > if the size
> > >> > is over a threshold?
> > >> >
> > >> > +1
> > >> >
> > >> > Penghui
> > >> >
> > >> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli  >
> > >> > wrote:
> > >> >
> > >> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang 
> ha
> > >> >> scritto:
> > >> >>
> > >> >> > This is a global setting now. But I wonder if we should compress
> it
> > >> only
> > >> >> > if the size
> > >> >> > is over a threshold?
> > >> >>
> > >> >>
> > >> >> Good idea
> > >> >>
> > >> >> Enrico
> > >> >>
> > >> >>
> > >> >>   Because:
> > >> >> > 1. It's not easy for us to notice some managed cursor info is too
> > >> large
> > >> >> in
> > >> >> > advance,  normally it would be found only if it have actual
> impact.
> > >> But
> > >> >> if
> > >> >> > we enable this compression in advance, it will took some extra
> > >> computing
> > >> >> > resources.
> > >> >> > 2. It seems that it won't be a common case that this managed
> cursor
> > >> info
> > >> >> > is too large (only if there are a lot individualDeletedMessages
> and
> > >> >> > batchedEntryDeletionIndexInfo). So not quite necessary to
> compress
> > >> all
> > >> >> > managed cursor info.
> > >> >> >
> > >> >> > Regards,
> > >> >> > Haiting
> > >> >> >
> > >> >> >
> > >> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
> > >> >> > > Hi Pulsar Community,
> > >> >> > >
> > >> >> > >
> > >> >> > > I create a proposal that support ManagedCursorInfo compression.
> > >> >> > >
> > >> >> > > The proposal can be found:
> > >> >> https://github.com/apache/pulsar/issues/14395
> > >> >> > >
> > >> >> > >
> > >> >> > > Motivation
> > >> >> > >
> > >> >> > > The cursor data is managed by ZooKeeper/etcd metadata store.
> When
> > >> >> > > cursor data becomes more and more, the data size will increase
> > and
> > >> >> > > will take a lot of time to pull the data. Therefore, it is
> > >> necessary
> > >> >> > > to add compression for the cursor, which can reduce the size of
> > >> data
> > >> >> > > and reduce the time of pulling data.
> > >> >> > > Goal
> > >> >> > >
> > >> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
> > >> >> ManagedCursorInfo.
> > >> >> > > Implementation
> > >> >> > >
> > >> >> > >- Cursor compression format
> > >> >> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> > >> >> > > [MANAGED_CURSOR_INFO_PAYLOAD]
> > >> >> > >
> > >> >> > >
> > >> >> > >-
> > >> >> > >
> > >> >> > >MAGIC_NUMBER
> > >> >> > >Ox4779
> > >> >> > >-
> > >> >> > >
> > >> >> > >METADATA
> > >> >> > >Add a named ManagedCursorInfoMetadata message to
> > >> >> MLDataFormats.proto:
> > >> >> > >message ManagedCursorInfoMetadata {
> > >> >> > >   required CompressionType compressionType = 1;
> > >> >> > >   required int32 uncompressedSize = 2;
> > >> >> > >}
> > >> >> > >
> > >> >> > > Currently, these compressions have been supported, we only need
> > to
> > >> >> > > deal with compression and decompression of the
> ManagedCursorInfo
> > >> data:
> > >> >> > >
> > >> >> > >-
> > >> >> > >
> > >> >> > >Get CursorInfo from the metadata store
> > >> >> > >We will check the cursor data head

[DISCUSS] Byte schema compatibility issue

2022-03-07 Thread guo jiwei
Hi,
   I want to discuss the compatibility issue with the byte schema here.
   For now, the byte-schema is compatible with all other schemas. This may
introduce more issues.
   Case 1:
  1. Consumer1 init with JSON schema for topic A.
  2. But producer1 init without schema and send byte messages
directly to topic A.
  This will cause consumer1 to deserialize msg error.  Also,
producer1 may send unsafe byte data.

 Case 2:
   1. Consumer1 init with byte schema for topic A.
   2. But producer1 init with AVRO/JSON schema and send messages to
topic A.
   This will cause consumer1 don't know how to deserialize msg.

To avoid the above issues, Byte schema should also follow the schema
compatibility policy. I'm open #13701
 to track this. If the idea
is accepted, I will start a PIP.

 Please give some suggestions about this idea.


Regards
Jiwei Guo (Tboy)


Re: [Discuss] Create Pulsar client release notes

2022-03-07 Thread Li Li
+1, I will do this task.

Li Li

> On Mar 8, 2022, at 8:39 AM, Huanli Meng  
> wrote:
> 
> +1
> 
> BR//Huanli
> 
>> On Mar 8, 2022, at 3:52 AM, Sijie Guo  wrote:
>> 
>> Yu, thank you for driving this effort! Great to see a proposal focusing on 
>> this! +1
>> 
>> On Fri, Mar 4, 2022 at 2:59 AM Yu > > wrote:
>> Hi Pulsarers,
>> 
>> For Pulsar release notes [1], we had issues below for a long time. 
>> 
>> - All contents are on a single Pulsar release note page. No navigations. 
>> It's easy to get lost and hard to understand which changes belong to which 
>> Pulsar version.
>> - Java, C++, Python, WebSocket clients do not have independent release 
>> notes, they belong to parts of Pulsar release notes. It is hard to find and 
>> use.
>> - Go, Node.js, C# clients’ changelogs are hosted in their own GitHub repos 
>> and not shown on the Pulsar website. Users need more clicks to get the info.
>> - ...
>> 
>> Recently, we got some negative feedback from users. I've submitted some 
>> changes [2] but it's a stopgap.
>> 
>> To completely solve these problems, I propose to make some changes in PIP 
>> 148 [3], including but not limited to:
>> 
>> - Create a "Release notes" chapter to docs, which shows all releases changes 
>> and release timeline.
>> - Add necessary explanations, such as time-based release plan, release 
>> frequency, semantic versioning, maintenance life cycle, etc.
>> - Create independent release notes for all clients.
>> - Automate the process of generating all release notes, which is relevant to 
>> PIP 112 [4].
>> - Add navigations, etc.
>> - ...
>> 
>> 
>> Here are mockups [5]. You can check and comment without login. 
>> 
>> ACTION: Please provide your feedback within 72 hours. **If there is no 
>> discussion or objection, we’ll implement them as shown in the mockups.**
>> 
>> We’d love your feedback! Thanks!
>> 
>> [1] https://pulsar.apache.org/en/release-notes/ 
>>  
>> [2] https://github.com/apache/pulsar/pull/14430 
>> 
>> [3] 
>> https://docs.google.com/document/d/1o6MWV3GvXQgKw1ZpL86y43xjnYvaUKrl6HxJKZnQyEU/edit#heading=h.35s8x8c7ja4c
>>  
>> 
>>  
>> [4] 
>> https://docs.google.com/document/d/1Ul2qIChDe8QDlDwJBICq1VviYZhdk1djKJJC5wXAGsI/edit#
>>  
>> 
>> [5] 
>> https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=0
>>  
>> 
>> 
> 



Re: [Discuss] Create Pulsar client release notes

2022-03-07 Thread Yu
Thanks Li Li, for always helping in making breaking changes to Pulsar!

On Tue, Mar 8, 2022 at 11:01 AM Li Li  wrote:

> +1, I will do this task.
>
> Li Li
>
> > On Mar 8, 2022, at 8:39 AM, Huanli Meng 
> wrote:
> >
> > +1
> >
> > BR//Huanli
> >
> >> On Mar 8, 2022, at 3:52 AM, Sijie Guo  wrote:
> >>
> >> Yu, thank you for driving this effort! Great to see a proposal focusing
> on this! +1
> >>
> >> On Fri, Mar 4, 2022 at 2:59 AM Yu  li...@apache.org>> wrote:
> >> Hi Pulsarers,
> >>
> >> For Pulsar release notes [1], we had issues below for a long time.
> >>
> >> - All contents are on a single Pulsar release note page. No
> navigations. It's easy to get lost and hard to understand which changes
> belong to which Pulsar version.
> >> - Java, C++, Python, WebSocket clients do not have independent release
> notes, they belong to parts of Pulsar release notes. It is hard to find and
> use.
> >> - Go, Node.js, C# clients’ changelogs are hosted in their own GitHub
> repos and not shown on the Pulsar website. Users need more clicks to get
> the info.
> >> - ...
> >>
> >> Recently, we got some negative feedback from users. I've submitted some
> changes [2] but it's a stopgap.
> >>
> >> To completely solve these problems, I propose to make some changes in
> PIP 148 [3], including but not limited to:
> >>
> >> - Create a "Release notes" chapter to docs, which shows all releases
> changes and release timeline.
> >> - Add necessary explanations, such as time-based release plan, release
> frequency, semantic versioning, maintenance life cycle, etc.
> >> - Create independent release notes for all clients.
> >> - Automate the process of generating all release notes, which is
> relevant to PIP 112 [4].
> >> - Add navigations, etc.
> >> - ...
> >>
> >>
> >> Here are mockups [5]. You can check and comment without login.
> >>
> >> ACTION: Please provide your feedback within 72 hours. **If there is no
> discussion or objection, we’ll implement them as shown in the mockups.**
> >>
> >> We’d love your feedback! Thanks!
> >>
> >> [1] https://pulsar.apache.org/en/release-notes/ <
> https://pulsar.apache.org/en/release-notes/>
> >> [2] https://github.com/apache/pulsar/pull/14430 <
> https://github.com/apache/pulsar/pull/14430>
> >> [3]
> https://docs.google.com/document/d/1o6MWV3GvXQgKw1ZpL86y43xjnYvaUKrl6HxJKZnQyEU/edit#heading=h.35s8x8c7ja4c
> <
> https://docs.google.com/document/d/1o6MWV3GvXQgKw1ZpL86y43xjnYvaUKrl6HxJKZnQyEU/edit#heading=h.35s8x8c7ja4c>
>
> >> [4]
> https://docs.google.com/document/d/1Ul2qIChDe8QDlDwJBICq1VviYZhdk1djKJJC5wXAGsI/edit#
> <
> https://docs.google.com/document/d/1Ul2qIChDe8QDlDwJBICq1VviYZhdk1djKJJC5wXAGsI/edit#
> >
> >> [5]
> https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=0
> <
> https://docs.google.com/spreadsheets/d/1MT8vt0620Cy4tggKk3gDVPbnhH7GVakld-5aDYCGAko/edit#gid=0
> >
> >>
> >
>
>


Re: [PIP] #14395 Making SchemaRegistry implementation configurable

2022-03-07 Thread Aparajita Singh
Thanks Matteo. This comment was addressed on the
 PR
.
Posting the same here as well:







*imo, schema registry should be maintaining this info. given the below
understanding now, we should not rename the `deleteSchema(...)` method or
replace its usages. but we can rename `deleteSchemaStorage(...)` to
`deleteSchemaFromStorage(...)`. is that fine?as per the current code, when
user tries to explicitly delete a schema from the topic, an empty schema is
pushed to the schema storage so that the latest schema is an empty schema.
this preserves the older versions of the schema in the storage and also
maintains metadata about the delete operation. the topic itself remains
unchanged.the schema is deleted from storage only when the topic itself is
deleted.replacing the `deleteSchema(...)` call with
`deleteSchemaStorage(...)` will mean that we will lose the following
metadata:* user who deleted the schema* timestamp at which the schema was
deleted*
Please let me know if there are any more comments on this.


On Wed, 23 Feb 2022 at 08:36, Matteo Merli  wrote:

> Hi Aparajita,
>
> Thanks for the proposal. Indeed the schema registry was meant to be
> pluggable since the beginning although we skipped the actual
> "plugging" part. It would be good to actually see multiple
> implementations there.
>
> I don't see any risk in this proposal and it's a good time to make
> breaking changes to the SchemaRegistry interface since there are not
> (yet) any implementation other than the default one.
>
> > Renaming a few methods in the SchemaRegistryService interface to reflect
> their behavior. The changes are:
> > Rename deleteSchema to putEmptySchema in SchemaRegistryService
>
> My only concern, based on current behavior, is that in some places in
> the code we're calling `deleteSchema()` while actually we should be
> calling `deleteSchemaStorage()` (using the current names). I guess
> that's probably due to the misleading nature of the method names. We
> should double-check these usages to make sure the expected operation
> is applied.
>
> Matteo
>
>
> --
> Matteo Merli
> 
>
> On Mon, Feb 21, 2022 at 7:53 AM Aparajita Singh
>  wrote:
> >
> > Hi,
> > Please review this proposal:
> https://github.com/apache/pulsar/issues/14395
> >
> > --
> > Thanks,
> > Aparajita
>


-- 
Thanks,
Aparajita


Re: [DISCUSS] Byte schema compatibility issue

2022-03-07 Thread PengHui Li
+1 the byte schema should also abide by the schema compatibility strategy
If I remember correctly, the byte schema should always compatible with
string schema

Penghui


On Tue, Mar 8, 2022 at 10:56 AM guo jiwei  wrote:

> Hi,
>I want to discuss the compatibility issue with the byte schema here.
>For now, the byte-schema is compatible with all other schemas. This may
> introduce more issues.
>Case 1:
>   1. Consumer1 init with JSON schema for topic A.
>   2. But producer1 init without schema and send byte messages
> directly to topic A.
>   This will cause consumer1 to deserialize msg error.  Also,
> producer1 may send unsafe byte data.
>
>  Case 2:
>1. Consumer1 init with byte schema for topic A.
>2. But producer1 init with AVRO/JSON schema and send messages to
> topic A.
>This will cause consumer1 don't know how to deserialize msg.
>
> To avoid the above issues, Byte schema should also follow the schema
> compatibility policy. I'm open #13701
>  to track this. If the idea
> is accepted, I will start a PIP.
>
>  Please give some suggestions about this idea.
>
>
> Regards
> Jiwei Guo (Tboy)
>


Re: [DISCUSSION][PIP-146] ManagedCursorInfo compression

2022-03-07 Thread PengHui Li
> We don't support rollback the data of the previous version Once you enable
this feature.

If you want to roll back to an old version, need to disable the cursor
compression,
wait a while, or restart the broker first to make sure the cursor data can
be flush to
the cursor ledger. And then roll back to the old version.

Penghui

On Tue, Mar 8, 2022 at 10:31 AM Zixuan Liu  wrote:

> Hi Xiaolong,
>
> It is disabled by default. Once you enable this feature:
> When reading your data, we will check your data header, if it is compressed
> data, we will parse this data by compression format, otherwise parse it by
> the original way.
> When updating your data, we will compress your data by the compression
> type.
>
> We don't support rollback the data of the previous version Once you enable
> this feature.
>
> Thanks,
> Zixuan
>
>
> r...@apache.org  于2022年3月7日周一 16:16写道:
>
> > Hi Zixuan:
> >
> > Here I am more concerned about whether this feature will break backward
> > compatibility, for historical data or old clusters, how do we use this
> > feature.
> >
> > --
> > Thanks
> > Xiaolong Ran
> >
> > Zixuan Liu  于2022年3月7日周一 15:14写道:
> >
> > > Hi everyone,
> > >
> > > Good catch! I update my proposal on
> > > https://github.com/apache/pulsar/issues/14529, and the compatibility
> > part
> > > has been appended:
> > >
> > > 1. The compression is disabled by default
> > > 2. We need to consider how to migrate the old data when this
> compression
> > > has been enabled. If the cursor data header is compressed format, we
> will
> > > parse the bytes data by compressed format, otherwise we will parse the
> > > cursor data directly by the original way
> > >
> > > Zixuan Liu  于2022年3月7日周一 15:11写道:
> > >
> > > > Hi PengHui,
> > > >
> > > > Sorry, the correct URL:
> https://github.com/apache/pulsar/issues/14529.
> > > >
> > > > :( Because of the problem of subscription, the email here is very
> > > > confusing.
> > > >
> > > >
> > > > PengHui Li  于2022年3月7日周一 12:39写道:
> > > >
> > > >> Hi Zixuan,
> > > >>
> > > >> Looks like you have added the wrong link for the proposal?
> > > >> https://github.com/apache/pulsar/issues/14395 is for PIP-44
> > > >>
> > > >> Penghui
> > > >>
> > > >> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li 
> > wrote:
> > > >>
> > > >> > > This is a global setting now. But I wonder if we should compress
> > it
> > > >> only
> > > >> > if the size
> > > >> > is over a threshold?
> > > >> >
> > > >> > +1
> > > >> >
> > > >> > Penghui
> > > >> >
> > > >> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli <
> eolive...@gmail.com
> > >
> > > >> > wrote:
> > > >> >
> > > >> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang 
> > ha
> > > >> >> scritto:
> > > >> >>
> > > >> >> > This is a global setting now. But I wonder if we should
> compress
> > it
> > > >> only
> > > >> >> > if the size
> > > >> >> > is over a threshold?
> > > >> >>
> > > >> >>
> > > >> >> Good idea
> > > >> >>
> > > >> >> Enrico
> > > >> >>
> > > >> >>
> > > >> >>   Because:
> > > >> >> > 1. It's not easy for us to notice some managed cursor info is
> too
> > > >> large
> > > >> >> in
> > > >> >> > advance,  normally it would be found only if it have actual
> > impact.
> > > >> But
> > > >> >> if
> > > >> >> > we enable this compression in advance, it will took some extra
> > > >> computing
> > > >> >> > resources.
> > > >> >> > 2. It seems that it won't be a common case that this managed
> > cursor
> > > >> info
> > > >> >> > is too large (only if there are a lot individualDeletedMessages
> > and
> > > >> >> > batchedEntryDeletionIndexInfo). So not quite necessary to
> > compress
> > > >> all
> > > >> >> > managed cursor info.
> > > >> >> >
> > > >> >> > Regards,
> > > >> >> > Haiting
> > > >> >> >
> > > >> >> >
> > > >> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
> > > >> >> > > Hi Pulsar Community,
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > I create a proposal that support ManagedCursorInfo
> compression.
> > > >> >> > >
> > > >> >> > > The proposal can be found:
> > > >> >> https://github.com/apache/pulsar/issues/14395
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > Motivation
> > > >> >> > >
> > > >> >> > > The cursor data is managed by ZooKeeper/etcd metadata store.
> > When
> > > >> >> > > cursor data becomes more and more, the data size will
> increase
> > > and
> > > >> >> > > will take a lot of time to pull the data. Therefore, it is
> > > >> necessary
> > > >> >> > > to add compression for the cursor, which can reduce the size
> of
> > > >> data
> > > >> >> > > and reduce the time of pulling data.
> > > >> >> > > Goal
> > > >> >> > >
> > > >> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
> > > >> >> ManagedCursorInfo.
> > > >> >> > > Implementation
> > > >> >> > >
> > > >> >> > >- Cursor compression format
> > > >> >> > >[MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> > > >> >> > > [MANAGED_CURSOR_INFO_PAYLOAD]
> > > >> >> > >
> > > >> >> > >
> > > >> >> > >-
> > > >> >> > >
> > > >> >> > >MAGIC_N

[GitHub] [pulsar-manager] thomasechen opened a new issue #449: Pulsar Manager sometimes could not create a topic if the cluster has multiple brokers in the cluster

2022-03-07 Thread GitBox


thomasechen opened a new issue #449:
URL: https://github.com/apache/pulsar-manager/issues/449


   Dear All,
   
I am currently try to adopt the Pulsar solution into our environment and I 
had an issue related to the "Pulsar Manager".

After deploying the Pulsar manager in our k8s environment and we can easily 
create the topic in each tenant and namespace if there was only one broker in 
the cluster. If the cluster have multiple brokers in it, the Pulsar manager 
could sometimes create topics successfully but sometimes not( It seems need 
some luck ) .
   
   Have you ever met this problem in your environment before and how do you 
solve it?
   Can you share me the experience how to config the Pulsar Manager to let it 
works perfectly?
   
Please give me some your suggestions and advice to solve the issue 
   1. Create a topic successfully
   
![image](https://user-images.githubusercontent.com/73772546/157181060-ba7f4e9a-9d30-44fc-891a-8b58616bd1a8.png)
   
   2.  Failed to create a topic 
   
![image](https://user-images.githubusercontent.com/73772546/157180711-07dd4452-5aea-47ce-8b7e-302c9b7f3da4.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-manager] thomasechen commented on issue #449: Pulsar Manager sometimes could not create a topic if the cluster has multiple brokers in the cluster

2022-03-07 Thread GitBox


thomasechen commented on issue #449:
URL: https://github.com/apache/pulsar-manager/issues/449#issuecomment-1061484123


   Dear All,
   
I found that the response code from Pulsar is 307 and it means **"Current 
broker doesn't serve the namespace of this topic"**
   This is the reason why the system sometime create topics successfully but 
sometime not.
   
   Is there any idea to solve this  ? 
   
   
![image](https://user-images.githubusercontent.com/73772546/157186778-0721e811-0133-4a7e-8700-8a8e23b125bc.png)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-manager] thomasechen commented on issue #274: Handle broker 307 redirect

2022-03-07 Thread GitBox


thomasechen commented on issue #274:
URL: https://github.com/apache/pulsar-manager/issues/274#issuecomment-1061488406


   Dear All,
   
This issue would lead to the Pulsar manager sometime could not create topic 
and subscription if the cluster has multiple brokers in it .
   
   [URL: 
https://github.com/apache/pulsar-manager/issues/449#issuecomment-1061484123](https://github.com/apache/pulsar-manager/issues/449#issue-1162280548)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org