Re: Detect unused variables in CI

2021-12-11 Thread Yunze Xu
Looks good. Could you open an issue for it so we can add it later?

Thanks,
Yunze
> 2021年12月10日 下午8:27,Yufei Zhang  写道:
> 
> From what I read it can be used in Maven projects. Basically it needs a
> SonarScanner (different versions for multiple languages and build tools)
> for Maven as in [1]. Then the scanner forwards the result to SonarQube
> website for reports. It can be used with code test coverage tools as well.



Note for pulsar-client checkstyle PR

2022-01-25 Thread Yunze Xu
Hi all (especially committers),

I opened a PR to enable checkstyle plugin for pulsar-client module just now.
See https://github.com/apache/pulsar/pull/13940 
.

It’s really a huge PR but I hope it could be included in Pulsar 2.10 release
because pulsar-client is one of the most huge modules of Pulsar. And the
code style has been out of control for a long time.

For Pulsar committers, please be careful to merge PRs that modify code in
pulsar-client module after this PR is merged because it could make master
branch broken.

Thanks,
Yunze

Re: [DISCUSS] The default value of maxPendingChunkedMessage

2022-01-30 Thread Yunze Xu
After thinking for a while, I’d prefer 10 as the default value and I changed
the default value to 10 in C++ client, see
https://github.com/apache/pulsar/pull/14070.

A chunked buffer to contain all chunks could use much memory, for example, if a
message was split into N chunks, since each chunk is 5MB by default, then 100
buffers will use N*500 MB. It could reach 1GB if N > 2.

In addition, normally, only if at least 100 producers sent messages to a
partition would it be meaningful to configure maxPendingChunkedMessages to 100.
IMO, it's hard to see so many producers on a partition in production.

Thanks,
Yunze Xu

> 2022年1月30日 下午6:32,Zike Yang  写道:
> 
> Hi, Pulsar community,
> 
> We found that there are inconsistencies between the code and the
> documentation regarding the default value of maxPendingChunkedMessage.
> 
> In the java client code, we use 10 as the default value. [1] But in
> the java doc, we use 100 as the default value. [2]
> We need to fix this inconsistency. But what should we take as the
> default value? From the code or the doc? I would like to hear your
> discussions.
> 
> [1] 
> https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> [2] 
> https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> 
> Thanks,
> Zike Yang



Re: [DISCUSS] PIP-142 Increase default numHttpServerThreads value to 200 to prevent Admin API unavailability

2022-02-17 Thread Yunze Xu
My only concern is the same with Matteo:

> Dedicating 200 threads to that would be
> a massive waste of resources (CPU & memory).

And we should make it clear whether numHttpServerThread=200 could
solve the existing problem like
https://github.com/apache/pulsar/pull/13666 
 

Thanks,
Yunze


> 2022年2月17日 上午1:32,Lari Hotari  写道:
> 
> URL: https://github.com/apache/pulsar/issues/14329
> 
> Motivation
> 
> Since Pulsar Admin API uses the blocking servlet API, all Jetty threads
> might be occupied and this causes unavailability of the Pulsar Admin
> API. The default value for the maximum number of threads for Jetty is
> too low in Pulsar. That is the root cause of many problems where Pulsar
> Admin API is unavailable when all threads are in use.
> 
> Additional context
> 
> -   Examples of previous issues where Jetty threads have been occupied
>and caused problems: #13666 #4756 #10619
> -   Mailing list thread about “make async” changes:
>https://lists.apache.org/thread/tn7rt59cd1k724l4ytfcmzx1w2sbtw7l
> 
> Implementation
> 
> -   Jetty defaults to 200 maximum threads, to prevent thread pool
>starvation. Make Pulsar use the same default value by setting
>numHttpServerThreads=200.
> -   Update the documentation for numHttpServerThreads
>-   The PR is already in place:
>https://github.com/apache/pulsar/pull/14320
> -   Set Jetty selectors and acceptors parameters to -1 so that Jetty
>automatically chooses optimal values based on available cores. The
>rationale is explained in the Q&A below.
>-   A separate PR will be made for this change.
> 
> Q&A
> 
> Q: What’s the reason of setting the default value to 200? If the node just 
> have one core, what will happen?
> 
> These are threads. Jetty defaults to 200 maximum threads, to prevent
> thread pool starvation. This is recommended when using blocking Servlet
> API. The problem is that Pulsar uses the blocking servlet API and
> doesn’t have a sufficient number of threads which are needed and
> recommended.
> 
> The value 200 doesn’t mean that there will be 200 threads to start with.
> This is the maximum size for the thread pool. When the value is more
> than 8, Jetty will start with 8 initial threads and add more threads to
> the pool when all threads are occupied.
> 
> Q: Do we need to take the number of system cores into consideration for the 
> maximum threads of the thread pool?
> 
> No. Jetty is different from Netty in this aspect. In Netty, everything
> should be asynchronous and “thou shall never block”. In Jetty, the
> maximum number of threads for the thread pool should be set to 50-500
> threads and blocking operations are fine.
> 
> The recommendation for the thread pool is explained in Jetty
> documentation
> https://www.eclipse.org/jetty/documentation/jetty-9/index.html#_thread_pool
>> Thread Pool > Configure with goal of limiting memory usage maximum
> available. Typically this is >50 and <500
> 
> However, there are separate settings which should take the number of
> available processors (cores) into account in Jetty.
> 
> http port acceptor and selector count:
> https://github.com/apache/pulsar/blob/b540523b474e4194e30c1acab65dfafdd11d3210/pulsar-broker/src/main/java/org/apache/pulsar/broker/web/WebService.java#L88
> 
> https port acceptor and selector count:
> https://github.com/apache/pulsar/blob/b540523b474e4194e30c1acab65dfafdd11d3210/pulsar-broker/src/main/java/org/apache/pulsar/broker/web/WebService.java#L125
> 
> Jetty documentantion for acceptors: > Acceptors > The standard rule of
> thumb for the number of Accepters to configure is one per CPU on a given
> machine.
> 
> Jetty documentation for selectors: > Selectors > The default number of
> selectors is equal to half of the number of processors available to the
> JVM, which should allow optimal performance even if all the connections
> used are performing significant non-blocking work in the callback tasks.
> 
> The settings in jetty are the “acceptor” and “selector” thread count
> settings. These have been fixed to 1 in Pulsar. The acceptors and
> selectors settings should be both set to -1. Jetty would pick the
> recommended count based on cores in that case.



[Discuss] Generate cert and key files automatically

2022-03-21 Thread Yunze Xu
Hi all,

Recently I found a document error when configuring Pulsar client for TLS
encryption. See https://github.com/apache/pulsar/issues/14762. However, the code
example in the official documents is more intuitive.

See https://pulsar.apache.org/docs/en/security-tls-transport/#java-client, the
example code doesn't configure `AuthenticationTls`, but it is required once TLS
encryption is enabled, even if TLS authentication is not enabled. Because the
client side can only send a SSL handshake via `AuthenticationTls`. It would be
confused.

Since the cert file and the key file are generated using a CA, whose path is
specified by `tlsTrustCertsFilePath` method, I think it would be possible to
generate a cert and a key file automatically. We only need to specify a common
name, which represents the role when authentication is enabled.

My initial design is, when client configures the `tlsTrustCertsFilePath`:
- If no authentication plugin is enabled, generate the cert and key files
  automatically using a default common name.
- Otherwise, use the cert and key files specified in `AuthenticationTls`.

The benefit is, when you want to pass the TLS authentication, you must configure
`AuthenticationTls` at client side, while you only needs to configure
`tlsTrustCertsFilePath` if broker side only enables TLS encryption.

What do you think? Is there a better solution?

Thanks,
Yunze






Re: [Discuss] Generate cert and key files automatically

2022-03-22 Thread Yunze Xu
Good point. It's because generating a certificate automatically is not safe,
right? If so, I think there is no need to add this feature since the motivation
is to make the code more intuitive.

Thanks,
Yunze




> 2022年3月22日 上午12:40,Enrico Olivelli  写道:
> 
> Il giorno lun 21 mar 2022 alle ore 16:31 Yunze Xu
>  ha scritto:
>> 
>> Hi all,
>> 
>> Recently I found a document error when configuring Pulsar client for TLS
>> encryption. See https://github.com/apache/pulsar/issues/14762. However, the 
>> code
>> example in the official documents is more intuitive.
>> 
>> See https://pulsar.apache.org/docs/en/security-tls-transport/#java-client, 
>> the
>> example code doesn't configure `AuthenticationTls`, but it is required once 
>> TLS
>> encryption is enabled, even if TLS authentication is not enabled. Because the
>> client side can only send a SSL handshake via `AuthenticationTls`. It would 
>> be
>> confused.
>> 
>> Since the cert file and the key file are generated using a CA, whose path is
>> specified by `tlsTrustCertsFilePath` method, I think it would be possible to
>> generate a cert and a key file automatically. We only need to specify a 
>> common
>> name, which represents the role when authentication is enabled.
> 
> Usually a service cannot generate a "valid" certificate automatically,
> it MUST be signed by a CA.
> 
> We may add an option to automatically generate a certificate (and a
> CA) but that will work only for
> DEV environments.
> 
> Enrico
> 
> 
>> 
>> My initial design is, when client configures the `tlsTrustCertsFilePath`:
>> - If no authentication plugin is enabled, generate the cert and key files
>>  automatically using a default common name.
>> - Otherwise, use the cert and key files specified in `AuthenticationTls`.
>> 
>> The benefit is, when you want to pass the TLS authentication, you must 
>> configure
>> `AuthenticationTls` at client side, while you only needs to configure
>> `tlsTrustCertsFilePath` if broker side only enables TLS encryption.
>> 
>> What do you think? Is there a better solution?
>> 
>> Thanks,
>> Yunze
>> 
>> 
>> 
>> 



Re: [Discuss] Generate cert and key files automatically

2022-03-22 Thread Yunze Xu
If `tlsCertFilePath` and `tlsKeyFilePath` were added to the client builder
options, I think they can also be used for TLS authentication as well.

I prefer this solution now just because it looks like generating certificats
automatically is not good, from what Enrico said.

The problem is that what if we configured both `AuthenticationTls` and those
two options? Because the underlying mechanisms are the same that a `SslContext`
is created from the cert and key files and then the `SslContext` object will be
used in the TCP or HTTP transport.

I think the priority of `AuthenticationTls` must be higher. Then it should be
encouraged to use `AuthenticationTls` when TLS authentication is enabled at
broker side. Otherwise, these two options should be encouraged to use.

Thanks,
Yunze




> 2022年3月22日 上午11:03,Zixuan Liu  写道:
> 
> Hi Yunze,
> 
> The current implementation is confusing, we should split the transport and
> auth for TLS.
> 
> For transport, the code can be so like:
> ```
> PulsarClient client = PulsarClient.builder()
>.enableTls(true)
>.tlsTrustCertsFilePath("ca.pem")
>.tlsCertFilePath("client-ca.pem")
>.tlsKeyFilePath("client-key.pem")
>.build();
> ```
> 
> For auth, the code can be so like:
> ```
> Map authParams = new HashMap<>();
> authParams.put("tlsCertFile", "client-ca.pem");
> authParams.put("tlsKeyFile", "client-key.pem");
> PulsarClient client = PulsarClient.builder()
>.enableTls(true)
>.tlsTrustCertsFilePath("ca.pem")
>.authentication(AuthenticationTls.class.getName(),
> authParams)
>.build();
> ```
> 
> When using the TLS auth, we don't need to set
> tlsCertFilePath("client-ca.pem") and tlsKeyFilePath("client-key.pem"), the
> authentication instead of this.
> 
> There have an important thing that if we are using the authentication with
> the token, we cannot setup the TLS transport.
> 
> 
> Enrico Olivelli  于2022年3月22日周二 00:40写道:
> 
>> Il giorno lun 21 mar 2022 alle ore 16:31 Yunze Xu
>>  ha scritto:
>>> 
>>> Hi all,
>>> 
>>> Recently I found a document error when configuring Pulsar client for TLS
>>> encryption. See https://github.com/apache/pulsar/issues/14762. However,
>> the code
>>> example in the official documents is more intuitive.
>>> 
>>> See
>> https://pulsar.apache.org/docs/en/security-tls-transport/#java-client, the
>>> example code doesn't configure `AuthenticationTls`, but it is required
>> once TLS
>>> encryption is enabled, even if TLS authentication is not enabled.
>> Because the
>>> client side can only send a SSL handshake via `AuthenticationTls`. It
>> would be
>>> confused.
>>> 
>>> Since the cert file and the key file are generated using a CA, whose
>> path is
>>> specified by `tlsTrustCertsFilePath` method, I think it would be
>> possible to
>>> generate a cert and a key file automatically. We only need to specify a
>> common
>>> name, which represents the role when authentication is enabled.
>> 
>> Usually a service cannot generate a "valid" certificate automatically,
>> it MUST be signed by a CA.
>> 
>> We may add an option to automatically generate a certificate (and a
>> CA) but that will work only for
>> DEV environments.
>> 
>> Enrico
>> 
>> 
>>> 
>>> My initial design is, when client configures the `tlsTrustCertsFilePath`:
>>> - If no authentication plugin is enabled, generate the cert and key files
>>>  automatically using a default common name.
>>> - Otherwise, use the cert and key files specified in `AuthenticationTls`.
>>> 
>>> The benefit is, when you want to pass the TLS authentication, you must
>> configure
>>> `AuthenticationTls` at client side, while you only needs to configure
>>> `tlsTrustCertsFilePath` if broker side only enables TLS encryption.
>>> 
>>> What do you think? Is there a better solution?
>>> 
>>> Thanks,
>>> Yunze
>>> 
>>> 
>>> 
>>> 
>> 



Re: Abount add TDengine Connector to Pulsar.

2022-03-29 Thread Yunze Xu
Hi JueShan,

AFAIK, the Pulsar main repository already contains some connectors and
it increases the complexity to maintain them, so for new connectors,
it's better to maintain them in your own repositories.

Thanks,
Yunze




> 2022年3月28日 下午8:29,刘梓霖  写道:
> 
> Hi EveryOne,
> I would like to contribute a TDengine connector to Pulsar.
> 
> Let me briefly introduce what a [TDengine](https://tdengine.com/) is.
> TDengine is a high-performance, distributed time series database that
> supports SQL.
> With TDengine, the total cost of ownership of typical IoT, Internet of
> Vehicles, and Industrial Internet big data platforms can be greatly reduced.
> 
> About Pulsar IO TDengine Connector, it is composed of TDengine Source
> Connector and TDengine Sink Connector. It can read data from TDengine and
> store it in Pulsar. At the same time, it can also write data in Pulsar into
> TDengine to realize TDengine-based data pipeline.
> 
> Thanks,
> JueShan



Re: [DISCUSS] PIP-155: Drop support for Python2

2022-04-17 Thread Yunze Xu
+1

Thanks,
Yunze




> 2022年4月16日 00:06,Matteo Merli  写道:
> 
> https://github.com/apache/pulsar/issues/15185
> 
> -
> 
> ## Motivation
> 
> Python 2.x has been deprecated for many years now and it was
> officially end-of-lifed 2.5 years ago
> (https://www.python.org/doc/sunset-python-2/).
> 
> We have well reached the point by which we need to drop Python 2.7
> compatibility for Pulsar client and for Pulsar functions.
> 
> ## Goal
> 
> Support only Python 3.5+ for Pulsar client and for Pulsar functions.
> 
> ## API Changes
> 
> No changes at this time, though Pulsar Python client library will be
> now free to use Python3 specific syntaxes and libraries.
> 
> ## Changes
> 
> 1. Switch the CI build to run Python client lib tests with Python3
> 2. Switch integration tests to use Python3
> 3. Stop building and distributing wheel files for Python 2.7
> 
> 
> 
> 
> --
> Matteo Merli
> 



Re: [Discuss] Generate cert and key files automatically

2022-04-18 Thread Yunze Xu
I have another concern that since we have to use `AuthenticationTls` for TLS
transport encryption, how can we perform a non-TLS authentication? It looks
like there’s no way to do that.

Thanks,
Yunze




> 2022年3月23日 11:24,Zixuan Liu  写道:
> 
>> I think the priority of `AuthenticationTls` must be higher. Then it
> should be encouraged to use `AuthenticationTls` when TLS authentication is
> enabled at broker side. Otherwise, these two options should be encouraged
> to use.
> 
> You are right, if we are set up the `AuthenticationTls` and TLS transport,
> we should use the `AuthenticationTls` data to set up, `AuthenticationTls`
> must be higher. When the user set up two config, we need to throw an
> expectation that only use the `AuthenticationTls`, or `tlsCertFilePath` and
> `tlsKeyFilePath`.
> 
> 
> Yunze Xu  于2022年3月23日周三 01:57写道:
> 
>> If `tlsCertFilePath` and `tlsKeyFilePath` were added to the client builder
>> options, I think they can also be used for TLS authentication as well.
>> 
>> I prefer this solution now just because it looks like generating
>> certificats
>> automatically is not good, from what Enrico said.
>> 
>> The problem is that what if we configured both `AuthenticationTls` and
>> those
>> two options? Because the underlying mechanisms are the same that a
>> `SslContext`
>> is created from the cert and key files and then the `SslContext` object
>> will be
>> used in the TCP or HTTP transport.
>> 
>> I think the priority of `AuthenticationTls` must be higher. Then it should
>> be
>> encouraged to use `AuthenticationTls` when TLS authentication is enabled at
>> broker side. Otherwise, these two options should be encouraged to use.
>> 
>> Thanks,
>> Yunze
>> 
>> 
>> 
>> 
>>> 2022年3月22日 上午11:03,Zixuan Liu  写道:
>>> 
>>> Hi Yunze,
>>> 
>>> The current implementation is confusing, we should split the transport
>> and
>>> auth for TLS.
>>> 
>>> For transport, the code can be so like:
>>> ```
>>> PulsarClient client = PulsarClient.builder()
>>>   .enableTls(true)
>>>   .tlsTrustCertsFilePath("ca.pem")
>>>   .tlsCertFilePath("client-ca.pem")
>>>   .tlsKeyFilePath("client-key.pem")
>>>   .build();
>>> ```
>>> 
>>> For auth, the code can be so like:
>>> ```
>>> Map authParams = new HashMap<>();
>>> authParams.put("tlsCertFile", "client-ca.pem");
>>> authParams.put("tlsKeyFile", "client-key.pem");
>>> PulsarClient client = PulsarClient.builder()
>>>   .enableTls(true)
>>>   .tlsTrustCertsFilePath("ca.pem")
>>>       .authentication(AuthenticationTls.class.getName(),
>>> authParams)
>>>   .build();
>>> ```
>>> 
>>> When using the TLS auth, we don't need to set
>>> tlsCertFilePath("client-ca.pem") and tlsKeyFilePath("client-key.pem"),
>> the
>>> authentication instead of this.
>>> 
>>> There have an important thing that if we are using the authentication
>> with
>>> the token, we cannot setup the TLS transport.
>>> 
>>> 
>>> Enrico Olivelli  于2022年3月22日周二 00:40写道:
>>> 
>>>> Il giorno lun 21 mar 2022 alle ore 16:31 Yunze Xu
>>>>  ha scritto:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> Recently I found a document error when configuring Pulsar client for
>> TLS
>>>>> encryption. See https://github.com/apache/pulsar/issues/14762.
>> However,
>>>> the code
>>>>> example in the official documents is more intuitive.
>>>>> 
>>>>> See
>>>> https://pulsar.apache.org/docs/en/security-tls-transport/#java-client,
>> the
>>>>> example code doesn't configure `AuthenticationTls`, but it is required
>>>> once TLS
>>>>> encryption is enabled, even if TLS authentication is not enabled.
>>>> Because the
>>>>> client side can only send a SSL handshake via `AuthenticationTls`. It
>>>> would be
>>>>> confused.
>>>>> 
>>>>> Since the cert file and the key file are generated using a CA, whose
>>>> path is
>>>>> specified by `tlsTrustCertsFilePath` method, I think it would be
>>>> possible to
>

Re: [DISCUSS] PIP-158: Split client TLS transport encryption from authentication

2022-05-08 Thread Yunze Xu
It totally LGTM. I have a suggestion that it might be better to configure a
class like `TlsConfiguration` instead of multiple TLS related configs added to
`ClientBuilder`.

Thanks,
Yunze




> 2022年4月24日 14:15,Zixuan Liu  写道:
> 
> Hi Pulsar community,
> 
> I open a https://github.com/apache/pulsar/issues/15289 for Split client TLS
> transport encryption from authentication.
> 
> Let me know what you think.
> 
> Thanks,
> Zixuan
> 
> --
> 
> Motivation
> 
> The client supports TLS transport encryption and TLS authentication, this
> code so like:
> 
> PulsarClient client = PulsarClient.builder()
>.serviceUrl("pulsar+ssl://localhost:6651")
>.tlsTrustCertsFilePath("/path/to/cacert.pem")
>.authentication(AuthenticationTls.class.getName(), authParams)
>.build()
> 
> This causes an issue that cannot use other authentication with TLS
> transport encryption, and also made our confusion if we use TLS transport
> encryption by setting authentication.
> Goal
> 
> Split client TLS transport encryption from authentication is used to
> support TLS transport encryption with any authentication.
> API Changes
> 
>   - Add new methods in org.apache.pulsar.client.api.ClientBuilder
> 
> public interface ClientBuilder extends Serializable, Cloneable {
>/** * Set the path to the TLS key file. * * @param
> tlsKeyFilePath * @return the client builder instance */
>ClientBuilder tlsKeyFilePath(String tlsKeyFilePath);
> 
>/** * Set the path to the TLS certificate file. * *
> @param tlsCertificateFilePath * @return the client builder
> instance */
>ClientBuilder tlsCertificateFilePath(String tlsCertificateFilePath);
> }
> 
> ImplementationTLS transport encryption
> 
> We can call the tlsKeyFilePath(), tlsCertificateFilePath() and
> tlsTrustCertsFilePath() to configurate the TLS transport encryption, the
> code so like:
> 
> PulsarClient client = PulsarClient.builder()
>.serviceUrl("pulsar+ssl://my-host:6650")
>.tlsTrustCertsFilePath("/path/to/cacert.pem")
>.tlsKeyFilePath("/path/to/client-key.pem")
>.tlsCertificateFilePath("/path/to/client-cert.pem")
>.build();
> 
> TLS transport encryption with any authentication
> 
> We can call the tlsKeyFilePath(), tlsCertificateFilePath(),
> tlsTrustCertsFilePath() and authentication() to configurate the TLS
> transport encryption with any authentication, the code so like:
> 
> PulsarClient client = PulsarClient.builder()
>.serviceUrl("pulsar+ssl://my-host:6650")
>.tlsTrustCertsFilePath("/path/to/cacert.pem")
>.tlsKeyFilePath("/path/to/client-key.pem")
>.tlsCertificateFilePath("/path/to/client-cert.pem")
>.authentication(AuthenticationTls.class.getName() /*
> AuthenticationToken.class.getName()*/, authParams)
>.builder()
> 
> For AuthenticationTls, we need to do check the authParams, when the
> authParams is empty, we need to read TLS config from ClientBuilder,
> otherwise read from the authParams
> Compatibility
> 
> None.



Re: [DISCUSS] PIP-158: Split client TLS transport encryption from authentication

2022-05-08 Thread Yunze Xu
Thanks for your clarification. Let’s continue maintaining these configs in
`ClientBuilder`.

Thanks,
Yunze




> 2022年5月9日 13:54,Zixuan Liu  写道:
> 
> Hi Yunze,
> 
> Thanks for your suggestion, your idea is great, but we have the
> `tlsProtocols()` and `tlsCiphers()` in `ClientBuilder`, so I use this style.
> 
> Thanks,
> Zixuan
> 
> Yunze Xu  于2022年5月9日周一 13:31写道:
> 
>> It totally LGTM. I have a suggestion that it might be better to configure a
>> class like `TlsConfiguration` instead of multiple TLS related configs
>> added to
>> `ClientBuilder`.
>> 
>> Thanks,
>> Yunze
>> 
>> 
>> 
>> 
>>> 2022年4月24日 14:15,Zixuan Liu  写道:
>>> 
>>> Hi Pulsar community,
>>> 
>>> I open a https://github.com/apache/pulsar/issues/15289 for Split client
>> TLS
>>> transport encryption from authentication.
>>> 
>>> Let me know what you think.
>>> 
>>> Thanks,
>>> Zixuan
>>> 
>>> --
>>> 
>>> Motivation
>>> 
>>> The client supports TLS transport encryption and TLS authentication, this
>>> code so like:
>>> 
>>> PulsarClient client = PulsarClient.builder()
>>>   .serviceUrl("pulsar+ssl://localhost:6651")
>>>   .tlsTrustCertsFilePath("/path/to/cacert.pem")
>>>   .authentication(AuthenticationTls.class.getName(),
>> authParams)
>>>   .build()
>>> 
>>> This causes an issue that cannot use other authentication with TLS
>>> transport encryption, and also made our confusion if we use TLS transport
>>> encryption by setting authentication.
>>> Goal
>>> 
>>> Split client TLS transport encryption from authentication is used to
>>> support TLS transport encryption with any authentication.
>>> API Changes
>>> 
>>>  - Add new methods in org.apache.pulsar.client.api.ClientBuilder
>>> 
>>> public interface ClientBuilder extends Serializable, Cloneable {
>>>   /** * Set the path to the TLS key file. * * @param
>>> tlsKeyFilePath * @return the client builder instance */
>>>   ClientBuilder tlsKeyFilePath(String tlsKeyFilePath);
>>> 
>>>   /** * Set the path to the TLS certificate file. * *
>>> @param tlsCertificateFilePath * @return the client builder
>>> instance */
>>>   ClientBuilder tlsCertificateFilePath(String tlsCertificateFilePath);
>>> }
>>> 
>>> ImplementationTLS transport encryption
>>> 
>>> We can call the tlsKeyFilePath(), tlsCertificateFilePath() and
>>> tlsTrustCertsFilePath() to configurate the TLS transport encryption, the
>>> code so like:
>>> 
>>> PulsarClient client = PulsarClient.builder()
>>>   .serviceUrl("pulsar+ssl://my-host:6650")
>>>   .tlsTrustCertsFilePath("/path/to/cacert.pem")
>>>   .tlsKeyFilePath("/path/to/client-key.pem")
>>>   .tlsCertificateFilePath("/path/to/client-cert.pem")
>>>   .build();
>>> 
>>> TLS transport encryption with any authentication
>>> 
>>> We can call the tlsKeyFilePath(), tlsCertificateFilePath(),
>>> tlsTrustCertsFilePath() and authentication() to configurate the TLS
>>> transport encryption with any authentication, the code so like:
>>> 
>>> PulsarClient client = PulsarClient.builder()
>>>   .serviceUrl("pulsar+ssl://my-host:6650")
>>>   .tlsTrustCertsFilePath("/path/to/cacert.pem")
>>>   .tlsKeyFilePath("/path/to/client-key.pem")
>>>   .tlsCertificateFilePath("/path/to/client-cert.pem")
>>>   .authentication(AuthenticationTls.class.getName() /*
>>> AuthenticationToken.class.getName()*/, authParams)
>>>   .builder()
>>> 
>>> For AuthenticationTls, we need to do check the authParams, when the
>>> authParams is empty, we need to read TLS config from ClientBuilder,
>>> otherwise read from the authParams
>>> Compatibility
>>> 
>>> None.
>> 
>> 



Re: [VOTE] [PIP-158] Split client TLS transport encryption from authentication

2022-05-10 Thread Yunze Xu
+1 (non-binding)

Thanks,
Yunze


Re: [DISCUSSION] PIP-156: Enable system topic by default

2022-05-10 Thread Yunze Xu
+1 (non-binding)

Thanks,
Yunze



Re: [VOTE] [PIP-158] Split client TLS transport encryption from authentication

2022-05-14 Thread Yunze Xu
+1 (non-binding)

Thanks,
Yunze



Re: [DISCUSS] Byte schema compatibility issue

2022-05-16 Thread Yunze Xu
For case 1, if you are using bytes schema to produce messages, it will be
user's responsibility to ensure the schema compatibility. Then at consumer side,
`Message#getValue`, which decodes the bytes internally via the schema,
should throw a `SchemaSerializationException` if the bytes of the value cannot
Be decoded.

Unfortunately, there is a bug that prevents bytes being decoded and it always
failed before decoding. I opened a PR to fix this issue:
https://github.com/apache/pulsar/pull/15622 

If you don’t want to check the schema compatibility at consumer side, you can
configure `isSchemaValidationEnforced` with true so that the creation of a 
producer
without schema on a topic with schema will fail.

IMO, bytes schema is treated as “without schema”. The issue is actually:
- Produce messages without schema
- Consume messages with schema

If `isSchemaValidationEnforced` is true, the producer cannot be created.
Otherwise, since we cannot guarantee the format of the message at producer side
and we cannot try to decode it at broker side. The only way is handling the 
error
at consumer side:
1. Decoding the message successfully, return the decoded value.
2. Otherwise, throw a `SchemaSerializationException`.

There is no problem with current implementation except what I tried to fix in 
#15622.


Thanks,
Yunze




> 2022年3月8日 10:55,guo jiwei  写道:
> 
> Hi,
>   I want to discuss the compatibility issue with the byte schema here.
>   For now, the byte-schema is compatible with all other schemas. This may
> introduce more issues.
>   Case 1:
>  1. Consumer1 init with JSON schema for topic A.
>  2. But producer1 init without schema and send byte messages
> directly to topic A.
>  This will cause consumer1 to deserialize msg error.  Also,
> producer1 may send unsafe byte data.
> 
> Case 2:
>   1. Consumer1 init with byte schema for topic A.
>   2. But producer1 init with AVRO/JSON schema and send messages to
> topic A.
>   This will cause consumer1 don't know how to deserialize msg.
> 
>To avoid the above issues, Byte schema should also follow the schema
> compatibility policy. I'm open #13701
>  to track this. If the idea
> is accepted, I will start a PIP.
> 
> Please give some suggestions about this idea.
> 
> 
> Regards
> Jiwei Guo (Tboy)



Re: [DISCUSS] Apache Pulsar 2.10.1 release

2022-05-23 Thread Yunze Xu
+1

Thanks,
Yunze




> 2022年5月23日 11:34,Hang Chen  写道:
> 
> +1
> 
> There are a lot of transaction fixes.
> 
> Thanks,
> Hang
> 
> PengHui Li  于2022年5月21日周六 13:06写道:
>> 
>> Hello, Pulsar community:
>> 
>> I'd like to propose to release Apache Pulsar 2.10.1
>> 
>> Currently, we have 190 commits [0] and there are many transaction
>> fixes, security fixes.
>> 
>> And there are 22 open PRs [1], I will follow them to make sure that
>> the important fixes could be contained in 2.10.1
>> 
>> If you have any important fixes or any questions,
>> please reply to this email, we will evaluate whether to
>> include it in 2.10.1
>> 
>> [0]
>> https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.10.1+
>> [1]
>> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+label%3Arelease%2F2.10.1+
>> 
>> Best Regards
>> Penghui



Re: About 2.10.1, 2.9.2, 2.8.4 has this been cherry picked yet? [fix][java-client] Fix performance regression with message listener

2022-05-31 Thread Yunze Xu
We use the `cherry-picked` label to mark the PR has been cherry-picked into
the specific branch. For example, the `cherry-picked/branch-2.10` label means
the PR has been cherry-picked to branch-2.10.

Thanks,
Yunze




> 2022年6月1日 09:57,Dave Fisher  写道:
> 
> This is a fix for a severe performance regression and I wanted to be sure 
> that it gets cherry picked.
> 
>> On May 31, 2022, at 6:48 PM, GitBox  wrote:
>> 
>> 
>> dave2wave commented on PR #15162:
>> URL: https://github.com/apache/pulsar/pull/15162#issuecomment-1143028123
>> 
>>  Has this been cherry picked to 2.10, 2.9, and 2.8 yet?
>> 
>> 
>> -- 
>> This is an automated message from the Apache Git Service.
>> To respond to the message, please log on to GitHub and use the
>> URL above to go to the specific comment.
>> 
>> To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org
>> 
>> For queries about this service, please contact Infrastructure at:
>> us...@infra.apache.org
>> 
> 



New proposal for chunk messages with shared subscriptions

2022-06-07 Thread Yunze Xu
Hi folks,

Recently I'm working on the implementation of PIP-37, see
https://github.com/apache/pulsar/wiki/PIP-37%3A-Large-message-size-handling-in-Pulsar#usecase-3-multiple-producers-with-shared-consumers
 

As we can see, https://github.com/apache/pulsar/pull/4400 only
implements chunking messages with non-shared subscriptions. When I
followed the **Option 2** section, I found it works but there are many
details that need to be taken care of.

For example,
- Should we add a marker type to indicate the chunk marker?
- Normally, the markers like Transaction markers are not visible to
  the client, but we need to send the chunk marker to client.
- What's the format of the chunk marker?
- Which compatibility problems would be brought by this design?

I think we need a new proposal to explain it in details and I'm
working on that, as well as the demo.

Feel free to ping me if you have any concern.

Thanks,
Yunze






Re: New proposal for chunk messages with shared subscriptions

2022-06-14 Thread Yunze Xu
I opened an initial PR for it: https://github.com/apache/pulsar/pull/16061 

It doesn’t adopt any option of the original PIP 37. I think we need another
proposal for it, just like the chunked message ID in PIP 107.


Thanks,
Yunze




> 2022年6月7日 22:12,Yunze Xu  写道:
> 
> Hi folks,
> 
> Recently I'm working on the implementation of PIP-37, see
> https://github.com/apache/pulsar/wiki/PIP-37%3A-Large-message-size-handling-in-Pulsar#usecase-3-multiple-producers-with-shared-consumers
>  
> 
> As we can see, https://github.com/apache/pulsar/pull/4400 only
> implements chunking messages with non-shared subscriptions. When I
> followed the **Option 2** section, I found it works but there are many
> details that need to be taken care of.
> 
> For example,
> - Should we add a marker type to indicate the chunk marker?
> - Normally, the markers like Transaction markers are not visible to
>  the client, but we need to send the chunk marker to client.
> - What's the format of the chunk marker?
> - Which compatibility problems would be brought by this design?
> 
> I think we need a new proposal to explain it in details and I'm
> working on that, as well as the demo.
> 
> Feel free to ping me if you have any concern.
> 
> Thanks,
> Yunze
> 
> 
> 
> 



[DISCUSS] User-friendly acknowledgeCumulative API on a partitioned topic or multi-topics

2022-07-15 Thread Yunze Xu
Hi all,

Long days ago I opened a PR to support cumulative acknowledgement
for C++ client, but it's controversial about whether should a
partitioned consumer acknowledge a message ID cumulatively.

See https://github.com/apache/pulsar/pull/6796 for discussion.

Currently, the Java client acknowledges the specific partition of the
message ID, while the C++ client just fails when calling
`acknowledgeCumulative` on a partitioned topic. However, even if the
Java client doesn't fail, it's not user friendly.

Assuming users called `acknowledgeCumulative` periodically, there is a
chance that some messages of the specific partition has never been
passed to the method.

For example, a consumer received:

P0-M0, P1-M0, P0-M1, P1-M1, P0-M2, P1-M2...

And the user acknowledged every two messages, i.e.

P0-M0, P0-M1, P0-M2

Eventually, partition 1 has never been acknowledged.

User must maintain its own `Map` cache for a
partitioned topic or multi-topics consumer with the existing
`acknowledgeCumulative` API.

Should we make it more friendly for users? For example, we can make
`acknowledgeCumulative` accept the map to remind users to maintain
the map from topic name to message ID:

```java
// the key is the partitioned topic name like my-topic-partition-0
void acknowledgeCumulative(Map topicToMessageId);
```

For those who don't want to maintain the map by themselves, maybe we
can provide a simpler API like:

```java
// acknowlegde all latest received messages
void acknowledgeCumulative();
```

and provide an option to enable this behavior.

Do you have any suggestion on this idea? I will prepare a proposal if
there is no disagreement.

Thanks,
Yunze






Re: [DISCUSS] Apache Pulsar 2.11.0 Release

2022-07-17 Thread Yunze Xu
In addition to #16202, there is a following PR to support the correct
ACK implementation for chunked messages. It should depend on #16202
But I think I can submit an initial PR this week and change the tests
after #16202 is merged.
 
Thanks,
Yunze




> 2022年7月18日 11:22,PengHui Li  写道:
> 
> Hi all,
> 
> We released 2.10.0 three months ago. And there are many great changes in
> the master branch,
> including new features and performance improvements.
> 
> - PIP 74: apply client memory to consumer
> https://github.com/apache/pulsar/pull/15216
> - PIP 143: Support split bundles by specified boundaries
> https://github.com/apache/pulsar/pull/13796
> - PIP 145: regex subscription improvements
> https://github.com/apache/pulsar/pull/16062
> - PIP 160: transaction performance improvements (still in progress and
> merged some PRs)
> - PIP 161: new exclusive producer mode support
> https://github.com/apache/pulsar/pull/15488
> - PIP 182: Provide new load balance placement strategy implementation for
> ModularLoadManagerStrategy https://github.com/apache/pulsar/pull/16281
> Add Pulsar Auth support for the Pulsar SQL
> https://github.com/apache/pulsar/pull/15571
> 
> And some features are blocked in the review stage, but they are powerful
> improvements for Pulsar
> 
> PIP 37: Support chunking with Shared subscription
> https://github.com/apache/pulsar/pull/16202
> PIP-166: Function add MANUAL delivery semantics
> https://github.com/apache/pulsar/pull/16279
> 
> You can find the complete change list in 2.11.0 at
> https://github.com/apache/pulsar/pulls?q=is%3Apr+milestone%3A2.11.0+-label%3Arelease%2F2.10.1+-label%3Arelease%2F2.10.2
> 
> And maybe I missed some important in-progress PRs, please let me know if it
> should be a blocker of the 2.11.0 release.
> 
> It's a good time to discuss the target time of the 2.11.0 release.
> I think we can leave 2 weeks to complete the in-progress PRs and 2 weeks to
> accept bug fixes.
> And target the 2.11.0 release in mid-August.
> 
> Please let me know what you think.
> 
> Thanks,
> Penghui



[DISCUSS] Apache Pulsar 2.8.4 release

2022-07-21 Thread Yunze Xu
Hello Pulsar Community,

It has been several months since the 2.8.3 release and there are many
important fixes after that. For example, there is a performance
regression for message listener that was introduced from 2.8.3 and
fixed in https://github.com/apache/pulsar/pull/15162.

I volunteer to be the release manager for 2.8.4.

Here [0] you can find the list of 149 commits to branch-2.8 since the
2.8.3 release. There are 54 closed PRs targeting 2.8.4 that have not
yet been cherry-picked [1] and I will cherry-pick them and solve the
conflicts if necessary. There are 18 open PRs labeled with
`release/2.8.4` [2]. I will check the status of them and see if they
can be pushed to 2.8.4.

Thanks,
Yunze

[0] - https://github.com/apache/pulsar/compare/v2.8.3...branch-2.8
[1] - 
https://github.com/apache/pulsar/pulls?q=is%3Apr+label%3Arelease%2F2.8.4+-label%3Acherry-picked%2Fbranch-2.8+is%3Aclosed
[2] - 
https://github.com/apache/pulsar/pulls?q=is%3Apr+label%3Arelease%2F2.8.4+-label%3Acherry-picked%2Fbranch-2.8+is%3Aopen



Re: [DISCUSS] Apache Pulsar 2.8.4 release

2022-07-21 Thread Yunze Xu
Sure, I will take it carefully for those PRs not cherry-picked to branch-2.8 
but labeled
as `release/2.8.4`.

Thanks,
Yunze




> 2022年7月22日 00:09,Dave Fisher  写道:
> 
> Thank you for volunteering!
> 
>> On Jul 21, 2022, at 12:57 AM, Yunze Xu  wrote:
>> 
>> Hello Pulsar Community,
>> 
>> It has been several months since the 2.8.3 release and there are many
>> important fixes after that. For example, there is a performance
>> regression for message listener that was introduced from 2.8.3 and
>> fixed in https://github.com/apache/pulsar/pull/15162.
>> 
>> I volunteer to be the release manager for 2.8.4.
>> 
>> Here [0] you can find the list of 149 commits to branch-2.8 since the
>> 2.8.3 release. There are 54 closed PRs targeting 2.8.4 that have not
>> yet been cherry-picked [1] and I will cherry-pick them and solve the
>> conflicts if necessary. There are 18 open PRs labeled with
>> `release/2.8.4` [2]. I will check the status of them and see if they
>> can be pushed to 2.8.4.
> 
> Please use discretion about selecting bug fixes only and avoid adding new 
> features or unexpected configuration changes. Thanks!
> 
> Ask for help with the cherry picks since being in a Release Manager is a big 
> job!
> 
> Regards,
> Dave
> 
> 
>> 
>> Thanks,
>> Yunze
>> 
>> [0] - https://github.com/apache/pulsar/compare/v2.8.3...branch-2.8
>> [1] - 
>> https://github.com/apache/pulsar/pulls?q=is%3Apr+label%3Arelease%2F2.8.4+-label%3Acherry-picked%2Fbranch-2.8+is%3Aclosed
>> [2] - 
>> https://github.com/apache/pulsar/pulls?q=is%3Apr+label%3Arelease%2F2.8.4+-label%3Acherry-picked%2Fbranch-2.8+is%3Aopen
>> 
> 



Re: [DISCUSS] Apache Pulsar 2.11.0 Release

2022-07-26 Thread Yunze Xu
I opened a PR https://github.com/apache/pulsar/pull/16803 that might be the 
blocker
of the release of 2.11.0, PTAL.

Thanks,
Yunze




> 2022年7月22日 18:21,Zixuan Liu  写道:
> 
> +1
> 
> Thanks,
> Zixuan
> 
> Enrico Olivelli  于2022年7月22日周五 18:06写道:
> 
>> Il giorno ven 22 lug 2022 alle ore 12:05 guo jiwei
>>  ha scritto:
>>> 
>>> I will take the release
>> 
>> Thanks !
>> 
>> Enrico
>> 
>>> 
>>> Regards
>>> Jiwei Guo (Tboy)
>>> 
>>> 
>>> On Fri, Jul 22, 2022 at 12:37 AM Nicolò Boschi 
>> wrote:
>>> 
>>>> I understand the need for the Pulsar Summit.
>>>> 
>>>> In that case I have to step back because I will be offline for the
>> next few
>>>> weeks
>>>> Sorry
>>>> 
>>>> Nicolò Boschi
>>>> 
>>>> Il Gio 21 Lug 2022, 06:32 PengHui Li  ha scritto:
>>>> 
>>>>> Thanks for volunteering Nicolò.
>>>>> 
>>>>>> So a plan could be to try to merge the work in progress targeted
>> for
>>>> 2.11
>>>>> by the mid of August and then start the code freezing as described
>> in the
>>>>> PIP.
>>>>> 
>>>>> So the target release date will be early September. One point is
>> Pulsar
>>>>> Summit
>>>>> San Francisco will start on August 18, 2022. I think maybe we can
>> start
>>>> to
>>>>> test
>>>>> the master branch for now and continue the in-progress tasks. If we
>> can
>>>>> have a
>>>>> major release before Pulsar Summit, it should be good news to the
>>>>> Community.
>>>>> 
>>>>> Thanks.
>>>>> Penghui
>>>>> 
>>>>> On Mon, Jul 18, 2022 at 4:06 PM Enrico Olivelli >> 
>>>>> wrote:
>>>>> 
>>>>>> Nicolò,
>>>>>> 
>>>>>> Il Lun 18 Lug 2022, 10:00 Nicolò Boschi  ha
>>>>> scritto:
>>>>>> 
>>>>>>> Thanks Penghui for the reminder.
>>>>>>> I'd like to also include PIP: 181 Pulsar shell if the time
>> permits.
>>>>>>> 
>>>>>>> I believe that is a good idea to start testing the code freeze
>>>> proposed
>>>>>> by
>>>>>>> PIP-175 (https://github.com/apache/pulsar/issues/15966). Even
>> if not
>>>>>>> officially approved, we discussed it many times and agreed to the
>>>>>>> usefulness of the code freezing.
>>>>>>> 
>>>>>> 
>>>>>> Great idea!
>>>>>> 
>>>>>> We should really try it
>>>>>> 
>>>>>> So a plan could be to try to merge the work in progress targeted
>> for
>>>> 2.11
>>>>>>> by the mid of August and then start the code freezing as
>> described in
>>>>> the
>>>>>>> PIP.
>>>>>>> 
>>>>>>> Also, I volunteer for driving the release if nobody else is
>>>> interested
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Thanks for volunteering
>>>>>> 
>>>>>> Enrico
>>>>>> 
>>>>>> 
>>>>>>> Thanks,
>>>>>>> Nicolò Boschi
>>>>>>> 
>>>>>>> 
>>>>>>> Il giorno lun 18 lug 2022 alle ore 06:59 Yunze Xu
>>>>>>>  ha scritto:
>>>>>>> 
>>>>>>>> In addition to #16202, there is a following PR to support the
>>>> correct
>>>>>>>> ACK implementation for chunked messages. It should depend on
>> #16202
>>>>>>>> But I think I can submit an initial PR this week and change the
>>>> tests
>>>>>>>> after #16202 is merged.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Yunze
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 2022年7月18日 11:22,PengHui Li  写道:
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>&

Re: [ANNOUNCE] Micheal Marshall as a new PMC member in Pulsar

2022-07-26 Thread Yunze Xu
Congratulations Michael!

Thanks,
Yunze




> 2022年7月26日 23:21,Enrico Olivelli  写道:
> 
> I am glad to announce that the Apache Pulsar PMC invited Micheal to
> join the PMC and he accepted.
> 
> Micheal is doing a great job in stewarding our community
> 
> Please join me and celebrate !
> 
> Enrico Olivelli



Re: [DISCUSS] Apache Pulsar 2.8.4 release

2022-08-05 Thread Yunze Xu
Hi, all

I have cherry-picked all necessary PRs into branch-2.8 now, see
https://github.com/apache/pulsar/pulls?q=is%3Apr+label%3Arelease%2F2.8.4+-label%3Acherry-picked%2Fbranch-2.8+

For some PRs with `release/2.8.4` label but not cherry-picked into
branch-2.8, I changed the label to `release/2.8.5`. If any of you
wants some PRs to be included in 2.8.4, please reply in the email. I
will begin the release next Tuesday (UTC+8 Time Zone).

Thanks,
Yunze




Re: [DISCUSS] Create a new Github Project to track the flaky tests

2022-08-10 Thread Yunze Xu
+1. Though the image url is broken :(

Thanks,
Yunze




> 2022年8月10日 16:35,PengHui Li  写道:
> 
> Hi all,
> 
> For better tracking flaky test fix, I have tried to create a Github Project 
> under the Pulsar repo https://github.com/apache/pulsar/projects/11 (It can be 
> removed if we don't like this way)
> This will help us to have a more intuitive view of all the flaky tests, how 
> many are in progress, in the review stage, and approved.
> 
> The project is public for all the contributors, so if you want to contribute 
> some flaky tests fixes,
> you can go to the Github Project to peek up items in the Todo column.
> 
> And I also created a PR https://github.com/apache/pulsar/pull/17038 to add 
> the PRs and issues
> with `flaky-tests` label to this project automatically. 
> 
> BTW, I also have some questions about the Github Project automation. As the 
> description of 
> column `Review in progress, it said the PR with request changes would go to 
> this column
> automatically. But it doesn't work. I'm not sure why.
> 
> 
> Best,
> Penghui
> 



Re: [DISCUSS] Skip unnecessary tests when there are only cpp/python related changes

2022-08-10 Thread Yunze Xu
LGTM

Thanks,
Yunze




> 2022年8月10日 15:36,Zike Yang  写道:
> 
> Hi, Pulsar community
> 
> Currently, Java tests consume significant CI resources. And it is not
> necessary to run all the tests for changes that are only on the C++ or
> python parts of the code. I have created a PR [0] to improve the CI by
> skipping unnecessary tests when there are only CPP/Python changes.
> This can significantly increase the efficiency of CI when testing the
> C++/Python part of the code.
> 
> After this PR gets merged, we will skip java unit tests, integration
> tests(the part only for java codes), and go function tests when there
> are only cpp/python changes. But the system test is not skipped
> because there are some python function codes in that test. Perhaps in
> the future, we can further optimize the system test to skip
> unnecessary matrix tests for PRs with only C++ changes.
> 
> I have created a test PR in a separate repo to verify this PR. [1]
> And more detail in [2].
> 
> Please take a look and feel free to comment on it.
> 
> Regarding the current Pulsar CI, I have a question. Why do we need to
> add doc_only check at each step when skipping code tests instead of
> just skipping the whole job for PR with only doc changes? [3] Is there
> any concern?
> 
> Please let me know what you think. Thanks!
> 
> 
> [0] https://github.com/apache/pulsar/pull/16988
> [1] https://github.com/RobertIndie/pulsar-ci-test/pull/1
> [2] https://github.com/RobertIndie/pulsar-ci-test/actions/runs/2829525510
> [3] 
> https://github.com/apache/pulsar/blob/master/.github/workflows/pulsar-ci.yaml#L380
> 
> Best,
> Zike Yang



Re: [DISCUSS] Apache Pulsar 2.11.0 Release

2022-08-10 Thread Yunze Xu
I found the scripts to build rpm and deb packages are broken, see
https://github.com/apache/pulsar/wiki/Release-process#31-build-rpm-and-deb-packages.

It's caused by https://github.com/apache/pulsar/pull/15376 and only
affects the 2.11.0 release and higher versions. It should be a blocker
for 2.11.0 release. I'm working on this issue at the moment and going
to push a fix soon, as well as the CI to protect the rpm/deb packaging
to avoid the regression.

Thanks,
Yunze




> 2022年8月4日 23:44,guo jiwei  写道:
> 
> Hi all,
> 
> Put an update here, we have created branch-2.11[1].
> 
> [1] https://github.com/apache/pulsar/tree/branch-2.11
> 
> 
> Regards
> Jiwei Guo (Tboy)
> 
> 
> On Wed, Jul 27, 2022 at 10:59 AM Zixuan Liu  wrote:
> 
>> +1
>> 
>> Thanks,
>> Zixuan
>> 
>> Yunze Xu  于2022年7月26日周二 23:34写道:
>> 
>>> I opened a PR https://github.com/apache/pulsar/pull/16803 that might be
>>> the blocker
>>> of the release of 2.11.0, PTAL.
>>> 
>>> Thanks,
>>> Yunze
>>> 
>>> 
>>> 
>>> 
>>>> 2022年7月22日 18:21,Zixuan Liu  写道:
>>>> 
>>>> +1
>>>> 
>>>> Thanks,
>>>> Zixuan
>>>> 
>>>> Enrico Olivelli  于2022年7月22日周五 18:06写道:
>>>> 
>>>>> Il giorno ven 22 lug 2022 alle ore 12:05 guo jiwei
>>>>>  ha scritto:
>>>>>> 
>>>>>> I will take the release
>>>>> 
>>>>> Thanks !
>>>>> 
>>>>> Enrico
>>>>> 
>>>>>> 
>>>>>> Regards
>>>>>> Jiwei Guo (Tboy)
>>>>>> 
>>>>>> 
>>>>>> On Fri, Jul 22, 2022 at 12:37 AM Nicolò Boschi >> 
>>>>> wrote:
>>>>>> 
>>>>>>> I understand the need for the Pulsar Summit.
>>>>>>> 
>>>>>>> In that case I have to step back because I will be offline for the
>>>>> next few
>>>>>>> weeks
>>>>>>> Sorry
>>>>>>> 
>>>>>>> Nicolò Boschi
>>>>>>> 
>>>>>>> Il Gio 21 Lug 2022, 06:32 PengHui Li  ha
>> scritto:
>>>>>>> 
>>>>>>>> Thanks for volunteering Nicolò.
>>>>>>>> 
>>>>>>>>> So a plan could be to try to merge the work in progress targeted
>>>>> for
>>>>>>> 2.11
>>>>>>>> by the mid of August and then start the code freezing as described
>>>>> in the
>>>>>>>> PIP.
>>>>>>>> 
>>>>>>>> So the target release date will be early September. One point is
>>>>> Pulsar
>>>>>>>> Summit
>>>>>>>> San Francisco will start on August 18, 2022. I think maybe we can
>>>>> start
>>>>>>> to
>>>>>>>> test
>>>>>>>> the master branch for now and continue the in-progress tasks. If we
>>>>> can
>>>>>>>> have a
>>>>>>>> major release before Pulsar Summit, it should be good news to the
>>>>>>>> Community.
>>>>>>>> 
>>>>>>>> Thanks.
>>>>>>>> Penghui
>>>>>>>> 
>>>>>>>> On Mon, Jul 18, 2022 at 4:06 PM Enrico Olivelli <
>> eolive...@gmail.com
>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Nicolò,
>>>>>>>>> 
>>>>>>>>> Il Lun 18 Lug 2022, 10:00 Nicolò Boschi  ha
>>>>>>>> scritto:
>>>>>>>>> 
>>>>>>>>>> Thanks Penghui for the reminder.
>>>>>>>>>> I'd like to also include PIP: 181 Pulsar shell if the time
>>>>> permits.
>>>>>>>>>> 
>>>>>>>>>> I believe that is a good idea to start testing the code freeze
>>>>>>> proposed
>>>>>>>>> by
>>>>>>>>>> PIP-175 (https://github.com/apache/pulsar/issues/15966). Even
>>>>> if not
>>>>>>>>>> officially approved, we discussed it many times and agreed to the
>>>>>>>>>> 

Questions about the release process

2022-08-11 Thread Yunze Xu
Hi all,

Recently I'm working on the release of 2.8.4 and it's near the vote of
the 1st candidate but I have some questions.

From the tutorial [1] we can see, the 8th step is "Run the vote".
However, the 7th step is "Write release notes", should we execute this
step later? I see the 16th step is also "Write release notes" but the
16th step at the beginning of "Release workflow" section is "Update
the site".

In addition, I found the previous candidate [2] includes the docker
images, which is not included in the template of the 8th step "Run the
vote". It seems to be the 10th step "Publish Docker Images".

It seems that the documents are not maintained well, which really
makes me confused. Therefore, before voting for the 1st candidate, I
want to get some clarifications from the mail list.

[1] https://github.com/apache/pulsar/wiki/Release-process
[2] https://lists.apache.org/thread/q0g5ko617rb77b1wqpxy94ks5mq48d88


Thanks,
Yunze






Re: [Discuss] PIP 198: Standardize PR Naming Convention using GitHub Actions

2022-08-11 Thread Yunze Xu
+1 on the customized one

Thanks,
Yunze




> 2022年8月12日 00:25,Alexander Preuss  
> 写道:
> 
> Hi together,
> 
> Thank you for driving this topic!
> I agree that our customized convention is better than the Angular one.
> 
> +1 on the customized one
> 
> Best,
> Alex
> 
> 
> 
> On Thu, Aug 11, 2022 at 11:02 AM Yu  wrote:
> 
>> Hi team,
>> 
>> For PIP 198: Standardize PR Naming Convention using GitHub Actions [1], we've
>> got many different suggestions on implementation details. Let's discuss
>> them one by one.
>> 
>>  Discussion topic:
>> 
>> For PR titles, which convention should we follow?
>> - Angular convention [2] - Our existing convention (it's customized based
>> on Angular) [3]  The differences between Angular and ours
>> are: 1. Definition 1.1 Property - Angular: [type] is required, [scope] is
>> optional - Ours: [type] and [scope] are required 1.2 Content - Angular: ci,
>> test, and docs belong to [type] - Ours: ci, test, and docs belong to
>> [scope] because
>> I think [type] defines "what action do you make" (eg.
>> add/delete/update/...),
>> while [scope] defines "where do you make changes". 2. Punctuation -
>> Angular: parenthesis and exclaim points are used - Ours: brackets are used
>>  Comparison examples Taking existing Pulsar PR titles as
>> examples: Example 1 - Angular: fix: Filter out already deleted entries
>> again before sending messages to consumers - Ours: [fix][broker] Filter out
>> already deleted entries again before sending messages to consumers Example
>> 2 - Angular: ci: add flaky test issues and PRs to flaky test project -
>> Ours: [feat][ci] Add flaky test issues and PRs to flaky test project
>> Example 3 - Angular: docs: Document configuration added by PIP-145 doc -
>> Ours: [improve][doc] Document configuration added by PIP-145 doc
>>  I prefer our customized convention because: - It makes PR
>> titles more clear and self-explanatory. - No need to change users' habits
>> since many people in the community have followed and gotten used to it for
>> several months [4].  Feel free to comment. Thank you!
>> [1]
>> https://docs.google.com/document/d/1sJlUNAHnYAbvu9UtEgCrn_oVTnVc1M5nHC19x1bFab4/edit?pli=1
>> 
>> [2]
>> https://github.com/angular/angular/blob/main/CONTRIBUTING.md#commit-message-header
>> [3]
>> https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit?pli=1#bookmark=id.y8943h392zno
>> [4] https://github.com/apache/pulsar/pulls
>> 
>> Yu, Max, mangoGoForward
>> 



Re: Questions about the release process

2022-08-12 Thread Yunze Xu
I'm going to push the Docker images for the 1st candidate soon.
Unfortunately, when I followed the 10th step "Publish Docker Images",
I found the tag doesn't have the "-rc" suffix.

```bash
cd docker
./build.sh
DOCKER_USER=bewaremypower DOCKER_PASSWORD="" 
DOCKER_ORG=bewaremypower ./publish.sh
```

I didn't have a deep look into the `build.sh` and `publish.sh`. But I
think we need to make it clear for release manager in the documents.
I'm strongly against asking many questions directly to previous
release managers. Unfortunately, I did in the passed few days. The
documents could, and should be better.

Anyway, I will open a VOTE soon.


Thanks,
Yunze




> 2022年8月11日 21:48,Yunze Xu  写道:
> 
> Hi all,
> 
> Recently I'm working on the release of 2.8.4 and it's near the vote of
> the 1st candidate but I have some questions.
> 
> From the tutorial [1] we can see, the 8th step is "Run the vote".
> However, the 7th step is "Write release notes", should we execute this
> step later? I see the 16th step is also "Write release notes" but the
> 16th step at the beginning of "Release workflow" section is "Update
> the site".
> 
> In addition, I found the previous candidate [2] includes the docker
> images, which is not included in the template of the 8th step "Run the
> vote". It seems to be the 10th step "Publish Docker Images".
> 
> It seems that the documents are not maintained well, which really
> makes me confused. Therefore, before voting for the 1st candidate, I
> want to get some clarifications from the mail list.
> 
> [1] https://github.com/apache/pulsar/wiki/Release-process
> [2] https://lists.apache.org/thread/q0g5ko617rb77b1wqpxy94ks5mq48d88
> 
> 
> Thanks,
> Yunze
> 
> 
> 
> 



[VOTE] Pulsar Release 2.8.4 Candidate 1

2022-08-12 Thread Yunze Xu
This is the first release candidate for Apache Pulsar, version 2.8.4.

It fixes the following issues:
https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.8.4

*** Please download, test and vote on this release. This vote will stay open
for at least 72 hours ***

Note that we are voting upon the source (tag), binaries are provided for
convenience.

Source and binary files:
https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.4-candidate-1/

SHA-512 checksums:

c3d26704f2cfb3365c29d4110612ca7351084f8bee3c306d5e906b3f9b22c7557cc5baf12f74f8c222baccae3310691419eda5b47fdf9cd6c5281b70134ab5eb
  apache-pulsar-2.8.4-bin.tar.gz
28160ee94dccfb74dfb56e0e5d0e08870c6612659507333a52b5660ecbf060a75d1eed667cffd8596f9468de95055b78916b932db0e0d4c2745868d55429ee98
  apache-pulsar-2.8.4-src.tar.gz

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachepulsar-1170/

The tag to be voted upon:
v2.8.4-candidate-1 (02ee5616866d4eda8dd94f85d9d9b71c459f248d)
https://github.com/apache/pulsar/releases/tag/v2.8.4-candidate-1

Pulsar's KEYS file containing PGP keys we use to sign the release:
https://dist.apache.org/repos/dist/dev/pulsar/KEYS

Docker images:

https://hub.docker.com/layers/pulsar/bewaremypower/pulsar/2.8.4/images/sha256-fba51a75c0f2ca79fbff7b254f80f641fcda661fd702f8149bbfdd5994078e3a

https://hub.docker.com/layers/pulsar-all/bewaremypower/pulsar-all/2.8.4/images/sha256-42d4b41e5869edc6242bb49d6a1687bd6d191a6385637122edc5c7b2c44ee46f

Please download the source package, and follow the Release Candidate
Validation[1] to validate the release

[1] https://github.com/apache/pulsar/wiki/Release-Candidate-Validation

Thanks,
Yunze






Re: Questions about the release process

2022-08-12 Thread Yunze Xu
Yeah, I agree. It’s better to move the release process to the codebase.

Regarding the automatic validation program, we can have that for some
common verifications like the GPG verification, which only requires a simple
command if you have downloaded the binary.

Thanks,
Yunze




> 2022年8月12日 18:12,PengHui Li  写道:
> 
> Thanks for raising this question.
> 
> Maybe we'd better move the release process doc and validation doc
> to the codebase, not the wiki pages.
> 
> - Only committers can update the wiki pages
> - The changes without review
> 
> After moving to the pulsar codebase
> 
> - Everyone can contribute to the validation doc
> - The release process doc update can get reviewers to review
> 
> I think there are multiple issues that need to be resolved for the release
> process
> 
> - Have the Python client(Linux, osx) at the RC stage, I think currently we
> only have the C++ client for RC, but push to pypi after the RC gets passed
> - Add validation process for the Python and C++ client
> - Add the Go function and Python function validation process
> - Add a script for building images for RC
> - Add images validation process
> 
> And another point is can we have an automatic validation program to reduce
> the burden on validators?
> I'm not sure if it is acceptable.
> 
> Thanks,
> Penghui
> 
> On Fri, Aug 12, 2022 at 4:50 PM Haiting Jiang 
> wrote:
> 
>>> the 7th step is "Write release notes", should we execute this
>>> step later?
>> 
>> From what I see, the release note can be postponed after the voting
>> process.
>> And it's not part of the voting content and does not affect whether we
>> should cut a new release candidate.
>> 
>>> In addition, I found the previous candidate [2] includes the docker
>>> images, which is not included in the template of the 8th step "Run the
>>> vote". It seems to be the 10th step "Publish Docker Images".
>> 
>> Confused +1, If we do add docker image as part of release vote, we should
>> also add validation method in [1]
>> 
>> [1] https://github.com/apache/pulsar/wiki/Release-Candidate-Validation
>> 
>> Thanks,
>> Haiting
>> 
>> On Thu, Aug 11, 2022 at 9:49 PM Yunze Xu 
>> wrote:
>> 
>>> Hi all,
>>> 
>>> Recently I'm working on the release of 2.8.4 and it's near the vote of
>>> the 1st candidate but I have some questions.
>>> 
>>> From the tutorial [1] we can see, the 8th step is "Run the vote".
>>> However, the 7th step is "Write release notes", should we execute this
>>> step later? I see the 16th step is also "Write release notes" but the
>>> 16th step at the beginning of "Release workflow" section is "Update
>>> the site".
>>> 
>>> In addition, I found the previous candidate [2] includes the docker
>>> images, which is not included in the template of the 8th step "Run the
>>> vote". It seems to be the 10th step "Publish Docker Images".
>>> 
>>> It seems that the documents are not maintained well, which really
>>> makes me confused. Therefore, before voting for the 1st candidate, I
>>> want to get some clarifications from the mail list.
>>> 
>>> [1] https://github.com/apache/pulsar/wiki/Release-process
>>> [2] https://lists.apache.org/thread/q0g5ko617rb77b1wqpxy94ks5mq48d88
>>> 
>>> 
>>> Thanks,
>>> Yunze
>>> 
>>> 
>>> 
>>> 
>>> 
>> 



Re: Questions about the release process

2022-08-12 Thread Yunze Xu
> The RM should ask a PMC member to help them add their KEY.
> Do not make the release docs part of versioned docs.
I agree.

Thanks,
Yunze




> 2022年8月12日 23:10,Dave Fisher  写道:
> 
> Hi -
> 
> One change that needs to be made is regarding the KEYS file.
> 
> We should drop the use of https://dist.apache.org/repos/dist/dev/pulsar/KEYS 
> instead we should carefully update 
> https://dist.apache.org/repos/dist/release/pulsar/KEYS
> 
> The two KEYS files are currently out of sync. The release file had to be hand 
> reconstructed at the beginning of the year and I’ve had to deal with new 
> Release Manager KEYS that were not copied during the Release Process. 
> (Recently Apache Infra has been scanning release and is informing PMCs when 
> their releases are broken.)
> 
> The RM should ask a PMC member to help them add their KEY. I’m willing to do 
> it and I’m certain other PMC members would do the same.
> 
> The VOTE threads can then always refer to a proper KEYS file that will always 
> be correct. RMs should also make sure that their KEY does not expire while 
> the release is active which could be for several years. If your KEY is 
> revoked at some point then please let the PMC know.
> 
> I like moving the Release Docs to the codebase, but we do need to assure that 
> the PMC fully reviews changes. The reviews that count before squash and merge 
> must be from PMC members. The reason is that the Pulsar PMC is responsible 
> for assuring that Pulsar releases comply with Apache Release Policies.
> 
> Do not make the release docs part of versioned docs. There should be only the 
> current policy. If other products of the Pulsar project require different 
> release docs it is fine to have separate docs.
> 
> All The Best,
> Dave
> 
>> On Aug 12, 2022, at 7:41 AM, tison  wrote:
>> 
>> Hi Penghui & Yunze,
>> 
>> I ever wrote developer guides for TiDB[1] and Apache Kvrocks (Incubating),
>> including the release guide for the latter[2].
>> 
>> Just for your information, I'm preparing the proposal to bring a developer
>> guide page (series docs). Perhaps start in the next month.
>> 
>> Although, it cannot help the current status, and I don't want to discuss
>> details on this topic here. Again, just for your information :)
>> 
>> Best,
>> tison.
>> 
>> [1] https://pingcap.github.io/tidb-dev-guide/
>> [2] https://kvrocks.apache.org/community/how-to-release
>> 
>> 
>> Yunze Xu  于2022年8月12日周五 21:57写道:
>> 
>>> Yeah, I agree. It’s better to move the release process to the codebase.
>>> 
>>> Regarding the automatic validation program, we can have that for some
>>> common verifications like the GPG verification, which only requires a
>>> simple
>>> command if you have downloaded the binary.
>>> 
>>> Thanks,
>>> Yunze
>>> 
>>> 
>>> 
>>> 
>>>> 2022年8月12日 18:12,PengHui Li  写道:
>>>> 
>>>> Thanks for raising this question.
>>>> 
>>>> Maybe we'd better move the release process doc and validation doc
>>>> to the codebase, not the wiki pages.
>>>> 
>>>> - Only committers can update the wiki pages
>>>> - The changes without review
>>>> 
>>>> After moving to the pulsar codebase
>>>> 
>>>> - Everyone can contribute to the validation doc
>>>> - The release process doc update can get reviewers to review
>>>> 
>>>> I think there are multiple issues that need to be resolved for the
>>> release
>>>> process
>>>> 
>>>> - Have the Python client(Linux, osx) at the RC stage, I think currently
>>> we
>>>> only have the C++ client for RC, but push to pypi after the RC gets
>>> passed
>>>> - Add validation process for the Python and C++ client
>>>> - Add the Go function and Python function validation process
>>>> - Add a script for building images for RC
>>>> - Add images validation process
>>>> 
>>>> And another point is can we have an automatic validation program to
>>> reduce
>>>> the burden on validators?
>>>> I'm not sure if it is acceptable.
>>>> 
>>>> Thanks,
>>>> Penghui
>>>> 
>>>> On Fri, Aug 12, 2022 at 4:50 PM Haiting Jiang 
>>>> wrote:
>>>> 
>>>>>> the 7th step is "Write release notes", should we execute this
>>>>>> step later?
>>>>>

Re: [VOTE] Pulsar Release 2.8.4 Candidate 1

2022-08-14 Thread Yunze Xu
You can see
https://lists.apache.org/thread/rg1g083c06ozm5go6zo1jophg9y9zl2f
for more details about the LTS release.

Thanks,
Yunze




> 2022年8月14日 11:00,Qiang Huang  写道:
> 
> +1 (non-binding)
> Is 2.8.4 a long term support release?
> 
> Yunze Xu  于2022年8月12日周五 16:20写道:
> 
>> This is the first release candidate for Apache Pulsar, version 2.8.4.
>> 
>> It fixes the following issues:
>> 
>> https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.8.4
>> 
>> *** Please download, test and vote on this release. This vote will stay
>> open
>> for at least 72 hours ***
>> 
>> Note that we are voting upon the source (tag), binaries are provided for
>> convenience.
>> 
>> Source and binary files:
>> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.4-candidate-1/
>> 
>> SHA-512 checksums:
>> 
>> c3d26704f2cfb3365c29d4110612ca7351084f8bee3c306d5e906b3f9b22c7557cc5baf12f74f8c222baccae3310691419eda5b47fdf9cd6c5281b70134ab5eb
>> apache-pulsar-2.8.4-bin.tar.gz
>> 28160ee94dccfb74dfb56e0e5d0e08870c6612659507333a52b5660ecbf060a75d1eed667cffd8596f9468de95055b78916b932db0e0d4c2745868d55429ee98
>> apache-pulsar-2.8.4-src.tar.gz
>> 
>> Maven staging repo:
>> https://repository.apache.org/content/repositories/orgapachepulsar-1170/
>> 
>> The tag to be voted upon:
>> v2.8.4-candidate-1 (02ee5616866d4eda8dd94f85d9d9b71c459f248d)
>> https://github.com/apache/pulsar/releases/tag/v2.8.4-candidate-1
>> 
>> Pulsar's KEYS file containing PGP keys we use to sign the release:
>> https://dist.apache.org/repos/dist/dev/pulsar/KEYS
>> 
>> Docker images:
>> 
>> 
>> https://hub.docker.com/layers/pulsar/bewaremypower/pulsar/2.8.4/images/sha256-fba51a75c0f2ca79fbff7b254f80f641fcda661fd702f8149bbfdd5994078e3a
>> 
>> 
>> https://hub.docker.com/layers/pulsar-all/bewaremypower/pulsar-all/2.8.4/images/sha256-42d4b41e5869edc6242bb49d6a1687bd6d191a6385637122edc5c7b2c44ee46f
>> 
>> Please download the source package, and follow the Release Candidate
>> Validation[1] to validate the release
>> 
>> [1] https://github.com/apache/pulsar/wiki/Release-Candidate-Validation
>> 
>> Thanks,
>> Yunze
>> 
>> 
>> 
>> 
>> 
> 
> -- 
> BR,
> Qiang Huang



Re: Questions about the release process

2022-08-15 Thread Yunze Xu
> One example is that `pool.sks-keyservers.net` in [1] seems not available
anymore, but I am not that confident enough to edit it directly.

Yeah. It’s not available in my env as well. I made some changes recently,
but I’m also not sure about this point.

> I think we can consider using a BOT (like Github Action)
to make the release candidates.

+1. But maybe it’s not an easy job.

Thanks,
Yunze




> 2022年8月15日 18:30,Haiting Jiang  写道:
> 
> One example is that `pool.sks-keyservers.net` in [1] seems not available
> anymore, but I am not that confident enough to edit it directly.



Re: [Discussion] PIP 198 - How to define [type] and [scope]?

2022-08-17 Thread Yunze Xu
LGTM.

Thanks,
Yunze




> 2022年8月17日 11:15,Yu  写道:
> 
> Hi team,
> 
> For PIP 198: Standardize PR Naming Convention using GitHub Actions [1]
> 
> How to define [type] and [scope]? Do these abbreviations LGTY?
> 
> *[Guide] Pulsar Pull Request Naming Convention* [2] contains everything
> about the definition. Feel free to check and comment!
> 
> ~~
> 
> TL;DR
> 
> PR title format: [type][scope] Summary [3]
> 
> ~~
> 
> [type]
> 
> 1. Definition: what actions do you take?
> 
> 2. It must be one of the following:
> - feat (abbr for "feature")
> - improve
> - fix
> - cleanup
> - refactor
> - revert
> 
> ~~
> 
> [scope]
> 
> 1. Definition: where do you make changes?
> 
> 2. It must be one of the following:
> - admin (changes to pulsar-admin, REST API, Java admin API)
> - broker
> - io
> - deploy
> - dep (abbr for dependency)
> - fcn (abbr for function)
> - monitor
> - pkg (abbr for package)
> - proxy
> - schema
> - sec (abbr for security)
> - sql
> - ts (abbr for tiered storage)
> - tool
> - txn (abbr for transaction)
> 
> - java (changes to Java client)
> - cpp (changes to C++ client)
> - py (changes to Python client)
> - ws (changes to WebSocket)
> - rest (changes to REST)
> 
> - test
> - ci
> - workflow
> - build
> - misc (abbr for miscellaneous)
> 
> - doc
> - blog
> - site (abbr for website)
> 
> ~~
> 
> Besides, many developers have different opinions on the following aspects.
> What's your writing preference?
> 
> - Submit breaking changes
> [feat][broker]! Support xx
> 
> - Submit PIP changes
> [feat][broker] PIP-198: Support xx
> 
> - Cherry pick changes [4]
> Choice A: [fix][broker][branch-2.9] xxx
> Choice B: [fix][broker] xxx. And add "cherry pick xxx to branch-2.9" in the
> PR description.
> 
> ~~
> 
> Feel free to comment and make your voice heard. Go vote! Thank you!
> 
> [1]
> https://docs.google.com/document/d/1sJlUNAHnYAbvu9UtEgCrn_oVTnVc1M5nHC19x1bFab4/edit
> [2] https://lists.apache.org/thread/90rcjf1dv0fbkb5hm31kmgr65fj0nfnn
> [3]
> https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit?pli=1#bookmark=id.y8943h392zno
> [4]
> https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit?pli=1#bookmark=kix.849jztd92uk7
> 
> Yu



Re: [ANNOUNCE] Jiwei Guo as a new PMC member in Pulsar

2022-08-18 Thread Yunze Xu
Congratulations!

Thanks,
Yunze




> 2022年8月18日 19:24,PengHui Li  写道:
> 
> Hi, all
> 
> I'm glad to announce that the Apache Pulsar PMC invited Jiwei Guo to join
> the
> PMC and he accepted.
> 
> Please join in celebrating!
> 
> Best,
> Penghui



Re: [DISCUSS] Enable non-mandatory updating PR branches

2022-08-18 Thread Yunze Xu
I’m glad to see the “update branch” option enabled for contributors.

Thanks,
Yunze




> 2022年8月18日 21:01,tison  写道:
> 
> Hello,
> 
> The short version
> =
> 
> Vote if you agree on enabling the non-mandatory updating PR branches
> button, i.e., the "Always suggest updating pull request branches" GitHub
> settings.
> 
> The full version
> 
> 
> Pulsar is under rapid development and numerous fixes are pushed to master
> every time. Since we are still suffering from quite a few flaky tests,
> merge master and retest is a hotspot to verify the patch once more.
> 
> However, we should pull the nightly master locally, check out the PR
> branch, perform the merge and push to remote. It's a bit awkward especially
> when a developer works on multiple branches.
> 
> GitHub provides a button "Always suggest updating pull request branches"
> with the description "Whenever there are new changes available in the base
> branch, present an “update branch” option in the pull request."[1]
> 
> It can simplify the workflow with one button click.
> 
> To clarify, this is different from the branch protection rule "Require
> branches to be up to date before merging" - it's non-mandatory and just
> provides the "update branch" button. It means we don't force every PR to
> catch up with the latest master before merged, which can cause exextremely
> high unnecessary traffic. Since we already allow PR authors to retrigger
> tests with the pulsorbot (or even contributors can push an empty commit),
> providing such a button does no harm.
> 
> I post this thread here to collect feedback, especially from the PMC
> members. Previously I asked the INFRA team to turn on this option for
> Apache Kvrocks (Incubating)[2] and I believe the INFRA team would be happy
> to see an explicit community consensus.
> 
> Best,
> tison.
> 
> [1]
> https://github.blog/changelog/2022-02-03-more-ways-to-keep-your-pull-request-branch-up-to-date/
> [2] https://issues.apache.org/jira/browse/INFRA-23432



Re: [Discussion] PIP 198 - How to define [type] and [scope]?

2022-08-21 Thread Yunze Xu
A, A

Thanks,
Yunze




> 2022年8月22日 12:47,Yu  写道:
> 
> Hi developers,
> 
> Two quick questions need your vote!
> 
> Which do you prefer?
> 
> 
> 
> # 1. Use "branch" or "BP"?
> 
> Choice A: [fix][broker][branch-2.9] xxx
> Choice B: [fix][broker][BP-2.9] xxx
> 
> 
> 
> # 2. for the [scope], use "misc" or "chore"? [1]
> 
> Choice A: misc
> Choice B: chore
> 
> 
> 
> Thank you all!
> 
> [1]
> https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit?pli=1#bookmark=id.58q1qxhu7pio
> 
> Yu
> 
> On Mon, Aug 22, 2022 at 12:44 PM Yu  wrote:
> 
>> Hi tison,
>> 
>> Thanks for your suggestions! We have several questions on [build]:
>> 
>>> build - all things related to the build system, including tools,
>> deployment logic, maven changes, packaging logics, docker image, build
>> scripts.
>> 
>> 
>> 
>> # 1.  "tools"
>> 
>> What do you refer to? Plugins?
>> 
>> Besides, the existing scope, [tool], refers to Pulsar CLI tools [1].
>> We're considering to rename it to [cli] since:
>> a. "cli" is more clear and short
>> b. Save the word "tool" for future use
>> 
>> Does it make sense?
>> 
>> 
>> 
>> # 2. "deployment logic"
>> 
>> Seems that it's an obsolete module and has not been updated for a long
>> while.
>> If so, can we ignore this?
>> 
>> 
>> 
>> # 3. Does "packaging logics" belong to [admin]? [2]
>> 
>> 
>> 
>> # 4. How about defining [build] refer to the following?
>> 
>> - Dependency (Maven)
>> 
>> In this way, we do not have the scope [dependency] since the changes to
>> dependency belong to [build].
>> 
>> - Docker
>> 
>> - Build or release script
>> 
>> 
>> 
>> Thank you for your reply!
>> 
>> [1]
>> https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit?pli=1#bookmark=id.khz275ok35u5
>> [2]
>> https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit?pli=1#bookmark=id.nnekhkthmwlh
>> 
>> Yu and Zixuan
>> 
>> On Mon, Aug 22, 2022 at 12:40 PM Yu  wrote:
>> 
>>> Thank you tison and Zixuan!
>>> 
>>> Agree on the following aspects:
>>> 
>>> 
>>> 
>>> # 1. Remove 3 [scope]s
>>> 
>>> - Remove [workflow] since it can be replaced with other scopes
>>> eg.
>>> "[feat][workflow] Add instructions for previewing website changes"
>>> can be written as
>>> "[feat][doc] Add instructions for previewing website changes"
>>> 
>>> - Remove [depoly] since changes to deployment can be represented by other
>>> [scope]s.
>>> 
>>> - Remove [pkg]since it refers to package API [1], which belongs to
>>> [admin].
>>> 
>>> 
>>> 
>>> # 2. Update 4 [scope]s
>>> 
>>> - Add [meta], which refers to changes to metadata.
>>> 
>>> - Add [storage], which refers to changes to managed ledger.
>>> 
>>> - Rename [ts] to [offloaded], which refers to changes to tiered storage.
>>> 
>>> - Rename [func] to [fn], which refers to changes to function.
>>> 
>>> 
>>> 
>>> # 3. Remain the same
>>> 
>>> These formats are fine to go:
>>> 
>>> - Submit breaking changes
>>> [feat][broker]! Support xx
>>> 
>>> - Submit PIP changes
>>> [feat][broker] PIP-198: Support xx
>>> 
>>> 
>>> 
>>> Feel free to comment, thank you!
>>> 
>>> 
>>> 
>>> [1] https://pulsar.apache.org/docs/next/admin-api-packages
>>> 
>>> Yu and Zixuan
>>> 
>>> On Fri, Aug 19, 2022 at 6:21 PM Zixuan Liu  wrote:
>>> 
 +1 for fcn -> fn
 +1 for ts -> offloader
 
 +1 * ci - CI workflow changes or debugging.
 +1 * build - all things related to the build system, including tools,
 deployment logic, maven changes, packaging logics, docker image,
 buildscripts.
 
 `pkg` should belong to the `admin` scope, so suggest using the `admin`
 instead `pkg`.
 
 `tool` is pulsar-admin, pulsar, pulsar-client, and so on cli, so keep
 using
 the `tool`.
 
 deploy should belong to the `build` scope`, so suggest using the `build`
 instead `deploy`.
 
 
 tison  于2022年8月19日周五 17:46写道:
 
> BTW, how can I sort changes for the metadata store?
> 
> Best,
> tison.
> 
> 
> tison  于2022年8月19日周五 17:44写道:
> 
>> To proposal a workable solution, I suggest:
>> 
>> replace
>> 
>> * pkg
>> * tool
>> * deploy
>> * ci
>> * workflow
>> * build
>> 
>> with
>> 
>> * ci - CI workflow changes or debugging.
>> * build - all things related to the build system, including tools,
>> deployment logic, maven changes, packaging logics, docker image,
 build
>> scripts.
>> 
>> Best,
>> tison.
>> 
>> 
>> tison  于2022年8月19日周五 17:41写道:
>> 
 I int

Re: [DISCUSS] [PIP-201] Extensions mechanism for Pulsar Admin CLI tools

2022-08-21 Thread Yunze Xu
The motivation and goal LGTM, but the API changes look very simple and
hard to use. Do we have to implement all these interfaces for an admin
extension? If yes, could you show an example in the proposal as a
guidance?

For example, if I just want to implement a simple command:

```bash
./bin/pulsar-admin kafka create-topic  --partitions 
```

How should I implement these interfaces?

Thanks,
Yunze




> 2022年8月18日 16:23,Enrico Olivelli  写道:
> 
> Hello,
> I have drafted a PIP around this proposal.
> 
> PIP link: https://github.com/apache/pulsar/issues/17155
> 
> I am preparing an official PR, I already have a working prototype.
> 
> Copy of the contents of the GH issue is attached for discussion here
> on the Mailing list.
> 
> Motivation
> 
> There are many projects that are in the Pulsar ecosystem like Protocol
> Handlers (Kafka, MQTT, RabbitMQ) and libraries (JMS…) that need
> additional tools for operating Pulsar following specific conventions
> adopted by each project and to handle custom domain objects (like JMS
> queues, Kafka Consumer Groups...).
> 
> Some examples:
> 
> Kafka: tools for inspecting internal systems, SchemaRegistry,
> Transaction Manager, Consumers Groups
> JMS: tools to handling Subscriptions and Selectors
> RabbitMQ: tools to handle Pulsar topics and subscription following the
> convention
> 
> This is very important as it is hard to follow the conventions of each
> project using pulsar-admin and the administrator may inadvertently
> break the system.
> 
> This feature will enhance the UX of the Pulsar Admin CLI tools for the
> benefit of the whole ecosystem and users.
> 
> Goal
> 
> As we do for many other components in Pulsar, we need a way to enhance
> the CLI tools, pulsar-admin and pulsar-shell, with additional commands
> that are specific to the additional features.
> 
> The proposal here is to add an extension mechanism to the pulsar-admin
> (and so pulsar-shell) tool.
> Following the usual conventions for extensions the extension will be
> bundled in a .nar file that may contain additional third party
> libraries.
> 
> The extension will be able to provide new top level commands
> 
> pulsar-admin my-command-group my-command arguments…
> 
> The extension will be able to access the PulsarAdmin API provided by
> the environment.
> 
> The extension must not depend directly on the JCommander library but
> we will provide an API to declare the parameters and the other
> metadata necessary to document and execute the command.
> This is very important because during the lifecycle of Pulsar the
> project may decide to upgrade JCommander to an incompatible version or
> to drop the dependency at all.
> 
> API Changes
> 
> We will introduce a new Maven module pulsar-tools-api that contains
> the public API that can be used by implementations of the custom
> commands.
> 
> The implementation will be bundled in a .nar file with a descriptor
> with the following fields:
> 
> factoryClass: x.CommandFactory
> name: extension-name
> description: Description...
> 
> There are the new classes:
> 
> /**
>   Access to the environment
> */
> public interface CommandExecutionContext {
>PulsarAdmin getPulsarAdmin();
>Properties getConfiguration();
> }
> 
> 
> /**
> * Custom command implementation
> */
> public interface CustomCommand {
>String name();
>String description();
>List parameters();
>boolean execute(Map parameters,
> CommandExecutionContext context) throws Exception;
> }
> 
> /**
> * A group of commands.
> */
> public interface CustomCommandGroup {
>String name();
>String description();
>List commands(CommandExecutionContext context);
> }
> 
> /**
> * Main entry point of the extension
> */
> public interface CustomCommandFactory {
> 
>/**
> * Generate the available command groups.
> */
>List commandGroups(CommandExecutionContext context);
> }
> 
> @Builder
> @Getter
> public final class ParameterDescriptor {
>@Builder.Default
>private String name = "";
>@Builder.Default
>private String description = "";
>private ParameterType type = ParameterType.STRING;
>private  boolean required = false;
> }



Re: [DISCUSS] [PIP-201] Extensions mechanism for Pulsar Admin CLI tools

2022-08-22 Thread Yunze Xu
I will take a look. But I think we should also add a trivial example
(or a test) in the Apache repo, e.g. just print some messages for an
extended command. And the JavaDocs of these interfaces should be
complete and more clear.

Thanks,
Yunze




> 2022年8月22日 18:37,Enrico Olivelli  写道:
> 
> Yunze,
> 
> Il giorno lun 22 ago 2022 alle ore 08:06 Yunze Xu
>  ha scritto:
>> 
>> The motivation and goal LGTM, but the API changes look very simple and
>> hard to use. Do we have to implement all these interfaces for an admin
>> extension? If yes, could you show an example in the proposal as a
>> guidance?
>> 
>> For example, if I just want to implement a simple command:
>> 
>> ```bash
>> ./bin/pulsar-admin kafka create-topic  --partitions 
>> 
>> ```
>> 
>> How should I implement these interfaces?
> 
> This is a example for the implementation that I am going to do for JMS
> https://github.com/datastax/pulsar-jms/pull/53/files#diff-9afaac9c7dc4b3d674e0623cd3d76348b01537c6095e9b5b8e804f59a481cceeR31
> 
> it is only a mock command at the moment, but it is good to showcase the 
> feature
> 
> Enrico
> 
> 
>> 
>> Thanks,
>> Yunze
>> 
>> 
>> 
>> 
>>> 2022年8月18日 16:23,Enrico Olivelli  写道:
>>> 
>>> Hello,
>>> I have drafted a PIP around this proposal.
>>> 
>>> PIP link: https://github.com/apache/pulsar/issues/17155
>>> 
>>> I am preparing an official PR, I already have a working prototype.
>>> 
>>> Copy of the contents of the GH issue is attached for discussion here
>>> on the Mailing list.
>>> 
>>> Motivation
>>> 
>>> There are many projects that are in the Pulsar ecosystem like Protocol
>>> Handlers (Kafka, MQTT, RabbitMQ) and libraries (JMS…) that need
>>> additional tools for operating Pulsar following specific conventions
>>> adopted by each project and to handle custom domain objects (like JMS
>>> queues, Kafka Consumer Groups...).
>>> 
>>> Some examples:
>>> 
>>> Kafka: tools for inspecting internal systems, SchemaRegistry,
>>> Transaction Manager, Consumers Groups
>>> JMS: tools to handling Subscriptions and Selectors
>>> RabbitMQ: tools to handle Pulsar topics and subscription following the
>>> convention
>>> 
>>> This is very important as it is hard to follow the conventions of each
>>> project using pulsar-admin and the administrator may inadvertently
>>> break the system.
>>> 
>>> This feature will enhance the UX of the Pulsar Admin CLI tools for the
>>> benefit of the whole ecosystem and users.
>>> 
>>> Goal
>>> 
>>> As we do for many other components in Pulsar, we need a way to enhance
>>> the CLI tools, pulsar-admin and pulsar-shell, with additional commands
>>> that are specific to the additional features.
>>> 
>>> The proposal here is to add an extension mechanism to the pulsar-admin
>>> (and so pulsar-shell) tool.
>>> Following the usual conventions for extensions the extension will be
>>> bundled in a .nar file that may contain additional third party
>>> libraries.
>>> 
>>> The extension will be able to provide new top level commands
>>> 
>>> pulsar-admin my-command-group my-command arguments…
>>> 
>>> The extension will be able to access the PulsarAdmin API provided by
>>> the environment.
>>> 
>>> The extension must not depend directly on the JCommander library but
>>> we will provide an API to declare the parameters and the other
>>> metadata necessary to document and execute the command.
>>> This is very important because during the lifecycle of Pulsar the
>>> project may decide to upgrade JCommander to an incompatible version or
>>> to drop the dependency at all.
>>> 
>>> API Changes
>>> 
>>> We will introduce a new Maven module pulsar-tools-api that contains
>>> the public API that can be used by implementations of the custom
>>> commands.
>>> 
>>> The implementation will be bundled in a .nar file with a descriptor
>>> with the following fields:
>>> 
>>> factoryClass: x.CommandFactory
>>> name: extension-name
>>> description: Description...
>>> 
>>> There are the new classes:
>>> 
>>> /**
>>> Access to the environment
>>> */
>>> public interface CommandExecutionContext {
>>> PulsarAdmin getPulsarAdmin();
&g

Re: [VOTE] [PIP-201] Extensions mechanism for Pulsar Admin CLI tools

2022-08-24 Thread Yunze Xu
+1 (non binding)

Thanks,
Yunze




> 2022年8月24日 15:38,Nicolò Boschi  写道:
> 
> +1 (non binding)
> Nicolò Boschi
> 
> 
> Il giorno mer 24 ago 2022 alle ore 09:11 Enrico Olivelli <
> eolive...@gmail.com> ha scritto:
> 
>> Hello,
>> this is the official thread VOTE for PIP-201 Extensions mechanism for
>> Pulsar Admin CLI tools
>> 
>> This is the PIP link https://github.com/apache/pulsar/issues/17155
>> This is the discussion:
>> https://lists.apache.org/thread/287ft8twc11cp4s1y4qkcx5nmh451cyo
>> 
>> I am still working on the PR, that is the subject of the VOTE.
>> 
>> Best regards
>> Enrico Olivelli
>> 



[DISCUSS] Deprecate KeyValue schema factory methods with Class parameters

2022-08-24 Thread Yunze Xu
Hi folks,

Recently I'm looking into the KeyValue schema and found **FOUR**
static methods in `Schema` to create a `KeyValueSchema` object:

1. KeyValue(Class key, Class value);
2. KeyValue(Class key, Class value, SchemaType type);
3. KeyValue(Schema key, Schema value);
4. KeyValue(Schema key, Schema value, KeyValueEncodingType 
keyValueEncodingType);

IMO, having too many overloads is not user-friendly. I turned to the
official website and found overload 4 is used. The overload 3 is just
a simple version of overload 4 whose encoding type is `INLINE`, but
it's not clear. I opened a PR
https://github.com/apache/pulsar/pull/17256 to make it more clear.

However, for overload 1 and 2, I can only find two references in unit
tests `testAllowNullAvroSchemaCreate` and
`testAllowNullJsonSchemaCreate`. From the very simple JavaDocs, it
looks like they are only available for JSON and AVRO schemas.

Then I found the original PR to introduce these overloads:
https://github.com/apache/pulsar/pull/2885

Code has been changed a lot since that. It looks other codes are not
much related to these two overloads now. IMO, we should not encourage
users to use these two overloads, so I suggest marking them as
deprecated and might remove them in future releases.


Thanks,
Yunze






Re: [VOTE] Pulsar Release 2.8.4 Candidate 1

2022-08-29 Thread Yunze Xu
Hi everyone, there are already two binding +1. Could anyone else help verify it?

Thanks,
Yunze




> 2022年8月23日 21:26,PengHui Li  写道:
> 
> +1 (binding)
> 
> - Start the standalone service
> - Publish and consume messages
> - Run the Cassandra connector
> - Validate the stateful function
> 
> Thanks,
> Penghui
> 
> On Tue, Aug 23, 2022 at 9:57 AM guo jiwei  wrote:
> 
>> +1 (binding)
>> 
>> - Checked checksums and signatures
>> - Checked license headers using Apache Rat
>> - Compiled the source by JDK11
>> - Ran the standalone server
>> - Confirmed that producer and consumer work properly
>> - Validated functions, connectors, and stateful functions
>> 
>> 
>> Regards
>> Jiwei Guo (Tboy)
>> 
>> 
>> On Mon, Aug 15, 2022 at 10:18 AM Qiang Huang 
>> wrote:
>> 
>>> Got it. Thx.
>>> 
>>> Yunze Xu  于2022年8月14日周日 23:22写道:
>>> 
>>>> You can see
>>>> https://lists.apache.org/thread/rg1g083c06ozm5go6zo1jophg9y9zl2f
>>>> for more details about the LTS release.
>>>> 
>>>> Thanks,
>>>> Yunze
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> 2022年8月14日 11:00,Qiang Huang  写道:
>>>>> 
>>>>> +1 (non-binding)
>>>>> Is 2.8.4 a long term support release?
>>>>> 
>>>>> Yunze Xu  于2022年8月12日周五 16:20写道:
>>>>> 
>>>>>> This is the first release candidate for Apache Pulsar, version
>> 2.8.4.
>>>>>> 
>>>>>> It fixes the following issues:
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.8.4
>>>>>> 
>>>>>> *** Please download, test and vote on this release. This vote will
>>> stay
>>>>>> open
>>>>>> for at least 72 hours ***
>>>>>> 
>>>>>> Note that we are voting upon the source (tag), binaries are provided
>>> for
>>>>>> convenience.
>>>>>> 
>>>>>> Source and binary files:
>>>>>> 
>>> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.4-candidate-1/
>>>>>> 
>>>>>> SHA-512 checksums:
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> c3d26704f2cfb3365c29d4110612ca7351084f8bee3c306d5e906b3f9b22c7557cc5baf12f74f8c222baccae3310691419eda5b47fdf9cd6c5281b70134ab5eb
>>>>>> apache-pulsar-2.8.4-bin.tar.gz
>>>>>> 
>>>> 
>>> 
>> 28160ee94dccfb74dfb56e0e5d0e08870c6612659507333a52b5660ecbf060a75d1eed667cffd8596f9468de95055b78916b932db0e0d4c2745868d55429ee98
>>>>>> apache-pulsar-2.8.4-src.tar.gz
>>>>>> 
>>>>>> Maven staging repo:
>>>>>> 
>>>> 
>> https://repository.apache.org/content/repositories/orgapachepulsar-1170/
>>>>>> 
>>>>>> The tag to be voted upon:
>>>>>> v2.8.4-candidate-1 (02ee5616866d4eda8dd94f85d9d9b71c459f248d)
>>>>>> https://github.com/apache/pulsar/releases/tag/v2.8.4-candidate-1
>>>>>> 
>>>>>> Pulsar's KEYS file containing PGP keys we use to sign the release:
>>>>>> https://dist.apache.org/repos/dist/dev/pulsar/KEYS
>>>>>> 
>>>>>> Docker images:
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://hub.docker.com/layers/pulsar/bewaremypower/pulsar/2.8.4/images/sha256-fba51a75c0f2ca79fbff7b254f80f641fcda661fd702f8149bbfdd5994078e3a
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://hub.docker.com/layers/pulsar-all/bewaremypower/pulsar-all/2.8.4/images/sha256-42d4b41e5869edc6242bb49d6a1687bd6d191a6385637122edc5c7b2c44ee46f
>>>>>> 
>>>>>> Please download the source package, and follow the Release Candidate
>>>>>> Validation[1] to validate the release
>>>>>> 
>>>>>> [1]
>>> https://github.com/apache/pulsar/wiki/Release-Candidate-Validation
>>>>>> 
>>>>>> Thanks,
>>>>>> Yunze
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> BR,
>>>>> Qiang Huang
>>>> 
>>>> 
>>> 
>>> --
>>> BR,
>>> Qiang Huang
>>> 
>> 



Re: [VOTE] Pulsar Release 2.8.4 Candidate 1

2022-09-02 Thread Yunze Xu
Thank you all,

Close the vote with 3 (+1) bindings and 2 (+1) non-bindings.

I am closing this vote and will continue the release process.

Thanks,
Yunze




> On Sep 1, 2022, at 11:33, Michael Marshall  wrote:
> 
> +1 (binding)
> 
> - Verified signatures for 45 artifacts
> - Verified the checksums for 45 artifacts
> - Compiled `apache-pulsar-2.8.4-src.tar.gz` running `mvn clean install
> -DskipTests` using JDK 11
> - Verified `mvn apache-rat:check` passes
> - Ran pulsar standalone, verified some pulsar-admin commands, verified
> perf produce/consume worked using JDK 11
> 
> Thanks for running the release, Yunze!
> 
> - Michael
> 
> On Mon, Aug 29, 2022 at 10:30 AM Nicolò Boschi  wrote:
>> 
>> +1 (non binding)
>> 
>> Checks:
>> 
>> - Checksum and signatures
>> 
>> - Apache Rat check passes
>> 
>> - Compile from source
>> 
>> - Run Pulsar standalone and produce-consume from CLI
>> 
>> - Tested K8S installation with Datastax Pulsar helm chart and verified TLS,
>> offloads and ElasticSearch sink
>> 
>> 
>> 
>> Nicolò Boschi
>> 
>> 
>> Il giorno lun 29 ago 2022 alle ore 17:20 Yunze Xu
>>  ha scritto:
>> 
>>> Hi everyone, there are already two binding +1. Could anyone else help
>>> verify it?
>>> 
>>> Thanks,
>>> Yunze
>>> 
>>> 
>>> 
>>> 
>>>> 2022年8月23日 21:26,PengHui Li  写道:
>>>> 
>>>> +1 (binding)
>>>> 
>>>> - Start the standalone service
>>>> - Publish and consume messages
>>>> - Run the Cassandra connector
>>>> - Validate the stateful function
>>>> 
>>>> Thanks,
>>>> Penghui
>>>> 
>>>> On Tue, Aug 23, 2022 at 9:57 AM guo jiwei  wrote:
>>>> 
>>>>> +1 (binding)
>>>>> 
>>>>> - Checked checksums and signatures
>>>>> - Checked license headers using Apache Rat
>>>>> - Compiled the source by JDK11
>>>>> - Ran the standalone server
>>>>> - Confirmed that producer and consumer work properly
>>>>> - Validated functions, connectors, and stateful functions
>>>>> 
>>>>> 
>>>>> Regards
>>>>> Jiwei Guo (Tboy)
>>>>> 
>>>>> 
>>>>> On Mon, Aug 15, 2022 at 10:18 AM Qiang Huang >>> 
>>>>> wrote:
>>>>> 
>>>>>> Got it. Thx.
>>>>>> 
>>>>>> Yunze Xu  于2022年8月14日周日 23:22写道:
>>>>>> 
>>>>>>> You can see
>>>>>>> https://lists.apache.org/thread/rg1g083c06ozm5go6zo1jophg9y9zl2f
>>>>>>> for more details about the LTS release.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Yunze
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> 2022年8月14日 11:00,Qiang Huang  写道:
>>>>>>>> 
>>>>>>>> +1 (non-binding)
>>>>>>>> Is 2.8.4 a long term support release?
>>>>>>>> 
>>>>>>>> Yunze Xu  于2022年8月12日周五 16:20写道:
>>>>>>>> 
>>>>>>>>> This is the first release candidate for Apache Pulsar, version
>>>>> 2.8.4.
>>>>>>>>> 
>>>>>>>>> It fixes the following issues:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> https://github.com/apache/pulsar/pulls?q=is%3Amerged+is%3Apr+label%3Arelease%2F2.8.4
>>>>>>>>> 
>>>>>>>>> *** Please download, test and vote on this release. This vote will
>>>>>> stay
>>>>>>>>> open
>>>>>>>>> for at least 72 hours ***
>>>>>>>>> 
>>>>>>>>> Note that we are voting upon the source (tag), binaries are provided
>>>>>> for
>>>>>>>>> convenience.
>>>>>>>>> 
>>>>>>>>> Source and binary files:
>>>>>>>>> 
>>>>>> 
>>> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-2.8.4-candidate-1/
>>>>>>>>> 
>>>>>>>>> SHA-512 checksums:
>

Re: [DISCUSS] User-friendly acknowledgeCumulative API on a partitioned topic or multi-topics

2022-09-04 Thread Yunze Xu
I am busy on other things recently so there is no further update. But
I found there is already two methods to acknowledge multiple messages
in Java client.

```java
void acknowledge(Messages messages) throws PulsarClientException;

void acknowledge(List messageIdList) throws 
PulsarClientException;
```

And here is the issue to track the catch up:

https://github.com/apache/pulsar/issues/17428

Yunze




> On Sep 4, 2022, at 22:37, Asaf Mesika  wrote:
> 
> What eventually happened with this idea?
> 
> On Fri, Jul 29, 2022 at 8:02 AM PengHui Li  wrote:
> 
>> +1
>> 
>> Penghui
>> On Jul 28, 2022, 20:14 +0800, lordcheng10 <1572139...@qq.com.invalid>,
>> wrote:
>>> Nice feature!
>>> 
>>> 
>>> 
>>> 
>>> -- Original --
>>> From: "Yunze Xu">> Date: 2022Äê7ÔÂ15ÈÕ(ÐÇÆÚÎå) ÍíÉÏ6:04
>>> To: "dev">> Subject: [DISCUSS] User-friendly acknowledgeCumulative API on a
>> partitioned topic or multi-topics
>>> 
>>> 
>>> 
>>> Hi all,
>>> 
>>> Long days ago I opened a PR to support cumulative acknowledgement
>>> for C++ client, but it's controversial about whether should a
>>> partitioned consumer acknowledge a message ID cumulatively.
>>> 
>>> See https://github.com/apache/pulsar/pull/6796 for discussion.
>>> 
>>> Currently, the Java client acknowledges the specific partition of the
>>> message ID, while the C++ client just fails when calling
>>> `acknowledgeCumulative` on a partitioned topic. However, even if the
>>> Java client doesn't fail, it's not user friendly.
>>> 
>>> Assuming users called `acknowledgeCumulative` periodically, there is a
>>> chance that some messages of the specific partition has never been
>>> passed to the method.
>>> 
>>> For example, a consumer received:
>>> 
>>> P0-M0, P1-M0, P0-M1, P1-M1, P0-M2, P1-M2...
>>> 
>>> And the user acknowledged every two messages, i.e.
>>> 
>>> P0-M0, P0-M1, P0-M2
>>> 
>>> Eventually, partition 1 has never been acknowledged.
>>> 
>>> User must maintain its own `Map>> partitioned topic or multi-topics consumer with the existing
>>> `acknowledgeCumulative` API.
>>> 
>>> Should we make it more friendly for users? For example, we can make
>>> `acknowledgeCumulative` accept the map to remind users to maintain
>>> the map from topic name to message ID:
>>> 
>>> ```java
>>> // the key is the partitioned topic name like my-topic-partition-0
>>> void acknowledgeCumulative(Map>> ```
>>> 
>>> For those who don't want to maintain the map by themselves, maybe we
>>> can provide a simpler API like:
>>> 
>>> ```java
>>> // acknowlegde all latest received messages
>>> void acknowledgeCumulative();
>>> ```
>>> 
>>> and provide an option to enable this behavior.
>>> 
>>> Do you have any suggestion on this idea? I will prepare a proposal if
>>> there is no disagreement.
>>> 
>>> Thanks,
>>> Yunze
>> 



[DISCUSS] Improvements on the release process

2022-09-05 Thread Yunze Xu
Hi all,

I'm working on 2.8.4 release recently. When I followed the release
process, I found many steps are outdated so that I turned to the
previous release managers for help frequently. Since the release
process is now in the codebase [1], I opened a PR for some
improvements on it. [2]

PTAL especially if you have been a release manager before.

[1] https://lists.apache.org/thread/mv1to079cznkxdldrpoq5518l2ozl5kr
[2] https://github.com/apache/pulsar/pull/17470

Thanks,
Yunze






Re: [DISCUSS] Improvements on the release process

2022-09-05 Thread Yunze Xu
Hi Yu,

Thanks for your reminder@

Thanks,
Yunze




> On Sep 6, 2022, at 11:56, Yu  wrote:
> 
> Hi Yunze,
> 
> Thanks for updating the workflow!
> 
> ~~
> 
> Hi all,
> 
> For the release process, we've updated the doc-related workflow [1] since
> we changed the doc maintenance strategy [2].
> 
> TL;DR
> Breaking change:
> For release managers: from 2.8.x, you do not need to generate
> an independent doc set for each release.
> 
> ~~
> 
> [1] Workflow changes:
> 
> - For doc contributors:
> https://docs.google.com/document/d/1-1uJyd1k9_h56xiiVRVOnrLcCnTmg9n7SrHhNVNEEi4/edit#bookmark=id.q5m2r5pimdi6
> 
> - For release managers:
> https://github.com/apache/pulsar/pull/17130/files#diff-f3115b8be648c3dc440594799619e7ce4408a34efab13b9d57a902030b62562c
> 
> [2] https://github.com/apache/pulsar/issues/16637
> 
> ~~~~~~
> 
> Feel free to comment, thank you!
> 
> Yu
> 
> On Tue, Sep 6, 2022 at 10:57 AM Yunze Xu 
> wrote:
> 
>> Hi all,
>> 
>> I'm working on 2.8.4 release recently. When I followed the release
>> process, I found many steps are outdated so that I turned to the
>> previous release managers for help frequently. Since the release
>> process is now in the codebase [1], I opened a PR for some
>> improvements on it. [2]
>> 
>> PTAL especially if you have been a release manager before.
>> 
>> [1] https://lists.apache.org/thread/mv1to079cznkxdldrpoq5518l2ozl5kr
>> [2] https://github.com/apache/pulsar/pull/17470
>> 
>> Thanks,
>> Yunze
>> 
>> 
>> 
>> 
>> 



Re: [DISCUSS] User-friendly acknowledgeCumulative API on a partitioned topic or multi-topics

2022-09-07 Thread Yunze Xu
Sure. I’m glad to see that. Just a little confused about who is Tarun?

Thanks,
Yunze




> On Sep 6, 2022, at 17:40, Shivji Kumar Jha  wrote:
> 
> ++ Tarun
> 
> Hi Yunze,
> 
> We would love to have this.
> 
> ```java
> // the key is the partitioned topic name like my-topic-partition-0
> void acknowledgeCumulative(Map topicToMessageId);
> ```
> 
> If you are busy with other things, do you mind Tarun taking this up ? Happy
> to have you as a reviewer.
> 
> Regards,
> Shivji Kumar Jha
> http://www.shivjijha.com/
> +91 8884075512
> 
> 
> On Sun, 4 Sept 2022 at 21:25, Yunze Xu  wrote:
> 
>> I am busy on other things recently so there is no further update. But
>> I found there is already two methods to acknowledge multiple messages
>> in Java client.
>> 
>> ```java
>>void acknowledge(Messages messages) throws PulsarClientException;
>> 
>>void acknowledge(List messageIdList) throws
>> PulsarClientException;
>> ```
>> 
>> And here is the issue to track the catch up:
>> 
>> https://github.com/apache/pulsar/issues/17428
>> 
>> Yunze
>> 
>> 
>> 
>> 
>>> On Sep 4, 2022, at 22:37, Asaf Mesika  wrote:
>>> 
>>> What eventually happened with this idea?
>>> 
>>> On Fri, Jul 29, 2022 at 8:02 AM PengHui Li 
>> wrote:
>>> 
>>>> +1
>>>> 
>>>> Penghui
>>>> On Jul 28, 2022, 20:14 +0800, lordcheng10 <1572139...@qq.com.invalid>,
>>>> wrote:
>>>>> Nice feature!
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- Original --
>>>>> From: "Yunze Xu">>>> Date: 2022Äê7ÔÂ15ÈÕ(ÐÇÆÚÎå) ÍíÉÏ6:04
>>>>> To: "dev">>>> Subject: [DISCUSS] User-friendly acknowledgeCumulative API on a
>>>> partitioned topic or multi-topics
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> Long days ago I opened a PR to support cumulative acknowledgement
>>>>> for C++ client, but it's controversial about whether should a
>>>>> partitioned consumer acknowledge a message ID cumulatively.
>>>>> 
>>>>> See https://github.com/apache/pulsar/pull/6796 for discussion.
>>>>> 
>>>>> Currently, the Java client acknowledges the specific partition of the
>>>>> message ID, while the C++ client just fails when calling
>>>>> `acknowledgeCumulative` on a partitioned topic. However, even if the
>>>>> Java client doesn't fail, it's not user friendly.
>>>>> 
>>>>> Assuming users called `acknowledgeCumulative` periodically, there is a
>>>>> chance that some messages of the specific partition has never been
>>>>> passed to the method.
>>>>> 
>>>>> For example, a consumer received:
>>>>> 
>>>>> P0-M0, P1-M0, P0-M1, P1-M1, P0-M2, P1-M2...
>>>>> 
>>>>> And the user acknowledged every two messages, i.e.
>>>>> 
>>>>> P0-M0, P0-M1, P0-M2
>>>>> 
>>>>> Eventually, partition 1 has never been acknowledged.
>>>>> 
>>>>> User must maintain its own `Map>>>> partitioned topic or multi-topics consumer with the existing
>>>>> `acknowledgeCumulative` API.
>>>>> 
>>>>> Should we make it more friendly for users? For example, we can make
>>>>> `acknowledgeCumulative` accept the map to remind users to maintain
>>>>> the map from topic name to message ID:
>>>>> 
>>>>> ```java
>>>>> // the key is the partitioned topic name like my-topic-partition-0
>>>>> void acknowledgeCumulative(Map>>>> ```
>>>>> 
>>>>> For those who don't want to maintain the map by themselves, maybe we
>>>>> can provide a simpler API like:
>>>>> 
>>>>> ```java
>>>>> // acknowlegde all latest received messages
>>>>> void acknowledgeCumulative();
>>>>> ```
>>>>> 
>>>>> and provide an option to enable this behavior.
>>>>> 
>>>>> Do you have any suggestion on this idea? I will prepare a proposal if
>>>>> there is no disagreement.
>>>>> 
>>>>> Thanks,
>>>>> Yunze
>>>> 
>> 
>> 



Re: [DISCUSS] Improvements on the release process

2022-09-07 Thread Yunze Xu
Good suggestion. I will update in the PR soon.

Thanks,
Yunze




> On Sep 6, 2022, at 15:46, Haiting Jiang  wrote:
> 
> There are a lot of work before current release process, maybe we should
> also include these:
> 
> 1. Start a discussion on the mail list about the release. We can provide a
> template to include more clear info about opening PRs and PRs to be
> cherry-picked to the released branch.
> 
> 2. Handling all the opening PRs and PRs to be cherry-picked.
> 
> 2.1 Revisit the PR if it should be ported to the released branch. The
> release label may not be correct.
> 
> 2.2 Cherry-pick merged PR to the released branch. If there are too many
> conflicts, we can ask the PR author to open a new PR to the released
> branch. If the PR can be cherry-picked directly, we should also check the
> CI status of the branch after we push them directly.
> 
> 2.3 It would be better if we have a clear time window that we should wait
> until we postpone the PR to the next release.



Re: [DISCUSS] PIP-70: Introduce lightweight raw Message metadata

2020-11-15 Thread Yunze Xu
I think protobuf has the ability to check if a field is enabled. i.e. 
RAW_METADATA_MAGIC_NUMBER and RAW_METADATA_SIZE are included in the protobuf-ed 
struct. In Kafka, a magic number represents the version of protocol not if the 
feature is enabled. If we need a *real* magic number, we must make it clear.

On 2020/11/09 06:24:18, Aloys Zhang  wrote: 
> Hi all,> 
> 
> We have drafted a proposal for supporting lightweight raw Message metadata> 
> which can be found at> 
> https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata>
>  
>  and> 
> https://docs.google.com/document/d/1IgnF9AJzL6JG6G4EL_xcoQxvOpd7bUXcgxFApBiPOFY>
>  
> 
> Also, I copy it to the email thread for easier viewing.> 
> 
> Any suggestions or ideas are welcomed to join the discussion.> 
> 
> 
> 
> ## PIP-70:  Introduce lightweight raw Message metadata> 
> 
> ### 1. Motivation> 
> 
> For messages in Pulsar, If we want to add new property, we always change> 
> the `MessageMetadata` in protocol(PulsarApi.proto), this kind of property> 
> could be understood by both the broker side and client side by> 
> deserializing the `MessageMetadata` . But in some different cases,, the> 
> property needs to be added from the broker side, Or need to be understood> 
> by the broker side in a low cost way. When the broker side gets the message> 
> produced from the client,  we could add the property at a new area, which> 
> does not combine with `MessageMetadata`, and no need deserializing original> 
> `MessageMetadata` when gets it out ; and when the broker sends the message> 
> to client, we could choose to filter out this part of property(or not as> 
> the client needs). We call this kind of property “raw Message metadata”. By> 
> this way, the “raw Message metadata” consumption is independent, and not> 
> related with the original `MessageMetadata`.> 
> 
> The benefit for this kind of “raw Message metadata” is that the broker does> 
> not need to  serialize/deserialize for the protobuf-ed `MessageMetadata`,> 
> this will provide a better performance. And also could provide a lot of> 
> features that are not supported yet.> 
> 
> Here are some of the use cases for raw Message metadata:> 
> 1) Provide ordered messages by time(broker side) sequence to make message> 
> seek by time more accurate.> 
> Currently, each message has a `publish_time`, it uses client side time, but> 
> for different producers in different clients, the time may not align> 
> between clients, and cause the message order and the message time> 
> (`publish_time`) order may be different.  But each topic-partition only has> 
> one owner broker, if we append broker side time in the “raw Message> 
> metadata”, we could make sure the message order is aligned with broker side> 
> time. With this feature, we could handle the message seek by time more> 
> accurately.> 
> 
> 2) Provide continuous message sequence-Id for messages in one> 
> topic-partition.> 
> MessageId is a combination of ledgerId+entryId+batchIndex; for a partition> 
> that contains more than one ledger, the Ids inside is not continuous. By> 
> this solution, we could append a sequence-Id at the end of each Message.> 
> This will make the message sequence management earlier.> 
> 
> In this proposal, we will take count in the first feature “provide ordered> 
> message by time(broker side) sequence” mentioned above, this will be easier> 
> to go through the proposal.> 
> 
> ### 2. Message and “raw Message metadata” structure changes.> 
> 
> As mentioned above, there are 2 main benefits in this proposal:> 
> 
> 1. Most of all the change happened on the Broker side.> 
> 2. Avoid to serialize/deserialize for the protobuf-ed `MessageMetadata`.> 
> 
>  2.1 Raw Message metadata structure in Protobuf> 
> 
> Protobuf used a lot in Pulsar, we could use Protobuf to do the raw Message> 
> metadata serialize/deserialize.> 
> In this example, we will save the broker side timestamp when each message> 
> is sent from the broker to BookKeeper. So the definition is very simple.> 
> 
> ```protobuf> 
> message RawMessageMetadata {> 
> optional uint64 broker_timestamp = 1;> 
>}> 
> ```> 
> 
>  2.2 Message and “raw Message metadata” structure details> 
> 
> Each message is send from producer client to broker in this frame format:> 
> 
> ```> 
> [TOTAL_SIZE] [CMD_SIZE][CMD] [MAGIC_NUMBER] [CHECKSUM] [METADATA_SIZE]> 
> [METADATA] [PAYLOAD]> 
> ```> 
> 
> The first 3 fields “[TOTAL_SIZE] [CMD_SIZE ] [CMD]” will be read in> 
> `LengthFieldBasedFrameDecoder`  and `PulsarDecoder`, and left the rest part> 
> handled in method> 
> `org.apache.pulsar.broker.service.Producer.publishMessage`. The left part> 
> “[MAGIC_NUMBER] [CHECKSUM] [METADATA_SIZE] [METADATA] [PAYLOAD]” is usually> 
> treated as “headersAndPayload” in the code. As described above, we do not> 
> want this part to be changed at all, so we could take this part as a whole> 
> package.> 
> 
> ```> 
> [MAGIC_NUMBER] [CHECKSUM] [METADATA_SIZ

Re: Virtual Pulsar Community Meetings

2021-01-29 Thread Yunze Xu
+1

> 2021年1月29日 下午4:53,Sijie Guo  写道:
> 
> That's a super great idea! Thank you for bringing this up!
> 
> Given there are a lot of committers/contributors are from North America and
> Asia, I think we should pick up a better time that would be suitable for
> people from Asia (Japan and China).
> In the bookkeeper community, we used to run two events. One is to cover NA
> and Asia, and the other one is to cover NA and EU.
> 
> I would suggest running biweekly meetings.
> 
> - Tuesday 4 PM - 5 PM PST for NA and Asia
> - Thursday 8 AM - 9 AM PST for NA and EU
> 
> All the events can be recorded and uploaded to Youtube. So people are able
> to watch the recordings as well.
> 
> If people are good with this proposal, I am happy to set up and coordinate
> the meetings given I have run many meetings in the BookKeeper community
> before.
> 
> We can get started next Tuesday and formalize the process as we go. Please
> vote for your idea.
> 
> Thanks,
> Sijie
> 
> 
> On Fri, Jan 29, 2021 at 12:05 AM Enrico Olivelli 
> wrote:
> 
>> Hello everyone,
>> in the BookKeeper community we used to have "Community Meetings" in order
>> to meet each other, discuss current works on the project, share knowledge
>> about current problems.
>> 
>> What about having Community meetings for Pulsar ?
>> I would be happy to organize and to host the first meeting.
>> 
>> In Apache we keep the decisions and the discussions on mailing lists, so
>> these meetings would be only to share information and we are not going to
>> make decisions.
>> 
>> The Pulsar community is distributed all over the world, we have people from
>> China, the US, Europe, it will be hard to find a good time for everyone.
>> I suggest for the first meeting to meet at 8:30 PST
>> 
>> For reference, this is the link to the "minutes" of the Community Meetings
>> in BK,
>> Sijie and Matteo will remember those days
>> https://cwiki.apache.org/confluence/display/BOOKKEEPER/Community+Meetings
>> 
>> 
>> Enrico
>> 



Re: Pulsar Server on JDK11 - initial discussion

2021-02-09 Thread Yunze Xu
LGTM if I don’t understand wrong.

In short, there’re two key points:

1. Keep the code compatible with Java 8.
2. Release the binaries that are built on JDK 11 to get the benefit.

So Java 8 users can still build Pulsar with JDK 8 but the default releases are 
built on JDK 11.

- Yunze



Re: Very flaky CPP tests - BasicEndToEndTest.testLookupThrottling

2021-04-05 Thread Yunze Xu
This flaky test have existed for a long time, even before I contributed my 
first PR to Pulsar. I’ve been looking into the problem for several times but 
all blocked for that I cannot reproduce it in my local environment.

From the code and logs, it looks like the cleanup phase encountered a 
segmentation fault, which may be caused by a possible null pointer accession in 
multi-thread test environment with https://github.com/google/gtest-parallel 
<https://github.com/google/gtest-parallel>. However, I still cannot reproduce 
it with the same command of CI.

I just want to have a CI environment to test my ideas, but you know, the 
Pulsar’s workflows cost too much time, a tiny change may wait too long. Is 
there any way to run similar CI in my own repo and verify if the change could 
fix the flaky test?

Regards
Yunze Xu

> 2021年4月2日 下午3:09,Enrico Olivelli  写道:
> 
> Hello,
> We have this BasicEndToEndTest.testLookupThrottling CPP test that is very 
> flaky
> 
> We already created two issues about it
> https://github.com/apache/pulsar/issues/6301
> https://github.com/apache/pulsar/issues/6267
> 
> Is there any expert of the CPP client that can help in fixing this?
> In my experience I see it failing very often on CI
> 
> Regards
> Enrico



Re: Very flaky CPP tests - BasicEndToEndTest.testLookupThrottling

2021-04-05 Thread Yunze Xu
Thanks for the instruction, I’ll take a look later.

Regards
Yunze Xu

> 2021年4月6日 上午1:38,Devin Bost  写道:
> 
> Check out the instructions by Lari Hotari here about how to setup your
> personal Github CI:
> https://markmail.org/message/xapp7aguh44osqhm
> 
> I've been working through similar issues as I've been facing a
> NullPointerException (in a PR of mine) that consistently occurs in Github
> but never occurs locally.
> 
> I tried using Act (https://github.com/nektos/act) to emulate the
> environment but ran into different issues, probably permission related.
> 
> --
> Devin G. Bost
> 
> On Mon, Apr 5, 2021, 10:51 AM Yunze Xu  wrote:
> 
>> This flaky test have existed for a long time, even before I contributed my
>> first PR to Pulsar. I’ve been looking into the problem for several times
>> but all blocked for that I cannot reproduce it in my local environment.
>> 
>> From the code and logs, it looks like the cleanup phase encountered a
>> segmentation fault, which may be caused by a possible null pointer
>> accession in multi-thread test environment with
>> https://github.com/google/gtest-parallel <
>> https://github.com/google/gtest-parallel>. However, I still cannot
>> reproduce it with the same command of CI.
>> 
>> I just want to have a CI environment to test my ideas, but you know, the
>> Pulsar’s workflows cost too much time, a tiny change may wait too long. Is
>> there any way to run similar CI in my own repo and verify if the change
>> could fix the flaky test?
>> 
>> Regards
>> Yunze Xu
>> 
>>> 2021年4月2日 下午3:09,Enrico Olivelli  写道:
>>> 
>>> Hello,
>>> We have this BasicEndToEndTest.testLookupThrottling CPP test that is
>> very flaky
>>> 
>>> We already created two issues about it
>>> https://github.com/apache/pulsar/issues/6301
>>> https://github.com/apache/pulsar/issues/6267
>>> 
>>> Is there any expert of the CPP client that can help in fixing this?
>>> In my experience I see it failing very often on CI
>>> 
>>> Regards
>>> Enrico
>> 
>> 



Re: [DISCUSS] Apache Pulsar 2.8.0 Release

2021-05-11 Thread Yunze Xu
We need a PR to upgrade BK to 4.14.0 before Pulsar 2.8.0 release, see 
https://github.com/apache/pulsar/pull/10330 
<https://github.com/apache/pulsar/pull/10330> 

Thanks,
Yunze Xu



Re: Pulsar WebSite Builder is failing

2021-05-21 Thread Yunze Xu
Hi Enrico, could you try https://github.com/apache/pulsar/pull/10668 
 in your local env?

Thanks,
Yunze

> 2021年5月18日 下午5:24,Enrico Olivelli  写道:
> 
> Hello,
> we are still stuck.
> 
> the website builder does not work.
> This time is because the build image picks up Python 3.5.2 and so the
> job is not able to build the Python client that apparently cannot be
> compiled with that version.
> 
> I tried to tweak the build image in order to force Python 3.9 but
> without success.
> 
> I cannot announce 2.7.2 release, and also all the website/docs
> improvements are stuck
> 
> Any help is appreciated
> :-)
> 
> Enrico
> 
> Il giorno ven 14 mag 2021 alle ore 14:33 Enrico Olivelli
>  ha scritto:
>> 
>> Il giorno ven 14 mag 2021 alle ore 13:43 Enrico Olivelli
>>  ha scritto:
>>> 
>>> Hello,
>>> the website builder is failing on CI, below you can find the error
>>> 
>>> Is there anyone who knows how it works and how to fix it ?
>>> Is there a way to update it manually ?
>>> 
>>> I cannot announce 2.7.2 release until the website is updated.
>>> 
>>> 
>>> this is the link:
>>> https://github.com/apache/pulsar/runs/2581534691?check_suite_focus=true
>>> 
>>> this is the error:
>>> Please, upgrade your dependencies to the actual version of core-js.
>>> 38596warning jest > jest-cli > jest-config > babel-core >
>>> babel-register > core-js@2.6.12: core-js@<3.3 is no longer maintained
>>> and not recommended for usage due to the number of issues. Because of
>>> the V8 engine whims, feature detection in old core-js versions could
>>> cause a slowdown up to 100x even if nothing is polyfilled. Please,
>>> upgrade your dependencies to the actual version of core-js.
>>> 38597warning jest > jest-cli > jest-environment-jsdom > jsdom >
>>> left-pad@1.3.0: use String.prototype.padStart()
>>> 38598warning jest > jest-cli > jest-environment-jsdom > jsdom >
>>> request-promise-native@1.0.9: request-promise-native has been
>>> deprecated because it extends the now deprecated request package, see
>>> https://github.com/request/request/issues/3142
>>> 38599warning jest > jest-cli > jest-haste-map > sane >
>>> fsevents@1.2.13: fsevents 1 will break on node v14+ and could be using
>>> insecure binaries. Upgrade to fsevents 2.
>>> 38600warning highlight.js@9.18.5: Support has ended for 9.x series.
>>> Upgrade to @latest
>>> 38601[2/4] Fetching packages...
>>> 38602info fsevents@1.2.13: The platform "linux" is incompatible with
>>> this module.
>>> 38603info "fsevents@1.2.13" is an optional dependency and failed
>>> compatibility check. Excluding it from installation.
>>> 38604error @redocly/openapi-core@1.0.0-beta.45: The engine "node" is
>>> incompatible with this module. Expected version ">=12.0.0". Got
>>> "10.23.3"
>>> 38605error Found incompatible module.
>>> 38606info Visit https://yarnpkg.com/en/docs/cli/install for
>>> documentation about this command.
>>> 38607Error: Process completed with exit code 1.
>> 
>> it looks like the problem is in the "pulsar build  image" that still
>> installs Node 10 instead of Node 12
>> 
>> I am trying to prepare a quick fix, but I will need help from someone
>> who can push the image to dockerhub
>> 
>> 
>> Enrico
>> 
>>> 
>>> Enrico



Correct TopicName#getPartitionIndex implementation

2021-06-08 Thread Yunze Xu
Hi all,

Currently the Java implementation to get the partition index of a topic name
is not correct. See
https://github.com/apache/pulsar/pull/8341/files#diff-445b0cfa56ca0c784df78e73d9294f2a37f079ca3c15c345b03c09d56f81ebff
 

 
for the unit tests I added.

I also noticed the problem in https://github.com/apache/pulsar/pull/10850 
 
because transaction buffer snapshot topic name may be `xxx-partition-0-yyy`,
which should not be treated as a partitioned topic.

Since Pulsar is 2.9.0-SNAPSHOT now, is it proper to correct the implementation?
What I concerned is the compatibility because we can’t assume users never used
a topic name like `my-topic-partition-000` to reference the partition 0 of
`my-topic`. If the behavior was corrected, `my-topic-partition-000` would be
treated as a non-partitioned topic.

I'm not sure if this change could have a wide influence, so I want to begin a
discussion about it.

Thanks,
Yunze

Re: Correct TopicName#getPartitionIndex implementation

2021-06-11 Thread Yunze Xu
Okay, I Just had a offline discussion with @yangl about this problem before.
And he or I will fix the getPartitionIndex implementation later.

Thanks,
Yunze



Update for apachepulsar/pulsar-build images

2021-07-20 Thread Yunze Xu
Hello,
Currently Pulsar's CI workflows rely on some docker images, but I found these
images are maintained mannually by who has the permission. See
https://hub.docker.com/r/apachepulsar/pulsar-build/tags?page=1&ordering=last_updated
If any Dockerfile changed, the image wouldn't be built and pushed to docker hub
automatically.

In this email, I'm only asking someone that has the permission to update Python
related images so that https://github.com/apache/pulsar/issues/11004 
<https://github.com/apache/pulsar/issues/11004> can be fixed.
Just run create-images.sh and push-images.sh under pulsar-client-cpp/docker
directory.

However, further more, it's better to add a workflow to push these images if
necessary. Can anyone help with it?

Thanks,
Yunze Xu

Re: [DISCUSS] PIP-91: Separate lookup timeout from operation timeout

2021-08-09 Thread Yunze Xu
+1

It makes sense to me. I also encountered TooManyRequests in topic lookup when 
there’re a lot of topics.

It should be retriable instead of a simple error response to client.

Thanks,
Yunze

> 2021年8月9日 下午10:11,Ivan Kelly  写道:
> 
> Hi folks,
> 
> I've created a PIP to do some rework on lookup timeouts and retries.
> We've had major client incidents recently due to a client with many
> many producers, which triggered a herding effect on broker restarts.
> This PIP aims to alleviate some of the issues we saw.
> 
> In summary, we want to retry (with backoff and jitter) on timeout and
> on TooManyRequests. Kicking the error back to the client just results
> in the clients restarting and trying again.
> 
> Please take a look.
> 
> https://github.com/apache/pulsar/wiki/PIP-91:-Separate-lookup-timeout-from-operation-timeout
> 
> Cheers,
> Ivan



Re: [ANNOUNCE] New committer: Rui Fu

2021-08-10 Thread Yunze Xu
Congrats Rui!

Thanks,
Yunze


Re: PIP-93 Pulsar Proxy Protocol Handlers

2021-08-30 Thread Yunze Xu
+1. Great idea.

I’m not familiar with Pulsar Proxy and have a question. How can a proxy 
protocol handler
Reuse the existing code of a protocol handler?

Thanks,
Yunze

> 2021年8月30日 下午4:47,Enrico Olivelli  写道:
> 
> Hello Pulsar fellows,
> 
> I have prepared a PIP about adding support for Protocol Handlers
> 
> This is the GDoc
> 
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> 
> 
> This is the PR for the implementation
> https://github.com/apache/pulsar/pull/11838/files
> 
> I am pretty sure that this PIP will make life of developers of Protocol
> Handlers and of Administrators who deploy Protocol Handlers very nicer
> 
> We are still working on the formal PIP process, at the moment I am sharing
> with you the document.
> My understanding is that after the discussion, I will start a VOTE thread,
> and if the VOTE passes we can move forward with reviewing the PR, and
> hopefully merge this feature for Pulsar 2.9.0
> 
> Enrico



Re: PIP-93 Pulsar Proxy Protocol Handlers

2021-08-30 Thread Yunze Xu
If I didn’t understand wrong, we’re going to use both broker version and proxy 
version KoP:
- The proxy version is responsible for lookup/auth related requests like 
METADATA and SASL_XXX requests
- The broker version is responsible for other requests that require broker to 
be the topic owner, like PRODUCE and FETCH requests
Right?

Thanks,
Yunze

> 2021年8月30日 下午11:56,Enrico Olivelli  写道:
> 
> Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
>  ha scritto:
> 
>> +1. Great idea.
>> 
>> I’m not familiar with Pulsar Proxy and have a question. How can a proxy
>> protocol handler
>> Reuse the existing code of a protocol handler?
>> 
> 
> The code that runs on proxy will be much different from the code you have
> in the Broker Protocol Handler.
> 
> Basically the Proxy protocol handles do these things:
> - run the custom wire protocol (by starting custom Netty endpoints)
> - use the discovery service to proxy the requests to the Broker that is the
> owner of the topic
> - run authentication and forwards user identity (if needed) to the Broker
> - performs authorization
> 
> The Proxy protocol handler does not access the BrokerService and cannot
> access Pulsar broker internals
> 
> Enrico
> 
> 
> 
>> 
>> Thanks,
>> Yunze
>> 
>>> 2021年8月30日 下午4:47,Enrico Olivelli  写道:
>>> 
>>> Hello Pulsar fellows,
>>> 
>>> I have prepared a PIP about adding support for Protocol Handlers
>>> 
>>> This is the GDoc
>>> 
>>> 
>> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>>> 
>>> 
>>> This is the PR for the implementation
>>> https://github.com/apache/pulsar/pull/11838/files
>>> 
>>> I am pretty sure that this PIP will make life of developers of Protocol
>>> Handlers and of Administrators who deploy Protocol Handlers very nicer
>>> 
>>> We are still working on the formal PIP process, at the moment I am
>> sharing
>>> with you the document.
>>> My understanding is that after the discussion, I will start a VOTE
>> thread,
>>> and if the VOTE passes we can move forward with reviewing the PR, and
>>> hopefully merge this feature for Pulsar 2.9.0
>>> 
>>> Enrico
>> 
>> 



Re: PIP-93 Pulsar Proxy Protocol Handlers

2021-08-30 Thread Yunze Xu
Thanks for your explanation and I’m looking forward for the prototype 
implementation.

Thanks,
Yunze

> 2021年8月31日 上午4:17,Enrico Olivelli  写道:
> 
> Yunze,
> 
> Il Lun 30 Ago 2021, 18:48 Yunze Xu  ha
> scritto:
> 
>> If I didn’t understand wrong, we’re going to use both broker version and
>> proxy version KoP:
>> - The proxy version is responsible for lookup/auth related requests like
>> METADATA and SASL_XXX requests
>> - The broker version is responsible for other requests that require broker
>> to be the topic owner, like PRODUCE and FETCH requests
>> Right?
>> 
> 
> You are on the right way.
> Probably it is better to discuss about KOP in a separate thread.
> 
> Enrico
> 
> Thank
>> Yunze
>> 
>>> 2021年8月30日 下午11:56,Enrico Olivelli  写道:
>>> 
>>> Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
>>>  ha scritto:
>>> 
>>>> +1. Great idea.
>>>> 
>>>> I’m not familiar with Pulsar Proxy and have a question. How can a proxy
>>>> protocol handler
>>>> Reuse the existing code of a protocol handler?
>>>> 
>>> 
>>> The code that runs on proxy will be much different from the code you have
>>> in the Broker Protocol Handler.
>>> 
>>> Basically the Proxy protocol handles do these things:
>>> - run the custom wire protocol (by starting custom Netty endpoints)
>>> - use the discovery service to proxy the requests to the Broker that is
>> the
>>> owner of the topic
>>> - run authentication and forwards user identity (if needed) to the Broker
>>> - performs authorization
>>> 
>>> The Proxy protocol handler does not access the BrokerService and cannot
>>> access Pulsar broker internals
>>> 
>>> Enrico
>>> 
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> Yunze
>>>> 
>>>>> 2021年8月30日 下午4:47,Enrico Olivelli  写道:
>>>>> 
>>>>> Hello Pulsar fellows,
>>>>> 
>>>>> I have prepared a PIP about adding support for Protocol Handlers
>>>>> 
>>>>> This is the GDoc
>>>>> 
>>>>> 
>>>> 
>> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>>>>> 
>>>>> 
>>>>> This is the PR for the implementation
>>>>> https://github.com/apache/pulsar/pull/11838/files
>>>>> 
>>>>> I am pretty sure that this PIP will make life of developers of Protocol
>>>>> Handlers and of Administrators who deploy Protocol Handlers very nicer
>>>>> 
>>>>> We are still working on the formal PIP process, at the moment I am
>>>> sharing
>>>>> with you the document.
>>>>> My understanding is that after the discussion, I will start a VOTE
>>>> thread,
>>>>> and if the VOTE passes we can move forward with reviewing the PR, and
>>>>> hopefully merge this feature for Pulsar 2.9.0
>>>>> 
>>>>> Enrico
>>>> 
>>>> 
>> 
>> 



[PIP 94] Message converter at broker level

2021-09-07 Thread Yunze Xu
Hi, folks

I’ve created PIP 94, see https://github.com/apache/pulsar/issues/11962 
 for details.
PTAL and give your suggestions if you have any concern.

Thanks,
Yunze

[PIP 96] Payload converter for Pulsar client

2021-09-18 Thread Yunze Xu
Hi, folks

I’ve created PIP 94 before for a message converter at broker level. However,
after the dicussion, I determined to discard the proposal. And now, for the
same motivation, I've created PIP 96 to add a payload converter for Pulsar
client. It should be noted that the converter works on a payload, i.e. not
including the metadata/header parts.

See details at https://github.com/apache/pulsar/issues/12087 
, PTAL.

Thanks,
Yunze

Correct semantics of producer close

2021-09-27 Thread Yunze Xu
Hi all,

Recently I found a PR (https://github.com/apache/pulsar/pull/12195 
) that
modifies the existing semantics of producer close. There're already some
communications in this PR, but I think it's better to start a discussion here
to let more know.

The existing implementation of producer close is:
1. Cancel all timers, including send and batch container 
(`batchMessageContainer`).
2. Complete all pending messages (`pendingMessages`) with 
`AlreadyCloseException`.

See `ProducerImpl#closeAsync` for details.

But the JavaDoc of `Producer#closeAsync` is:

> No more writes will be accepted from this producer. Waits until all pending 
> write request are persisted.

Anyway, the document and implementation are inconsistent. But specifically,
we need to define the behavior for how to process `pendingMessages` and
`batchMessageContainer` when producer call `closeAsync`.

1. batchMessageContainer: contains the buffered single messages (`Message`). 
2. pendingMessages: all inflight messages (`OpSendMsg`) in network.

IMO, from the JavaDoc, only `pendingMessages` should be processed and the
messages in `batchMessageContainer` should be discarded.

Since other clients might have already implemented the similar semantics of
Java clients. If we changed the semantics now, the behaviors among different
clients might be inconsistent.

Should we add a configuration to support graceful close to follow the docs? Or
just change the current behavior?

Thanks,
Yunze

Re: Correct semantics of producer close

2021-09-28 Thread Yunze Xu
I can’t agree more, just like what I’ve said in PR 12195:

> At any case, when you choose `sendAsync`, you should always make use of the 
> returned future to confirm the result of all messages. In Kafka, it's the 
> send callback.

But I found many users are confused about the current behavior, especially
those are used to Kafka’s close semantics. They might expect a simple try
to flush existing messages, which works at a simple test environment, even
there's no guarantee for exception cases.



> 2021年9月28日 下午4:37,Joe F  写道:
> 
> Clients should not depend on any of this behaviour, since the broker is at
> the other end of an unreliable  network connection. The
> semantic differences are kind of meaningless from a usability point, since
> flushing on close =/= published.  What exactly does "graceful" convey
> here?  Flush the  buffer on the client end and hope it makes it to the
> server.
> 
> Is there a  difference whether you flush(or process) pending messages  or
> not? There is no guarantee that either case will ensure the message is
> published.
> 
> The only way to ensure that messages are published is to wait for the ack.
> The correct model should be to wait for return on the blocking API, or wait
> for future completion of the async API, then handle any publish errors and
> then only close the producer.
> 
> 
> On Mon, Sep 27, 2021 at 8:50 PM Yunze Xu 
> wrote:
> 
>> Hi all,
>> 
>> Recently I found a PR (https://github.com/apache/pulsar/pull/12195 <
>> https://github.com/apache/pulsar/pull/12195>) that
>> modifies the existing semantics of producer close. There're already some
>> communications in this PR, but I think it's better to start a discussion
>> here
>> to let more know.
>> 
>> The existing implementation of producer close is:
>> 1. Cancel all timers, including send and batch container
>> (`batchMessageContainer`).
>> 2. Complete all pending messages (`pendingMessages`) with
>> `AlreadyCloseException`.
>> 
>> See `ProducerImpl#closeAsync` for details.
>> 
>> But the JavaDoc of `Producer#closeAsync` is:
>> 
>>> No more writes will be accepted from this producer. Waits until all
>> pending write request are persisted.
>> 
>> Anyway, the document and implementation are inconsistent. But specifically,
>> we need to define the behavior for how to process `pendingMessages` and
>> `batchMessageContainer` when producer call `closeAsync`.
>> 
>> 1. batchMessageContainer: contains the buffered single messages
>> (`Message`).
>> 2. pendingMessages: all inflight messages (`OpSendMsg`) in network.
>> 
>> IMO, from the JavaDoc, only `pendingMessages` should be processed and the
>> messages in `batchMessageContainer` should be discarded.
>> 
>> Since other clients might have already implemented the similar semantics of
>> Java clients. If we changed the semantics now, the behaviors among
>> different
>> clients might be inconsistent.
>> 
>> Should we add a configuration to support graceful close to follow the
>> docs? Or
>> just change the current behavior?
>> 
>> Thanks,
>> Yunze



Re: [VOTE] PIP-99 Pulsar Proxy Extensions

2021-09-28 Thread Yunze Xu
+1 (non binding)

Thanks,
Yunze


Re: [PIP 100] Add seekByIndex for consumer api

2021-09-28 Thread Yunze Xu
You need to create an issue first to start a discussion for your PIP.

Here’s the process of PIP:

1. The author(s) of the proposal will create a GitHub issue ticket choosing the
   template for PIP proposals.
2. The author(s) will send a note to the dev@pulsar.apache.org 
 mailing list
   to start the discussion, using subject prefix `[PIP] xxx`.
3. Based on the discussion and feedback, some changes might be applied by
   authors to the text of the proposal.
4. Once some consensus is reached, there will be a vote to formally approve
   the proposal.
   The vote will be held on the dev@pulsar.apache.org 
 mailing list. Everyone
   is welcome to vote on the proposal, though it will considered to be binding
   only the vote of PMC members.
   I would be required to have a lazy majority of at least 3 binding +1s votes.
   The vote should stay open for at least 48 hours.
5. When the vote is closed, if the outcome is positive, the state of the
   proposal is updated and the Pull Requests associated with this proposal can
   start to get merged into the master branch.

Thanks,
Yunze

> 2021年9月28日 下午10:04,JiangHaiting  写道:
> 
> Hi Pulsar Community,
> 
> 
> I'm glad to have this opportunity to propose this PIP.
> 
> 
> Currently we can reset the read position of a cursor by message id or 
> timestamp. 
> Since we formerly introduced index in broker metadata since 2.9.0, 
> reset cursor by index is very helpful in other protocol handler (KoP or RoP).
> 
> 
> Also, as @BewareMyPower pointed out that 
> "users might want to seek to 1 messages before. Currently they cannot 
> achieve this goal.". 
> And this PIP will make it possible.
> 
> 
> I've already created a PR, see details at 
> https://github.com/apache/pulsar/pull/12032
> 
> 
> 
> Thanks,
> Haiting Jiang (Github:Jason918)



Re: [PIP 100] Add seekByIndex for consumer api

2021-09-28 Thread Yunze Xu
I have edited the Wiki page for PIP
 https://github.com/apache/pulsar/wiki/Pulsar-Improvement-Proposal-(PIP) 
 

Thanks,
Yunze

Re: Correct semantics of producer close

2021-09-28 Thread Yunze Xu
It’s a good point that `ProducerImpl#failPendingBatchMessages` treats
messages in batch container also as pending messages.

I agree with your definition of "graceful close”. It’s more like a “at most 
once”
semantics, like the original JavaDoc said

> pending writes will not be retried

Thanks,
Yunze

> 2021年9月29日 上午5:24,Michael Marshall  写道:
> 
> Thanks for bringing this thread to the mailing list, Yunze.
> 
> I think the right change is to update the `closeAsync` method to first
> flush `batchMessageContainer` and to then asynchronously wait for the
> `pendingMessages` queue to drain. We could add a new timeout or rely
> on the already implemented `sendTimeout` config to put an upper time
> limit on `closeAsync`. My reasoning as well as responses to Joe and
> Yunze follow:
> 
>> we need to define the behavior for how to process `pendingMessages`
>> and `batchMessageContainer` when producer call `closeAsync`.
> 
> Yes, this is exactly the clarification required, and I agree that the
> Javadoc is ambiguous and that the implementation doesn't align with
> the Javadoc.
> 
> If we view the Javadoc as binding, then the fundamental question is
> what messages are "pending". The `pendingMessages` seem pretty easy to
> classify as "pending" given that they are already in flight on the
> network.
> 
> I also consider `batchMessageContainer` to be "pending" because a
> client application already has callbacks for the messages in this
> container. These callbacks are expected to complete when the batch
> message delivery completes. Since the client application already has a
> reference to a callback, it isn't a problem that the producer
> implementation initiates the flush logic. (Note that the current
> design fails the `pendingMessages` but does not fail the
> `batchMessageContainer` when `closeAsync` is called, so the callbacks
> for that container are currently left incomplete forever if the client
> is closed with an unsent batch. We will need to address this design in
> the work that comes from this discussion.)
> 
> Further, the `ProducerImpl#failPendingMessages` method includes logic
> to call `ProducerImpl#failPendingBatchMessages`, which implies that
> these batched, but not sent, messages have been historically
> considered "pending".
> 
> If we view the Javadoc as non-binding, I think my guiding influence
> for the new design would be that the `closeAsync` method should result
> in a "graceful" shutdown of the client.
> 
>> What exactly does "graceful" convey here?
> 
> This is a great question, and will likely drive the design here. I
> view graceful to mean that the producer attempts to avoid artificial
> failures. That means trying to drain the queue instead of
> automatically failing all of the queue's callbacks. The tradeoff is
> that closing the producer takes longer. This reasoning would justify
> my claim that we should first flush the `batchMessageContainer`
> instead of failing the batch without any effort at delivery, as that
> would be artificial.
> 
>> There is no guarantee that either case will ensure the message
>> is published.
> 
> I don't think that implementing `closeAsync` with graceful shutdown
> logic implies a guarantee of message publishing. Rather, it guarantees
> that failures will be the result of a real exception or a timeout.
> Since calling `closeAsync` prevents additional messages from
> delivering, users leveraging this functionality might be operating
> with "at most once" delivery semantics where they'd prefer to deliver
> the messages if possible, but they aren't going to delay application
> shutdown indefinitely to deliver its last messages. If users need
> stronger guarantees about whether their messages are delivered, they
> are probably already using the flush methods to ensure that the
> producer's queues are empty before calling `closeAsync`.
> 
> I also agree that in all of these cases, we're assuming that users are
> capturing references to the async callbacks and then making business
> logic decisions based on the results of those callbacks.
> 
> Thanks,
> Michael
> 
> On Tue, Sep 28, 2021 at 4:58 AM Yunze Xu  wrote:
>> 
>> I can’t agree more, just like what I’ve said in PR 12195:
>> 
>>> At any case, when you choose `sendAsync`, you should always make use of the 
>>> returned future to confirm the result of all messages. In Kafka, it's the 
>>> send callback.
>> 
>> But I found many users are confused about the current behavior, especially
>> those are used to Kafka’s close semantics. They might expect a simple try
>> to

[VOTE] PIP-96 Message payload processor for Pulsar client

2021-09-29 Thread Yunze Xu
Hi folks,

It has been about two weeks since I opened the PIP-96 issue and the design has
changed a lot. Thanks a lot for @eolivelli's suggestions. I think now it's time
to start a vote.

PIP-96 issue: https://github.com/apache/pulsar/issues/12087 


Thanks,
Yunze

Re: Correct semantics of producer close

2021-09-30 Thread Yunze Xu
n see both sides of the argument regarding whether to flush
>>>> pending
>>>> > messages or not. But I think what is definitely in the contract is
>>>> not to
>>>> > discard any callbacks causing user code to block forever. No matter
>>>> what,
>>>> > we must always call the callbacks.
>>>> >
>>>> > Personally, I am in favour of a close operation not flushing pending
>>>> > messages (and I define pending here as any message that has a
>>>> callback).
>>>> > The reason is that if we wait for all pending messages to be sent
>>>> then we
>>>> > now face a number of edge cases that could cause the close operation
>>>> to
>>>> > take a very long time to complete. What if the user code really just
>>>> needs
>>>> > to close the producer right now? If we amend the documentation to
>>>> make it
>>>> > clear that close does not flush pending messages then the user is now
>>>> able
>>>> > to explicitly craft the behaviour they need. If they want all messages
>>>> > flushed first then chaing flushAsync->closeAsync else just
>>>> closeAsync.
>>>> >
>>>> > Unfortunately I think user expectation, regardless of the current
>>>> javadoc,
>>>> > is that close would flush everything and in an ideal world it would.
>>>> We
>>>> > have the Principle of Least Surprise but we also have Safe By Default.
>>>> > Users might be surprised that when calling closeAsync, a load of their
>>>> > pending messages get ConnectionAlreadyClosed, but equally they might
>>>> be
>>>> > surprised when closeAsync doesn't complete because the pending
>>>> messages
>>>> > can't be cleared. Failing pending messages is the safer option. User
>>>> code
>>>> > must handle failure responses and cannot claim data loss with a
>>>> > non-positive response. But if they can't close a producer, that could
>>>> > result in a wider impact on their system, not to mention more issues
>>>> > created in GitHub.
>>>> >
>>>> > Jack
>>>> >
>>>> > On Wed, Sep 29, 2021 at 7:05 AM Joe F >>> wrote:
>>>> >
>>>> > > [ External sender. Exercise caution. ]
>>>> > >
>>>> > > >I don't think that implementing `closeAsync` with graceful
>>>> shutdown
>>>> > > logic implies a guarantee of message publishing. Rather, it
>>>> guarantees
>>>> > > that failures will be the result of a real exception or a
>>>> timeout.
>>>> > >
>>>> > > I think that's beside the point. There
>>>> is no definition of "real"
>>>> > > exceptions.   At that point the app is publishing on a
>>>> best effort basis,
>>>> > > and there are no guarantees anywhere in client or server.
>>>> > >
>>>> > > There is no concept  of  "maybe published". OR
>>>> > > "published-if-no_real_errors".  What does that even
>>>> mean?  That is only a
>>>> > > can of worms which is going to add to developer confusion and
>>>> lead to
>>>> > > Pulsar users finding in the worst possible way that something
>>>> got lost
>>>> > > because it never got published.  It's a poor experience
>>>> when you find it.
>>>> > > I have a real life experience where a user used async APIs (in a
>>>> lambda),
>>>> > > which hummed along fine.  One day much later, the cloud had
>>>> a hitch, and
>>>> > > they discovered a message was  not delivered.
>>>> > >
>>>> > > I am more concerned about developers discovering at the worst
>>>> possible time
>>>> > > that  ""published-if-no_real_errors"  is a concept.
>>>> > >
>>>> > > My suggestion is to make this simple for developers.
>>>> > >
>>>> > > The sync/async nature of the close() [ or any other API, for
>>>> that
>>>> > > matter ]  is completely orthogonal to the API semantics,
>>>> and is just a
>>>> > > programmatic choice to deal with  how resources 

Re: [VOTE] PIP-96 Message payload processor for Pulsar client

2021-10-03 Thread Yunze Xu
I see there’re already three +1 now. Here’s the related PR:

https://github.com/apache/pulsar/pull/12088 
<https://github.com/apache/pulsar/pull/12088> 

> 2021年9月30日 下午12:08,Yunze Xu  写道:
> 
> Hi folks,
> 
> It has been about two weeks since I opened the PIP-96 issue and the design has
> changed a lot. Thanks a lot for @eolivelli's suggestions. I think now it's 
> time
> to start a vote.
> 
> PIP-96 issue: https://github.com/apache/pulsar/issues/12087 
> <https://github.com/apache/pulsar/issues/12087>
> 
> Thanks,
> Yunze



Re: Cutting 2.9.0 RC - final call

2021-10-04 Thread Yunze Xu
Just a reminder of review for PIP 96: 
https://github.com/apache/pulsar/pull/12088 
 

And I’ve pinged reviewers again just now.

Thanks,
Yunze

> 2021年10月4日 下午3:01,Enrico Olivelli  写道:
> 
> 
> 
> There are a few approved PIP that needs review and merging.
> 
> Please let me know for any blocker.
> 
> Tomorrow, Tuesday, I will move every pending issue and PR scheduled for
> 2.9.0 to 2.10.0.
> 
> If the owner of the PR or issue really thinks that we must include that in
> 2.9.0 please let us know here on dev@
> 
> We must try to stick to the Schedule this time, and we are already late (1
> week according to GitHub)
> 
> CI is in quite good shape,
> but there are a few flaky tests, especially in the C client.
> I don't consider this a blocker, but if you have time to help in
> fixing this it will be great !
> 
> Best regards
> Enrico



Re: [VOTE] PIP-96 Message payload processor for Pulsar client

2021-10-04 Thread Yunze Xu
Done. Thanks for your reminder.

Thanks,
Yunze

> 2021年10月4日 下午2:16,Enrico Olivelli  写道:
> 
> Yunze
> You can move the PIP to Approved and copy it to the Wiki.
> 
> Now that it is approved we can review and finally merge the PR
> 
> Enrico
> 
> Il Lun 4 Ott 2021, 08:03 Yunze Xu  ha scritto:
> 
>> I see there’re already three +1 now. Here’s the related PR:
>> 
>> https://github.com/apache/pulsar/pull/12088 <
>> https://github.com/apache/pulsar/pull/12088>
>> 
>>> 2021年9月30日 下午12:08,Yunze Xu  写道:
>>> 
>>> Hi folks,
>>> 
>>> It has been about two weeks since I opened the PIP-96 issue and the
>> design has
>>> changed a lot. Thanks a lot for @eolivelli's suggestions. I think now
>> it's time
>>> to start a vote.
>>> 
>>> PIP-96 issue: https://github.com/apache/pulsar/issues/12087 <
>> https://github.com/apache/pulsar/issues/12087>
>>> 
>>> Thanks,
>>> Yunze
>> 
>> 



Remove deprecated tlsEnabled config from broker and functions worker

2021-10-24 Thread Yunze Xu
Hi all,

I’ve opened a PR to remove `tlsEnabled` config:
https://github.com/apache/pulsar/pull/12473 
 

For built-in admin or client used for replication or system topic, we have
isBrokerClientTlsEnabled config to determine whether to connect the TLS 
endpoint.
`tlsEnabled` is used to check whether the broker exposes the TLS service or
web service, but it's meaningless even it's true, if `webServicePortTls` is
not configured, `tlsEnabled` will tell a false state.

You can see more details in that PR (#12473).

I've opened that PR because when I read the source code, `tlsEnabled` makes me
confused. Then I searched for the references and found this config is 
meaningless.

Please share your points if there're some compatibility issues I've missed.

Thanks,
Yunze

Re: Creating Good Release notes

2021-12-02 Thread Yunze Xu
First I agree with Jonathan that we should perform some changes with
the original PR descriptions.

Then, classifying these PRs is also necessary, otherwise the release notes
would be meaningless. There are a lot of PRs that should be classfied in
Misc part of https://github.com/apache/pulsar/pull/12425 
 and I also gave
some comments in the PR.

IMO, it’s okay to ignore the PRs that only fix some typos or fix some flaky 
tests.
But I found many PRs in Misc part should also be noted.

We should not sacrifice the release quality for a new release like 2.9.1.

> 2021年12月2日 下午7:11,Enrico Olivelli  写道:
> 
> Hello community,
> 
> There is an open discussion on the Pulsar 2.9.0 release notes PR:
> https://github.com/apache/pulsar/pull/12425
> 
> I have created the block of release notes by downloading the list of PR
> using some GitHub API.
> Then I have manually classified:
> - News and Noteworthy: cool things in the Release
> - Breaking Changes: things you MUST know when you upgrade
> - Java Client, C++ Client, Python Client, Functions/Pulsar IO
> 
> The goal is to provide useful information for people who want to upgrade
> Pulsar.
> 
> My problems are:
> - PR titles are often badly written, but I don't want to fix all of them
> (typos,  tenses of verbs, formatting)
> - There are more than 300 PRs, I don't want to classify them manually, I
> just highlighted the most important from my point of view
> 
> If for 2.9.0 we still keep a list of PR, then I believe that the current
> status of the patch is good.
> 
> If we want to do it another way, then I am now asking if there is someone
> who can volunteer in fixing and classifying the list of 300 PRs, it is a
> huge task.
> 
> There is already much more work to do to get 2.9.0 completely released (and
> also PulsarAdapters) and we have to cut 2.9.1 as soon as possible due to a
> bad regression found in 2.9.0.
> 
> Thanks
> Enrico



Detect unused variables in CI

2021-12-10 Thread Yunze Xu
Hi, all

Recently I found a bug that could be avoided if we have a CI to detect unused
variables. See https://github.com/apache/pulsar/pull/13233 
. We can see the
private field `recycleHandle` was not used before this PR.

Generally, we should avoid all unused private fields except some special cases
like `AtomicIntegerFieldUpdater`, while the warning should be suppressed by
`@SuppressWarnings(“unused”)`.

I see checkstyle plugins are still not applied for all modules, so the code 
quality
Is not guaranteed well. BTW, I found checkstyle plugin cannot detect unused
variables.

Does anyone know any plugin to do this work? There is much work to enhance
our code quality.

Thanks,
Yunze

Re: Detect unused variables in CI

2021-12-10 Thread Yunze Xu
Thanks for the suggestion. I just took a look at
https://github.com/SonarSource/sonarqube 
<https://github.com/SonarSource/sonarqube>. It looks like SonarQube can only
be applied for Gradle projects?

Thanks,
Yunze

> 2021年12月10日 下午6:11,Yufei Zhang  写道:
> 
> Hi,
> 
> My previous team used SonarQube for detecting such issues. I saw a free
> version can be used. Also there is sonarlint for local checks which i found
> useful.
> 
> Cheers
> Yufei
> 
> On Fri, Dec 10, 2021 at 6:08 PM Yunze Xu 
> wrote:
> 
>> Hi, all
>> 
>> Recently I found a bug that could be avoided if we have a CI to detect
>> unused
>> variables. See https://github.com/apache/pulsar/pull/13233 <
>> https://github.com/apache/pulsar/pull/13233>. We can see the
>> private field `recycleHandle` was not used before this PR.
>> 
>> Generally, we should avoid all unused private fields except some special
>> cases
>> like `AtomicIntegerFieldUpdater`, while the warning should be suppressed by
>> `@SuppressWarnings(“unused”)`.
>> 
>> I see checkstyle plugins are still not applied for all modules, so the
>> code quality
>> Is not guaranteed well. BTW, I found checkstyle plugin cannot detect unused
>> variables.
>> 
>> Does anyone know any plugin to do this work? There is much work to enhance
>> our code quality.
>> 
>> Thanks,
>> Yunze



[Discuss] Release RPM packages with cxx11 ABI

2023-09-04 Thread Yunze Xu
Hi all,

Currently the official released pre-built binaries for Linux include:
- RPM packages for RedHat-based Linux distros
- DEB packages for Debian-based Linux distros
- APK packages for Alpine-based Linux distros

Though they are provided in different Linux distributions, they are
all compiled with GCC. However, only the RPM package is built with GCC
< 5. GCC 5 brings a breaking ABI change [1] to the std::string, which
is widely used in the Pulsar C++ client interfaces as the byte array.
In short, the impact is, if you're using GCC >= 5
1. You have to add the -D_GLIBCXX_USE_CXX11_ABI=0 compile option to
use the pre-built RPM package.
2. If your application depends on other 3rd party libraries that are
built with GCC >= 5, the pre-built RPM package cannot be used
together.

So for users that use GCC >= 5, the current RPM package is very
unfriendly especially for the 2nd case. As a workaround, they have to
build libraries by themselves, while installing dependencies might be
complicated in C++ world.

However, CentOS 7, whose EOL is 2024-06-30 [2], is still widely used.
In addition, if you're going to upgrade your GCC via devtoolset, the
installed GCC still does not have the cxx11 ABI [3]. So I don't think
it's good to drop the support for old ABIs for RedHat users.

In short, I suggest adding separated pre-built binaries with new ABIs
for RPM packages. The hierarchy will be:

rpm-arm64/aarch64/  RPMs with new ABIs
rpm-arm64/aarch64/legacy  RPMs with old ABIs
rpm-x86_64/x86_64/    RPMs with new ABIs
rpm-x86_64/x86_64/legacy  RPMs with old ABIs

See [4] for the current hierarchy.

After that, we should document the difference here [5].

[1] https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html
[2] https://wiki.centos.org/About/Product
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1546704
[4] https://archive.apache.org/dist/pulsar/pulsar-client-cpp-3.3.0/
[5] https://pulsar.apache.org/docs/3.1.x/client-libraries-cpp-setup/

Thanks,
Yunze


Re: [DISCUSS] Replace Customized Map with ConcurrentHashMap

2023-09-06 Thread Yunze Xu
I support replacing it with JDK's ConcurrentHashMap. Maintaining a
customized concurrent hash map whose algorithm is essentially the same
with the implementation of a very old version of JDK is painful. The
PROs listed like no boxing and linear probing have not proved to be
better by any benchmark. Instead, Penghui's benchmark shows the
performance is much worse.

Regarding the remove or other operations in forEach method, I believe
there is a good way to resolve it and we should not do many things in
the forEach callback.

Thanks,
Yunze

On Wed, Sep 6, 2023 at 2:36 PM the tumbled  wrote:
>
>
>
> On 2023/09/06 04:22:58 QQ wrote:
> > Hi Pulsar Community,
> >
> > I’d like to start a discussion about whether replacing the customize util 
> > class like ConcurrentOpenHashMap with ConcurrentHashMap, as the performance 
> > of ConcurrentHashMap is better than those customize util significantly.
> > Worse, these customize util class cannot ensure consistence in method 
> > forEach as PR https://github.com/apache/pulsar/pull/21110 shows, which is 
> > disquieting, although it may not cause any problem.
> >
> > Thanks,
> > The Tumbled.
> >
>
> The benchmark results is provided in 
> https://github.com/apache/pulsar/pull/20647\#issuecomment-1607257960 
> .
>
>
> > However, iterators are designed to be used by only one thread at a time. 
> > Bear in mind that the results of aggregate status methods including size, 
> > isEmpty, and containsValue are typically useful only when a map is not 
> > undergoing concurrent updates in other threads. Otherwise the results of 
> > these methods reflect transient states that may be adequate for monitoring 
> > or estimation purposes, but not for program control.
>
> And I notice that ConcurrentHashMap’s forEach method do not support 
> thread-safety too.
> It seems that the only reason left for us to replacing the customized util is 
> the superior performance of ConcurrentHashMap.
>
> Thanks,
> The Tumbled.


Re: [VOTE] Pulsar DotPulsar Release 3.0.0 Candidate 1

2023-09-07 Thread Yunze Xu
+1 (binding)

- Verified signature and checksums
- Build from source with dotnet 7.0.400 on Windows 11
- Run the example by adding the dotpulsar 3.0.0 dependency

But I think there are some points to improve with the document. I'm
new to the dotnet CLI though I have some experiences of C# a few years
ago. I created the project via Visual Studio. However, when I tried to
add the dependency via `dotnet add` command, I found it failed with a
project template created by Visual Studio. Eventually I have to create
and run a project via dotnet CLI:

```
cd project-dir
dotnet new console
dotnet add package DotPulsar --version 3.0.0
# Edit the Program.cs
dotnet build
dotnet run
```

It would be helpful to provide more guide, even links to existings
tutorial from MSDN.

Thanks,
Yunze

On Tue, Sep 5, 2023 at 3:51 PM Zike Yang  wrote:
>
> +1 (non-binding)
>
> - Verified signature and checksums
> - Install the nupkg file
> - Build the client from the source
> - Run examples
>
> But I find that the dotpulsar 3.0.0 has already been published to the
> nuget.org: https://www.nuget.org/packages/DotPulsar/3.0.0
>
> BR,
> Zike Yang
>
> On Sun, Sep 3, 2023 at 11:48 AM tison  wrote:
> >
> > Hi everyone,
> >
> > This is the first release candidate for Apache DotPulsar, version 3.0.0.
> >
> > It fixes the following issues:
> > https://github.com/apache/pulsar-dotpulsar/compare/2.11.1...3.0.0
> >
> > Please download the source files and review this release candidate:
> > - Download the source package, verify shasum and asc
> > - Follow the README.md to build and run the DotPulsar.
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Source files:
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-dotpulsar-3.0.0-candidate-1/
> >
> > Pulsar's KEYS file containing PGP keys we use to sign the release:
> > https://downloads.apache.org/pulsar/KEYS
> >
> > The tag to be voted upon:
> > v3.0.0
> > https://github.com/apache/pulsar-dotpulsar/releases/tag/3.0.0
> >
> > Please review and vote on the release candidate #1 for the version 3.0.0,
> > as follows:
> >
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > Best,
> > tison.


Re: [VOTE] PIP-302 Introduce refreshAsync API for TableView

2023-10-07 Thread Yunze Xu
Totally I'm +0 at the moment. I'm still wondering which issue you
really want to resolve. I left a comment
https://github.com/apache/pulsar/pull/21271#issuecomment-1751899833.
Generally you can get latest value unless the producer is far quicker
than the reader. However, even with the refreshAsync() method, in the
future's callback, the data could still be updated.

>From my perspective, this proposal only makes sense when you can
guarantee there is no more data written to the topic when you call
methods of TableView. But if so, a simple `boolean hasReachedLatest()`
method could do the trick.

Thanks,
Yunze

On Sun, Oct 8, 2023 at 2:12 PM 太上玄元道君  wrote:
>
> +1 (no-binding)
>
>
> Xiangying Meng 于2023年9月27日 周三15:05写道:
>
> > Hi dev,
> >This thread is to start a vote for PIP-302 Add new API
> > refreshAsync for TableView.
> > Discuss thread:
> > https://lists.apache.org/thread/o085y2314o0fymvx0x8pojmgjwcwn59q
> > PIP: https://github.com/apache/pulsar/pull/21166
> >
> > BR,
> > Xiangying
> >


Re: [VOTE] PIP-307: Support subscribing multi-topics for WebSocket

2023-10-19 Thread Yunze Xu
+1 (binding)

Thanks,
Yunze

On Fri, Oct 20, 2023 at 10:59 AM mattison chao  wrote:
>
> +1(binding)
>
> Best,
> Mattison
>
> > On 19 Oct 2023, at 20:47, guo jiwei  wrote:
> >
> > Hi dev,
> >   Currently WebSocket only supports the consumption of a single topic,
> > which cannot satisfy users' consumption scenarios of multiple topics.  So
> > in order to support consumption of multiple topics or pattern topics, I
> > would like to start a vote for PIP-307
> > .
> >
> >
> > Ref:
> > • Discuss Mail:
> > https://lists.apache.org/thread/co8396ywny161x91dffzvxlt993mo1ht
> > • PIP-307: https://github.com/apache/pulsar/pull/21390
> >
> >
> > Regards
> > Jiwei Guo (Tboy)
>


Re: [VOTE] Pulsar Client Go Release 0.11.1 Candidate 1

2023-10-20 Thread Yunze Xu
+1 (binding)

- Verified checksum and signatures
- Build the perf tool from source
- Run the perf to produce and consumer for some time

Thanks,
Yunze

On Mon, Sep 11, 2023 at 6:08 PM Zike Yang  wrote:
>
> Hi everyone,
> Please review and vote on the release candidate #1 for the version
> 0.11.1, as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> This is the first release candidate for Apache Pulsar Go client, version 
> 0.11.1.
>
> It fixes the following issues:
> https://github.com/apache/pulsar-client-go/compare/v0.11.0...v0.11.1-candidate-1
>
> Pulsar Client Go's KEYS file contains PGP keys we used to sign this release:
> https://dist.apache.org/repos/dist/dev/pulsar/KEYS
>
> Please download these packages and review this release candidate:
> - Review release notes: https://github.com/apache/pulsar-client-go/pull/1092
> - Download the source package (verify shasum, and asc) and follow the
> README.md to build and run the pulsar-client-go.
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Source file:
> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-go-0.11.1-candidate-1/
>
> The tag to be voted upon:
> v0.11.1-candidate-1
> https://github.com/apache/pulsar-client-go/tree/v0.11.1-candidate-1
>
> SHA-512 checksums:
> d2209c652918acee8d2c77d52a0a556af16ff7fc3e30ad96d05e01285b83a61d1a1f0d32bace184f830e2dd2e4dd20910e9ce5ae23aac4a40eb3d19885cb0182
>  apache-pulsar-client-go-0.11.1-src.tar.gz


[Discuss] Release Pulsar C++ Client 3.4.0

2023-10-22 Thread Yunze Xu
I would like to propose releasing the Pulsar C++ Client 3.4.0. It has
been about 3 months since the last release. There have been many new
features and bug fixes since then.

Besides, from my own perspective, it's better to let Python and
Node.js clients depend on this new version of C++ client. Especially I
observed that the topic name was not shown correctly many times
recently, which are fixed by
https://github.com/apache/pulsar-client-cpp/pull/331 and
https://github.com/apache/pulsar-client-cpp/pull/329.

So it's time to release a new version. Please let me know if you have
any PRs that need to be included in 3.4.0

Thanks,
Yunze


Re: [VOTE] PIP-298 Consumer supports specifying consumption isolation level

2023-10-22 Thread Yunze Xu
+1 (binding)

Thanks,
Yunze

On Mon, Oct 23, 2023 at 1:08 PM PengHui Li  wrote:
>
> +1 (binding)
>
> Regards,
> Penghui
>
> On Mon, Oct 23, 2023 at 10:37 AM hzh0425  wrote:
>
> > Dave previously mentioned in the discussion thread that he will continue
> > to support this pip if the documentation is supplemented and improved. Due
> > to inability to contact him, we apply to cancel his -1 binding.
> >
> >
> >
> >  Replied Message 
> > | From | Dave Fisher |
> > | Date | 09/26/2023 04:03 |
> > | To | dev@pulsar.apache.org |
> > | Cc | |
> > | Subject | Re: [VOTE] PIP-298 Consumer supports specifying consumption
> > isolation level |
> > -1 (binding) I’m not convinced that breaking transaction isolation is the
> > proper course of action.
> >
> > Regards,
> > Dave
> >
> > > On Sep 25, 2023, at 6:46 AM, hzh0425  wrote:
> > >
> > > Hi dev,
> > > This thread is to start a vote for PIP-298 Consumer supports specifying
> > consumption isolation level
> > > Discuss thread:
> > > https://lists.apache.org/thread/8ny0qtp7m9qcdbvnfjdvpnkc4c5ssyld
> > >
> > > https://lists.apache.org/thread/2opqjof83425vry6gzszd5glqgryrv11
> > >
> > > PIP: https://github.com/apache/pulsar/pull/21114
> > >
> > > BR,
> > > hzh
> >


Re: [VOTE] Pulsar Release 3.1.1 Candidate 1

2023-10-23 Thread Yunze Xu
+1 (binding)

- Verified checksum and signatures
- Built from source with Java 17.0.7 and Maven 3.9.3 on macOS m1
- Started standalone and verified produce and consume
- Ran it with StreamNative KoP and verified produce and consume with
Kafka clients 3.5.0

Thanks,
Yunze

On Thu, Oct 19, 2023 at 5:53 PM mattison chao  wrote:
>
> +1 (binding)
>
>
> - Built from source code. (Java version: 17.0.8.1, Apache Maven 3.9.4, OS 
> name: "mac os x", version: "13.4.1", arch: "x86_64")
> - Checked binary license
> - Started Standalone
> - Ran a round of publish and consume
>
> Best,
> Mattison
>
> > On 7 Oct 2023, at 09:09, guo jiwei  wrote:
> >
> > This is the first release candidate for Apache Pulsar version 3.1.1.
> >
> > It fixes the following issues:
> > https://github.com/apache/pulsar/pulls?q=is%3Apr+is%3Amerged+label%3Arelease%2F3.1.1+label%3Acherry-picked%2Fbranch-3.1+
> >
> > *** Please download, test and vote on this release. This vote will
> > stay open for at least 72 hours ***
> >
> > Note that we are voting upon the source (tag), binaries are provided
> > for convenience.
> >
> > Source and binary files:
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-3.1.1-candidate-1/
> >
> > SHA-512 checksums:
> >
> > af79f970c8835320584faf58c85bfc5cd12261f5e366c2c16bce2f7628d769ef7374a3c0e383ff443519e484a35a23e86415e0156a0f35dd3bc1f606d2fa0421
> >
> > apache-pulsar-3.1.1-bin.tar.gz
> >
> > a43306b8a08a330c721ca96501c0c4285e5c47cfab4a037034148401afc4dbd65e505da591ef9ca76ca5acf39d6fd9c96537a08956ad6fb37f7c70d8ab747510
> >
> > apache-pulsar-3.1.1-src.tar.gz
> >
> > Maven staging repo:
> > https://repository.apache.org/content/repositories/orgapachepulsar-1243/
> >
> > The tag to verify:
> > v3.1.1-candidate-1 (80fb39085b4e49ff31f2df17b10addcca5abdccb)
> > https://github.com/apache/pulsar/releases/tag/v3.1.1-candidate-1
> >
> > Pulsar's KEYS file containing PGP keys you use to sign the release:
> > https://dist.apache.org/repos/dist/dev/pulsar/KEYS
> >
> > Docker images:
> >
> > pulsar images:
> > https://hub.docker.com/layers/mattison/pulsar-all/3.1.1-80fb390/images/sha256-1088b07fd2448733db1d165676b82c1278f2940cb0861c704450ef2be5c2fa1c?context=explore
> >
> > pulsar-all images:
> > https://hub.docker.com/layers/mattison/pulsar/3.1.1-80fb390/images/sha256-21e8bf1571e4ab559a51b3f6e524d725cffabe3c6836101f9d7ea7eb1e2bf62c?context=explore
> >
> > Please download the source package, and follow the README to build
> > and run the Pulsar standalone service.
> >
> >
> >
> > Regards
> > Jiwei Guo (Tboy)
>


Re: Question about Pulsar gRPC client(s)

2023-10-31 Thread Yunze Xu
It's a protocol handler like https://github.com/streamnative/kop that
supports the Kafka protocol. The protocol handler should not be
maintained in the core repo because there are a few people maintaining
the plugin.

And just like Christophe and Zike said, compared with the native
Pulsar protocol, there is actually some overhead so it cannot fully
replace the native Pulsar protocol.

Thanks,
Yunze

On Wed, Nov 1, 2023 at 10:11 AM Zike Yang  wrote:
>
> Another point is that there are many features implemented on the
> client side, including batching, chunking, DLQ, etc. This makes it
> hard to replace the existing pulsar clients completely.
>
> Zike Yang
>
>
> On Wed, Nov 1, 2023 at 4:43 AM Christophe Bornet  
> wrote:
> >
> > Hi Kiryl,
> >
> > Thanks for mentioning pulsar-grpc.
> > Indeed, using gRPC simplifies the implementation of the networking logic
> > (keep-alive, reconnection, flow control,…). On the other hand, the Java
> > gRPC implementation makes a lot of buffer copies to cleanly separate the
> > network and app layers but that takes a toll on performance. Compared to
> > that, the broker Pulsar protocol impl is optimized to not do copies between
> > the consumer/producer endpoints and the bookkeeper client.
> > So I think we could not replace completely the Pulsar protocol by gRPC.
> > We could maybe support both but it’s a lot of work to maintain both
> > protocols. (I kind of gave up maintaining pulsar-grpc because of the amount
> > of work compared to the number of users, but if there’s interest I can
> > reconsider).
> > Another possibility would be to do a proxy instead of a low-level protocol
> > handler. A bit like the WebSocket proxy. This would be far less work to
> > maintain as it would use the Pulsar protocol to communicate with the
> > brokers. It could be done as a Proxy extension. Compared to the WS proxy,
> > this would provide type safety, discovery, and so on…
> > As for the Admin, it’s a bit the same. It would be a bunch of work to
> > support both gRPC and REST. You have some kind of type hinting with the
> > OpenAPI spec that you can use to generate client SDKs (eg. with
> > openapi-generator.
> > I wonder what others have to say.
> >
> > Christophe
> >
> >
> > Le mar. 31 oct. 2023 à 19:57, Kiryl Valkovich 
> > a écrit :
> >
> > > Hi! Am I understanding it right, that if this project
> > > https://github.com/cbornet/pulsar-grpc is merged to the apache/pulsar
> > > repo, we could easily cover non-mainstream platforms that are supported by
> > > gRPC, but don't have ready-to-use Pulsar clients?
> > >
> > > https://github.com/apache/pulsar/wiki/PIP-59:-gPRC-Protocol-Handler
> > >
> > > Similar to already supported language-agnostic client interfaces - REST
> > > and WebSocket.
> > >
> > > Actively maintained gRPC libraries I found (19, or 15 if considering JVM
> > > languages and web as duplicates):
> > > - [C# / .NET](https://grpc.io/docs/languages/csharp/)
> > > - [C++](https://grpc.io/docs/languages/cpp/)
> > > - [Dart](https://grpc.io/docs/languages/dart/)
> > > - [Go](https://grpc.io/docs/languages/go/)
> > > - [Java](https://grpc.io/docs/languages/java/)
> > > - [Kotlin](https://grpc.io/docs/languages/kotlin/)
> > > - [Node](https://grpc.io/docs/languages/node/)
> > > - [Objective-C](https://grpc.io/docs/languages/objective-c/)
> > > - [PHP](https://grpc.io/docs/languages/php/)
> > > - [Python](https://grpc.io/docs/languages/python/)
> > > - [Ruby](https://grpc.io/docs/languages/ruby/)
> > > - [OCaml](https://github.com/dialohq/ocaml-grpc)
> > > - [Haskell](https://github.com/awakesecurity/gRPC-haskell)
> > > - [Elixir](https://github.com/elixir-grpc/grpc)
> > > - [Rust](https://github.com/hyperium/tonic)
> > > - [Scala](https://scalapb.github.io/)
> > > - [Swift](https://github.com/grpc/grpc-swift)
> > > - Web client: https://github.com/grpc/grpc-web
> > > - Web client 2: https://connectrpc.com/docs/web/getting-started
> > >
> > > Actively maintained Pulsar libraries (8):
> > > - Java
> > > - C++
> > > - Python
> > > - Go
> > > - Node.js
> > > - C#
> > > - PHP
> > > - Rust
> > >
> > > Is there any reason for not merging it to the apache/pulsar?
> > >
> > > I would definitely prefer to work with a statically typed gRPC client
> > > instead of REST or WebSocket.
> > >
> > > By the way, the same can work for the Pulsar Admin API. Implement the gRPC
> > > server once in Java, and we have full-featured native statically typed
> > > (where applicable :)) Pulsar Admin clients for any platform.
> > >
> > > Best,
> > > Kiryl
> > >


[VOTE] Pulsar Client C++ Release 3.4.0 Candidate 1

2023-11-01 Thread Yunze Xu
This is the first release candidate for Apache Pulsar Client C++, version 3.4.0.

It fixes the following issues:
https://github.com/apache/pulsar-client-cpp/milestone/5?closed=1

*** Please download, test and vote on this release. This vote will stay open
for at least 72 hours ***

Note that we are voting upon the source (tag), binaries are provided for
convenience.

Source and binary files:
https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-cpp/pulsar-client-cpp-3.4.0-candidate-1/

SHA-512 checksums:

85c2ae95bb3abb7c13326e1205c4dc9e339387d13efab242e51c044c53d322a65a28d32928ebe53202ea59f2df46a74480c4ef675e25adc64c14025dc3e314aa
 ./apache-pulsar-client-cpp-3.4.0.tar.gz

The tag to be voted upon:
v3.4.0-candidate-1 (272e1a1c78fd72d80758ac56ce400c67ce54d167)
https://github.com/apache/pulsar-client-cpp/releases/tag/v3.4.0-candidate-1

Pulsar's KEYS file containing PGP keys you use to sign the release:
https://downloads.apache.org/pulsar/KEYS

Please download the source package, and follow the README to compile and test.


Re: [VOTE] Pulsar Client C++ Release 3.4.0 Candidate 1

2023-11-05 Thread Yunze Xu
I want to include an important fix
(https://github.com/apache/pulsar/pull/21144) for the latest PR in
3.4.0 (https://github.com/apache/pulsar-client-cpp/pull/336). So I
will open another candidate after that.

Thanks,
Yunze

On Thu, Nov 2, 2023 at 9:20 PM Baodi Shi  wrote:
>
>  +1 (non-binding)
>
> -  Checked the sign and checksum
> -  Build the source
> -  Test SampleProducer and SampleConsumer
>
> Thanks,
> Baodi Shi
>
>
> On Nov 1, 2023 at 18:18:21, Yunze Xu  wrote:
>
> > This is the first release candidate for Apache Pulsar Client C++, version
> > 3.4.0.
> >
> > It fixes the following issues:
> > https://github.com/apache/pulsar-client-cpp/milestone/5?closed=1
> >
> > *** Please download, test and vote on this release. This vote will stay
> > open
> > for at least 72 hours ***
> >
> > Note that we are voting upon the source (tag), binaries are provided for
> > convenience.
> >
> > Source and binary files:
> >
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-cpp/pulsar-client-cpp-3.4.0-candidate-1/
> >
> > SHA-512 checksums:
> >
> >
> > 85c2ae95bb3abb7c13326e1205c4dc9e339387d13efab242e51c044c53d322a65a28d32928ebe53202ea59f2df46a74480c4ef675e25adc64c14025dc3e314aa
> > ./apache-pulsar-client-cpp-3.4.0.tar.gz
> >
> > The tag to be voted upon:
> > v3.4.0-candidate-1 (272e1a1c78fd72d80758ac56ce400c67ce54d167)
> > https://github.com/apache/pulsar-client-cpp/releases/tag/v3.4.0-candidate-1
> >
> > Pulsar's KEYS file containing PGP keys you use to sign the release:
> > https://downloads.apache.org/pulsar/KEYS
> >
> > Please download the source package, and follow the README to compile and
> > test.
> >


Re: [VOTE] PIP-300: Add custom dynamic configuration for plugins

2023-11-05 Thread Yunze Xu
+1 (binding)

Thanks,
Yunze

On Mon, Nov 6, 2023 at 10:47 AM guo jiwei  wrote:
>
> +1 (binding)
>
>
> Regards
> Jiwei Guo (Tboy)
>
>
> On Sun, Oct 8, 2023 at 2:13 PM 太上玄元道君  wrote:
>
> > +1 (no-binding)
> >
> >
> > Zixuan Liu 于2023年9月26日 周二10:54写道:
> >
> > > Hi Pulsar Community,
> > >
> > > Voting for PIP-300: https://github.com/apache/pulsar/pull/21127
> > > Discussion thread:
> > > https://lists.apache.org/thread/ysnsnollgy1b6w1dsvmx1t1y2rz1tyd6
> > >
> > > Thanks,
> > > Zixuan
> > >
> >


[VOTE] Pulsar Client C++ Release 3.4.0 Candidate 2

2023-11-06 Thread Yunze Xu
This is the second release candidate for Apache Pulsar Client C++,
version 3.4.0.

It fixes the following issues:
https://github.com/apache/pulsar-client-cpp/milestone/5?closed=1

*** Please download, test and vote on this release. This vote will stay open
for at least 72 hours ***

Note that we are voting upon the source (tag), binaries are provided for
convenience.

Source and binary files:
https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-cpp/pulsar-client-cpp-3.4.0-candidate-2/

SHA-512 checksums:

10517590a2e4296d6767a044e58dd32c79e404a5136cf41126f4cb2416ed0ef8fb1ad7aa7da54c37c12ed71b05a527ed08bfac9d50d3550fa5d475c1e8c00950
 apache-pulsar-client-cpp-3.4.0.tar.gz


The tag to be voted upon:
v3.4.0-candidate-2 (f337eff7caae93730ec1260810655cbb5a345e70)
https://github.com/apache/pulsar-client-cpp/releases/tag/v3.4.0-candidate-2

Pulsar's KEYS file containing PGP keys you use to sign the release:
https://downloads.apache.org/pulsar/KEYS

Please download the source package, and follow the README to compile and test.


Re: [VOTE] Pulsar Client C++ Release 3.4.0 Candidate 2

2023-11-08 Thread Yunze Xu
> Regards,
> Penghui
>
> On Wed, Nov 8, 2023 at 10:45 PM Yubiao Feng
>  wrote:
>
> > Hi all
> >
> > Sorry, I'll send another email explaining what tests were done.
> >
> > Please ignore the previous email.
> >
> > Thanks
> > Yubiao Feng
> >
> >
> > On Wed, Nov 8, 2023 at 11:48 AM Yubiao Feng 
> > wrote:
> >
> > > +1 (no-binding)
> > >
> > > Thanks
> > > Yubiao Feng
> > >
> > > On Tue, Nov 7, 2023 at 3:03 PM Yunze Xu  wrote:
> > >
> > >> This is the second release candidate for Apache Pulsar Client C++,
> > >> version 3.4.0.
> > >>
> > >> It fixes the following issues:
> > >> https://github.com/apache/pulsar-client-cpp/milestone/5?closed=1
> > >>
> > >> *** Please download, test and vote on this release. This vote will stay
> > >> open
> > >> for at least 72 hours ***
> > >>
> > >> Note that we are voting upon the source (tag), binaries are provided for
> > >> convenience.
> > >>
> > >> Source and binary files:
> > >>
> > >>
> > https://dist.apache.org/repos/dist/dev/pulsar/pulsar-client-cpp/pulsar-client-cpp-3.4.0-candidate-2/
> > >>
> > >> SHA-512 checksums:
> > >>
> > >>
> > >>
> > 10517590a2e4296d6767a044e58dd32c79e404a5136cf41126f4cb2416ed0ef8fb1ad7aa7da54c37c12ed71b05a527ed08bfac9d50d3550fa5d475c1e8c00950
> > >>  apache-pulsar-client-cpp-3.4.0.tar.gz
> > >>
> > >>
> > >> The tag to be voted upon:
> > >> v3.4.0-candidate-2 (f337eff7caae93730ec1260810655cbb5a345e70)
> > >>
> > >>
> > https://github.com/apache/pulsar-client-cpp/releases/tag/v3.4.0-candidate-2
> > >>
> > >> Pulsar's KEYS file containing PGP keys you use to sign the release:
> > >> https://downloads.apache.org/pulsar/KEYS
> > >>
> > >> Please download the source package, and follow the README to compile and
> > >> test.
> > >>
> > >
> >


  1   2   3   4   5   6   7   >