[GitHub] [pulsar-client-node] massakam commented on issue #221: Non-durable subscription

2022-06-23 Thread GitBox


massakam commented on issue #221:
URL: 
https://github.com/apache/pulsar-client-node/issues/221#issuecomment-1164048194

   Currently, the Node.js client does not support non-durable subscription. 
This is because there is no C API to specify the subscription mode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-client-node] BewareMyPower commented on a diff in pull request #219: Add null check for consumer in MessageListenerProxy

2022-06-23 Thread GitBox


BewareMyPower commented on code in PR #219:
URL: https://github.com/apache/pulsar-client-node/pull/219#discussion_r904726822


##
src/Consumer.cc:
##
@@ -63,7 +63,9 @@ void MessageListenerProxy(Napi::Env env, Napi::Function 
jsCallback, MessageListe
   Consumer *consumer = data->consumer;
   delete data;
 
-  jsCallback.Call({msg, consumer->Value()});
+  if (consumer) {

Review Comment:
   I think the controversial point is whether should we skip the null consumer.
   - Yes: might lose messages (not sure)
   - No: fast fail
   
   Adding logs doesn't solve anything. There is no logger in NodeJS client, 
printing logs to console is meaningless and helpless as I said.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-client-node] BewareMyPower commented on a diff in pull request #219: Add null check for consumer in MessageListenerProxy

2022-06-23 Thread GitBox


BewareMyPower commented on code in PR #219:
URL: https://github.com/apache/pulsar-client-node/pull/219#discussion_r904727898


##
src/Consumer.cc:
##
@@ -63,7 +63,9 @@ void MessageListenerProxy(Napi::Env env, Napi::Function 
jsCallback, MessageListe
   Consumer *consumer = data->consumer;
   delete data;
 
-  jsCallback.Call({msg, consumer->Value()});
+  if (consumer) {

Review Comment:
   We need more info to reproduce the segmentation fault. But for now, a safe 
solution is to avoid accessing a null consumer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-test-infra] dependabot[bot] opened a new pull request, #49: Bump got from 11.8.2 to 11.8.5 in /test-reporter

2022-06-23 Thread GitBox


dependabot[bot] opened a new pull request, #49:
URL: https://github.com/apache/pulsar-test-infra/pull/49

   Bumps [got](https://github.com/sindresorhus/got) from 11.8.2 to 11.8.5.
   
   Release notes
   Sourced from https://github.com/sindresorhus/got/releases";>got's releases.
   
   v11.8.5
   
   Backport security fix https://github.com/sindresorhus/got/commit/861ccd9ac2237df762a9e2beed7edd88c60782dc";>https://github.com/sindresorhus/got/commit/861ccd9ac2237df762a9e2beed7edd88c60782dc
   
   https://nvd.nist.gov/vuln/detail/CVE-2022-33987";>CVE-2022-33987
   
   
   
   https://github.com/sindresorhus/got/compare/v11.8.4...v11.8.5";>https://github.com/sindresorhus/got/compare/v11.8.4...v11.8.5
   v11.8.3
   
   Bump cacheable-request dependency (https://github-redirect.dependabot.com/sindresorhus/got/issues/1921";>#1921)
  9463bb6
   Fix HTTPError missing .code property (https://github-redirect.dependabot.com/sindresorhus/got/issues/1739";>#1739)
  0e167b8
   
   https://github.com/sindresorhus/got/compare/v11.8.2...v11.8.3";>https://github.com/sindresorhus/got/compare/v11.8.2...v11.8.3
   
   
   
   Commits
   
   https://github.com/sindresorhus/got/commit/5e17bb748c260b02e4cf716c2f4079a1c6a7481e";>5e17bb7
 11.8.5
   https://github.com/sindresorhus/got/commit/bce8ce7d528a675bd5a8d996e110b73674e290d2";>bce8ce7
 Backport 861ccd9ac2237df762a9e2beed7edd88c60782dc
   https://github.com/sindresorhus/got/commit/8ced19215603f3eda25a9f5dce390f1b152fe9a3";>8ced192
 Fix build
   https://github.com/sindresorhus/got/commit/670eb04b5b01622f489277d6fb1dd04a41d3cb51";>670eb04
 11.8.4
   https://github.com/sindresorhus/got/commit/20f29fe3726a4ddd104f557456dbd5275685e879";>20f29fe
 Backport https://github-redirect.dependabot.com/sindresorhus/got/issues/1543";>#1543:
 Initialize globalResponse in case of ignored HTTPError (https://github-redirect.dependabot.com/sindresorhus/got/issues/2017";>#2017)
   https://github.com/sindresorhus/got/commit/0da732f4650c398f3b2fea672f8916e6c7004c8f";>0da732f
 11.8.3
   https://github.com/sindresorhus/got/commit/9463bb696d4ee909970e3fc609ee40b7644e3f6c";>9463bb6
 Bump cacheable-request dependency (https://github-redirect.dependabot.com/sindresorhus/got/issues/1921";>#1921)
   https://github.com/sindresorhus/got/commit/0e167b8b9505a7e9e4a4bbe39e9baeb1f5c4a1fd";>0e167b8
 HTTPError code set to 'HTTPError' https://github-redirect.dependabot.com/sindresorhus/got/issues/1711";>#1711
 (https://github-redirect.dependabot.com/sindresorhus/got/issues/1739";>#1739)
   See full diff in https://github.com/sindresorhus/got/compare/v11.8.2...v11.8.5";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=got&package-manager=npm_and_yarn&previous-version=11.8.2&new-version=11.8.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   - `@dependabot use these labels` will set the current labels as the default 
for future PRs for this repo and language
   - `@dependabot use these reviewers` will set the current reviewers as the 
default for future PRs for this repo and language
   - `@dependabot use these assignees` will set the current assignees as the 
default for future PRs for this repo and language
   - `@dependabot use this milestone` will set the current milestone as the 
default for future PRs for this repo and language
 

[GitHub] [pulsar-test-infra] dependabot[bot] opened a new pull request, #50: Bump got from 11.8.3 to 11.8.5 in /http-cache-action

2022-06-23 Thread GitBox


dependabot[bot] opened a new pull request, #50:
URL: https://github.com/apache/pulsar-test-infra/pull/50

   Bumps [got](https://github.com/sindresorhus/got) from 11.8.3 to 11.8.5.
   
   Release notes
   Sourced from https://github.com/sindresorhus/got/releases";>got's releases.
   
   v11.8.5
   
   Backport security fix https://github.com/sindresorhus/got/commit/861ccd9ac2237df762a9e2beed7edd88c60782dc";>https://github.com/sindresorhus/got/commit/861ccd9ac2237df762a9e2beed7edd88c60782dc
   
   https://nvd.nist.gov/vuln/detail/CVE-2022-33987";>CVE-2022-33987
   
   
   
   https://github.com/sindresorhus/got/compare/v11.8.4...v11.8.5";>https://github.com/sindresorhus/got/compare/v11.8.4...v11.8.5
   
   
   
   Commits
   
   https://github.com/sindresorhus/got/commit/5e17bb748c260b02e4cf716c2f4079a1c6a7481e";>5e17bb7
 11.8.5
   https://github.com/sindresorhus/got/commit/bce8ce7d528a675bd5a8d996e110b73674e290d2";>bce8ce7
 Backport 861ccd9ac2237df762a9e2beed7edd88c60782dc
   https://github.com/sindresorhus/got/commit/8ced19215603f3eda25a9f5dce390f1b152fe9a3";>8ced192
 Fix build
   https://github.com/sindresorhus/got/commit/670eb04b5b01622f489277d6fb1dd04a41d3cb51";>670eb04
 11.8.4
   https://github.com/sindresorhus/got/commit/20f29fe3726a4ddd104f557456dbd5275685e879";>20f29fe
 Backport https://github-redirect.dependabot.com/sindresorhus/got/issues/1543";>#1543:
 Initialize globalResponse in case of ignored HTTPError (https://github-redirect.dependabot.com/sindresorhus/got/issues/2017";>#2017)
   See full diff in https://github.com/sindresorhus/got/compare/v11.8.3...v11.8.5";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=got&package-manager=npm_and_yarn&previous-version=11.8.3&new-version=11.8.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   - `@dependabot use these labels` will set the current labels as the default 
for future PRs for this repo and language
   - `@dependabot use these reviewers` will set the current reviewers as the 
default for future PRs for this repo and language
   - `@dependabot use these assignees` will set the current assignees as the 
default for future PRs for this repo and language
   - `@dependabot use this milestone` will set the current milestone as the 
default for future PRs for this repo and language
   
   You can disable automated security fix PRs for this repo from the [Security 
Alerts page](https://github.com/apache/pulsar-test-infra/network/alerts).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [VOTE] PIP-160 Make transactions work more efficiently by aggregation operation for transaction log and pending ack store

2022-06-23 Thread mattison chao
+1 (non-binding)

Best,
Mattison

On Thu, 23 Jun 2022 at 09:49, Jia Zhai  wrote:

> +1
>
> On Mon, Jun 20, 2022 at 9:31 AM PengHui Li  wrote:
>
> > +1 (binding)
> >
> > Left a minor comment here about the field name
> > https://github.com/apache/pulsar/issues/15370#issuecomment-1159870170
> > Please check.
> >
> > Thanks,
> > Penghui
> >
> > On Mon, Jun 20, 2022 at 9:21 AM Yubiao Feng
> >  wrote:
> >
> > > Hi Pulsar Community
> > >
> > > I would like to start a VOTE on "Make transactions work more
> efficiently
> > by
> > > aggregation operation for transaction log and pending ack store"
> > (PIP-160).
> > >
> > > The proposal can be read at
> > https://github.com/apache/pulsar/issues/15370
> > >
> > > and the discussion thread is available at
> > > https://lists.apache.org/thread/lsmn0hg9np97qrzzh2wovxq1yhxj9qhy
> > >
> > > Voting will stay open for at least 48h.
> > >
> > > Thanks
> > > Yubiao Feng
> > >
> >
>


[GitHub] [pulsar-client-node] nearzxide10 commented on issue #161: Looks like the pulsar-client-node can not run with the current pulsar master code

2022-06-23 Thread GitBox


nearzxide10 commented on issue #161:
URL: 
https://github.com/apache/pulsar-client-node/issues/161#issuecomment-1165063296

   I
   
   > I have the same problem. Unable to install from linux docker. How can I 
solve this ?
   > 
   > I got the same error whether I installed apache-pulsar-client-dev.deb or 
not.
   
   I have the same problem too , 
   but my environment :
   docker image : node:16.15.1 (debian)
   platform : linux/arm64
   
   my Dockerfile 
   ```Docekrfile=
   FROM node:16.15.1
   WORKDIR /srv
   ADD . .
   RUN apt update && apt install -y zip wget curl gnupg g++ make
   ARG PULSAR_CPP_CLIENT_VERSION=2.9.1
   RUN wget --no-check-certificate --user-agent=Mozilla -O 
apache-pulsar-client-dev.deb 
"https://archive.apache.org/dist/pulsar/pulsar-${PULSAR_CPP_CLIENT_VERSION}/DEB/apache-pulsar-client.deb";
 && \
   wget --no-check-certificate --user-agent=Mozilla -O 
apache-pulsar-client.deb 
"https://archive.apache.org/dist/pulsar/pulsar-${PULSAR_CPP_CLIENT_VERSION}/DEB/apache-pulsar-client-dev.deb";
 && \
   dpkg --force-architecture -i apache-pulsar-client*.deb
   RUN npm install --save pulsar-client
   RUN npm install
   CMD [ "node", "index.js" ]
   ```
   error message 
   ```
   #11 15.28 npm notice
   #11 15.28 npm ERR! code 1
   #11 15.28 npm ERR! path /srv/node_modules/pulsar-client
   #11 15.28 npm ERR! command failed
   #11 15.28 npm ERR! command sh -c node-pre-gyp install --fallback-to-build
   #11 15.28 npm ERR! make: Entering directory 
'/srv/node_modules/pulsar-client/build'
   #11 15.28 npm ERR! CC(target) 
Release/obj.target/nothing/../node-addon-api/nothing.o
   #11 15.28 npm ERR! AR(target) Release/obj.target/../node-addon-api/nothing.a
   #11 15.28 npm ERR! COPY Release/nothing.a
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/addon.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/Message.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/MessageId.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/Authentication.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/Client.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/Producer.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/ProducerConfig.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/Consumer.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/ConsumerConfig.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/Reader.o
   #11 15.28 npm ERR! CXX(target) Release/obj.target/Pulsar/src/ReaderConfig.o
   #11 15.28 npm ERR! SOLINK_MODULE(target) Release/obj.target/Pulsar.node
   #11 15.28 npm ERR! make: Leaving directory 
'/srv/node_modules/pulsar-client/build'
   #11 15.28 npm ERR! Failed to execute '/usr/local/bin/node 
/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js build 
--fallback-to-build 
--module=/srv/node_modules/pulsar-client/build/Release/libpulsar.node 
--module_name=libpulsar 
--module_path=/srv/node_modules/pulsar-client/build/Release --napi_version=8 
--node_abi_napi=napi --napi_build_version=0 --node_napi_label=node-v93' (1)
   #11 15.28 npm ERR! node-pre-gyp info it worked if it ends with ok
   #11 15.28 npm ERR! node-pre-gyp info using node-pre-gyp@1.0.9
   #11 15.28 npm ERR! node-pre-gyp info using node@16.15.1 | linux | arm64
   #11 15.28 npm ERR! node-pre-gyp info check checked for 
"/srv/node_modules/pulsar-client/build/Release/libpulsar.node" (not found)
   #11 15.28 npm ERR! node-pre-gyp http GET 
https://pulsar.apache.org/docs/en/client-libraries-cpp/libpulsar-v1.6.2-node-v93-linux-arm64.tar.gz
   #11 15.28 npm ERR! node-pre-gyp ERR! install response status 404 Not Found 
on 
https://pulsar.apache.org/docs/en/client-libraries-cpp/libpulsar-v1.6.2-node-v93-linux-arm64.tar.gz
   #11 15.28 npm ERR! node-pre-gyp WARN Pre-built binaries not installable for 
pulsar-client@1.6.2 and node@16.15.1 (node-v93 ABI, glibc) (falling back to 
source compile with node-gyp)
   #11 15.28 npm ERR! node-pre-gyp WARN Hit error response status 404 Not Found 
on 
https://pulsar.apache.org/docs/en/client-libraries-cpp/libpulsar-v1.6.2-node-v93-linux-arm64.tar.gz
   #11 15.28 npm ERR! gyp info it worked if it ends with ok
   #11 15.28 npm ERR! gyp info using node-gyp@9.0.0
   #11 15.28 npm ERR! gyp info using node@16.15.1 | linux | arm64
   #11 15.28 npm ERR! gyp info ok
   #11 15.28 npm ERR! gyp info it worked if it ends with ok
   #11 15.28 npm ERR! gyp info using node-gyp@9.0.0
   #11 15.28 npm ERR! gyp info using node@16.15.1 | linux | arm64
   #11 15.28 npm ERR! gyp info find Python using Python version 3.7.3 found at 
"/usr/bin/python3"
   #11 15.28 npm ERR! gyp http GET 
https://nodejs.org/download/release/v16.15.1/node-v16.15.1-headers.tar.gz
   #11 15.28 npm ERR! gyp http 200 
https://nodejs.org/download/release/v16.15.1/node-v16.15.1-headers.tar.gz
   #11 15.28 npm ERR! gyp http GET 
https://nodejs.org/download/release/v16.15.1/S

[GitHub] [pulsar-manager] NiuBlibing opened a new issue, #467: Error when generate token

2022-06-23 Thread GitBox


NiuBlibing opened a new issue, #467:
URL: https://github.com/apache/pulsar-manager/issues/467

   
https://github.com/apache/pulsar-manager/blob/0f314e13279a1514d743a4f7062617f42f72e8a7/build.gradle#L155
   
![image](https://user-images.githubusercontent.com/88433283/175449499-dee8e99a-c3f6-48df-9d18-93357f3e9360.png)
   It failed with the error message,and it works when I remove the `constraints`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] PIP-172: Introduce the HEALTH_CHECK command in the binary protocol

2022-06-23 Thread Michael Marshall
Thanks for your replies Cong Zhao.

I think the current PIP might need some clarification on how errors
are handled. For example, if a single broker fails to respond because
it was being restarted, how would the client handle that kind of
failure with this feature?

> This is a good definition of cluster health, but we can't check all topics 
> that would add a lot of load on cleint and broker.

I wasn't suggesting that the client would need to ask the broker for
each of the producers/consumers, but rather that the client would
monitor producers/consumers locally and make decisions about cluster
health. For example, if a producer cannot connect to its target topic
after some amount of time or some number of retries, or if a producer
can connect but cannot publish a message successfully within some
amount of time, then the client could consider the cluster to be
unhealthy.

> This proposal mainly provides a means to check whether there is available 
> topic in the cluster, and I think this is meaningful in most cases.

The client will discover if one of its targeted topics is unavailable,
so instead of monitoring the broker's health check topic, I think the
client should monitor/failover when a targeted topic is "unavailable"
for some configured length of time.

I support making the auto-failover logic more robust, but I don't
think the broker health check is the right signal to use for overall
cluster health. In my view, the broker's health check is meant to
signal to orchestrators (like Kubernetes) when a broker ought to be
restarted.

Thanks,
Michael


On Thu, Jun 23, 2022 at 12:35 AM Cong Zhao  wrote:
>
> Hi Michael,
>
> Thanks for your feedback.
>
> > I define a client's primary cluster as "healthy" when it is "healthy"
> for all of its producers and consumers. I define a healthy producer as
> one that can connect to a topic and publish messages within certain
> latency and throughput thresholds (configured by the user), and I
> define a healthy consumer as one that can connect to a topic and
> consume messages when there are messages to be consumed (possibly
> within a certain latency?).
>
> This is a good definition of cluster health, but we can't check all topics 
> that would add a lot of load on cleint and broker.
>
> > By the above definitions, I don't think the broker's health check will
> give us the right notion of "healthy" because that health check
> monitors producing/consuming to/from the health check topic, not the
> client's target topics. One primary difference is that a health check
> topic could have a different persistence policy, which means the
> client could incorrectly classify the broker as healthy when there
> aren't enough available bookies for a producer's target topic.
>
> This proposal mainly provides a means to check whether there is available 
> topic in the cluster, and I think this is meaningful in most cases.
>
> I think if the client implementation doesn't meet the user's needs, they can 
> also override the `healthCheck` method based on the `HEALTH_CHECK` command.
>
> Thanks,
> Cong Zhao
>
> On 2022/06/22 19:06:25 Michael Marshall wrote:
> > I'd like to clarify the motivation for this PIP. My understanding is
> > that the primary motivation is to give clients a robust way to
> > classify a cluster as "healthy". The initial beneficiary of this
> > feature is the auto failover use case. I think the feature makes
> > sense, but before using the broker's concept of "healthy" as defined
> > in the broker health check, I think we should define what constitutes
> > a "healthy cluster" from the client's perspective.
> >
> > I define a client's primary cluster as "healthy" when it is "healthy"
> > for all of its producers and consumers. I define a healthy producer as
> > one that can connect to a topic and publish messages within certain
> > latency and throughput thresholds (configured by the user), and I
> > define a healthy consumer as one that can connect to a topic and
> > consume messages when there are messages to be consumed (possibly
> > within a certain latency?).
> >
> > By the above definitions, I don't think the broker's health check will
> > give us the right notion of "healthy" because that health check
> > monitors producing/consuming to/from the health check topic, not the
> > client's target topics. One primary difference is that a health check
> > topic could have a different persistence policy, which means the
> > client could incorrectly classify the broker as healthy when there
> > aren't enough available bookies for a producer's target topic.
> >
> > The broker health check also includes checks that we probably don't
> > want to use to classify whole clusters as "unhealthy". For example, if
> > the broker is deadlocked, it will be considered unhealthy. In
> > Kubernetes, that broker will be restarted "soon", and the topics will
> > be scheduled to another broker. I probably wouldn't consider a
> > whole cluster as "unhealthy" because a single broker was

Pulsar Community Meeting Notes 2022/06/23, (8:30 AM PST)

2022-06-23 Thread Michael Marshall
Hi Pulsar Community,

Here are the meeting notes from today's community meeting. Thanks to
all who participated!

Disclaimer: If something is misattributed or misrepresented, please
send a correction to this list.

Source google doc:
https://docs.google.com/document/d/19dXkVXeU2q_nHmkG8zURjKnYlvD96TbKf5KjYyASsOE

Thanks,
Michael

2022/06/23, (8:30 AM PST)
-   Attendees:
-   Matteo Merli
-   Christophe Bornet
-   Ayman Khalil
-   Andrey Yegorov
-   Heesung Sohn
-   Michael Marshall
-   Rajan Dhabalia

-   Discussions/PIPs/PRs (Generally discussed in order they appear)

-   Michael: PIP 172 https://github.com/apache/pulsar/issues/15859.
Looking to get feedback on this PIP so that we can raise awareness of
the feature. Matteo: views that there are two modes for auto failover.
The first is that there is a human controlled model that switches
cluster URLs for the clients. This is not ideal since it is human
driven and it requires an extra service. An advantage is that you
don’t get spurious failures where clients are moving over for certain
failures. In the design of this PIP, there is a risk of different
views of which cluster is healthy and which is not. Matteo: is okay
with automatic failover, but it is very crude. You could add
comprehensive solutions about not being able to connect or not being
able to produce messages for some length of time. We can expand the
logic in the auto failover case by adding checks for number of errors
over a window and configurable thresholds. (There was additional
discussion about the broker health check and what it does.)

-   Rajan: https://github.com/apache/pulsar/pull/15223 - PIP for
syncing pulsar policies across multiple clouds - there is no way to
synchronize the metadata right now. Helps when there isn’t a global
zookeeper, and allows for geo replicated clusters to have unique
metadata stores. Rajan continued to describe the PIP, see
https://github.com/apache/pulsar/issues/13728 for more details.
Matteo: your use case may be multiple cloud, but the actual feature is
more general. It is having global configuration without global
zookeeper. One issue is two writes: what if there is an error in
publishing the write to zookeeper and to bookkeeper. Which broker does
the publishing and how do we make sure we don’t skip any updates?
Rajan: if it fails to publish the message, the solution is eventually
consistent. There is a publish snapshot solution, that would ensure
dropped messages would eventually get sent as snapshots. There is a
producer to send updates to the other clusters and then a consumer to
get the update. Whichever consumer wins will get the update. Matteo:
you can have a race where two consumers believe they have the message.
The only way to do this is with the exclusive producer because it adds
the producer fencing. Rajan: correct, but the event is idempotent, so
the consumer can handle duplicates. Another issue to cover is how to
handle many clusters and the relationships for which clusters
replicate which policies. This proposal allows for using local global
zookeeper for each region. Matteo: the snapshot is going to be
complicated because if you take the snapshot and apply it before other
updates, you’ll get conflicts and could lose data. You are treating
the replication channel and the writing store as two separate things,
but correcting them afterwards will be very hard. If you switch the
logic, and first write to the topic as your write ahead log, then that
is your store. This could be a wrapper on the metadata store. Before
writing to zookeeper, publish on the topic, then when it is persisted,
it can get applied to zookeeper. This handles crashing and restarts,
which also gives you a clear replay model. It doesn’t account for
inconsistency between clusters. Rajan: you’re suggesting we handle
failure more gracefully? Matteo: yes, use a WAL to handle failure.
Rajan: if the topic is down, you cannot write then. Matteo: correct.
Rajan: there is still a need for a synchronizer though. Matteo: you
can just enable compaction on the topic. That is your snapshot. Rajan:
compaction comes with its own cost, though. Matteo: the cost of
compaction is the cost of the snapshot. The compaction can run in the
broker or can run manually. Rajan: when running a big scale system,
compaction has its own issues. If we have any issues with storage, you
lose data. I have lost a ledger, but not a zookeeper snapshot. Matteo:
if you are taking a snapshot and are publishing it on a topic, are not
you still relying on a ledger? (Some back and forth about requirements
and compaction). Matteo: compaction is run on a very large scale in
tens of clusters that I know of. I agree that there were many issues
in compaction, but most of them should be solved. Your durability
guarantees are tied to the ledger replication. Rajan: correct. Let me
think about this. My concerns with compaction are scalability on
server side and durability. Matteo: compaction should be good because
there are limit