2020-02-26 09:13:03 UTC - Eugen: Another way to think about this would be to 
consider it a synchronous topic compaction with only one key ( 
<https://github.com/streamnative/pulsar/issues/650> )
----
2020-02-26 09:15:07 UTC - Eugen: But as it's only for one key, (I hope) it 
would be able to do without the  overhead of a compaction topic.
----
2020-02-26 16:13:10 UTC - Alexandre DUVAL: Hi all,

About <https://github.com/apache/pulsar/pull/6428>, in AuthorizationProvider, 
some of methods return CompletableFuture&lt;Void&gt; and other 
CompletableFuture&lt;Boolean&gt;, I personally prefer the last one (makes more 
sense).
Because for example:
```public CompletableFuture&lt;Void&gt; grantPermissionAsync(TopicName 
topicname, Set&lt;AuthAction&gt; actions, String role, String authDataJson)```
can't be joined with:
```public CompletableFuture&lt;Boolean&gt; canManageTopic(TopicName topicName, 
TopicOperation operation, String role, AuthenticationDataSource authData)
// with TopicOperation.grantPermissions```
The main problem is about breaking changes, so must I have both 
CompletableFuture&lt;Boolean&gt; and CompletableFuture&lt;Void&gt; impl?
WDYT?
----
2020-02-26 16:21:33 UTC - Alexandre DUVAL: I would like to use canManageTopic 
&amp; canManageNamespace in already existing methods in AuthorizationProvider, 
but maybe I should just use old and new grantPermissionAsync &amp; 
canManageTopic where grantPermissionAsync is used and depreciate it?
----
2020-02-26 16:43:00 UTC - Addison Higham: :disappointed: :disappointed: another 
case of one of my ZK nodes not being consistent with other 2... just missing a 
few keys that the other 2 have
----
2020-02-26 16:55:35 UTC - chris: If you provide a default implementation in the 
interface is it still a breaking change? Also, the naming of function 
authorization method makes more sense to me. These names would make more sense 
to me allowTopicOperation allowNamespaceOperation. To start using it could have 
a default implementation of allowOperation map to the old method in the 
interface. Then you should be able to replaces instances of the old methods 
with the new ones.
----
2020-02-26 16:56:13 UTC - chris: and this could keep working as before
----
2020-02-26 17:00:47 UTC - Alexandre DUVAL: good ideas!
----
2020-02-26 17:26:59 UTC - John Duffie: @John Duffie has joined the channel
----
2020-02-26 19:11:45 UTC - Joe Francis: The general solution to this would be a 
db with change notifications.  You can simulate one. Use a state topic and an 
update  topic. Run a function that keeps updating the state topic from the 
notification topic. Anytime the state gets updated, the function acks the 
previous message in the state, topic so the last message  will be retained till 
the next state is published. And after updating the   state , the function acks 
the update message. So  the last message in the state topic plus unacked 
messages in the update topic is the current state.
+1 : Devin G. Bost
----
2020-02-26 20:05:34 UTC - Eugen: Can't this be handled in the broker, rather 
than requiring the user to create another topic and a function? After all, we 
already have
1. `MessageId.earliest`: getting all existing messages + all new messages
2. `MessageId.latest`: getting only new messages
So what I'm trying to do is adding this case:
3. `MessageId.latestInclusive`: getting only the latest existing message + all 
new messages
So why can't the broker do what the consumer can, namely  doing a 
`getLatestMessageId()` and have the reader receive all messages from that id? 
In other words, if 1. works by passing msgId.earliest, and there is a way to 
get the latest message id, why would it be so hard to make 
msgId.latestInclusive work in the same way?
----
2020-02-26 20:28:00 UTC - Ravi Shah: How to pass pulsar backlog message metric 
to HPA to scale Kubernetes pods?
----
2020-02-26 20:29:09 UTC - Ravi Shah: Is there any pulsar Prometheus adapter for 
custom metric which i can pass to HPA? 
----
2020-02-26 21:09:18 UTC - Luke Lu: @Luke Lu has joined the channel
----
2020-02-26 22:17:35 UTC - Joe Francis: That is a good question, and I can 
provide my opinion, which is a bit  philosophical. It comes down to how systems 
ought to be designed.  Databases and message queues  are entirely different 
systems, and designed to do very different work efficiently. The tradeoffs made 
when implementing such systems, on more or less the same physical resources, 
are totally different. One is designed for random writes/reads and search, the 
other is designed for sequential reads and writes, and WORM 

I run Pulsar at a very large scale (Millions of topics) I also run a very large 
distributed db (PetaB), . Day in day out, I have users who want to use my db as 
a Q and the Q(Pulsar) as a db. Obviously both are possible. But both cannot be 
done efficiently at scale . Pulsar is the way it is because when it was 
designed, it was decided that it would not attempt to do certain things, like 
transactions, scheduled delivery, nack, compaction etc, which all are flow 
killers. And that was not without reason, because all the people involved in 
building Pulsar had attempted to build something similar with AMQ as the base, 
and understood the problems introduced  by attempting to be everything to 
everyone.  Pulsar scales and is super efficient at what it does, because of 
what it chose not to do.  

Functions are a nice abstraction, similar to the dispatch vs storage 
abstraction. Although, that abstraction is more observed in the breach now.
----
2020-02-26 22:27:55 UTC - Eugen: I see your general point about db vs queue. 
But you seem to be saying that what I'm requesting cannot be made efficient or 
scalable, but would `latestInclusive` be any less efficient or scalable than 
the existing `earliest`?
----
2020-02-26 22:59:40 UTC - Joe Francis: That assumes that I am in favor of 
earliest :slightly_smiling_face:  But assuming I do, the earliest is free; 
lastestinclusive requires work in the dispatch path
----
2020-02-26 23:11:11 UTC - Eugen: Thanks joef for your opinion, I appreciate it! 
It's true, we are trying to do something that is different from the core 
functionality of pulsar. Form a user's perspective, our problem is imo much 
more a streaming (but not queue) one than it is a database one (databases, 
which return the latest value for something, but in general do not stream real 
time events). One of these days I hope to find the time to look into this and 
see how much of a problem it is in terms of implementation and efficiency. But 
regardless of anyone being in favor of features like earliest, they are very 
handy and useful to some, and everyone else can just ignore them and won't be 
impacted...
----
2020-02-26 23:35:56 UTC - Eugen: Fwiw, one of pulsar's core features is i/o 
separation, so that historical data can be read without impacting throughput / 
latency real-time streams
----
2020-02-27 00:56:30 UTC - Alexandre DUVAL: Why subscription argument exists on 
canConsume in authz?
----
2020-02-27 01:02:01 UTC - Eric Simon: Can someone explain to me why changes are 
being released into the 2.5.0 after the release? This is incredibly frustrating.
----
2020-02-27 01:48:06 UTC - Devin G. Bost: FYI @Penghui Li
----
2020-02-27 04:20:22 UTC - Sijie Guo: @Addison Higham: it is probably the 
zookeeper is lagging behind.
----
2020-02-27 04:21:02 UTC - Sijie Guo: 2.5.0 is a tag.
----
2020-02-27 04:21:15 UTC - Sijie Guo: I don’t think there is new changes are 
release to 2.5.0
----
2020-02-27 04:21:25 UTC - Sijie Guo: there are changes released to branch-2.5
----
2020-02-27 04:21:37 UTC - Addison Higham: this was just the config store that 
only has a few thousand nodes :confused:
----
2020-02-27 04:21:38 UTC - Sijie Guo: the changes are used for cutting 2.5.1
----
2020-02-27 04:21:47 UTC - Addison Higham: but I have it on my list and go back 
through metrics
----
2020-02-27 04:21:54 UTC - Sijie Guo: Can you explain what are the problems you 
see?
+1 : Devin G. Bost
----
2020-02-27 04:22:29 UTC - Sijie Guo: a few thousand nodes or a few thousands 
znodes?
----
2020-02-27 04:24:05 UTC - Addison Higham: znodes :slightly_smiling_face:
----
2020-02-27 04:26:28 UTC - Addison Higham: ~6k znodes, a whole snapshot 
(including ephemeral data), the data dir of the master shows it is only about 
~4mb of data (it is just the config store so relatively low update rate as 
well...). After I nuked the storage and did a resync, it took like 10 seconds
----
2020-02-27 04:26:33 UTC - Addison Higham: to resync
----
2020-02-27 04:27:35 UTC - Addison Higham: and the state persisted for a few 
hours. My best guess is that since I run this on k8s that perhaps there is 
something about ip addresses changing rapidly that can cause issues
----
2020-02-27 04:29:38 UTC - Addison Higham: sadly I didn't have time to 
investigate more :confused:
----
2020-02-27 04:31:24 UTC - Sijie Guo: @Addison Higham: there is no real 
guarantee about when an update is propageted to all participants / followers. 
so I would suggest no using global configuration store. instead, just write a 
task to do a multi-write to create namespace in the clusters that this 
namespace is assigned to.
----
2020-02-27 04:31:36 UTC - Sijie Guo: but it requires additional work.
----
2020-02-27 04:32:38 UTC - Addison Higham: in this case, the ZKs are actually 
all in the same region (we had run a proper global ZK in our beta env and for 
now just decided to have observers in all other regions)
----
2020-02-27 04:33:52 UTC - Sijie Guo: observers are usually the problems 
:slightly_smiling_face:
----
2020-02-27 04:34:08 UTC - Sijie Guo: because they are easily falling behind.
----
2020-02-27 04:34:30 UTC - Addison Higham: the region where we had the issue was 
the region where we have the 3 participating members
----
2020-02-27 04:34:59 UTC - Addison Higham: (ATM we have a fairly large 
imbalance, with 90% of traffic in one region, so we put the real quorum members 
there)
----
2020-02-27 04:36:29 UTC - Addison Higham: anyways, yes, it seems unlikely that 
it is solely a ZK bug, I am assuming it is likely something partially about how 
we are running things, but I am just surprised by it
----
2020-02-27 04:38:29 UTC - Addison Higham: I mostly just need to dig back 
through logs and metrics to figure it out, the interesting bit is that the 
metrics report a znode count that matches, but when I went to actually go 
inspect, znodes were missing
----
2020-02-27 04:40:16 UTC - Sijie Guo: are your metrics collected on the same 
zookeeper node?
----
2020-02-27 04:40:39 UTC - Addison Higham: we collect from all 3 members, and 
actually might be wrong, had my query screwed up...
----
2020-02-27 04:43:01 UTC - Sijie Guo: I guess you are using the headless service?
----
2020-02-27 04:43:51 UTC - Addison Higham: not quite sure what you mean, we are 
just using the pulsar images that attach the prometheus exporter and pulling 
them that way and export to datadog
----
2020-02-27 04:45:04 UTC - Addison Higham: znode drop off
----
2020-02-27 04:46:11 UTC - Addison Higham: that is count of znodes, the drop off 
correlates when I restarted all three zk nodes, but they all got the storage 
re-attached
----
2020-02-27 04:48:46 UTC - Addison Higham: maybe I am just dumb and really 
shouldn't be restarting all 3 nodes at once?
----
2020-02-27 04:52:13 UTC - Sijie Guo: all I see. so the znode count drops at all 
3 nodes?
----
2020-02-27 05:05:49 UTC - Addison Higham: no, just the one node
----
2020-02-27 05:07:04 UTC - Addison Higham: the line above the drop are the the 
other 2 members and it stays consistent with the count before the restart
----
2020-02-27 05:09:41 UTC - Joe Francis: what is the status of the low-znode ZK? 
Is it in the quorum?
----
2020-02-27 05:10:15 UTC - Addison Higham: yes and reported itself as a follower
----
2020-02-27 05:13:35 UTC - Joe Francis: That seems strange. Unless somehow its 
on-disk snapshot was mucked up. There has been a recent issue with ZK not 
shutting down when running out of disk space ...
----
2020-02-27 05:13:54 UTC - Addison Higham: 10GB disks but it is only about 4 MB 
of data :confused:
----
2020-02-27 05:17:06 UTC - Devin G. Bost: @Joe Francis I personally experienced 
the issue with ZK running out of disk space. That was a big mess, but the 
bigger problem was that it didn't recover correctly after the crash: 
<https://issues.apache.org/jira/plugins/servlet/mobile#issue/ZOOKEEPER-1621|https://issues.apache.org/jira/plugins/servlet/mobile#issue/ZOOKEEPER-1621>
There's another open ZK issue that I came across that I think related, but I 
don't remember how I found it.
----
2020-02-27 05:17:07 UTC - Addison Higham: so, my best guess at the moment:
• we run on k8s, with AWS CNI, which means the pod gets a unique IP on each boot
• we are using DNS -&gt; IPs for global ZK, so there is lag on reboots to form 
quorum as the DNS gets swapped around
• perhaps somehow during this time, as it tries and re-form quorum against 
invalid IPs it can just one node but not the other and we end up with some sort 
of strange "network partition" where A-&gt;B and B-&gt;C but A can't connect to 
C
----
2020-02-27 05:17:15 UTC - Joe Francis: 
<https://issues.apache.org/jira/browse/ZOOKEEPER-3701|ZOOKEEPER-3701>
+1 : Devin G. Bost
----
2020-02-27 05:20:14 UTC - Devin G. Bost: @Addison Higham when the ZK disk 
filled up and crashed, it happened really fast, so we needed to watch it 
carefully to spot it. Good metrics would probably have made it easier to detect 
for us.
----
2020-02-27 05:20:53 UTC - Devin G. Bost: 4 MB of data makes me wonder if it 
crashed and restarted already.
----
2020-02-27 05:24:29 UTC - Addison Higham: this was during a lull in traffic and 
happened as a result of maintenance where we manually restarted the global zk 
nodes
----
2020-02-27 05:26:30 UTC - Sijie Guo: @Addison Higham did you restart the node 
with small count of znodes?
----
2020-02-27 05:30:34 UTC - Devin G. Bost: @Addison Higham I think we also ran 
into odd problems in the past when we've restarted all ZK nodes simultaneously.
----
2020-02-27 05:31:30 UTC - Addison Higham: need to check against some logs to 
correlate the exact timing, one sec
----
2020-02-27 05:32:51 UTC - Joe Francis: A ZK cluster will survive a full reboot, 
so shutting down all should not matter. I would consult the log of the lower 
count ZK and the leader and see what sort of sync happened when it joined the 
quorum
----
2020-02-27 05:40:41 UTC - Devin G. Bost: @Joe Francis One time, we had a ZK 
cluster where some of our broker information was missing, but I don't think I 
checked all the ZK nodes for it.
If I remember correctly, that may have happened after restarting all the ZK 
nodes, but I don't remember for certain.
----
2020-02-27 05:45:42 UTC - Devin G. Bost: I thought there might have been an 
association with what @Penghui Li's recent PIP was about.
----
2020-02-27 05:55:27 UTC - Joe Francis: My cluster is on BM, and normally we 
operate at about 10M  znodes. Our main issues are with the Global ZK, but 
nothing  wild
----
2020-02-27 05:58:37 UTC - Joe Francis: Having reliable, fast storage for ZK 
disks will make life easier.
----
2020-02-27 06:02:47 UTC - Devin G. Bost: We definitely experienced far fewer 
issues after we upgraded our ZK disks to fast SSD SAN.
----
2020-02-27 06:05:02 UTC - Devin G. Bost: How do I get the znode count?
----
2020-02-27 06:09:25 UTC - Joe Francis: mntr should show it
----

Reply via email to