I put up a fix - https://github.com/apache/incubator-pulsar/pull/1994 verified that it fixes the problem. however I would like to get feedback before I added unit tests / integration tests.
- Sijie On Tue, Jun 19, 2018 at 12:22 PM Sijie Guo <guosi...@gmail.com> wrote: > Actually there is already an issue for that - > https://github.com/apache/incubator-pulsar/issues/1967 will reuse that. > > The fix can be simple, but would like to get some clarifications from > yahoo folks. as I understand, all the messages in non-persistent topic have > same message ids. it makes the behavior different from persistent topic, > which I think acknowledgements for non-persistent topic are ignored. if > that is the expected semantic, one straightforward fix is disable > acknowledgement grouping for non-persistent topic. Can someone from yahoo > clarify it before I started the fix? > > - Sijie > > On Tue, Jun 19, 2018 at 12:15 PM Sijie Guo <guosi...@gmail.com> wrote: > >> I found the issue. So in 2.0 there is an optimization change on grouping >> acknowledgements. The optimization change here >> <https://github.com/apache/incubator-pulsar/commit/19dd2c502725e3b45e56541576508626c4213091#diff-debb36270f152d2533b03902f214fa9aR674> >> avoided >> delivered messages that are already "delivered". >> However I think in non-persistent topic, since we don't store entries to >> bookkeeper storage, we end up using 0 as ledger id and entry id. so when >> the optimization change treats those >> messages as already "delivered" so it never pops those messages to client >> even client already receive them from the wire. >> >> Filing a github issue for it. >> >> - Sijie >> >> On Tue, Jun 19, 2018 at 11:27 AM Sijie Guo <guosi...@gmail.com> wrote: >> >>> Hi Geoffroy, >>> >>> Sorry for late response. I tried your sequence and observed same >>> behavior. so there might be something wrong there. I am looking into it now. >>> >>> - Sijie >>> >>> On Tue, Jun 19, 2018 at 11:13 AM <geoffroy.fouqu...@exensa.com> wrote: >>> >>>> Le 2018-06-19 19:08, Ali Ahmed a écrit : >>>> > Is the server version is both cases 2.0 ? >>>> >>>> No, I always use the same version for pulsar-client and the server. For >>>> a given pulsar version, my test script download an archive and make all >>>> tests using this archive. >>>> >>>> > -Ali >>>> > >>>> > >>>> > On Tue, Jun 19, 2018 at 7:47 AM, Geoffroy Fouquier < >>>> > geoffroy.fouqu...@exensa.com> wrote: >>>> > >>>> >> >>>> >> I recently describe on pulsar-users my issue concerning >>>> non-persistent >>>> >> topics. This time, I reproduce the faultly behaviour using only >>>> >> pulsar-client and test the new pulsar 2.0.1 release, but my problem >>>> >> remains. >>>> >> >>>> >> It's quite easy to reproduce this behaviour: >>>> >> >>>> >> - I made tests with fresh installations only and without any >>>> >> configurations. >>>> >> >>>> >> - I start a standalone server, then after a few seconds a >>>> >> pulsar-client >>>> >> which consume a topic. >>>> >> >>>> >> - Then i use pulsar-client to produce messages on the same topic >>>> (and >>>> >> repeat with a few seconds of delay between each batch). >>>> >> >>>> >> >>>> >> if I send 10 times 1000 short messages ("foo bar baz") >>>> >> >>>> >> with pulsar 1.22, I receive: >>>> >> >>>> >> 100% of messages on a persistent topic >>>> >> >>>> >> 100% of messages on a non-persistent topic >>>> >> >>>> >> with pulsar 2.0.1, I receive: >>>> >> >>>> >> 100% of messages on a persistent topic. >>>> >> >>>> >> 0,09% of messages on a non-persistent topic (9/10000) >>>> >> >>>> >> >>>> >> In fact, the nine messages are received with the first batch, and >>>> >> nothing >>>> >> is receive after the first one. I understand that if i send too many >>>> >> messages the broker might start to drop messages. But 1000 messages >>>> >> aren't >>>> >> such a burden and pulsar 1.22 doesn't have any problem to handle it. >>>> >> But >>>> >> ok, maybe I send too many messages at the same times, so I try to >>>> send >>>> >> smaller batch (resp. 10 and 2) : >>>> >> >>>> >> - If I send 10 times 10 messages, I receive 12 / 100 messages. >>>> >> >>>> >> - If a send 10 times 2 messages, I receive 11 / 20 messages. >>>> >> >>>> >> >>>> >> So I think there is a bug with pulsar 2 and non-persistent topic, but >>>> >> maybe I am wrong. >>>> >> >>>> >> >>>> >> Some logs about my experiments: >>>> >> >>>> >> % NBITE=10 NB_MESSAGE=1000 PERS=non-persistent >>>> >> PULSAR_VERSION=1.22.0-incubating ./test-pulsar.sh >>>> >> Pulsar version: 1.22.0-incubating >>>> >> Starting standalone pulsar with pid 32128 >>>> >> Starting pulsar consumer (pid 32541) on >>>> non-persistent://tenant/standa >>>> >> lone/ns/topic >>>> >> >>>> >> Writing 1000 messages (foo bar baz) on non-persistent://tenant/standa >>>> >> lone/ns/topic >>>> >> (1/10) Nb received messages: 1000 (should be: 1000) >>>> >> (2/10) Nb received messages: 2000 (should be: 2000) >>>> >> (3/10) Nb received messages: 3000 (should be: 3000) >>>> >> (4/10) Nb received messages: 4000 (should be: 4000) >>>> >> (5/10) Nb received messages: 5000 (should be: 5000) >>>> >> (6/10) Nb received messages: 6000 (should be: 6000) >>>> >> (7/10) Nb received messages: 7000 (should be: 7000) >>>> >> (8/10) Nb received messages: 8000 (should be: 8000) >>>> >> (9/10) Nb received messages: 9000 (should be: 9000) >>>> >> (10/10) Nb received messages: 10000 (should be: 10000) >>>> >> >>>> >> % NBITE=10 NB_MESSAGE=1000 PERS=non-persistent >>>> >> PULSAR_VERSION=2.0.1-incubating ./test-pulsar.sh >>>> >> Pulsar version: 2.0.1-incubating >>>> >> Starting standalone pulsar with pid 2382 >>>> >> Starting pulsar consumer (pid 3080) on >>>> >> non-persistent://tenant/ns/topic >>>> >> >>>> >> Writing 1000 messages (foo bar baz) on >>>> >> non-persistent://tenant/ns/topic >>>> >> [...16:07:36.727 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (1/10) Nb received messages: 9 (should be: 1000) >>>> >> [...16:07:43.792 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (2/10) Nb received messages: 9 (should be: 2000) >>>> >> [...16:07:50.963 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (3/10) Nb received messages: 9 (should be: 3000) >>>> >> [...16:07:58.158 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (4/10) Nb received messages: 9 (should be: 4000) >>>> >> [...16:08:05.419 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (5/10) Nb received messages: 9 (should be: 5000) >>>> >> [...16:08:12.650 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (6/10) Nb received messages: 9 (should be: 6000) >>>> >> [...16:08:19.780 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (7/10) Nb received messages: 9 (should be: 7000) >>>> >> [...16:08:26.857 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (8/10) Nb received messages: 9 (should be: 8000) >>>> >> [...16:08:33.929 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (9/10) Nb received messages: 9 (should be: 9000) >>>> >> [...16:08:40.931 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (10/10) Nb received messages: 9 (should be: 10000) >>>> >> >>>> >> % NBITE=10 NB_MESSAGE=10 PERS=non-persistent >>>> >> PULSAR_VERSION=2.0.1-incubating >>>> >> ./test-pulsar.sh >>>> >> Pulsar version: 2.0.1-incubating >>>> >> Starting standalone pulsar with pid 4336 >>>> >> Starting pulsar consumer (pid 5020) on >>>> >> non-persistent://tenant/ns/topic >>>> >> >>>> >> Writing 10 messages (foo bar baz) on non-persistent://tenant/ns/topic >>>> >> [...16:10:01.506 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (1/10) Nb received messages: 2 (should be: 10) >>>> >> [...16:10:08.197 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (2/10) Nb received messages: 3 (should be: 20) >>>> >> [...16:10:14.995 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (3/10) Nb received messages: 4 (should be: 30) >>>> >> [...16:10:21.707 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (4/10) Nb received messages: 5 (should be: 40) >>>> >> [...16:10:28.516 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (5/10) Nb received messages: 6 (should be: 50) >>>> >> [...16:10:35.398 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (6/10) Nb received messages: 7 (should be: 60) >>>> >> [...16:10:42.248 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (7/10) Nb received messages: 8 (should be: 70) >>>> >> [...16:10:49.218 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (8/10) Nb received messages: 9 (should be: 80) >>>> >> [...16:10:55.964 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (9/10) Nb received messages: 10 (should be: 90) >>>> >> [...16:11:02.649 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 10 messages successfully produced] >>>> >> (10/10) Nb received messages: 12 (should be: 100) >>>> >> >>>> >> % NBITE=10 NB_MESSAGE=2 PERS=non-persistent >>>> >> PULSAR_VERSION=2.0.1-incubating >>>> >> ./test-pulsar.sh >>>> >> Pulsar version: 2.0.1-incubating >>>> >> Starting standalone pulsar with pid 6095 >>>> >> Starting pulsar consumer (pid 6782) on >>>> >> non-persistent://tenant/ns/topic >>>> >> >>>> >> Writing 2 messages (foo bar baz) on non-persistent://tenant/ns/topic >>>> >> [...16:11:50.609 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (1/10) Nb received messages: 2 (should be: 2) >>>> >> [...16:11:57.779 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (2/10) Nb received messages: 3 (should be: 4) >>>> >> [...16:12:04.637 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (3/10) Nb received messages: 4 (should be: 6) >>>> >> [...16:12:11.405 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (4/10) Nb received messages: 5 (should be: 8) >>>> >> [...16:12:18.502 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (5/10) Nb received messages: 6 (should be: 10) >>>> >> [...16:12:25.459 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (6/10) Nb received messages: 7 (should be: 12) >>>> >> [...16:12:32.425 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (7/10) Nb received messages: 8 (should be: 14) >>>> >> [...16:12:39.296 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (8/10) Nb received messages: 9 (should be: 16) >>>> >> [...16:12:46.080 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (9/10) Nb received messages: 10 (should be: 18) >>>> >> [...16:12:52.940 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 2 messages successfully produced] >>>> >> (10/10) Nb received messages: 11 (should be: 20) >>>> >> >>>> >> % NBITE=10 NB_MESSAGE=1000 PERS=persistent >>>> >> PULSAR_VERSION=2.0.1-incubating >>>> >> ./test-pulsar.sh >>>> >> Pulsar version: 2.0.1-incubating >>>> >> Starting standalone pulsar with pid 9848 >>>> >> Starting pulsar consumer (pid 10531) on persistent://tenant/ns/topic >>>> >> >>>> >> Writing 1000 messages (foo bar baz) on persistent://tenant/ns/topic >>>> >> [...16:23:24.986 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (1/10) Nb received messages: 1000 (should be: 1000) >>>> >> [...16:23:42.458 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (2/10) Nb received messages: 2000 (should be: 2000) >>>> >> [...16:24:02.250 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (3/10) Nb received messages: 3000 (should be: 3000) >>>> >> [...16:24:19.675 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (4/10) Nb received messages: 4000 (should be: 4000) >>>> >> [...16:24:37.809 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (5/10) Nb received messages: 5000 (should be: 5000) >>>> >> [...16:24:56.121 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (6/10) Nb received messages: 6000 (should be: 6000) >>>> >> [...16:25:13.922 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (7/10) Nb received messages: 7000 (should be: 7000) >>>> >> [...16:25:32.399 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (8/10) Nb received messages: 8000 (should be: 8000) >>>> >> [...16:25:50.666 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (9/10) Nb received messages: 9000 (should be: 9000) >>>> >> [...16:26:09.348 [main] INFO >>>> >> org.apache.pulsar.client.cli.PulsarClientTool >>>> >> - 1000 messages successfully produced] >>>> >> (10/10) Nb received messages: 10000 (should be: 10000) >>>> >> >>>> >> -- >>>> >> >>>> >> Geoffroy Fouquier >>>> >> >>>> >> http://eXenSa.com >>>> >> >>>> >> >>>> >> >>>> >> >>>> >>>