Re: [ANNOUNCE] New committer: Yong Zhang

2021-01-22 Thread Yong Zhang
Thank you all!

Happy to be a pulsar committer!

Yong

On Fri, 22 Jan 2021 at 14:20, Enrico Olivelli  wrote:

> Congratulations Yong!
>
>
> Enrico
>
> Il Ven 22 Gen 2021, 04:28 Huanli Meng  ha
> scritto:
>
> >
> > Congratulations, Yong
> >
> > BR//Huanli
> >
> > > On Jan 22, 2021, at 10:41 AM, Guangning E 
> wrote:
> > >
> > > Congratulations!
> > >
> > > Thanks,
> > > Guangning
> > >
> > > Dianjin Wang  于2021年1月22日周五 上午10:33写道:
> > >
> > >> Congratulations!
> > >>
> > >> Best,
> > >> Dianjin Wang
> > >>
> > >> On Fri, Jan 22, 2021 at 10:24 AM Jinfeng Huang 
> > >> wrote:
> > >>>
> > >>> Dear all,
> > >>> The Apache Pulsar PMC recently extended committer karma to Yong Zhang
> > >> and he has accepted.
> > >>>
> > >>> Yong Zhang has done a lot of contributions to Pulsar core features
> such
> > >> as Transactions and Package API, and lots of bug fixes. It is great to
> > have
> > >> Yong onboard as Pulsar committers.
> > >>>
> > >>> Congratulations and welcome onboard Yong Zhang! Please join me to
> > >> welcome Yong Zhang.
> > >>>
> > >>> @yong, you can share with us a little more about yourself.
> > >>>
> > >>> Best Regards,
> > >>> Jennifer on behave of the Pulsar PMC
> > >>
> >
> >
>


[GitHub] [pulsar-helm-chart] SakaSun opened a new pull request #98: The name of the resource for accessing ConfigMap object is 'configmaps'

2021-01-22 Thread GitBox


SakaSun opened a new pull request #98:
URL: https://github.com/apache/pulsar-helm-chart/pull/98


   ### Motivation
   
   Exception while trying to get configmap as following error log:
   ERROR 
org.apache.pulsar.functions.runtime.kubernetes.KubernetesRuntimeFactory - Error 
while trying to fetch configmap pulsar-functions-worker-config at namespace 
pulsar
   io.kubernetes.client.openapi.ApiException: Forbidden
at 
io.kubernetes.client.openapi.ApiClient.handleResponse(ApiClient.java:971) 
~[io.kubernetes-client-java-api-9.0.2.jar:?]
at io.kubernetes.client.openapi.ApiClient.execute(ApiClient.java:883) 
~[io.kubernetes-client-java-api-9.0.2.jar:?]
at 
io.kubernetes.client.openapi.apis.CoreV1Api.readNamespacedConfigMapWithHttpInfo(CoreV1Api.java:44821)
 ~[io.kubernetes-client-java-api-9.0.2.jar:?]
at 
io.kubernetes.client.openapi.apis.CoreV1Api.readNamespacedConfigMap(CoreV1Api.java:44791)
 ~[io.kubernetes-client-java-api-9.0.2.jar:?]
at 
org.apache.pulsar.functions.runtime.kubernetes.KubernetesRuntimeFactory.fetchConfigMap(KubernetesRuntimeFactory.java:369)
 [org.apache.pulsar-pulsar-functions-runtime-2.7.0.jar:2.7.0]
at 
org.apache.pulsar.functions.runtime.kubernetes.KubernetesRuntimeFactory$1.run(KubernetesRuntimeFactory.java:358)
 [org.apache.pulsar-pulsar-functions-runtime-2.7.0.jar:2.7.0]
at java.util.TimerThread.mainLoop(Timer.java:555) [?:1.8.0_275]
at java.util.TimerThread.run(Timer.java:505) [?:1.8.0_275]
   
   ### Modifications
   
   Fixed resource name for ConfigMap object
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-helm-chart] SakaSun closed pull request #98: The name of the resource for accessing ConfigMap object is 'configmaps'

2021-01-22 Thread GitBox


SakaSun closed pull request #98:
URL: https://github.com/apache/pulsar-helm-chart/pull/98


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [pulsar-helm-chart] SakaSun commented on pull request #98: The name of the resource for accessing ConfigMap object is 'configmaps'

2021-01-22 Thread GitBox


SakaSun commented on pull request #98:
URL: https://github.com/apache/pulsar-helm-chart/pull/98#issuecomment-765525914


   Duplicate of #95



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [E] Re: [PIP-78] Split the individual acknowledgments into multiple entries

2021-01-22 Thread Joe Francis
Let me take a step back and explain  how I am looking at this from a
high-level
design viewpoint


Bookkeeper (BK) is like an LSM implementation of a KV store. Writes to all
keys are appended to a single file; deletes are logical.  Compaction
reclaims space.  An Index is used locate entries, tracking logical deletes
and reclaim space.


The index in BK  is another LSM.  Again, writes are appended, deletes are
logical, and  an index is used to  locate entries , account for deletes and
compaction to reclaim space (the implementation within rocksdb is far more
complex with bloom filters and memtables, but you get the idea )   BK just
uses a sophisticated index (rocksdb) which is tiny and cacheable and
rocksdb has within it a sophisticated index which is small and cacheable


So when I look at this proposal, what I see is the same - another attempt
to build an LSM with a sophisticated index/cache mechanism using log
structured storage. So I am quite skeptical that this needs to solved this
way,  within Pulsar.



Joe

On Wed, Jan 20, 2021 at 12:30 AM linlin  wrote:

> We can look at ManagedCursorImpl.buildIndividualDeletedMessageRanges
>
> What is saved in the entry is not a bitSet, but a messageRange one by one,
> which contains information such as ledgerId and entryId. BitSet only exists
> in the memory and is used to quickly determine whether it already exists.
> In addition, the position of each ack will be stored in the
> individualDeletedMessages queue. When persisted to the entry, the queue
> will be traversed, and the position information of each ack will generate a
> messageRange.
> A messageRange contains lowerEndpoint (ledgerId+entryId), upperEndpoint
> (ledgerId+entryId), 4 longs, about 256 bits.
>
> We assume a more extreme scenario, 300K messages, every other ack has an
> unacknowledged, that is, 150K location information will be stored in
> individualDeletedMessages. 150K * 256/8/1024 /1024 ≈ 4.6MB
> Of course, there are also scenarios where the customer's ack spans several
> ledgers.
>
>
> On 2021/01/20 00:38:47, Joe F  wrote:
> > I have a simpler question. Just storing the message-ids raw will fit
> ~300K>
> > entries in one ledger entry. With the bitmap  changes, we can store a>
> > couple of million  within one 5MB ledger entry.  So can you tell us what>
> > numbers of unacked messages are  creating a problem?  What exactly are
> the>
> > issues you face, and at what numbers of unacked messages/memory use etc?>
> >
> > I have my own concerns about this proposal, but I would like to
> understand>
> > the problem first>
> >
> > Joe>
> >
> > On Sun, Jan 17, 2021 at 10:16 PM Sijie Guo  wrote:>
> >
> > > Hi Lin,>
> > >>
> > > Thanks you and Penghui for drafting this! We have seen a lot of pain
> points>
> > > of `managedLedgerMaxUnackedRangesToPersist` when enabling delayed
> messages.>
> > > Glad that you and Penghui are spending time on resolving this!>
> > >>
> > > Overall the proposal looks good. But I have a couple of questions about
> the>
> > > proposal.>
> > >>
> > > 1. What happens if the broker fails to write the entry marker? For
> example,>
> > > at t0, the broker flushes dirty pages and successfully writes an entry>
> > > marker. At t1, the broker tries to flushes dirty pages but failed to
> write>
> > > the new entry marker. How can you recover the entry marker?>
> > >>
> > > 2.  When a broker crashes and recovers the managed ledger, the cursor>
> > > ledger is not writable anymore. Are you going to create a new cursor
> ledger>
> > > and copy all the entries from the old cursor ledger to the new one?>
> > >>
> > > It would be good if you can clarify these two questions.>
> > >>
> > > - Sijie>
> > >>
> > > On Sun, Jan 17, 2021 at 9:48 PM linlin  wrote:>
> > >>
> > > > Hi, community:>
> > > > Recently we encountered some problems when using individual>
> > > > acknowledgments, such as:>
> > > > when the amount of acknowledgment is large, entry writing fails; a
> large>
> > > > amount of cache causes OOM, etc.>
> > > > So I drafted a PIP in `>
> > > >>
> > > >>
> > >
>
> https://docs.google.com/document/d/1uQtyb8t6X04v2vrSrdGWLFkuCkBcGYZbqK8XsVJ4qkU/edit?usp=sharing`
> 
> >
>
> > > <
>
> https://docs.google.com/document/d/1uQtyb8t6X04v2vrSrdGWLFkuCkBcGYZbqK8XsVJ4qkU/edit?usp=sharing
> >>
>
> > > > <>
> > >
>
> https://docs.google.com/document/d/1uQtyb8t6X04v2vrSrdGWLFkuCkBcGYZbqK8XsVJ4qkU/edit?usp=sharing
> >
>
> > > >>
> > > > ,>
> > > > any voice is welcomed.>
> > > >>
> > >>
> >
>


Re: [E] Re: [PIP-78] Split the individual acknowledgments into multiple entries

2021-01-22 Thread Sijie Guo
Joe - Delayed messages or certain user logic can introduce a lot of message
holes. We have seen this issue in quite a lot of customers' production
environment. Hence we need to find a solution for solving these problems.
If you are skeptical of an implementation like that, how about us making
cursor implementation pluggable. We can make this proposal implemented as
one plugin. So it will not impact any existing logic but allowing people
use a plugin to solve this problem.

Thanks,
Sijie

On Fri, Jan 22, 2021 at 5:00 PM Joe Francis 
wrote:

> Let me take a step back and explain  how I am looking at this from a
> high-level
> design viewpoint
>
>
> Bookkeeper (BK) is like an LSM implementation of a KV store. Writes to all
> keys are appended to a single file; deletes are logical.  Compaction
> reclaims space.  An Index is used locate entries, tracking logical deletes
> and reclaim space.
>
>
> The index in BK  is another LSM.  Again, writes are appended, deletes are
> logical, and  an index is used to  locate entries , account for deletes and
> compaction to reclaim space (the implementation within rocksdb is far more
> complex with bloom filters and memtables, but you get the idea )   BK just
> uses a sophisticated index (rocksdb) which is tiny and cacheable and
> rocksdb has within it a sophisticated index which is small and cacheable
>
>
> So when I look at this proposal, what I see is the same - another attempt
> to build an LSM with a sophisticated index/cache mechanism using log
> structured storage. So I am quite skeptical that this needs to solved this
> way,  within Pulsar.
>
>
>
> Joe
>
> On Wed, Jan 20, 2021 at 12:30 AM linlin  wrote:
>
> > We can look at ManagedCursorImpl.buildIndividualDeletedMessageRanges
> >
> > What is saved in the entry is not a bitSet, but a messageRange one by
> one,
> > which contains information such as ledgerId and entryId. BitSet only
> exists
> > in the memory and is used to quickly determine whether it already exists.
> > In addition, the position of each ack will be stored in the
> > individualDeletedMessages queue. When persisted to the entry, the queue
> > will be traversed, and the position information of each ack will
> generate a
> > messageRange.
> > A messageRange contains lowerEndpoint (ledgerId+entryId), upperEndpoint
> > (ledgerId+entryId), 4 longs, about 256 bits.
> >
> > We assume a more extreme scenario, 300K messages, every other ack has an
> > unacknowledged, that is, 150K location information will be stored in
> > individualDeletedMessages. 150K * 256/8/1024 /1024 ≈ 4.6MB
> > Of course, there are also scenarios where the customer's ack spans
> several
> > ledgers.
> >
> >
> > On 2021/01/20 00:38:47, Joe F  wrote:
> > > I have a simpler question. Just storing the message-ids raw will fit
> > ~300K>
> > > entries in one ledger entry. With the bitmap  changes, we can store a>
> > > couple of million  within one 5MB ledger entry.  So can you tell us
> what>
> > > numbers of unacked messages are  creating a problem?  What exactly are
> > the>
> > > issues you face, and at what numbers of unacked messages/memory use
> etc?>
> > >
> > > I have my own concerns about this proposal, but I would like to
> > understand>
> > > the problem first>
> > >
> > > Joe>
> > >
> > > On Sun, Jan 17, 2021 at 10:16 PM Sijie Guo  wrote:>
> > >
> > > > Hi Lin,>
> > > >>
> > > > Thanks you and Penghui for drafting this! We have seen a lot of pain
> > points>
> > > > of `managedLedgerMaxUnackedRangesToPersist` when enabling delayed
> > messages.>
> > > > Glad that you and Penghui are spending time on resolving this!>
> > > >>
> > > > Overall the proposal looks good. But I have a couple of questions
> about
> > the>
> > > > proposal.>
> > > >>
> > > > 1. What happens if the broker fails to write the entry marker? For
> > example,>
> > > > at t0, the broker flushes dirty pages and successfully writes an
> entry>
> > > > marker. At t1, the broker tries to flushes dirty pages but failed to
> > write>
> > > > the new entry marker. How can you recover the entry marker?>
> > > >>
> > > > 2.  When a broker crashes and recovers the managed ledger, the
> cursor>
> > > > ledger is not writable anymore. Are you going to create a new cursor
> > ledger>
> > > > and copy all the entries from the old cursor ledger to the new one?>
> > > >>
> > > > It would be good if you can clarify these two questions.>
> > > >>
> > > > - Sijie>
> > > >>
> > > > On Sun, Jan 17, 2021 at 9:48 PM linlin  wrote:>
> > > >>
> > > > > Hi, community:>
> > > > > Recently we encountered some problems when using individual>
> > > > > acknowledgments, such as:>
> > > > > when the amount of acknowledgment is large, entry writing fails; a
> > large>
> > > > > amount of cache causes OOM, etc.>
> > > > > So I drafted a PIP in `>
> > > > >>
> > > > >>
> > > >
> >
> >
> https://docs.google.com/document/d/1uQtyb8t6X04v2vrSrdGWLFkuCkBcGYZbqK8XsVJ4qkU/edit?usp=sharing`
>