Re: [DISCUSS] PIP-211: Introduce offload throttling

Jiuming Tao Fri, 11 Nov 2022 06:09:45 -0800

Hi Michael,

> The main trade off with throttling is that we might
> leave performance on the table if the cluster is not heavily utilized.


It is indeed possible. When a broker don’t publish/consume many messages and 
offloaders are in the throttling, 
the performance of the broker cannot be released.
But offloader throttling exists as a bottom-line solution, it's reasonable.


> It would not be a broker. It would be an offloader, and its sole task
> would be offloading data.


It surely a good solution, broker and offloader wouldn’t affect each other.
But like you said, a new service would complicate the pulsar deployment.

 
> I made my tangent because I think the current design leads to
> unnecessary load on the broker

Could you please explain that why the design will leads to unnecessary load on 
the broker?
IMO, I don't think it will bring huge expenses.

>  which could be misinterpreted by the
> load manager as a reason to load balance, which could interrupt
> offloading

It is indeed possible, because Offload takes longer, the chance of being 
interrupted increases.
But I think maybe there is no better solution for this.

Thanks,
Tao Jiuming


> 2022年11月11日 上午4:04，Michael Marshall <mmarsh...@apache.org> 写道：
> 
>> Yes, the PIP’s key point is protect the broker, to prevent offloading takes 
>> too much broker resources.
> 
> Throttling also protects the bookkeeper. Reads that are used to
> offload data are the lowest priority reads since they are not serving
> an actual client. Since we don't have a way to tell bookkeeper the
> requested quality of service for a read operation, throttling is a
> natural solution. The main trade off with throttling is that we might
> leave performance on the table if the cluster is not heavily utilized.
> 
>> Do you mean that add a new broker type? And this type broker only for 
>> Offload processing?
> 
> It would not be a broker. It would be an offloader, and its sole task
> would be offloading data. The broker would still be the component that
> serves reads from tiered storage.
> 
>> I think that the introduction of a new broker type is relatively heavyweight 
>> in order to implement offload throttling.
> 
> The "offloader" component is independent of the throttling feature. We
> can implement this PIP without addressing my tangent. That being said,
> I made my tangent because I think the current design leads to
> unnecessary load on the broker, which could be misinterpreted by the
> load manager as a reason to load balance, which could interrupt
> offloading. I would guess that interruption could force the offloader
> to need to restart the task of offloading a ledger, which is very
> inefficient.
> 
> Thanks,
> Michael
> 
> On Thu, Nov 10, 2022 at 12:50 PM Jiuming Tao
> <jm...@streamnative.io.invalid> wrote:
>> 
>> Hi Michael,
>> 
>> 
>>> This PIP is similar to autorecovery throttling. I think the feature
>>> makes sense for the same reasons that throttling autorecovery makes
>>> sense.
>> 
>> Yes, the PIP’s key point is protect the broker, to prevent offloading takes 
>> too much broker resources.
>> 
>>> Tangentially, can we decouple writes to tiered storage from the broker
>>> hosting the topic being offloaded?
>> 
>> Do you mean that add a new broker type? And this type broker only for 
>> Offload processing?
>> I think that the introduction of a new broker type is relatively heavyweight 
>> in order to implement offload throttling.
>> We can do it in a simpler way
>> 
>> Thanks,
>> Tao Jiuming
>> 
>> 
>> 
>> 
>>> 2022年11月8日 上午8:08，Michael Marshall <mmarsh...@apache.org> 写道：
>>> 
>>> This PIP is similar to autorecovery throttling. I think the feature
>>> makes sense for the same reasons that throttling autorecovery makes
>>> sense.
>>> 
>>> Tangentially, can we decouple writes to tiered storage from the broker
>>> hosting the topic being offloaded? An independent service could write
>>> to tiered storage without impacting the broker and could easily scale
>>> as with the work. The primary complication for the service would be
>>> figuring out which ledgers to offload. Perhaps the managed ledger
>>> could "offer" ledgers up that need to be offloaded, and the new
>>> service would only need to consume those events.
>>> 
>>> Although, a new service would complicate the pulsar deployment.
>>> 
>>> Thanks,
>>> Michael
>>> 
>>> On Mon, Nov 7, 2022 at 10:30 AM Jiuming Tao
>>> <jm...@streamnative.io.invalid> wrote:
>>>> 
>>>>> One alternative would be to throttle offload in the write path instead of 
>>>>> adding additional logic to the read path in managed ledgers.
>>>> 
>>>> This is really a feasible method.
>>>> But we need to make changes in FileSystem and BlobStore offloaders, event 
>>>> custom offloaders. I think this is not universal.
>>>> 
>>>>> One simple way to do this is to to limit how many threads can write 
>>>>> offloaded ledgers. This is the same way that reading of offloaded ledgers 
>>>>> are already “throttled” by that thread count defaulting to 2.
>>>> 
>>>> Yes, the offloader thread count is defaulting to 2, but, it does not 
>>>> effectively limit traffic. If the reading rate of BK is very fast, it also 
>>>> leads to high CPU/Memory/Network usage
>>>> 
>>>> Thanks,
>>>> Tao Jiuming
>>>> 
>>>>> 2022年11月2日 上午1:43，Dave Fisher <w...@apache.org> 写道：
>>>>> 
>>>>> One alternative would be to throttle offload in the write path instead of 
>>>>> adding additional logic to the read path in managed ledgers.
>>>>> 
>>>>> One simple way to do this is to to limit how many threads can write 
>>>>> offloaded ledgers. This is the same way that reading of offloaded ledgers 
>>>>> are already “throttled” by that thread count defaulting to 2.
>>>>> 
>>>>> Regards,
>>>>> Dave
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On Nov 1, 2022, at 10:27 AM, Jiuming Tao <jm...@streamnative.io.invalid> 
>>>>>> wrote:
>>>>>> 
>>>>>> Hi pulsar community,
>>>>>> 
>>>>>> I opened a PIP to discuss: PIP-211: Introduce offload throttling
>>>>>> 
>>>>>> PIP link: https://github.com/apache/pulsar/issues/18004 
>>>>>> <https://github.com/apache/pulsar/issues/18004>
>>>>>> 
>>>>>> Thanks,
>>>>>> Tao Jiuming
>>>>> 
>>>> 
>>

Re: [DISCUSS] PIP-211: Introduce offload throttling

Reply via email to