> Yes, the PIP’s key point is protect the broker, to prevent offloading takes > too much broker resources.
Throttling also protects the bookkeeper. Reads that are used to offload data are the lowest priority reads since they are not serving an actual client. Since we don't have a way to tell bookkeeper the requested quality of service for a read operation, throttling is a natural solution. The main trade off with throttling is that we might leave performance on the table if the cluster is not heavily utilized. > Do you mean that add a new broker type? And this type broker only for Offload > processing? It would not be a broker. It would be an offloader, and its sole task would be offloading data. The broker would still be the component that serves reads from tiered storage. > I think that the introduction of a new broker type is relatively heavyweight > in order to implement offload throttling. The "offloader" component is independent of the throttling feature. We can implement this PIP without addressing my tangent. That being said, I made my tangent because I think the current design leads to unnecessary load on the broker, which could be misinterpreted by the load manager as a reason to load balance, which could interrupt offloading. I would guess that interruption could force the offloader to need to restart the task of offloading a ledger, which is very inefficient. Thanks, Michael On Thu, Nov 10, 2022 at 12:50 PM Jiuming Tao <jm...@streamnative.io.invalid> wrote: > > Hi Michael, > > > > This PIP is similar to autorecovery throttling. I think the feature > > makes sense for the same reasons that throttling autorecovery makes > > sense. > > Yes, the PIP’s key point is protect the broker, to prevent offloading takes > too much broker resources. > > > Tangentially, can we decouple writes to tiered storage from the broker > > hosting the topic being offloaded? > > Do you mean that add a new broker type? And this type broker only for Offload > processing? > I think that the introduction of a new broker type is relatively heavyweight > in order to implement offload throttling. > We can do it in a simpler way > > Thanks, > Tao Jiuming > > > > > > 2022年11月8日 上午8:08,Michael Marshall <mmarsh...@apache.org> 写道: > > > > This PIP is similar to autorecovery throttling. I think the feature > > makes sense for the same reasons that throttling autorecovery makes > > sense. > > > > Tangentially, can we decouple writes to tiered storage from the broker > > hosting the topic being offloaded? An independent service could write > > to tiered storage without impacting the broker and could easily scale > > as with the work. The primary complication for the service would be > > figuring out which ledgers to offload. Perhaps the managed ledger > > could "offer" ledgers up that need to be offloaded, and the new > > service would only need to consume those events. > > > > Although, a new service would complicate the pulsar deployment. > > > > Thanks, > > Michael > > > > On Mon, Nov 7, 2022 at 10:30 AM Jiuming Tao > > <jm...@streamnative.io.invalid> wrote: > >> > >>> One alternative would be to throttle offload in the write path instead of > >>> adding additional logic to the read path in managed ledgers. > >> > >> This is really a feasible method. > >> But we need to make changes in FileSystem and BlobStore offloaders, event > >> custom offloaders. I think this is not universal. > >> > >>> One simple way to do this is to to limit how many threads can write > >>> offloaded ledgers. This is the same way that reading of offloaded ledgers > >>> are already “throttled” by that thread count defaulting to 2. > >> > >> Yes, the offloader thread count is defaulting to 2, but, it does not > >> effectively limit traffic. If the reading rate of BK is very fast, it also > >> leads to high CPU/Memory/Network usage > >> > >> Thanks, > >> Tao Jiuming > >> > >>> 2022年11月2日 上午1:43,Dave Fisher <w...@apache.org> 写道: > >>> > >>> One alternative would be to throttle offload in the write path instead of > >>> adding additional logic to the read path in managed ledgers. > >>> > >>> One simple way to do this is to to limit how many threads can write > >>> offloaded ledgers. This is the same way that reading of offloaded ledgers > >>> are already “throttled” by that thread count defaulting to 2. > >>> > >>> Regards, > >>> Dave > >>> > >>> Sent from my iPhone > >>> > >>>> On Nov 1, 2022, at 10:27 AM, Jiuming Tao <jm...@streamnative.io.invalid> > >>>> wrote: > >>>> > >>>> Hi pulsar community, > >>>> > >>>> I opened a PIP to discuss: PIP-211: Introduce offload throttling > >>>> > >>>> PIP link: https://github.com/apache/pulsar/issues/18004 > >>>> <https://github.com/apache/pulsar/issues/18004> > >>>> > >>>> Thanks, > >>>> Tao Jiuming > >>> > >> >