Hi Michael, > The main trade off with throttling is that we might > leave performance on the table if the cluster is not heavily utilized.
It is indeed possible. When a broker don’t publish/consume many messages and offloaders are in the throttling, the performance of the broker cannot be released. But offloader throttling exists as a bottom-line solution, it's reasonable. > It would not be a broker. It would be an offloader, and its sole task > would be offloading data. It surely a good solution, broker and offloader wouldn’t affect each other. But like you said, a new service would complicate the pulsar deployment. > I made my tangent because I think the current design leads to > unnecessary load on the broker Could you please explain that why the design will leads to unnecessary load on the broker? IMO, I don't think it will bring huge expenses. > which could be misinterpreted by the > load manager as a reason to load balance, which could interrupt > offloading It is indeed possible, because Offload takes longer, the chance of being interrupted increases. But I think maybe there is no better solution for this. Thanks, Tao Jiuming > 2022年11月11日 上午4:04,Michael Marshall <mmarsh...@apache.org> 写道: > >> Yes, the PIP’s key point is protect the broker, to prevent offloading takes >> too much broker resources. > > Throttling also protects the bookkeeper. Reads that are used to > offload data are the lowest priority reads since they are not serving > an actual client. Since we don't have a way to tell bookkeeper the > requested quality of service for a read operation, throttling is a > natural solution. The main trade off with throttling is that we might > leave performance on the table if the cluster is not heavily utilized. > >> Do you mean that add a new broker type? And this type broker only for >> Offload processing? > > It would not be a broker. It would be an offloader, and its sole task > would be offloading data. The broker would still be the component that > serves reads from tiered storage. > >> I think that the introduction of a new broker type is relatively heavyweight >> in order to implement offload throttling. > > The "offloader" component is independent of the throttling feature. We > can implement this PIP without addressing my tangent. That being said, > I made my tangent because I think the current design leads to > unnecessary load on the broker, which could be misinterpreted by the > load manager as a reason to load balance, which could interrupt > offloading. I would guess that interruption could force the offloader > to need to restart the task of offloading a ledger, which is very > inefficient. > > Thanks, > Michael > > On Thu, Nov 10, 2022 at 12:50 PM Jiuming Tao > <jm...@streamnative.io.invalid> wrote: >> >> Hi Michael, >> >> >>> This PIP is similar to autorecovery throttling. I think the feature >>> makes sense for the same reasons that throttling autorecovery makes >>> sense. >> >> Yes, the PIP’s key point is protect the broker, to prevent offloading takes >> too much broker resources. >> >>> Tangentially, can we decouple writes to tiered storage from the broker >>> hosting the topic being offloaded? >> >> Do you mean that add a new broker type? And this type broker only for >> Offload processing? >> I think that the introduction of a new broker type is relatively heavyweight >> in order to implement offload throttling. >> We can do it in a simpler way >> >> Thanks, >> Tao Jiuming >> >> >> >> >>> 2022年11月8日 上午8:08,Michael Marshall <mmarsh...@apache.org> 写道: >>> >>> This PIP is similar to autorecovery throttling. I think the feature >>> makes sense for the same reasons that throttling autorecovery makes >>> sense. >>> >>> Tangentially, can we decouple writes to tiered storage from the broker >>> hosting the topic being offloaded? An independent service could write >>> to tiered storage without impacting the broker and could easily scale >>> as with the work. The primary complication for the service would be >>> figuring out which ledgers to offload. Perhaps the managed ledger >>> could "offer" ledgers up that need to be offloaded, and the new >>> service would only need to consume those events. >>> >>> Although, a new service would complicate the pulsar deployment. >>> >>> Thanks, >>> Michael >>> >>> On Mon, Nov 7, 2022 at 10:30 AM Jiuming Tao >>> <jm...@streamnative.io.invalid> wrote: >>>> >>>>> One alternative would be to throttle offload in the write path instead of >>>>> adding additional logic to the read path in managed ledgers. >>>> >>>> This is really a feasible method. >>>> But we need to make changes in FileSystem and BlobStore offloaders, event >>>> custom offloaders. I think this is not universal. >>>> >>>>> One simple way to do this is to to limit how many threads can write >>>>> offloaded ledgers. This is the same way that reading of offloaded ledgers >>>>> are already “throttled” by that thread count defaulting to 2. >>>> >>>> Yes, the offloader thread count is defaulting to 2, but, it does not >>>> effectively limit traffic. If the reading rate of BK is very fast, it also >>>> leads to high CPU/Memory/Network usage >>>> >>>> Thanks, >>>> Tao Jiuming >>>> >>>>> 2022年11月2日 上午1:43,Dave Fisher <w...@apache.org> 写道: >>>>> >>>>> One alternative would be to throttle offload in the write path instead of >>>>> adding additional logic to the read path in managed ledgers. >>>>> >>>>> One simple way to do this is to to limit how many threads can write >>>>> offloaded ledgers. This is the same way that reading of offloaded ledgers >>>>> are already “throttled” by that thread count defaulting to 2. >>>>> >>>>> Regards, >>>>> Dave >>>>> >>>>> Sent from my iPhone >>>>> >>>>>> On Nov 1, 2022, at 10:27 AM, Jiuming Tao <jm...@streamnative.io.invalid> >>>>>> wrote: >>>>>> >>>>>> Hi pulsar community, >>>>>> >>>>>> I opened a PIP to discuss: PIP-211: Introduce offload throttling >>>>>> >>>>>> PIP link: https://github.com/apache/pulsar/issues/18004 >>>>>> <https://github.com/apache/pulsar/issues/18004> >>>>>> >>>>>> Thanks, >>>>>> Tao Jiuming >>>>> >>>> >>