First, I am +1 on the design proposed in this PIP. > Could you please explain that why the design will leads to unnecessary load > on the broker? > IMO, I don't think it will bring huge expenses.
I did not mean the throttling design proposed by this PIP was expensive. I meant that running the offloader within the broker puts load on the broker, and because that load could be decoupled, I called it unnecessary. Thanks, Michael On Fri, Nov 11, 2022 at 8:10 AM Jiuming Tao <jm...@streamnative.io.invalid> wrote: > > Hi Michael, > > > The main trade off with throttling is that we might > > leave performance on the table if the cluster is not heavily utilized. > > It is indeed possible. When a broker don’t publish/consume many messages and > offloaders are in the throttling, > the performance of the broker cannot be released. > But offloader throttling exists as a bottom-line solution, it's reasonable. > > > > It would not be a broker. It would be an offloader, and its sole task > > would be offloading data. > > > It surely a good solution, broker and offloader wouldn’t affect each other. > But like you said, a new service would complicate the pulsar deployment. > > > > I made my tangent because I think the current design leads to > > unnecessary load on the broker > > Could you please explain that why the design will leads to unnecessary load > on the broker? > IMO, I don't think it will bring huge expenses. > > > which could be misinterpreted by the > > load manager as a reason to load balance, which could interrupt > > offloading > > It is indeed possible, because Offload takes longer, the chance of being > interrupted increases. > But I think maybe there is no better solution for this. > > Thanks, > Tao Jiuming > > > > 2022年11月11日 上午4:04,Michael Marshall <mmarsh...@apache.org> 写道: > > > >> Yes, the PIP’s key point is protect the broker, to prevent offloading > >> takes too much broker resources. > > > > Throttling also protects the bookkeeper. Reads that are used to > > offload data are the lowest priority reads since they are not serving > > an actual client. Since we don't have a way to tell bookkeeper the > > requested quality of service for a read operation, throttling is a > > natural solution. The main trade off with throttling is that we might > > leave performance on the table if the cluster is not heavily utilized. > > > >> Do you mean that add a new broker type? And this type broker only for > >> Offload processing? > > > > It would not be a broker. It would be an offloader, and its sole task > > would be offloading data. The broker would still be the component that > > serves reads from tiered storage. > > > >> I think that the introduction of a new broker type is relatively > >> heavyweight in order to implement offload throttling. > > > > The "offloader" component is independent of the throttling feature. We > > can implement this PIP without addressing my tangent. That being said, > > I made my tangent because I think the current design leads to > > unnecessary load on the broker, which could be misinterpreted by the > > load manager as a reason to load balance, which could interrupt > > offloading. I would guess that interruption could force the offloader > > to need to restart the task of offloading a ledger, which is very > > inefficient. > > > > Thanks, > > Michael > > > > On Thu, Nov 10, 2022 at 12:50 PM Jiuming Tao > > <jm...@streamnative.io.invalid> wrote: > >> > >> Hi Michael, > >> > >> > >>> This PIP is similar to autorecovery throttling. I think the feature > >>> makes sense for the same reasons that throttling autorecovery makes > >>> sense. > >> > >> Yes, the PIP’s key point is protect the broker, to prevent offloading > >> takes too much broker resources. > >> > >>> Tangentially, can we decouple writes to tiered storage from the broker > >>> hosting the topic being offloaded? > >> > >> Do you mean that add a new broker type? And this type broker only for > >> Offload processing? > >> I think that the introduction of a new broker type is relatively > >> heavyweight in order to implement offload throttling. > >> We can do it in a simpler way > >> > >> Thanks, > >> Tao Jiuming > >> > >> > >> > >> > >>> 2022年11月8日 上午8:08,Michael Marshall <mmarsh...@apache.org> 写道: > >>> > >>> This PIP is similar to autorecovery throttling. I think the feature > >>> makes sense for the same reasons that throttling autorecovery makes > >>> sense. > >>> > >>> Tangentially, can we decouple writes to tiered storage from the broker > >>> hosting the topic being offloaded? An independent service could write > >>> to tiered storage without impacting the broker and could easily scale > >>> as with the work. The primary complication for the service would be > >>> figuring out which ledgers to offload. Perhaps the managed ledger > >>> could "offer" ledgers up that need to be offloaded, and the new > >>> service would only need to consume those events. > >>> > >>> Although, a new service would complicate the pulsar deployment. > >>> > >>> Thanks, > >>> Michael > >>> > >>> On Mon, Nov 7, 2022 at 10:30 AM Jiuming Tao > >>> <jm...@streamnative.io.invalid> wrote: > >>>> > >>>>> One alternative would be to throttle offload in the write path instead > >>>>> of adding additional logic to the read path in managed ledgers. > >>>> > >>>> This is really a feasible method. > >>>> But we need to make changes in FileSystem and BlobStore offloaders, > >>>> event custom offloaders. I think this is not universal. > >>>> > >>>>> One simple way to do this is to to limit how many threads can write > >>>>> offloaded ledgers. This is the same way that reading of offloaded > >>>>> ledgers are already “throttled” by that thread count defaulting to 2. > >>>> > >>>> Yes, the offloader thread count is defaulting to 2, but, it does not > >>>> effectively limit traffic. If the reading rate of BK is very fast, it > >>>> also leads to high CPU/Memory/Network usage > >>>> > >>>> Thanks, > >>>> Tao Jiuming > >>>> > >>>>> 2022年11月2日 上午1:43,Dave Fisher <w...@apache.org> 写道: > >>>>> > >>>>> One alternative would be to throttle offload in the write path instead > >>>>> of adding additional logic to the read path in managed ledgers. > >>>>> > >>>>> One simple way to do this is to to limit how many threads can write > >>>>> offloaded ledgers. This is the same way that reading of offloaded > >>>>> ledgers are already “throttled” by that thread count defaulting to 2. > >>>>> > >>>>> Regards, > >>>>> Dave > >>>>> > >>>>> Sent from my iPhone > >>>>> > >>>>>> On Nov 1, 2022, at 10:27 AM, Jiuming Tao > >>>>>> <jm...@streamnative.io.invalid> wrote: > >>>>>> > >>>>>> Hi pulsar community, > >>>>>> > >>>>>> I opened a PIP to discuss: PIP-211: Introduce offload throttling > >>>>>> > >>>>>> PIP link: https://github.com/apache/pulsar/issues/18004 > >>>>>> <https://github.com/apache/pulsar/issues/18004> > >>>>>> > >>>>>> Thanks, > >>>>>> Tao Jiuming > >>>>> > >>>> > >> >