With your GPU example, that seems like a bit of a stretch. Can you flesh out the example a little more, get more into the details of how it actually works (and I understand it's made up anyway)?
Sounds like something other than this task is spinning up a gpu resource? And this is just a "wait then run" operator? If something else is controlling the resource spin up, then why does this task need a pool at all? It's not controlling the increase in load. No example needed with use_pool_when_deferred=True because that's how I think it makes sense to be universally. On Sun, Oct 20, 2024 at 10:05 AM Jarek Potiuk <ja...@potiuk.com> wrote: > Yeah... Off-by one... Good eye - I lied too :) (noticed it after I sent the > message. I wish email had the ability to correct typos).. > > 2) -> yes we agree, but to clarify a bit - we need it at OPERATOR level not > TASK level. The difference comes from who defines it should be the > Operator's Author not the DAG Author. I.e. we should be able to define > "use_pool_when_deferred" (working name) when you define the operator, not > when you create the operator as a task in DAG. So basically IMHO it should > have the ability to internally set this property of the BaseOperator, but > it's not necessary to expose it via the `__init__` method of the actual > CustomDeferrableOperator(BaseOperator). We still CAN expose it via > __init__, but I'd not say it's desired. > > Example: > > 1) example 1. RunMyGPUFineTuningOperator. Pool = num shared GPU, The > operator does: a) wait in deferrable for a MODEL to appear b) upload the > model and fine-tunes it (non-deferrable, uses GPU). > "use_pool_when_deferred" = False > 2) example 2. UpdateNewSalesforceUsersOperator. Pool = num salesforce > connections. (Protect Salesforce API from being overloaded - our licence > has only 10 parallel connections possible) The operator does a) checks if > new users are defined (by polling Salesforce API) - deferred b) updates the > user with new fields via Salesforce API. "use_pool_when_deferred" = True > > Enough. > > J. > > > > > > > > > > > > > On Sun, Oct 20, 2024 at 4:45 PM Daniel Standish > <daniel.stand...@astronomer.io.invalid> wrote: > > > So yeah hopefully we all agree that if we keep it, we should move it to > > task. > > > > I guess we can think of this thread as two items: > > > > 1. if we keep ability to have tasks not occupy a pool slot, shouldn't > it > > be configured at task level? I think so. > > 2. but should we keep the ability to have tasks not be considered at > > task level? > > 3. if tasks are to stay in pools when deferred, ideally they should do > > so continuously (e.g. including when in between worker and triggerer) > > > > > > Ok I lied, three items. But 3 is more like a reminder that there is this > > bad behavior :) > > > > Anyway, let's move on to focus on number 2, whether we should provide > users > > a configuration option to make tasks "drop out" of the pool when > deferred. > > > > After reading your message Jarek, I did not come away with an > understanding > > of a practical use case for having the task vacate the pool slot when > > deferred. Can you offer an example or two? > > > > > > > > On Sun, Oct 20, 2024 at 7:29 AM Daniel Standish < > > daniel.stand...@astronomer.io> wrote: > > > > > Totally agree that if we *must* make it configurable, then at task is > the > > > right place. Will read the rest of it later :) > > >