Hey Dmitri, There's a section in the email above and the linked doc that talks about the linked proposal. See "Relationship to the "Asynchronous & Reliable Tasks" Proposal".
As for pulling away from a REST API in favor of driving things directly from persistence, there's a lot to discuss here. Bear in mind that the design goes into detail about one proposed "TaskExecutor" implementation; maybe another TaskExecutor could work exactly like you describe. But the reason that this implementation proposes to be driven by a REST API is that there's a lot of interesting future work -- see the "Future Work" section of the doc for some examples -- that can be added on to the REST API. In particular, table maintenance actions like compaction. --EM On Mon, Jun 23, 2025 at 2:31 PM Dmitri Bourlatchkov <di...@apache.org> wrote: > Hi All, > > A previous proposal by Robert [1] from May 9 appears to be related. I think > we should consider both at the same time, possibly as alternatives, but > perhaps also sharing / reusing their respective ideas. > > A few notes after a quick review: > > * Separate scaling for task executors seems reasonable at first glance, but > it adds deployment complexity. If we go with this approach, I believe it > would be worth making this deployment strategy optional. In other words let > admin users decide whether they want to have extra nodes dedicated to > specific tasks or whether they are ok with having uniform nodes. > > * I'm not sure a separate rich REST API for submitting tasks is really > necessary. Proper synchronization among multiple nodes will > probably require roundtrips to Persistence anyway, so task submission could > probably be done via Persistence. > > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l > > Thanks, > Dmitri. > > > On Mon, Jun 23, 2025 at 3:12 PM William Hyun <will...@apache.org> wrote: > > > Hello Polaris Community, > > > > I would like to share my proposal for a new service, the Polaris > > Delegation Service, and to share the design document for discussion > > and feedback. The Delegation Service is intended to optionally be > > deployed alongside Polaris to handle the execution of certain > > long-running tasks. > > > > 1. Motivation > > The Polaris Catalog is optimized for low-latency metadata operations. > > However, certain tasks such as purging data files for dropped tables > > are resource-intensive and can impact its core performance. The > > motivation for this new service is to decouple these I/O-heavy > > background tasks from the main catalog, ensuring it remains highly > > responsive while allowing the task execution workload to be managed > > and scaled independently. > > > > 2. Proposal > > We propose an optional, independent Delegation Service responsible for > > executing these offloaded operations. > > The MVP will focus on synchronously handling the data file deletion > > process for DROP TABLE WITH PURGE commands. > > > > 3. Relationship to the "Asynchronous & Reliable Tasks" Proposal > > This proposal is designed to be highly synergistic with the existing > > "Asynchronous & Reliable Tasks" proposal. > > > > The Asynchronous Task proposal describes a general internal framework > > for reliably scheduling and managing the lifecycle of any task within > > Polaris. On the other hand, this proposal defines a specific, external > > worker service optimized for executing a particular class of I/O-heavy > > tasks. > > > > The Delegation Service does not alter the core Polaris task schema. > > This allows it to seamlessly act as a specialized "backend" worker > > that can execute tasks scheduled and managed by the more advanced > > Asynchronous Task Framework, which would serve as the reliable > > "frontend." This relationship is explored further in section 10.2 of > > the document. > > > > Please find the detailed design document here for review: > > - > > > https://docs.google.com/document/d/1AhR-cZ6WW6M-z8v53txOfcWvkDXvS-0xcMe3zjLMLj8/edit?usp=sharing > > > > Best Regards, > > William > > >