pepijnve commented on code in PR #75: URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2152976180
########## content/blog/2025-06-15-cancellation.md: ########## @@ -0,0 +1,353 @@ +--- +layout: post +title: Query Cancellation +date: 2025-06-27 +author: Pepijn Van Eeckhoudt +categories: [features] +--- +<!-- +{% comment %} +Licensed to the Apache Software Foundation (ASF) under one or more +contributor license agreements. See the NOTICE file distributed with +this work for additional information regarding copyright ownership. +The ASF licenses this file to you under the Apache License, Version 2.0 +(the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +{% endcomment %} +--> + + +## The Challenge of Cancelling Long-Running Queries + +Have you ever tried to cancel a query that just wouldn't stop? +In this post, we'll take a look at why that can happen in DataFusion and what the community did to resolve the problem in depth. + +### Understanding Rust's Async Model + +To really understand the cancellation problem you need to be somewhat familiar with Rust's asynchronous programming model. +This is a bit different than what you might be used to from other ecosystems. +Let's go over the basics again as a refresher. +If you're familiar with the ins and outs of `Future` and `async` you can skip this section. + +#### Futures Are Inert + +Rust's asynchronous programming model is built around the `Future<T>` trait. +In contrast to, for instance, Javascript's `Promise` or Java's `Future` a Rust `Future` does not necessarily represent an actively running asynchronous job. +Instead, a `Future<T>` represents a lazy calculation that only makes progress when explicitly polled. +If nothing tells a `Future` to try and make progress explicitly, it is [an inert object](https://doc.rust-lang.org/std/future/trait.Future.html#runtime-characteristics). + +You ask a `Future<T>`to advance its calculation as much as possible by calling the `poll` method. +The `Future` responds with either: +- `Poll::Pending` if it needs to wait for something (like I/O) before it can continue +- `Poll::Ready<T>` when it has completed and produced a value + +When a `Future` returns `Pending`, it saves its internal state so it can pick up where it left off the next time you poll it. +This state management is what makes Rust's `Future`s memory-efficient and composable. +It also needs to set up the necessary signaling so that the caller gets notified when it should try to call `poll` again. +This avoids having to call `poll` in a busy-waiting loop. + +Rust's `async` keyword provides syntactic sugar over this model. +When you write an `async` function or block, the compiler transforms it into an implementation of the `Future` trait for you. +Since all the state management is compiler generated and hidden from sight, async code tends to be more readable while maintaining the same underlying mechanics. + +The `await` keyword complements this by letting you pause execution until a `Future` completes. +When you `.await` a `Future`, you're essentially telling the compiler to poll that `Future` until it's ready before program execution continues with the statement after the await. Review Comment: `.await` is sort of equivalent to `ready!(future.poll)` along with a state transition on ready, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org