On Tue, Apr 15, 2025 at 5:44 AM Robert Haas <robertmh...@gmail.com> wrote: > On Thu, Apr 10, 2025 at 11:15 PM Thomas Munro <thomas.mu...@gmail.com> wrote: > > The new streaming BHS isn't just issuing probabilistic hints about > > future access obtained from a second iterator. It has just one shared > > iterator connected up to the workers' ReadStreams. Each worker pulls > > a disjoint set of blocks out of its stream, possibly running a bunch > > of IOs in the background as required. > > It feels to me like the problem here is that the shared iterator is > connected to unshared read-streams. If you make a shared read-stream > object and connect the shared iterator to that instead, does that > solve this whole problem, or is there more to it?
More or less, yeah, just put the whole ReadStream object in shared memory, pin an LWLock on it and call it a parallel-aware or shared ReadStream. But how do you make the locking not terrible? My "work stealing" brain dump was imagining a way to achieve the same net effect, except NOT have to acquire an exclusive lock for every buffer you pull out of the stream. I was speculating that we could achieve zero locking for most of the stream without any cache line ping pong, but a cunning read barrier scheme could detect when you've been flipped into a slower coordination mode by another backend and need to turn on some locking and fight over the last handful of buffers. And I was also observing that if you can figure out to make it general and reusable enough, we have more unsolved problems like this in unrelated parallel query code not even involving streams. It's a tiny more approachable subset of the old "data buffered in other workers" problem, as I think you called it once.