On Tue, Apr 15, 2025 at 5:44 AM Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Apr 10, 2025 at 11:15 PM Thomas Munro <thomas.mu...@gmail.com> wrote:
> > The new streaming BHS isn't just issuing probabilistic hints about
> > future access obtained from a second iterator.  It has just one shared
> > iterator connected up to the workers' ReadStreams.  Each worker pulls
> > a disjoint set of blocks out of its stream, possibly running a bunch
> > of IOs in the background as required.
>
> It feels to me like the problem here is that the shared iterator is
> connected to unshared read-streams. If you make a shared read-stream
> object and connect the shared iterator to that instead, does that
> solve this whole problem, or is there more to it?

More or less, yeah, just put the whole ReadStream object in shared
memory, pin an LWLock on it and call it a parallel-aware or shared
ReadStream.  But how do you make the locking not terrible?

My "work stealing" brain dump was imagining a way to achieve the same
net effect, except NOT have to acquire an exclusive lock for every
buffer you pull out of the stream.  I was speculating that we could
achieve zero locking for most of the stream without any cache line
ping pong, but a cunning read barrier scheme could detect when you've
been flipped into a slower coordination mode by another backend and
need to turn on some locking and fight over the last handful of
buffers.  And I was also observing that if you can figure out to make
it general and reusable enough, we have more unsolved problems like
this in unrelated parallel query code not even involving streams.
It's a tiny more approachable subset of the old "data buffered in
other workers" problem, as I think you called it once.


Reply via email to