Ok, so I have begun splitting ARROW-7001 into smaller tasks that
eventually create an AsyncScanner. The plan...
ARROW-12286 & ARROW-12287 are minor utilities that could have been
split out anyways.
ARROW-12288 Creates a `Scanner` interface and cleans up the existing
implementation somewhat. This
I would also lean in the direction of progress to get user feedback
sooner — if our test suite passes stably then it is probably okay to
merge, and if it's possible (without great hardship) to have a
fallback to the non-async version (so there's a workaround if there
end up being show-stopping bugs
1) Most of the committed changes have been off the main path. The
only exception is the streaming CSV reader. Assuming ARROW-12208 is
merged (it is close) a stable path would be to revert most of
ARROW-12161 and change the legacy scanner to simply wrap each call to
the streaming CSV reader with R
Three thoughts:
1. Given that lots of prerequisite patches have already merged, and we've
seen some instability as a result of those, I don't think it's obviously
true that holding ARROW-7001 out of 4.0 is lower risk. It could be that the
intermediate state we're in now is higher risk. What do you
Hi Weston,
Objective note:
I'm just a user, but I want to add that so far the Arrow releases are
pretty good quality which means you are making good calls.
Personal opinion:
There were several annoying bugs, where one would have to change a
parameter between parquet V1/V2, threaded / non-threaded
Hey Weston,
First, thanks for all your work in getting these changes so far. I think it's
also been a valuable experience in working with async code, and hopefully the
problems we've run into so far will help inform further work, including with
the query engine.
If you're not comfortable mergi
I have been working the last few months on ARROW-7001 [0] which
enables nested parallelism by converting the dataset scanning to
asynchronous (previously announced here[1] and discussed here[2]). In
addition to enabling nested parallelism this also allows for parallel
readahead which gives signifi