Hi, I've been exploring BlockManager and the stores for a while now and am tempted to say that a memory-only Spark setup would be possible (except shuffle blocks). Is this correct?
What about shuffle blocks? Do they have to be stored on disk (in DiskStore)? I think broadcast variables are in-memory first so except on-disk storage level explicitly used (by Spark devs), there's no reason not to have Spark in-memory only. (I was told that one of the differences between Trino/Presto vs Spark SQL is that Trino keeps all processing in-memory only and will blow up while Spark uses disk to avoid OOMEs). Pozdrawiam, Jacek Laskowski ---- https://about.me/JacekLaskowski "The Internals Of" Online Books <https://books.japila.pl/> Follow me on https://twitter.com/jaceklaskowski <https://twitter.com/jaceklaskowski>