It's a bit different for users leveraging LevelDB - since it requires
opt-in, they are willing to use it if they still use it, hence they are
likely to retain the config during the upgrade.
>From the initial post, there is a claim that we deprecated LevelDB in
Apache Spark 4.0.0. Shall I ask what
I would like to provide some new information:
1. Spark 3.4.0 [SPARK-42277] has started using RocksDB as the default option
for `spark.history.store.hybridStore.diskBackend`.
- Since Spark 3.4, Spark will use RocksDB store if
`spark.history.store.hybridStore.enabled` is true. To restore the beha
Thanks for the valuable input.
I think it's more about the case where upgrading would surprise the end
users. If we simply remove LevelDB from the next release, we will be
removing these intermediate data as well and enforcing them to rebuild
everything. 15 mins is probably not super long from the
I think SHS only uses LevelDB/RocksDB to store intermediate data, supporting
re-parsing to rebuild the cache should be fine enough.
Also share my experience about using LevelDB/RocksDB for SHS, it seems LevelDB
has native memory leak issues, at least for the SHS use case, I need to reboot
the S
IMHO, it's probably dependent on how long the rewrite will take, from
reading the event log. If loading the state from LevelDB and rewriting to
RocksDB is quite much faster, then we may want to support this for a couple
minor releases to not force users to lose their cache. If there is no such
diff
This is indeed an issue at the moment. Personally, I haven't found a
proper way to migrate data from LevelDB to RocksDB, as their storage
structures are different. Should we wait until a reasonable migration
solution becomes available before moving forward with this?
Jungtaek Lim 于2025年5月28日周三 15
Thanks for initiating this.
I wonder if we don't have any compatibility issue on every component - SS
area does not have an issue, but I don't quite remember if the history
server would be OK with this. What is the story of the migration if they
had been using leveldb? I guess it could be probably
The project "org.fusesource.leveldbjni:leveldbjni" released its last version 12
years ago, and its code repository was last updated 8 years ago. Consequently,
I believe it's challenging for us to receive ongoing maintenance and support
from this project.
On the flip side, when developers implem
Hi all,
I'd like to start a discussion about removing LevelDB support from Apache Spark.
As noted in SPARK-44223(https://issues.apache.org/jira/browse/SPARK-44223),
LevelDB support was deprecated in Spark 4.0. It’s no longer actively
maintained or widely used, and continuing to support it brings