[ https://issues.apache.org/jira/browse/KAFKA-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax resolved KAFKA-9148. ------------------------------------ Resolution: Won't Fix Don't think it's a good idea to folk RocksDB to begin with... Also this ticket is very old. Closing for now. > Consider forking RocksDB for Streams > ------------------------------------- > > Key: KAFKA-9148 > URL: https://issues.apache.org/jira/browse/KAFKA-9148 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: A. Sophie Blee-Goldman > Priority: Major > > We recently upgraded our RocksDB dependency to 5.18 for its memory-management > abilities (namely the WriteBufferManager, see KAFKA-8215). Unfortunately, > someone from Flink recently discovered a ~8% [performance > regression|https://github.com/facebook/rocksdb/issues/5774] that exists in > all versions 5.18+ (up through the current newest version, 6.2.2). Flink was > able to react to this by downgrading to 5.17 and [picking the > WriteBufferManage|https://github.com/dataArtisans/frocksdb/pull/4]r to their > fork (fRocksDB). > Due to this and other reasons enumerated below, we should consider also > forking our own RocksDB for Streams. > Pros: > * We can avoid passing sudden breaking changes on to our users, such removal > of methods with no deprecation period (see discussion on KAFKA-8897) > * We can pick whichever version has the best performance for our needs, and > pick over any new features, metrics, etc that we need to use rather than > being forced to upgrade (and breaking user code, introducing regression, etc) > * Support for some architectures does not exist in all RocksDB versions, > making Streams completely unusable for some users until we can upgrade the > rocksdb dependency to one that supports their specific case. It's worth > noting that we've only had [one > user|https://issues.apache.org/jira/browse/KAFKA-9225] hit this so far (that > we know of), and some workarounds have been discussed on the ticket. > * The Java API seems to be a very low priority to the rocksdb folks. > ** They leave out critical functionality, features, and configuration > options that have been in the c++ API for a very long time > ** Those that do make it over often have random gaps in the API such as > setters but no getters (see [rocksdb PR > #5186|https://github.com/facebook/rocksdb/pull/5186]) > ** Others are poorly designed and require too many trips across the JNI, > making otherwise incredibly useful features prohibitively expensive. > *** [|#issuecomment-83145980] [Custom > Comparator|https://github.com/facebook/rocksdb/issues/538#issuecomment-83145980]: > a custom comparator could significantly improve the performance of session > windows. This is trivial to do but given the high performance cost of > crossing the jni, it is currently only practical to use a c++ comparator > *** [Prefix Seek|https://github.com/facebook/rocksdb/issues/6004]: not > currently used by Streams but a commonly requested feature, and may also > allow improved range queries > ** Even when an external contributor develops a solution for poorly > performing Java functionality and helpfully tries to contribute their patch > back to rocksdb, it gets ignored by the rocksdb people ([rocksdb PR > #2283|https://github.com/facebook/rocksdb/pull/2283]) > Cons: > * More work (not to be trivialized, the truth is we don't and can't know how > much extra work this will ultimately be) > Given that we rarely upgrade the Rocks dependency, use only some fraction of > its features, and would need or want to make only minimal changes ourselves, > it seems like we could actually get away with very little extra work by > forking rocksdb. Note that as of this writing the frocksdb repo has only > needed to open 5 PRs on top of the actual rocksdb (two of them trivial). Of > course, the LOE to maintain this will only grow over time, so we should think > carefully about whether and when to start taking on this potential burden. > -- This message was sent by Atlassian Jira (v8.20.10#820010)