Perhaps a basic question: In the context of a remote disk, wouldn't it make even more sense to use commitlog_sync = periodic rather than batch? Since periodic decouples disk I/O from client write latency, it seems better suited for mitigating the additional overhead of remote storage.
On Fri, Feb 21, 2025 at 8:25 PM Long Pan <panlong...@gmail.com> wrote: > Thank you all very much, Guo, Patrick and Jon! I will take a close look at > the resources you are sharing. > > On Thu, Feb 20, 2025 at 10:06 AM Patrick McFadin <pmcfa...@gmail.com> > wrote: > >> I'll give you the general guidance around any type of storage you >> pick. This even applies to local disks but it will directly apply to >> your question. >> >> The key to success with storage and Cassandra is sequential, >> concurrent IO. Most of the large IO operations are either writing and >> reading a large file from disk. Sometimes, and in the harder case to >> manage, at the same time. Storage systems that bias more to reads or >> writes will create an imbalance that can lead to issues. And worth >> emphasizing. These are sequential reads and writes. IOPs are mostly >> irrelevant. The second aspect to manage is latency. Latency from disk >> directly correlates to query performance. >> >> With respect to remote storage, they tend to have more issues with >> these requirements. NFS, for example, has far too much latency and >> concurrency. Just don't use it. The best thing you can do when looking >> at choices is run some simple tests. Another Jon Haddad resource but >> great: https://www.youtube.com/watch?v=dPpEORxoMRU You don't even need >> to run Cassandra in the test. Just do some IO testing and verify that >> it can read and write in a balanced manner, observe the latency and >> watch for any IOWait that creeps up. >> >> If you have a specific technology combination, just ask here. >> Collectively we have probably seen it all. >> >> Patrick >> >> On Wed, Feb 19, 2025 at 10:27 PM Long Pan <panlong...@gmail.com> wrote: >> > >> > Hi Cassandra Community, >> > >> > I’m exploring the feasibility of running Cassandra with remote storage, >> primarily block storage (e.g., AWS EBS, OCI Block Volume, Google Persistent >> Disk) and possibly even file storage (e.g., NFS, EFS, FSx). While local >> SSDs are the typical recommendation for optimal performance, I’d like to >> understand if anyone has experience or insights on using remote disks in >> production. >> > >> > Specifically, I’m looking for guidance on: >> > >> > Feasibility – Has anyone successfully run Cassandra with remote >> storage? If so, what use cases worked well? >> > Major Downsides & Caveats – Are there any known performance >> bottlenecks, consistency issues? >> > Configuration Tuning – Are there any special settings (e.g., >> compaction, memtable flush thresholds, disk I/O tuning) that can help >> mitigate potential drawbacks? >> > Monitoring & Alerting – What are the key metrics and failure scenarios >> to watch out for when using remote storage? >> > >> > I’d appreciate any insights, war stories, or best practices from those >> who have experimented with or deployed Cassandra on remote storage. >> > >> > Thanks, >> > Long Pan >> >