Re: Seeking Advice on Running Cassandra with Remote Disk

2025-02-26 Thread Patrick McFadin
It may help with some, but it's compaction and memtable flushes that generate the most IO. You also run the risk of not having data fully committed to disk if something bad were to happen. It might be ok for time series data that you can afford to lose, but not for mission critical things. On Fri,

Re: Seeking Advice on Running Cassandra with Remote Disk

2025-02-21 Thread Long Pan
Perhaps a basic question: In the context of a remote disk, wouldn't it make even more sense to use commitlog_sync = periodic rather than batch? Since periodic decouples disk I/O from client write latency, it seems better suited for mitigating the additional overhead of remote storage. On Fri, Feb

Re: Seeking Advice on Running Cassandra with Remote Disk

2025-02-21 Thread Long Pan
Thank you all very much, Guo, Patrick and Jon! I will take a close look at the resources you are sharing. On Thu, Feb 20, 2025 at 10:06 AM Patrick McFadin wrote: > I'll give you the general guidance around any type of storage you > pick. This even applies to local disks but it will directly appl

Re: Seeking Advice on Running Cassandra with Remote Disk

2025-02-20 Thread Patrick McFadin
I'll give you the general guidance around any type of storage you pick. This even applies to local disks but it will directly apply to your question. The key to success with storage and Cassandra is sequential, concurrent IO. Most of the large IO operations are either writing and reading a large f

Re: Seeking Advice on Running Cassandra with Remote Disk

2025-02-19 Thread guo Maxwell
See the DISCUSS Merging compaction improvements to 5.0 , Jon said he have worked with AWS and the EBS team directly and wrote the Best Practices for C* on EBS