I feel like that calls for an anti-pattern -> success blog post Luca 🤣
On Tue, Jul 20, 2021 at 9:17 AM Luca Rondanini <luca.rondan...@gmail.com> wrote: > Thanks Sean, > > I'm switching to G1 in order to gain some time while refactoring. I should > be able to go down to 4 tables! Yes, the original design was that poor. > > Thanks again > > On Tue, Jul 20, 2021 at 6:41 AM Durity, Sean R < > sean_r_dur...@homedepot.com> wrote: > >> Each table in the cluster will have a memtable. This is why you do not >> want to fracture the memory into 900+ slices. The rule of thumb I have >> followed is to stay in the low hundreds (maybe 200) tables for the whole >> cluster. I would be requiring the hard refactoring (or moving tables to >> different clusters) immediately, since you really need to reduce by at >> least 700 tables. You are seeing the memory impacts. >> >> >> >> In addition, in my experience, CMS is much harder to tune. G1GC works >> well in my use cases without much tuning (or Java-guru level knowledge). >> However, I don’t think that you will be able to engineer around the 900+ >> tables, no matter which GC you use. >> >> >> >> Sean Durity – Staff Systems Engineer, Cassandra >> >> >> >> *From:* Luca Rondanini <luca.rondan...@gmail.com> >> *Sent:* Monday, July 19, 2021 11:34 AM >> *To:* user@cassandra.apache.org >> *Subject:* [EXTERNAL] R/W timeouts VS number of tables in keyspace >> >> >> >> Hi all, >> >> >> >> I have a keyspace with almost 900 tables. >> >> >> >> Lately I started receiving lots of w/r timeouts (eg >> com.datastax.driver.core.exceptions.Read/WriteTimeoutException: Cassandra >> timeout during write query at consistency LOCAL_ONE (1 replica were >> required but only 0 acknowledged the write). >> >> >> >> *I'm even experiencing nodes crashing.* >> >> >> >> In the logs I get many warnings like: >> >> >> >> WARN [Service Thread]....GCInspector.java:282 - ConcurrentMarkSweep GC >> in 4025ms. CMS Old Ge >> n: 2141569800 -> 2116170568; Par Eden Space: 167772160 -> 0; Par Survivor >> Space: 20971520 -> 0 >> >> >> WARN [GossipTasks:1].....FailureDetector.java:288 - Not marking nodes >> down due to local pause >> of 5038005208 > 5000000000 >> >> I know 900 tables is a design error for C* but before a super painful >> refactoring I'd like to rule out any configuration problem. Any suggestion? >> >> >> >> Thanks a lot, >> >> Luca >> >> >> >> >> >> >> >> ------------------------------ >> >> The information in this Internet Email is confidential and may be legally >> privileged. It is intended solely for the addressee. Access to this Email >> by anyone else is unauthorized. If you are not the intended recipient, any >> disclosure, copying, distribution or any action taken or omitted to be >> taken in reliance on it, is prohibited and may be unlawful. When addressed >> to our clients any opinions or advice contained in this Email are subject >> to the terms and conditions expressed in any applicable governing The Home >> Depot terms of business or client engagement letter. The Home Depot >> disclaims all responsibility and liability for the accuracy and content of >> this attachment and for any damages or losses arising from any >> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other >> items of a destructive nature, which may be contained in this attachment >> and shall not be liable for direct, indirect, consequential or special >> damages in connection with this e-mail message or its attachment. >> > -- Scott Hirleman scott.hirle...@gmail.com