Sorry there is a mistake in my previous post. I would correct it. In Q3, I mentioned there are a lot of invalidating messages in the debug.log. It is true but cassandra configurations were wrong. In that case, the cassandra.yaml configurations are as follows:
- cassandra.yaml - compaction_throughput_mb_per_sec: 0 (not 8 or default) - concurrent_compactors: 1 - sstable_preemptive_open_interval_in_mb: 0 (not 8 or default) - memtable_flush_writers: 1 And More precisely, in that case, Cassandra keep on outputting invalidating messages for a while(a few hours). However CPU usage is almost 0.0% in top command like below. $ top -bu cassandra -n 1 ... PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2631 cassand+ 20 0 0.250t 1.969g 703916 S 0.0 57.0 8459:35 java I want to know what was actually happening at that time. Regards, Satoshi On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sahik...@gmail.com> wrote: > Thank you for your very useful advice! > > > Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've > understood what does these logs mean. But I have more a few questions. So I > would very much appreciate If I get some explanations about these questions. > > * Q1. > In my understand, when open a SSTable, a lot of RandomAccessReaders(RARs) > are created. A number of RARs is equal to a number of segments of SSTable. > Is a number of segments(=RARs) equal to follows? > > a number of segments = size of SSTable / size of segments > > * Q2. > What is happen if the Cassandra open a SSTable file which bigger than JVM > heap (or memory)? > > * Q3. > In my case, there are a lot of invalidating messages for the same SSTable > file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in my > previous post). In some cases, there are more than 600 invalidating > messages for the same file and these messages logged for a few hours. Would > that closing a big SSTable is the cause? > > * Q4. > I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data > directories. Are these files temporary in compaction process? > > > Here is my experimental configurations. > > - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU) > - Cassandra version: 2.2.5 > - inserted data size: about 100GB > - cassandra-env.sh: default > - cassandra.yaml > - compaction_throughput_mb_per_sec: 8 (or default) > - concurrent_compactors: 1 > - sstable_preemptive_open_interval_in_mb: 25 (or default) > - memtable_flush_writers: 1 > > > Regards, > Satoshi > > > On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti < > stefania.alborghe...@datastax.com> wrote: > >> Each sstable has one or more random access readers (one per segment for >> example) and FileCacheService is a cache for such readers. When an sstable >> is closed, the cache is invalidated. If no single reader of an sstable is >> used for at least 512 milliseconds, all readers are evicted. If the sstable >> is opened again, new reader(s) will be created and added to the cache again. >> >> FileCacheService was removed in cassandra 3.0 in favour of a pool of >> page-aligned buffers, and sharing the NIO file channels amongst the readers >> of an sstable, refer to CASSANDRA-8897 >> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and CASSANDRA-8893 >> <https://issues.apache.org/jira/browse/CASSANDRA-8893> for more details. >> >> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sahik...@gmail.com> >> wrote: >> >>> Hi, >>> >>> I have been working on some experiments for Cassandra and found some log >>> messages as follows in debug.log. >>> I am not sure what it exactly is, so I would appreciate if someone gives >>> me some explanations about it. >>> >>> In my verification, a Cassandra node runs as a stand-alone server on >>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB >>> data size) to a table from a client application (which runs on another >>> instance separated from Cassandra node). After insertion, Cassandra >>> continues it's I/O activities for (probably) compaction and keep logging >>> the messages as follows: >>> >>> --- >>> ... >>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170 >>> FileCacheService.java:102 - Evicting cold readers for >>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db >>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780 >>> FileCacheService.java:177 - Invalidating cache for >>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899 >>> FileCacheService.java:177 - Invalidating cache for >>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187 >>> FileCacheService.java:177 - Invalidating cache for >>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308 >>> FileCacheService.java:177 - Invalidating cache for >>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>> ... >>> --- >>> >>> I guess these messages are related to the compaction process and >>> FileCacheService was invalidating cache which associated with a SSTable >>> file. But I'm not sure what it does actually mean. When the cache is >>> invalidated? And What happens is after cache invalidation? >>> >>> >>> Regards, >>> Satoshi >>> >> >> >> >> -- >> >> >> [image: datastax_logo.png] <http://www.datastax.com/> >> >> Stefania Alborghetti >> >> Apache Cassandra Software Engineer >> >> |+852 6114 9265| stefania.alborghe...@datastax.com >> >> >> >> >