Hi, Stefania Thank you for your advice, again!!
Could I ask you for another question? > Each sstable has one or more random access readers (one per segment for example) and FileCacheService is a cache for such readers Does this mean RAR(s) must be created and added to the cache (FileCacheService) whenever a SSTable is opened even if in case of compaction? I think random access read doesn't need to read the data from a SSTable in case of compaction because the SSTable had sorted by their keys, then only need sequential read to merge. If I have something wrong, I'm glad if you could correct. Regards, Satoshi On Thu, Mar 17, 2016 at 5:19 PM, Stefania Alborghetti < stefania.alborghe...@datastax.com> wrote: > Q1. Readers are created as needed, there is no fixed number. For example, > we may have 2 threads scanning sstables at the same time due to 2 different > CQL SELECT statements. > > Q2. There is no correlation between sstable size and JVM HEAP size. We > don't load entire sstables in memory. > > Q3. It's difficult to say what caused the invalidation messages, basically > anything that removed sstables from memory, such as dropping the table, > snapshots, compactions, streaming, there may me other operations I'm not > familiar with. > > Q4. Correct, these are temporary files. Once again, in 3.0 things are > different and the temporary files have been replaced by transaction logs > (CASSANDRA-7066). > > > On Thu, Mar 17, 2016 at 3:40 PM, Satoshi Hikida <sato...@imagine-orb.com> > wrote: > >> Sorry there is a mistake in my previous post. I would correct it. >> >> In Q3, I mentioned there are a lot of invalidating messages in the >> debug.log. It is true but cassandra configurations were wrong. In that >> case, the cassandra.yaml configurations are as follows: >> >> - cassandra.yaml >> - compaction_throughput_mb_per_sec: 0 (not 8 or default) >> - concurrent_compactors: 1 >> - sstable_preemptive_open_interval_in_mb: 0 (not 8 or default) >> - memtable_flush_writers: 1 >> >> And More precisely, in that case, Cassandra keep on outputting >> invalidating messages for a while(a few hours). However CPU usage is almost >> 0.0% in top command like below. >> >> $ top -bu cassandra -n 1 >> ... >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >> COMMAND >> 2631 cassand+ 20 0 0.250t 1.969g 703916 S 0.0 57.0 8459:35 >> java >> >> I want to know what was actually happening at that time. >> >> >> Regards, >> Satoshi >> >> >> On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sahik...@gmail.com> >> wrote: >> >>> Thank you for your very useful advice! >>> >>> >>> Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've >>> understood what does these logs mean. But I have more a few questions. So I >>> would very much appreciate If I get some explanations about these questions. >>> >>> * Q1. >>> In my understand, when open a SSTable, a lot of >>> RandomAccessReaders(RARs) are created. A number of RARs is equal to a >>> number of segments of SSTable. Is a number of segments(=RARs) equal to >>> follows? >>> >>> a number of segments = size of SSTable / size of segments >>> >>> * Q2. >>> What is happen if the Cassandra open a SSTable file which bigger than >>> JVM heap (or memory)? >>> >>> * Q3. >>> In my case, there are a lot of invalidating messages for the same >>> SSTable file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in >>> my previous post). In some cases, there are more than 600 invalidating >>> messages for the same file and these messages logged for a few hours. Would >>> that closing a big SSTable is the cause? >>> >>> * Q4. >>> I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data >>> directories. Are these files temporary in compaction process? >>> >>> >>> Here is my experimental configurations. >>> >>> - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU) >>> - Cassandra version: 2.2.5 >>> - inserted data size: about 100GB >>> - cassandra-env.sh: default >>> - cassandra.yaml >>> - compaction_throughput_mb_per_sec: 8 (or default) >>> - concurrent_compactors: 1 >>> - sstable_preemptive_open_interval_in_mb: 25 (or default) >>> - memtable_flush_writers: 1 >>> >>> >>> Regards, >>> Satoshi >>> >>> >>> On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti < >>> stefania.alborghe...@datastax.com> wrote: >>> >>>> Each sstable has one or more random access readers (one per segment for >>>> example) and FileCacheService is a cache for such readers. When an sstable >>>> is closed, the cache is invalidated. If no single reader of an sstable is >>>> used for at least 512 milliseconds, all readers are evicted. If the sstable >>>> is opened again, new reader(s) will be created and added to the cache >>>> again. >>>> >>>> FileCacheService was removed in cassandra 3.0 in favour of a pool of >>>> page-aligned buffers, and sharing the NIO file channels amongst the readers >>>> of an sstable, refer to CASSANDRA-8897 >>>> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and >>>> CASSANDRA-8893 <https://issues.apache.org/jira/browse/CASSANDRA-8893> >>>> for more details. >>>> >>>> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sahik...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have been working on some experiments for Cassandra and found some >>>>> log messages as follows in debug.log. >>>>> I am not sure what it exactly is, so I would appreciate if someone >>>>> gives me some explanations about it. >>>>> >>>>> In my verification, a Cassandra node runs as a stand-alone server on >>>>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about >>>>> 100GB >>>>> data size) to a table from a client application (which runs on another >>>>> instance separated from Cassandra node). After insertion, Cassandra >>>>> continues it's I/O activities for (probably) compaction and keep logging >>>>> the messages as follows: >>>>> >>>>> --- >>>>> ... >>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170 >>>>> FileCacheService.java:102 - Evicting cold readers for >>>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db >>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780 >>>>> FileCacheService.java:177 - Invalidating cache for >>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899 >>>>> FileCacheService.java:177 - Invalidating cache for >>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187 >>>>> FileCacheService.java:177 - Invalidating cache for >>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308 >>>>> FileCacheService.java:177 - Invalidating cache for >>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db >>>>> ... >>>>> --- >>>>> >>>>> I guess these messages are related to the compaction process and >>>>> FileCacheService was invalidating cache which associated with a SSTable >>>>> file. But I'm not sure what it does actually mean. When the cache is >>>>> invalidated? And What happens is after cache invalidation? >>>>> >>>>> >>>>> Regards, >>>>> Satoshi >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>> >>>> Stefania Alborghetti >>>> >>>> Apache Cassandra Software Engineer >>>> >>>> |+852 6114 9265| stefania.alborghe...@datastax.com >>>> >>>> >>>> >>>> >>> >> > > > -- > > > [image: datastax_logo.png] <http://www.datastax.com/> > > Stefania Alborghetti > > Apache Cassandra Software Engineer > > |+852 6114 9265| stefania.alborghe...@datastax.com > > > >