Hi, Stefania

Thank you for your advice, again!!

Could I ask you for another question?

> Each sstable has one or more random access readers (one per segment for
example) and FileCacheService is a cache for such readers
Does this mean RAR(s) must be created and added to the cache
(FileCacheService) whenever a SSTable is opened even if in case of
compaction? I think random access read doesn't need to read the data from a
SSTable in case of compaction because the SSTable had sorted by their keys,
then only need sequential read to merge.

If I have something wrong, I'm glad if you could correct.


Regards,
Satoshi


On Thu, Mar 17, 2016 at 5:19 PM, Stefania Alborghetti <
stefania.alborghe...@datastax.com> wrote:

> Q1. Readers are created as needed, there is no fixed number. For example,
> we may have 2 threads scanning sstables at the same time due to 2 different
> CQL SELECT statements.
>
> Q2. There is no correlation between sstable size and JVM HEAP size. We
> don't load entire sstables in memory.
>
> Q3. It's difficult to say what caused the invalidation messages, basically
> anything that removed sstables from memory, such as dropping the table,
> snapshots, compactions, streaming, there may me other operations I'm not
> familiar with.
>
> Q4. Correct, these are temporary files. Once again, in 3.0 things are
> different and the temporary files have been replaced by transaction logs
> (CASSANDRA-7066).
>
>
> On Thu, Mar 17, 2016 at 3:40 PM, Satoshi Hikida <sato...@imagine-orb.com>
> wrote:
>
>> Sorry there is a mistake in my previous post. I would correct it.
>>
>> In Q3, I mentioned there are a lot of invalidating messages in the
>> debug.log. It is true but cassandra configurations were wrong. In that
>> case, the cassandra.yaml configurations are as follows:
>>
>> - cassandra.yaml
>> - compaction_throughput_mb_per_sec: 0 (not 8 or default)
>> - concurrent_compactors: 1
>> - sstable_preemptive_open_interval_in_mb: 0  (not 8 or default)
>> - memtable_flush_writers: 1
>>
>> And More precisely, in that case, Cassandra keep on outputting
>> invalidating messages for a while(a few hours). However CPU usage is almost
>> 0.0% in top command like below.
>>
>>     $ top -bu cassandra -n 1
>>     ...
>>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>> COMMAND
>>     2631 cassand+  20   0  0.250t 1.969g 703916 S   0.0 57.0   8459:35
>> java
>>
>> I want to know what was actually happening at that time.
>>
>>
>> Regards,
>> Satoshi
>>
>>
>> On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sahik...@gmail.com>
>> wrote:
>>
>>> Thank you for your very useful advice!
>>>
>>>
>>> Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've
>>> understood what does these logs mean. But I have more a few questions. So I
>>> would very much appreciate If I get some explanations about these questions.
>>>
>>> * Q1.
>>> In my understand, when open a SSTable, a lot of
>>> RandomAccessReaders(RARs) are created. A number of RARs is equal to a
>>> number of segments of SSTable. Is a number of segments(=RARs) equal to
>>> follows?
>>>
>>> a number of segments = size of SSTable / size of segments
>>>
>>> * Q2.
>>> What is happen if the Cassandra open a SSTable file which bigger than
>>> JVM heap (or memory)?
>>>
>>> * Q3.
>>> In my case, there are a lot of invalidating messages for the same
>>> SSTable file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in
>>> my previous post). In some cases, there are more than 600 invalidating
>>> messages for the same file and these messages logged for a few hours. Would
>>> that closing a big SSTable is the cause?
>>>
>>> * Q4.
>>> I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data
>>> directories. Are these files temporary in compaction process?
>>>
>>>
>>> Here is my experimental configurations.
>>>
>>> - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU)
>>> - Cassandra version: 2.2.5
>>> - inserted data size: about 100GB
>>> - cassandra-env.sh: default
>>> - cassandra.yaml
>>> - compaction_throughput_mb_per_sec: 8 (or default)
>>> - concurrent_compactors: 1
>>> - sstable_preemptive_open_interval_in_mb: 25 (or default)
>>> - memtable_flush_writers: 1
>>>
>>>
>>> Regards,
>>> Satoshi
>>>
>>>
>>> On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti <
>>> stefania.alborghe...@datastax.com> wrote:
>>>
>>>> Each sstable has one or more random access readers (one per segment for
>>>> example) and FileCacheService is a cache for such readers. When an sstable
>>>> is closed, the cache is invalidated. If no single reader of an sstable is
>>>> used for at least 512 milliseconds, all readers are evicted. If the sstable
>>>> is opened again, new reader(s) will be created and added to the cache 
>>>> again.
>>>>
>>>> FileCacheService was removed in cassandra 3.0 in favour of a pool of
>>>> page-aligned buffers, and sharing the NIO file channels amongst the readers
>>>> of an sstable, refer to CASSANDRA-8897
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and
>>>> CASSANDRA-8893 <https://issues.apache.org/jira/browse/CASSANDRA-8893>
>>>> for more details.
>>>>
>>>> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sahik...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have been working on some experiments for Cassandra and found some
>>>>> log messages as follows in debug.log.
>>>>> I am not sure what it exactly is, so I would appreciate if someone
>>>>> gives me some explanations about it.
>>>>>
>>>>> In my verification, a Cassandra node runs as a stand-alone server on
>>>>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 
>>>>> 100GB
>>>>> data size) to a table from a client application (which runs on another
>>>>> instance separated from Cassandra node). After insertion, Cassandra
>>>>> continues it's I/O activities for (probably) compaction and keep logging
>>>>> the messages as follows:
>>>>>
>>>>> ---
>>>>> ...
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
>>>>> FileCacheService.java:102 - Evicting cold readers for
>>>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> ...
>>>>> ---
>>>>>
>>>>> I guess these messages are related to the compaction process and
>>>>> FileCacheService was invalidating cache which associated with a SSTable
>>>>> file. But I'm not sure what it does actually mean. When the cache is
>>>>> invalidated? And What happens is after cache invalidation?
>>>>>
>>>>>
>>>>> Regards,
>>>>> Satoshi
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>
>>>> Stefania Alborghetti
>>>>
>>>> Apache Cassandra Software Engineer
>>>>
>>>> |+852 6114 9265| stefania.alborghe...@datastax.com
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>
> --
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Stefania Alborghetti
>
> Apache Cassandra Software Engineer
>
> |+852 6114 9265| stefania.alborghe...@datastax.com
>
>
>
>

Reply via email to