Re: [problem with OOM in nodes]

Denis Gabaydulin Fri, 21 Sep 2012 10:08:45 -0700

And some stuff from log:


/var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
"[0-9]+ bytes" -o | cut -d " " -f 1 |  awk '{ foo = $1 / 1024 / 1024 ;
print foo "MB" }'  | sort -nr | head -n 50
3821.55MB
3337.85MB
1221.64MB
1128.67MB
930.666MB
916.4MB
861.114MB
843.325MB
711.813MB
706.992MB
674.282MB
673.861MB
658.305MB
557.756MB
531.577MB
493.112MB
492.513MB
492.291MB
484.484MB
479.908MB
465.742MB
464.015MB
459.95MB
454.472MB
441.248MB
428.763MB
424.028MB
416.663MB
416.191MB
409.341MB
406.895MB
397.314MB
388.27MB
376.714MB
371.298MB
368.819MB
366.92MB
361.371MB
360.509MB
356.168MB
355.012MB
354.897MB
354.759MB
347.986MB
344.109MB
335.546MB
329.529MB
326.857MB
326.252MB
326.237MB

Is it bad signal?

On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin <gaba...@gmail.com> wrote:
> Found one more intersting fact.
> As I can see in cfstats, compacted row maximum size: 386857368 !
>
> On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin <gaba...@gmail.com> wrote:
>> Reports - is a SuperColumnFamily
>>
>> Each report has unique identifier (report_id). This is a key of
>> SuperColumnFamily.
>> And a report saved in separate row.
>>
>> A report is consisted of report rows (may vary between 1 and 500000,
>> but most are small).
>>
>> Each report row is saved in separate super column. Hector based code:
>>
>> superCfMutator.addInsertion(
>>   report_id,
>>   "Reports",
>>   HFactory.createSuperColumn(
>>     report_row_id,
>>     mapper.convertObject(object),
>>     columnDefinition.getTopSerializer(),
>>     columnDefinition.getSubSerializer(),
>>     inferringSerializer
>>   )
>> );
>>
>> We have two frequent operation:
>>
>> 1. count report rows by report_id (calculate number of super columns
>> in the row).
>> 2. get report rows by report_id and range predicate (get super columns
>> from the row with range predicate).
>>
>> I can't see here a big super columns :-(
>>
>> On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs <ty...@datastax.com> wrote:
>>> I'm not 100% that I understand your data model and read patterns correctly,
>>> but it sounds like you have large supercolumns and are requesting some of
>>> the subcolumns from individual super columns.  If that's the case, the issue
>>> is that Cassandra must deserialize the entire supercolumn in memory whenever
>>> you read *any* of the subcolumns.  This is one of the reasons why composite
>>> columns are recommended over supercolumns.
>>>
>>>
>>> On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin <gaba...@gmail.com> wrote:
>>>>
>>>> p.s. Cassandra 1.1.4
>>>>
>>>> On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin <gaba...@gmail.com>
>>>> wrote:
>>>> > Hi, all!
>>>> >
>>>> > We have a cluster with virtual 7 nodes (disk storage is connected to
>>>> > nodes with iSCSI). The storage schema is:
>>>> >
>>>> > Reports:{
>>>> >     1:{
>>>> >         1:{"value1":"some val", "value2":"some val"},
>>>> >         2:{"value1":"some val", "value2":"some val"}
>>>> >         ...
>>>> >     },
>>>> >     2:{
>>>> >         1:{"value1":"some val", "value2":"some val"},
>>>> >         2:{"value1":"some val", "value2":"some val"}
>>>> >         ...
>>>> >     }
>>>> >     ...
>>>> > }
>>>> >
>>>> > create keyspace osmp_reports
>>>> >   with placement_strategy = 'SimpleStrategy'
>>>> >   and strategy_options = {replication_factor : 4}
>>>> >   and durable_writes = true;
>>>> >
>>>> > use osmp_reports;
>>>> >
>>>> > create column family QueryReportResult
>>>> >   with column_type = 'Super'
>>>> >   and comparator = 'BytesType'
>>>> >   and subcomparator = 'BytesType'
>>>> >   and default_validation_class = 'BytesType'
>>>> >   and key_validation_class = 'BytesType'
>>>> >   and read_repair_chance = 1.0
>>>> >   and dclocal_read_repair_chance = 0.0
>>>> >   and gc_grace = 432000
>>>> >   and min_compaction_threshold = 4
>>>> >   and max_compaction_threshold = 32
>>>> >   and replicate_on_write = true
>>>> >   and compaction_strategy =
>>>> > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>>>> >   and caching = 'KEYS_ONLY';
>>>> >
>>>> > =============================================
>>>> >
>>>> > Read/Write CL: 2
>>>> >
>>>> > Most of the reports are small, but some of them could have a half
>>>> > mullion of rows (xml). Typical operations on this dataset is:
>>>> >
>>>> > count report rows by report_id (top level id of super column);
>>>> > get columns (report_rows) by range predicate and limit for given
>>>> > report_id.
>>>> >
>>>> > A data is written once and hasn't never been updated.
>>>> >
>>>> > So, time to time a couple of nodes crashes with OOM exception. Heap
>>>> > dump says, that we have a lot of super columns in memory.
>>>> > For example, I see one of the reports is in memory entirely. How it
>>>> > could be possible? If we don't load the whole report, cassandra could
>>>> > whether do this for some internal reasons?
>>>> >
>>>> > What should we do to avoid OOMs?
>>>
>>>
>>>
>>>
>>> --
>>> Tyler Hobbs
>>> DataStax
>>>

Re: [problem with OOM in nodes]

Reply via email to