"Splitting one report to multiple rows is uncomfortably"
WHY? Reading from N disks is way faster than reading from 1 disk.
I think in terms of PlayOrm and then explain the model you can use so I
think in objects first
Report {
String uniqueId
String reportName; //may be indexable and query
Thanks a lot for helping. We came to the same decision clustering one
report to multiple cassandra rows (sorted buckets of report rows) and
manage clusters on client side.
On Tue, Sep 25, 2012 at 5:28 AM, aaron morton wrote:
> What exactly is the problem with big rows?
>
> During compaction the r
> What exactly is the problem with big rows?
During compaction the row will be passed through a slower two pass processing,
this add's to IO pressure.
Counting big rows requires that the entire row be read.
Repairing big rows requires that the entire row be repaired.
I generally avoid rows abo
On Sun, Sep 23, 2012 at 10:41 PM, aaron morton wrote:
> /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
> "[0-9]+ bytes" -o | cut -d " " -f 1 | awk '{ foo = $1 / 1024 / 1024 ;
> print foo "MB" }' | sort -nr | head -n 50
>
>
> Is it bad signal?
>
> Sorry, I do not know what
> /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
> "[0-9]+ bytes" -o | cut -d " " -f 1 | awk '{ foo = $1 / 1024 / 1024 ;
> print foo "MB" }' | sort -nr | head -n 50
> Is it bad signal?
Sorry, I do not know what this is outputting.
>> As I can see in cfstats, compacted r
And some stuff from log:
/var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
"[0-9]+ bytes" -o | cut -d " " -f 1 | awk '{ foo = $1 / 1024 / 1024 ;
print foo "MB" }' | sort -nr | head -n 50
3821.55MB
3337.85MB
1221.64MB
1128.67MB
930.666MB
916.4MB
861.114MB
843.325MB
711.813MB
Found one more intersting fact.
As I can see in cfstats, compacted row maximum size: 386857368 !
On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin wrote:
> Reports - is a SuperColumnFamily
>
> Each report has unique identifier (report_id). This is a key of
> SuperColumnFamily.
> And a report sav
Reports - is a SuperColumnFamily
Each report has unique identifier (report_id). This is a key of
SuperColumnFamily.
And a report saved in separate row.
A report is consisted of report rows (may vary between 1 and 50,
but most are small).
Each report row is saved in separate super column. Hec
I'm not 100% that I understand your data model and read patterns correctly,
but it sounds like you have large supercolumns and are requesting some of
the subcolumns from individual super columns. If that's the case, the
issue is that Cassandra must deserialize the entire supercolumn in memory
when
p.s. Cassandra 1.1.4
On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin wrote:
> Hi, all!
>
> We have a cluster with virtual 7 nodes (disk storage is connected to
> nodes with iSCSI). The storage schema is:
>
> Reports:{
> 1:{
> 1:{"value1":"some val", "value2":"some val"},
> 2
Hi, all!
We have a cluster with virtual 7 nodes (disk storage is connected to
nodes with iSCSI). The storage schema is:
Reports:{
1:{
1:{"value1":"some val", "value2":"some val"},
2:{"value1":"some val", "value2":"some val"}
...
},
2:{
1:{"value1":"some
11 matches
Mail list logo