Re: Why "select count("*) from .." hangs ?

2014-03-26 Thread shahab
Thanks for the hints. I got a better picture of how to deal with "count"
queries.


On Tue, Mar 25, 2014 at 7:01 PM, Robert Coli  wrote:

> On Tue, Mar 25, 2014 at 8:36 AM, shahab  wrote:
>
>> But after iteration 8,  (i.e. inserting  150 sensor data), the
>> "select count(') ...)  throws time-out exception and doesn't work anymore.
>> I even tried  to execute "select count(*)..." using Datastax DevCenter GUI,
>> but I got same result.
>>
>
> All operations in Cassandra are subject to (various) timeouts, which by
> default are in the scale of single digit seconds.
>
> If you attempt to do an operation (such as aggregates across large numbers
> of large objects) which cannot complete in this time, this is a strong
> indication that either your overall approach is inappropriate or, at very
> least, that your buckets are too large.
>
> =Rob
>
>


Thrift -> CQL

2014-03-26 Thread rubbish me
Hi all,
 
We have been using Cassandra for more than 3 years and now we have a cluster in 
production still running on 1.1.x contains dynamic-columned column-families - 
with hector as client. 

We are trying to update to the latest 1.2.x and considering to use datastax 
client in order to utilise some of its round robin / failover goodness.
 
We bumped on to a few walls however when converting our thrift based client 
code to CQL.  We read through the docs + datastax dev blog entires like: this 
and this.  However they are mostly focus on reading from an existing dynamic 
cf, run some alter table statements, and reading it again.
Very little about how to insert / update.
 
So there comes my questions:
-  Is there any way to do insert / update at all on a good old wide cf using 
CQL?   Based on what we read back out, we have tried:

INSERT INTO cf_name(key, column1, value) VALUES (‘key1’, 
‘columnName1’,’columnValue2’)

But we ended up with “Unknown identifier column1”
 
-  About read -  One of our cf is defined with a secondary index.  So the 
schema looks something like:
 
create column family cf_with_index
  with column_type = 'Standard'
  and comparator = 'UTF8Type'
  and default_validation_class = 'UTF8Type'
  and key_validation_class = 'UTF8Type'
  and column_metadata = [
{column_name : 'indexed_column',
validation_class : UTF8Type,
index_name : 'column_idx',
index_type : 0}];
 
When reading from cli, we will see all columns, data as you expected:
--
---
RowKey: rowkey1
=> (name=c1, value=v1, timestamp=xxx, ttl=604800)
=> (name=c2, value=v2, timestamp=xxx, ttl=604800)
=> (name=c3, value=v3, timestamp=xxx, ttl=604800)
=> (name=indexed_column, value=value1, timestamp=xxx, ttl=604800)
---
 
However when we Query via CQL, we only get the indexed column:
SELECT * FROM cf_with_index WHERE key = ‘rowkey1’;
 
key   | indexed_column
---+
rowkey1   | value1
 
Any way to get the rest?
 
-  Obtaining TTL and writetime on these wide rows  - we tried:
SELECT key, column1, value, writetime(value), ttl(value) FROM cf LIMIT 1;
It works, but a bit clumsy.  Is there a better way?
 
-  We can live with thrift.  Is there any way / plan to let us to execute 
thrift with datastax driver?  Hector seems not active anymore.
 
Many thanks in advanced,
 
A



Re: Thrift -> CQL

2014-03-26 Thread Peter Lin
Hector has round robin and failover. Is there a particular kind of failover
you're looking for?

by default Hector will try another node if the first node it connects to is
down. It's been that way since the 1.x client if I'm not mistaken.


On Wed, Mar 26, 2014 at 9:41 AM, rubbish me wrote:

> Hi all,
>
>
>
> We have been using Cassandra for more than 3 years and now we have a
> cluster in production still running on 1.1.x contains dynamic-columned
> column-families - with hector as client.
>
>
> We are trying to update to the latest 1.2.x and considering to use
> datastax client in order to utilise some of its round robin / failover
> goodness.
>
>
>
> We bumped on to a few walls however when converting our thrift based
> client code to CQL.  We read through the docs + datastax dev blog entires
> like: this  and 
> this.
> However they are mostly focus on reading from an existing dynamic cf, run
> some alter table statements, and reading it again.
>
> Very little about how to insert / update.
>
>
>
> So there comes my questions:
>
> -  *Is there any way to do insert / update at all on a good old wide cf
> using CQL?   Based on what we read back out, we have tried:*
>
>
> INSERT INTO cf_name(key, column1, value) VALUES ('key1',
> 'columnName1','columnValue2')
>
>
> But we ended up with "Unknown identifier column1"
>
>
>
> -  *About read -  One of our cf is defined with a secondary index.  So
> the schema looks something like:*
>
>
>
> create column family cf_with_index
>
>   with column_type = 'Standard'
>
>   and comparator = 'UTF8Type'
>
>   and default_validation_class = 'UTF8Type'
>
>   and key_validation_class = 'UTF8Type'
>
>   and column_metadata = [
>
> {column_name : 'indexed_column',
>
> validation_class : UTF8Type,
>
> index_name : 'column_idx',
>
> index_type : 0}];
>
>
>
> When reading from cli, we will see all columns, data as you expected:
>
> --
>
> ---
>
> RowKey: rowkey1
>
> => (name=c1, value=v1, timestamp=xxx, ttl=604800)
>
> => (name=c2, value=v2, timestamp=xxx, ttl=604800)
>
> => (name=c3, value=v3, timestamp=xxx, ttl=604800)
>
> => (name=indexed_column, value=value1, timestamp=xxx, ttl=604800)
>
> ---
>
>
>
> However when we Query via CQL, we only get the indexed column:
>
> SELECT * FROM cf_with_index WHERE key = 'rowkey1';
>
>
>
> key   | indexed_column
>
> ---+
>
> rowkey1   | value1
>
>
>
> Any way to get the rest?
>
>
>
> -  *Obtaining TTL and writetime on these wide rows  - we tried:*
>
> *SELECT key, column1, value, writetime(value), ttl(value) FROM cf LIMIT 1;*
>
> *It works, but a bit clumsy.  Is there a better way?*
>
>
>
> -  *We can live with thrift.  Is there any way / plan to let us to
> execute thrift with datastax driver?  Hector seems not active anymore.*
>
>
>
> Many thanks in advanced,
>
>
>
> A
>
>


Re: Thrift -> CQL

2014-03-26 Thread Sylvain Lebresne
>
> -  *Is there any way to do insert / update at all on a good old wide cf
> using CQL?   Based on what we read back out, we have tried:*
>
>
> INSERT INTO cf_name(key, column1, value) VALUES ('key1',
> 'columnName1','columnValue2')
>
>
> But we ended up with "Unknown identifier column1"
>

What does cqlsh give you if you try to do 'DESC cf_name'?



>
>
> -  *About read -  One of our cf is defined with a secondary index.  So
> the schema looks something like:*
>
>
>
> create column family cf_with_index
>
>   with column_type = 'Standard'
>
>   and comparator = 'UTF8Type'
>
>   and default_validation_class = 'UTF8Type'
>
>   and key_validation_class = 'UTF8Type'
>
>   and column_metadata = [
>
> {column_name : 'indexed_column',
>
> validation_class : UTF8Type,
>
> index_name : 'column_idx',
>
> index_type : 0}];
>
>
>
> When reading from cli, we will see all columns, data as you expected:
>
> --
>
> ---
>
> RowKey: rowkey1
>
> => (name=c1, value=v1, timestamp=xxx, ttl=604800)
>
> => (name=c2, value=v2, timestamp=xxx, ttl=604800)
>
> => (name=c3, value=v3, timestamp=xxx, ttl=604800)
>
> => (name=indexed_column, value=value1, timestamp=xxx, ttl=604800)
>
> ---
>
>
>
> However when we Query via CQL, we only get the indexed column:
>
> SELECT * FROM cf_with_index WHERE key = 'rowkey1';
>
>
>
> key   | indexed_column
>
> ---+
>
> rowkey1   | value1
>
>
>
> Any way to get the rest?
>

You would have to declare the other columns (c1, c2 and c3) in the metadata
(you don't have to index them though).



>
>
> -  *Obtaining TTL and writetime on these wide rows  - we tried:*
>
> *SELECT key, column1, value, writetime(value), ttl(value) FROM cf LIMIT 1;*
>
> *It works, but a bit clumsy.  Is there a better way?*
>

No, it's the CQL way (not that I particularly agree with the "clumsy"
qualification, but I suppose we all have different opinion on what is
clumsy and what is not).


>
>
> -  *We can live with thrift.  Is there any way / plan to let us to
> execute thrift with datastax driver?*
>

No (and it's not like it's a minor change to allow that, the DataStax Java
driver uses the native protocol which is CQL only by nature).

--
Sylvain


unstable write performance

2014-03-26 Thread Jiaan Zeng
Hi,

I am doing some performance benchmarks in a *single* node cassandra
1.2.4. BTW, the machine is dedicated to run one cassandra instance.
The workload is 100% write. The throughput varies dramatically and
sometimes even drops to 0. I have tried several things below and still
got the same observation. There is no errors in the log file. One
thing I spotted in the log is GCInspector reports GC takes more than
200 ms. I think that is because the size of the memtable setting. If I
lower the memtable size, that kind of report can go away. Any clues
about what is happening in this case and suggestions about how to
achieve a stable write throughput? Thanks a lot.

1) Increase heap size from 4 G to 8 G. The total memory is 16 G.
2) Increase "memtable_total_space_in_mb" and
"commitlog_total_space_in_mb" to decrease the number of memtable
flush.
3) Disable the compaction to eliminate the impact of compaction on disk.

Below is an example of throughput.
280 sec: 865658 operations; 2661.5 current ops/sec; [INSERT
AverageLatency(us)=3640.16]
 290 sec: 865658 operations; 0 current ops/sec;
 300 sec: 903204 operations; 3754.22 current ops/sec; [INSERT
AverageLatency(us)=12341.77]


-- 
Regards,
Jiaan


Re: Kernel keeps killing cassandra process - OOM

2014-03-26 Thread prem yadav
Thanks Robert. That seems to be the issue. however the fix mentioned there
doesn't work. I downgraded Java to jdk6_37 and that seems to have done the
trick. Thanks for pointing me to that Jira ticket.


On Mon, Mar 24, 2014 at 6:48 PM, Robert Coli  wrote:

> On Mon, Mar 24, 2014 at 4:11 AM, prem yadav  wrote:
>
>> the nodes die *without * being under any load. Completely idle.
>>
>
> https://issues.apache.org/jira/browse/CASSANDRA-6541
>
> ?
>
> =Rob
>
>


nodetool scrub throws exception FileAlreadyExistsException

2014-03-26 Thread Donald Smith

% time nodetool scrub -s as_reports data_report_info_2011
xss =  -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar 
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8192M -Xmx8192M 
-Xmn2048M -XX:+HeapDumpOnOutOfMemoryError -Xss256k
Exception in thread "main" FSWriteError in 
/mnt/cassandra-storage/data/as_reports/data_report_info_2011/snapshots/pre-scrub-1395848747073/as_reports-data_report_info_2011-jb-3-Data.db
at 
org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:84)
at 
org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1817)
at 
org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1123)
at 
org.apache.cassandra.service.StorageService.scrub(StorageService.java:2197)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.reflect.misc.Trampoline.invoke(Unknown Source)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.reflect.misc.MethodUtil.invoke(Unknown Source)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown 
Source)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Unknown 
Source)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source)
at com.sun.jmx.mbeanserver.PerInterface.invoke(Unknown Source)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(Unknown Source)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Unknown 
Source)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown 
Source)
at javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown 
Source)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown 
Source)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown 
Source)
at javax.management.remote.rmi.RMIConnectionImpl.invoke(Unknown Source)
at sun.reflect.GeneratedMethodAccessor42.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
at sun.rmi.transport.Transport$1.run(Unknown Source)
at sun.rmi.transport.Transport$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown 
Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.nio.file.FileAlreadyExistsException: 
/mnt/cassandra-storage/data/as_reports/data_report_info_2011/snapshots/pre-scrub-1395848747073/as_reports-data_report_info_2011-jb-3-Data.db
 -> 
/mnt/cassandra-storage/data/as_reports/data_report_info_2011/as_reports-data_report_info_2011-jb-3-Data.db
at sun.nio.fs.UnixException.translateToIOException(Unknown Source)
at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
at sun.nio.fs.UnixFileSystemProvider.createLink(Unknown Source)
at java.nio.file.Files.createLink(Unknown Source)
at 
org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:80)
... 39 more
1.112u 0.122s 3:38.36 0.5%  0+0k 0+328io 0pf+0w


That table is new and very unlikely to be corrupted.  I retried the command 
without "-s" and it succeeded right away. I tried again WITH "-s" and it 
succeeded again too.

Donald A. Smith | Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
dona...@audiencescience.com

[AudienceScience]

<>

Re: unstable write performance

2014-03-26 Thread Marcin Cabaj
ParNew GC (used by default in cassandra) uses 'stop-the-world' algorithm,
which means your application has to be stopped to do gc.
You can run jstat command to monitor gc activity and check if your write
performance is related to GC, eg:
$ jstat -gc  1s
But it shouldn't drop throughtput to 0 ops/s.

Very often (almost always) the first bottleneck is storage.
12341.77us avg latency is quite high. How big your inserts are? Is your
disk saturated? What kind of storage do you use?
Run iostat to take a look what does your disk do.




On Wed, Mar 26, 2014 at 2:54 PM, Jiaan Zeng  wrote:

> Hi,
>
> I am doing some performance benchmarks in a *single* node cassandra
> 1.2.4. BTW, the machine is dedicated to run one cassandra instance.
> The workload is 100% write. The throughput varies dramatically and
> sometimes even drops to 0. I have tried several things below and still
> got the same observation. There is no errors in the log file. One
> thing I spotted in the log is GCInspector reports GC takes more than
> 200 ms. I think that is because the size of the memtable setting. If I
> lower the memtable size, that kind of report can go away. Any clues
> about what is happening in this case and suggestions about how to
> achieve a stable write throughput? Thanks a lot.
>
> 1) Increase heap size from 4 G to 8 G. The total memory is 16 G.
> 2) Increase "memtable_total_space_in_mb" and
> "commitlog_total_space_in_mb" to decrease the number of memtable
> flush.
> 3) Disable the compaction to eliminate the impact of compaction on disk.
>
> Below is an example of throughput.
> 280 sec: 865658 operations; 2661.5 current ops/sec; [INSERT
> AverageLatency(us)=3640.16]
>  290 sec: 865658 operations; 0 current ops/sec;
>  300 sec: 903204 operations; 3754.22 current ops/sec; [INSERT
> AverageLatency(us)=12341.77]
>
>
> --
> Regards,
> Jiaan
>


memory usage spikes

2014-03-26 Thread prem yadav
Hi,
in another thread, I has mentioned that we had issue with Cassandra getting
killed by kernel due to OOM. Downgrading to jdk6_37 seems to have fixed it.

However, even now, after every couple of hours, the nodes are showing a
spike in memory usage.
For ex: on a 8GB ram machine, once the usage reached to 7.5 GB.
and then slowly it comes down to normal.

Cassandra version in use is 1.1.9.10.

Any idea why this could be happening? There is no load on the cluster.

Thanks.


Question about how compaction and partition keys interact

2014-03-26 Thread Donald Smith
In CQL we need to decide between using ((customer_id,type),date) as the CQL 
primary key for a reporting table, versus ((customer_id,date),type).

We store reports for every day.  If we use (customer_id,type) as the partition 
key (physical key), then we have  a WIDE ROW where each date's data is stored 
in a different column. Over time, as new reports are added for different dates, 
the row will get wider and wider, and I thought that might cause more work for 
compaction.

So, would a partition key of (customer_id,date) yield better compaction 
behavior?

Again, if we use (customer_id,type) as the partition key, then over time, as 
new columns are added to that row for different dates, I'd think that 
compaction would have to merge new data for a given physical row from multiple 
sstables. That would make compaction expensive.  But if we use 
(customer_id,date) as the partition key, then new data will be added to new 
physical rows, and so compaction would have less work to do

My question is really about how compaction interacts with partition keys.  
Someone on the Cassandra irc channel, 
http://webchat.freenode.net/?channels=#cassandra, said that when partition keys 
overlap between sstables, there's only "slightly" more work to do than when 
they don't, for merging sstables in compaction.  So he thought the first form,  
((customer_id,type),date),  would be better.

One advantage of the first form, ((customer_id,type),date) ,  is that we can 
get all report data for all dates for a given customer and type in a single 
wide row  -- and we do have a (uncommon) use case for such reports.

If we used a primary key of ((customer_id,type,date)), then the rows would be 
un-wide; that wouldn't take advantage of clustering columns and (like the 
second form) wouldn't support the (uncommon) use case mentioned in the previous 
paragraph.

Thanks, Don

Donald A. Smith | Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
dona...@audiencescience.com

[AudienceScience]

<>

Re: memory usage spikes

2014-03-26 Thread Marcin Cabaj
Hi,

RSS or VIRT?

Could you paste output of:
$ ps -p `jps | awk '/CassandraDaemon/ {print $1}'` uww
please?


On Wed, Mar 26, 2014 at 5:20 PM, prem yadav  wrote:

> Hi,
> in another thread, I has mentioned that we had issue with Cassandra
> getting killed by kernel due to OOM. Downgrading to jdk6_37 seems to have
> fixed it.
>
> However, even now, after every couple of hours, the nodes are showing a
> spike in memory usage.
> For ex: on a 8GB ram machine, once the usage reached to 7.5 GB.
> and then slowly it comes down to normal.
>
> Cassandra version in use is 1.1.9.10.
>
> Any idea why this could be happening? There is no load on the cluster.
>
> Thanks.
>
>
>
>


Re: memory usage spikes

2014-03-26 Thread prem yadav
here:

ps -p `/usr/java/jdk1.6.0_37/bin/jps | awk '/Dse/ {print $1}'` uww

SER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
497  20450  0.9 31.0 4727620 2502644 ? SLl  06:55   3:28
/usr/java/jdk1.6.0_37//bin/java -ea
-javaagent:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1968M -Xmx1968M
-Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss190k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true
-Dcom.sun.management.jmxremote.port=7199
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dlog4j.configuration=log4j-server.properties
-Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/var/run/dse.pid -cp
:/usr/share/dse/dse.jar:/usr/share/dse/common/commons-codec-1.6.jar:/usr/share/dse/common/commons-io-2.4.jar:/usr/share/dse/common/guava-13.0.jar:/usr/share/dse/common/jbcrypt-0.3m.jar:/usr/share/dse/common/log4j-1.2.16.jar:/usr/share/dse/common/slf4j-api-1.6.1.jar:/usr/share/dse/common/slf4j-log4j12-1.6.1.jar:/etc/dse:/usr/share/java/jna.jar:/etc/dse/cassandra:/usr/share/dse/cassandra/tools/lib/stress.jar:/usr/share/dse/cassandra/lib/antlr-2.7.7.jar:/usr/share/dse/cassandra/lib/antlr-3.2.jar:/usr/share/dse/cassandra/lib/antlr-runtime-3.2.jar:/usr/share/dse/cassandra/lib/avro-1.4.0-cassandra-1.jar:/usr/share/dse/cassandra/lib/cassandra-all-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-clientutil-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-thrift-1.1.9.10.jar:/usr/share/dse/cassandra/lib/commons-cli-1.1.jar:/usr/share/dse/cassandra/lib/commons-codec-1.6.jar:/usr/share/dse/cassandra/lib/commons-lang-2.4.jar:/usr/share/dse/cassandra/lib/commons-logging-1.1.1.jar:/usr/share/dse/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/dse/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/dse/cassandra/lib/guava-13.0.jar:/usr/share/dse/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/dse/cassandra/lib/httpclient-4.0.1.jar:/usr/share/dse/cassandra/lib/httpcore-4.0.1.jar:/usr/share/dse/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar:/usr/share/dse/cassandra/lib/jline-0.9.94.jar:/usr/share/dse/cassandra/lib/joda-time-1.6.2.jar:/usr/share/dse/cassandra/lib/json-simple-1.1.jar:/usr/share/dse/cassandra/lib/libthrift-0.7.0.jar:/usr/share/dse/cassandra/lib/log4j-1.2.16.jar:/usr/share/dse/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/dse/cassandra/lib/servlet-api-2.5.jar:/usr/share/dse/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/dse/cassandra/lib/snakeyaml-1.6.jar:/usr/share/dse/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/dse/cassandra/lib/snaptree-0.1.jar:/usr/share/dse/cassandra/lib/stringtemplate-3.2.jar::/usr/share/dse/solr/lib/solr-4.0.2.4-SNAPSHOT-uber.jar:/usr/share/dse/solr/lib/solr-web-4.0.2.4-SNAPSHOT.jar:/usr/share/dse/solr/conf::/usr/share/dse/tomcat/lib/annotations-api-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-ha-6.0.32.jar:/usr/share/dse/tomcat/lib/coyote-6.0.32.jar:/usr/share/dse/tomcat/lib/el-api-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-el-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-jdt-6.0.29.jar:/usr/share/dse/tomcat/lib/jsp-api-6.0.29.jar:/usr/share/dse/tomcat/lib/juli-6.0.32.jar:/usr/share/dse/tomcat/lib/servlet-api-6.0.29.jar:/usr/share/dse/tomcat/lib/tribes-6.0.32.jar:/usr/share/dse/tomcat/conf::/usr/share/dse/hadoop:/etc/dse/hadoop:/usr/share/dse/hadoop/lib/ant-1.6.5.jar:/usr/share/dse/hadoop/lib/automaton-1.11-8.jar:/usr/share/dse/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/share/dse/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/share/dse/hadoop/lib/commons-cli-1.2.jar:/usr/share/dse/hadoop/lib/commons-codec-1.4.jar:/usr/share/dse/hadoop/lib/commons-collections-3.2.1.jar:/usr/share/dse/hadoop/lib/commons-configuration-1.6.jar:/usr/share/dse/hadoop/lib/commons-digester-1.8.jar:/usr/share/dse/hadoop/lib/commons-el-1.0.jar:/usr/share/dse/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/share/dse/hadoop/lib/commons-lang-2.4.jar:/u

Its the spike in RAM usage. Now it is normal but keeps showing the spikes.


On Wed, Mar 26, 2014 at 5:31 PM, Marcin Cabaj wrote:

> Hi,
>
> RSS or VIRT?
>
> Could you paste output of:
> $ ps -p `jps | awk '/CassandraDaemon/ {print $1}'` uww
> please?
>
>
> On Wed, Mar 26, 2014 at 5:20 PM, prem yadav  wrote:
>
>> Hi,
>> in another thread, I has mentioned that we had issue with Cassandra
>> getting killed by kernel due to OOM. Downgrading to jdk6_37 seems to have
>> fixed it.
>>
>> However, even now, after every couple of hours, the nodes are showing a
>> spike in memory usage.
>> For ex: on a 8GB ram machine, once the usage reached to 7.5 GB.
>> and then slowly it comes down to no

RE: memory usage spikes

2014-03-26 Thread Donald Smith
Prem,

Did you follow the instructions at 
http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k

And did you install jna-3.2.7.jar into /usr/share/java, as per 
http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installJnaRHEL.html
 ?

Don

From: prem yadav [mailto:ipremya...@gmail.com]
Sent: Wednesday, March 26, 2014 10:36 AM
To: user@cassandra.apache.org
Subject: Re: memory usage spikes

here:

ps -p `/usr/java/jdk1.6.0_37/bin/jps | awk '/Dse/ {print $1}'` uww

SER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
497  20450  0.9 31.0 4727620 2502644 ? SLl  06:55   3:28 
/usr/java/jdk1.6.0_37//bin/java -ea 
-javaagent:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities 
-XX:ThreadPriorityPolicy=42 -Xms1968M -Xmx1968M -Xmn400M 
-XX:+HeapDumpOnOutOfMemoryError -Xss190k -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 
-XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true 
-Dcom.sun.management.jmxremote.port=7199 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true 
-Dcassandra-pidfile=/var/run/dse.pid -cp 
:/usr/share/dse/dse.jar:/usr/share/dse/common/commons-codec-1.6.jar:/usr/share/dse/common/commons-io-2.4.jar:/usr/share/dse/common/guava-13.0.jar:/usr/share/dse/common/jbcrypt-0.3m.jar:/usr/share/dse/common/log4j-1.2.16.jar:/usr/share/dse/common/slf4j-api-1.6.1.jar:/usr/share/dse/common/slf4j-log4j12-1.6.1.jar:/etc/dse:/usr/share/java/jna.jar:/etc/dse/cassandra:/usr/share/dse/cassandra/tools/lib/stress.jar:/usr/share/dse/cassandra/lib/antlr-2.7.7.jar:/usr/share/dse/cassandra/lib/antlr-3.2.jar:/usr/share/dse/cassandra/lib/antlr-runtime-3.2.jar:/usr/share/dse/cassandra/lib/avro-1.4.0-cassandra-1.jar:/usr/share/dse/cassandra/lib/cassandra-all-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-clientutil-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-thrift-1.1.9.10.jar:/usr/share/dse/cassandra/lib/commons-cli-1.1.jar:/usr/share/dse/cassandra/lib/commons-codec-1.6.jar:/usr/share/dse/cassandra/lib/commons-lang-2.4.jar:/usr/share/dse/cassandra/lib/commons-logging-1.1.1.jar:/usr/share/dse/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/dse/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/dse/cassandra/lib/guava-13.0.jar:/usr/share/dse/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/dse/cassandra/lib/httpclient-4.0.1.jar:/usr/share/dse/cassandra/lib/httpcore-4.0.1.jar:/usr/share/dse/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar:/usr/share/dse/cassandra/lib/jline-0.9.94.jar:/usr/share/dse/cassandra/lib/joda-time-1.6.2.jar:/usr/share/dse/cassandra/lib/json-simple-1.1.jar:/usr/share/dse/cassandra/lib/libthrift-0.7.0.jar:/usr/share/dse/cassandra/lib/log4j-1.2.16.jar:/usr/share/dse/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/dse/cassandra/lib/servlet-api-2.5.jar:/usr/share/dse/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/dse/cassandra/lib/snakeyaml-1.6.jar:/usr/share/dse/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/dse/cassandra/lib/snaptree-0.1.jar:/usr/share/dse/cassandra/lib/stringtemplate-3.2.jar::/usr/share/dse/solr/lib/solr-4.0.2.4-SNAPSHOT-uber.jar:/usr/share/dse/solr/lib/solr-web-4.0.2.4-SNAPSHOT.jar:/usr/share/dse/solr/conf::/usr/share/dse/tomcat/lib/annotations-api-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-ha-6.0.32.jar:/usr/share/dse/tomcat/lib/coyote-6.0.32.jar:/usr/share/dse/tomcat/lib/el-api-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-el-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-jdt-6.0.29.jar:/usr/share/dse/tomcat/lib/jsp-api-6.0.29.jar:/usr/share/dse/tomcat/lib/juli-6.0.32.jar:/usr/share/dse/tomcat/lib/servlet-api-6.0.29.jar:/usr/share/dse/tomcat/lib/tribes-6.0.32.jar:/usr/share/dse/tomcat/conf::/usr/share/dse/hadoop:/etc/dse/hadoop:/usr/share/dse/hadoop/lib/ant-1.6.5.jar:/usr/share/dse/hadoop/lib/automaton-1.11-8.jar:/usr/share/dse/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/share/dse/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/share/dse/hadoop/lib/commons-cli-1.2.jar:/usr/share/dse/hadoop/lib/commons-codec-1.4.jar:/usr/share/dse/hadoop/lib/commons-collections-3.2.1.jar:/usr/share/dse/hadoop/lib/commons-configuration-1.6.jar:/usr/share/dse/hadoop/lib/commons-digester-1.8.jar:/usr/share/dse/hadoop/lib/commons-el-1.0.jar:/usr/share/dse/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/share/dse/hadoop/lib/commons-lang-2.4.jar:/u

Its the spike in RAM usage. Now it is normal but keeps showing the spikes.

On Wed, Mar 26, 2014 at 5:31 PM, Marcin Cabaj 
mailto:marcin.ca...@datasift.com>> wrote:
Hi,

RSS or VI

Re: memory usage spikes

2014-03-26 Thread prem yadav
Thanks Don,
Yes have followed those steps. Except jna. The version I am using is 3.2.4.
The link you have shared is for Cassandra 2.0. I am using 1.1. Let me
install jna 3.2.7 and see if that helps.

Thanks


On Wed, Mar 26, 2014 at 5:38 PM, Donald Smith <
donald.sm...@audiencescience.com> wrote:

>  Prem,
>
>
>
> Did you follow the instructions at
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k
>
>
>
> And did you install jna-3.2.7.jar into /usr/share/java, as per
> http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installJnaRHEL.html?
>
>
>
> Don
>
>
>
> *From:* prem yadav [mailto:ipremya...@gmail.com]
> *Sent:* Wednesday, March 26, 2014 10:36 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: memory usage spikes
>
>
>
> here:
>
>
>
> ps -p `/usr/java/jdk1.6.0_37/bin/jps | awk '/Dse/ {print $1}'` uww
>
>
>
> SER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
>
> 497  20450  0.9 31.0 4727620 2502644 ? SLl  06:55   3:28
> /usr/java/jdk1.6.0_37//bin/java -ea
> -javaagent:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar
> -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1968M -Xmx1968M
> -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss190k -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true
> -Dcom.sun.management.jmxremote.port=7199
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dlog4j.configuration=log4j-server.properties
> -Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/var/run/dse.pid -cp
> :/usr/share/dse/dse.jar:/usr/share/dse/common/commons-codec-1.6.jar:/usr/share/dse/common/commons-io-2.4.jar:/usr/share/dse/common/guava-13.0.jar:/usr/share/dse/common/jbcrypt-0.3m.jar:/usr/share/dse/common/log4j-1.2.16.jar:/usr/share/dse/common/slf4j-api-1.6.1.jar:/usr/share/dse/common/slf4j-log4j12-1.6.1.jar:/etc/dse:/usr/share/java/jna.jar:/etc/dse/cassandra:/usr/share/dse/cassandra/tools/lib/stress.jar:/usr/share/dse/cassandra/lib/antlr-2.7.7.jar:/usr/share/dse/cassandra/lib/antlr-3.2.jar:/usr/share/dse/cassandra/lib/antlr-runtime-3.2.jar:/usr/share/dse/cassandra/lib/avro-1.4.0-cassandra-1.jar:/usr/share/dse/cassandra/lib/cassandra-all-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-clientutil-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-thrift-1.1.9.10.jar:/usr/share/dse/cassandra/lib/commons-cli-1.1.jar:/usr/share/dse/cassandra/lib/commons-codec-1.6.jar:/usr/share/dse/cassandra/lib/commons-lang-2.4.jar:/usr/share/dse/cassandra/lib/commons-logging-1.1.1.jar:/usr/share/dse/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/dse/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/dse/cassandra/lib/guava-13.0.jar:/usr/share/dse/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/dse/cassandra/lib/httpclient-4.0.1.jar:/usr/share/dse/cassandra/lib/httpcore-4.0.1.jar:/usr/share/dse/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar:/usr/share/dse/cassandra/lib/jline-0.9.94.jar:/usr/share/dse/cassandra/lib/joda-time-1.6.2.jar:/usr/share/dse/cassandra/lib/json-simple-1.1.jar:/usr/share/dse/cassandra/lib/libthrift-0.7.0.jar:/usr/share/dse/cassandra/lib/log4j-1.2.16.jar:/usr/share/dse/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/dse/cassandra/lib/servlet-api-2.5.jar:/usr/share/dse/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/dse/cassandra/lib/snakeyaml-1.6.jar:/usr/share/dse/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/dse/cassandra/lib/snaptree-0.1.jar:/usr/share/dse/cassandra/lib/stringtemplate-3.2.jar::/usr/share/dse/solr/lib/solr-4.0.2.4-SNAPSHOT-uber.jar:/usr/share/dse/solr/lib/solr-web-4.0.2.4-SNAPSHOT.jar:/usr/share/dse/solr/conf::/usr/share/dse/tomcat/lib/annotations-api-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-ha-6.0.32.jar:/usr/share/dse/tomcat/lib/coyote-6.0.32.jar:/usr/share/dse/tomcat/lib/el-api-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-el-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-jdt-6.0.29.jar:/usr/share/dse/tomcat/lib/jsp-api-6.0.29.jar:/usr/share/dse/tomcat/lib/juli-6.0.32.jar:/usr/share/dse/tomcat/lib/servlet-api-6.0.29.jar:/usr/share/dse/tomcat/lib/tribes-6.0.32.jar:/usr/share/dse/tomcat/conf::/usr/share/dse/hadoop:/etc/dse/hadoop:/usr/share/dse/hadoop/lib/ant-1.6.5.jar:/usr/share/dse/hadoop/lib/automaton-1.11-8.jar:/usr/share/dse/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/share/dse/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/share/dse/hadoop/lib/commons-cli-1.2.jar:/usr/share/dse/hadoop/lib/commons-codec-1.4.jar:/usr/share/dse/hadoop/lib/commons-collections-3.2.1.jar:/usr/share/dse/hadoop/lib/commons-configuration-1.6.jar

Re: memory usage spikes

2014-03-26 Thread Marcin Cabaj
You can try to dump memory mapping of the cassandra process during spike
using pmap, eg:
$ pmap -x 
and paste here.


On Wed, Mar 26, 2014 at 5:47 PM, prem yadav  wrote:

> Thanks Don,
> Yes have followed those steps. Except jna. The version I am using is
> 3.2.4. The link you have shared is for Cassandra 2.0. I am using 1.1. Let
> me install jna 3.2.7 and see if that helps.
>
> Thanks
>
>
> On Wed, Mar 26, 2014 at 5:38 PM, Donald Smith <
> donald.sm...@audiencescience.com> wrote:
>
>>  Prem,
>>
>>
>>
>> Did you follow the instructions at
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k
>>
>>
>>
>> And did you install jna-3.2.7.jar into /usr/share/java, as per
>> http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installJnaRHEL.html?
>>
>>
>>
>> Don
>>
>>
>>
>> *From:* prem yadav [mailto:ipremya...@gmail.com]
>> *Sent:* Wednesday, March 26, 2014 10:36 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: memory usage spikes
>>
>>
>>
>> here:
>>
>>
>>
>> ps -p `/usr/java/jdk1.6.0_37/bin/jps | awk '/Dse/ {print $1}'` uww
>>
>>
>>
>> SER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
>>
>> 497  20450  0.9 31.0 4727620 2502644 ? SLl  06:55   3:28
>> /usr/java/jdk1.6.0_37//bin/java -ea
>> -javaagent:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar
>> -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1968M -Xmx1968M
>> -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss190k -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
>> -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
>> -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true
>> -Dcom.sun.management.jmxremote.port=7199
>> -Dcom.sun.management.jmxremote.ssl=false
>> -Dcom.sun.management.jmxremote.authenticate=false
>> -Dlog4j.configuration=log4j-server.properties
>> -Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/var/run/dse.pid -cp
>> :/usr/share/dse/dse.jar:/usr/share/dse/common/commons-codec-1.6.jar:/usr/share/dse/common/commons-io-2.4.jar:/usr/share/dse/common/guava-13.0.jar:/usr/share/dse/common/jbcrypt-0.3m.jar:/usr/share/dse/common/log4j-1.2.16.jar:/usr/share/dse/common/slf4j-api-1.6.1.jar:/usr/share/dse/common/slf4j-log4j12-1.6.1.jar:/etc/dse:/usr/share/java/jna.jar:/etc/dse/cassandra:/usr/share/dse/cassandra/tools/lib/stress.jar:/usr/share/dse/cassandra/lib/antlr-2.7.7.jar:/usr/share/dse/cassandra/lib/antlr-3.2.jar:/usr/share/dse/cassandra/lib/antlr-runtime-3.2.jar:/usr/share/dse/cassandra/lib/avro-1.4.0-cassandra-1.jar:/usr/share/dse/cassandra/lib/cassandra-all-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-clientutil-1.1.9.10.jar:/usr/share/dse/cassandra/lib/cassandra-thrift-1.1.9.10.jar:/usr/share/dse/cassandra/lib/commons-cli-1.1.jar:/usr/share/dse/cassandra/lib/commons-codec-1.6.jar:/usr/share/dse/cassandra/lib/commons-lang-2.4.jar:/usr/share/dse/cassandra/lib/commons-logging-1.1.1.jar:/usr/share/dse/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/dse/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/dse/cassandra/lib/guava-13.0.jar:/usr/share/dse/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/dse/cassandra/lib/httpclient-4.0.1.jar:/usr/share/dse/cassandra/lib/httpcore-4.0.1.jar:/usr/share/dse/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/dse/cassandra/lib/jamm-0.2.5.jar:/usr/share/dse/cassandra/lib/jline-0.9.94.jar:/usr/share/dse/cassandra/lib/joda-time-1.6.2.jar:/usr/share/dse/cassandra/lib/json-simple-1.1.jar:/usr/share/dse/cassandra/lib/libthrift-0.7.0.jar:/usr/share/dse/cassandra/lib/log4j-1.2.16.jar:/usr/share/dse/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/dse/cassandra/lib/servlet-api-2.5.jar:/usr/share/dse/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/dse/cassandra/lib/snakeyaml-1.6.jar:/usr/share/dse/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/dse/cassandra/lib/snaptree-0.1.jar:/usr/share/dse/cassandra/lib/stringtemplate-3.2.jar::/usr/share/dse/solr/lib/solr-4.0.2.4-SNAPSHOT-uber.jar:/usr/share/dse/solr/lib/solr-web-4.0.2.4-SNAPSHOT.jar:/usr/share/dse/solr/conf::/usr/share/dse/tomcat/lib/annotations-api-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-6.0.32.jar:/usr/share/dse/tomcat/lib/catalina-ha-6.0.32.jar:/usr/share/dse/tomcat/lib/coyote-6.0.32.jar:/usr/share/dse/tomcat/lib/el-api-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-el-6.0.29.jar:/usr/share/dse/tomcat/lib/jasper-jdt-6.0.29.jar:/usr/share/dse/tomcat/lib/jsp-api-6.0.29.jar:/usr/share/dse/tomcat/lib/juli-6.0.32.jar:/usr/share/dse/tomcat/lib/servlet-api-6.0.29.jar:/usr/share/dse/tomcat/lib/tribes-6.0.32.jar:/usr/share/dse/tomcat/conf::/usr/share/dse/hadoop:/etc/dse/hadoop:/usr/share/dse/hadoop/lib/ant-1.6.5.jar:/usr/share/dse/hadoop/lib/automaton-1.11-8.jar:/usr/share/dse/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/share/dse/hadoop/lib/

Re: Question about how compaction and partition keys interact

2014-03-26 Thread Jonathan Lacefield
Don,

  What is the underlying question?  Are trying to figure out what's going
to be faster for reads or are you really concerned about storage?

  The recommendation typically provided is to suggest that tables are
modeled based on query access, to enable the fastest read performance.

  In your example, will your app's queries look for
  1)  customer interactions by type by day, with the ability to
   - sort by day within a type
   - grab ranges of dates for at type quickly
   - or pull all dates (and cell data) for a type
   or
 2)  customer interactions by date by type, with the ability to
   - sort by type within a date
   - grab ranges of types for a date quickly
   - or pull all types data for a date

  We also typically recommend that partitions stay within ~100k of columns
or ~100MB per partition.  With your first scenario, wide row, you wouldn't
hit the number of columns for ~273 years :)

  What's interesting in your modeling scenario is that, with the current
options, you don't have the ability to easily pull all dates for a customer
without specifying the type, specific dates, or using ALLOW FILTERING.  Did
you ever consider partitioning simply on customer and using date and type
as clustering keys?

  Hope that helps.

Jonathan




Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487






On Wed, Mar 26, 2014 at 1:22 PM, Donald Smith <
donald.sm...@audiencescience.com> wrote:

>  In CQL we need to decide between using *((customer_id,type),date) *as
> the CQL primary key for a reporting table, versus
> *((customer_id,date),type)*.
>
>
>
> We store reports for every day.  If we use *(customer_id,type)* as the
> partition key (physical key), then we have  a WIDE ROW where each date's
> data is stored in a different column. Over time, as new reports are added
> for different dates, the row will get wider and wider, and I thought that
> might cause more work for compaction.
>
>
>
> So, would a partition key of *(customer_id,date)* yield better compaction
> behavior?
>
>
>
> Again, if we use *(customer_id,type)* as the partition key, then over
> time, as new columns are added to that row for different dates, I'd think
> that compaction would have to merge new data for a given physical row from
> multiple sstables. That would make compaction expensive.  But if we use
> *(customer_id,date)* as the partition key, then new data will be added to *new
> physical rows*, and so compaction would have less work to do
>
>
>
> My question is really about how compaction interacts with partition keys.
>  Someone on the Cassandra irc channel,
> http://webchat.freenode.net/?channels=#cassandra, said that when
> partition keys overlap between sstables, there's only "slightly" more work
> to do than when they don't, for merging sstables in compaction.  So he
> thought the first form, * ((customer_id,type),date), * would be better.
>
>
>
> One advantage of the first form,* ((customer_id,type),date) , * is that
> we can get all report data for all dates for a given customer and type in a
> single wide row  -- and we do have a (uncommon) use case for such reports.
>
>
>
> If we used a primary key of *((customer_id,type,date))*, then the rows
> would be un-wide; that wouldn't take advantage of clustering columns and
> (like the second form) wouldn't support the (uncommon) use case mentioned
> in the previous paragraph.
>
>
>
> Thanks, Don
>
>
>
> *Donald A. Smith* | Senior Software Engineer
> P: 425.201.3900 x 3866
> C: (206) 819-5965
> F: (646) 443-2333
> dona...@audiencescience.com
>
>
> [image: AudienceScience]
>
>
>
<>

RE: Question about how compaction and partition keys interact

2014-03-26 Thread Donald Smith
My underlying question is about the effects of the partitioning key on 
compaction.   Specifically, would having date as part of the partitioning key 
make compaction easier (because compaction wouldn't have to merge wide rows 
over multiple days)?   According to the person on irc, it wouldn't make much 
difference.

We care mostly about read times. If read times were all we cared about, we'd 
use a CQL primary key  of ((customer_id,type) date), especially since it lets 
us efficiently iterate over all dates for a given customer and type.  I also 
care about compaction time, and if the other primary key form decreased 
compaction time, I might go for it. We have terabytes of data.

I don't think we ever have to query all types for a given customer or date.  
That is, we are always given a specific customer and type, plus usually but not 
always a date.

Thanks, Don

From: Jonathan Lacefield [mailto:jlacefi...@datastax.com]
Sent: Wednesday, March 26, 2014 11:20 AM
To: user@cassandra.apache.org
Subject: Re: Question about how compaction and partition keys interact

Don,

  What is the underlying question?  Are trying to figure out what's going to be 
faster for reads or are you really concerned about storage?

  The recommendation typically provided is to suggest that tables are modeled 
based on query access, to enable the fastest read performance.

  In your example, will your app's queries look for
  1)  customer interactions by type by day, with the ability to
   - sort by day within a type
   - grab ranges of dates for at type quickly
   - or pull all dates (and cell data) for a type
   or
 2)  customer interactions by date by type, with the ability to
   - sort by type within a date
   - grab ranges of types for a date quickly
   - or pull all types data for a date

  We also typically recommend that partitions stay within ~100k of columns or 
~100MB per partition.  With your first scenario, wide row, you wouldn't hit the 
number of columns for ~273 years :)

  What's interesting in your modeling scenario is that, with the current 
options, you don't have the ability to easily pull all dates for a customer 
without specifying the type, specific dates, or using ALLOW FILTERING.  Did you 
ever consider partitioning simply on customer and using date and type as 
clustering keys?

  Hope that helps.

Jonathan




Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487
[Image removed by sender.]


[Image removed by 
sender.]

On Wed, Mar 26, 2014 at 1:22 PM, Donald Smith 
mailto:donald.sm...@audiencescience.com>> 
wrote:
In CQL we need to decide between using ((customer_id,type),date) as the CQL 
primary key for a reporting table, versus ((customer_id,date),type).

We store reports for every day.  If we use (customer_id,type) as the partition 
key (physical key), then we have  a WIDE ROW where each date's data is stored 
in a different column. Over time, as new reports are added for different dates, 
the row will get wider and wider, and I thought that might cause more work for 
compaction.

So, would a partition key of (customer_id,date) yield better compaction 
behavior?

Again, if we use (customer_id,type) as the partition key, then over time, as 
new columns are added to that row for different dates, I'd think that 
compaction would have to merge new data for a given physical row from multiple 
sstables. That would make compaction expensive.  But if we use 
(customer_id,date) as the partition key, then new data will be added to new 
physical rows, and so compaction would have less work to do

My question is really about how compaction interacts with partition keys.  
Someone on the Cassandra irc channel, 
http://webchat.freenode.net/?channels=#cassandra, said that when partition keys 
overlap between sstables, there's only "slightly" more work to do than when 
they don't, for merging sstables in compaction.  So he thought the first form,  
((customer_id,type),date),  would be better.

One advantage of the first form, ((customer_id,type),date) ,  is that we can 
get all report data for all dates for a given customer and type in a single 
wide row  -- and we do have a (uncommon) use case for such reports.

If we used a primary key of ((customer_id,type,date)), then the rows would be 
un-wide; that wouldn't take advantage of clustering columns and (like the 
second form) wouldn't support the (uncommon) use case mentioned in the previous 
paragraph.

Thanks, Don

Donald A. Smith | Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
dona...@audiencescience.com

[AudienceScience]


<><>

Re: Kernel keeps killing cassandra process - OOM

2014-03-26 Thread Robert Coli
On Wed, Mar 26, 2014 at 8:35 AM, prem yadav  wrote:

> Thanks Robert. That seems to be the issue. however the fix mentioned there
> doesn't work. I downgraded Java to jdk6_37 and that seems to have done the
> trick. Thanks for pointing me to that Jira ticket.
>

If the workaround on that ticket doesn't work with some versions, I'm sure
the community would appreciate it if you registered for the Apache JIRA and
detailed your findings there. :)

=Rob


Rearranging commitlog and saved_cache directories on a live cluster.

2014-03-26 Thread Redmumba
I currently have a group of about 51 hosts on Cassandra 1.2.15, 17 in each
EC2 AZ (us-east-1a, 1d, 1e).  These are m2.4xlarge machines, so they have
basically a 10G partition on /, and then two ~800G partitions on /dev/sdb
and /dev/sdc.

When I first started, I was expecting the commitlog to take up
significantly more space than it does, so I mounted one of the two drives
(/dev/sdb) for commitlog, saved_caches, etc., and the second drive
(/dev/sdc) for data.  However, now that I have had the cluster running for
a week or so, I'm realizing that the space on sdb is much more necessary
for my data (it will basically allow me to double the space).

My question is twofold.

1. If I change data_file_directories and add a second data directory, will
this affect my data?  What will I need to do once I change it--just run a
repair?  Upgrade sstables?

2. Right now, the average size of my commitlog seems to hover around 1G
with the defaults.  My config settings are as follows:

commitlog_sync: periodic
commitlog_sync_period_in_ms: 1
commitlog_segment_size_in_mb: 32
commitlog_total_space_in_mb: 4096

Am I correct in assuming that commitlog_total_space_in_mb will restrict the
MAXIMUM commitlog directory size to 4GB?  I.e., it should never grow more
than that?  I'm concerned that, if I move the commitlog directory to the
root partition, it will fill it up and cause system instability.

Furthermore, if I move the commitlog (and saved_caches) directory to
another partition, would I just need to drain the node before shutting down
Cassandra and moving it?

Thanks for the help, folks!

Andrew


Re: unstable write performance

2014-03-26 Thread ssiv...@gmail.com

280 sec: 865658 operations; 2661.5 current ops/sec; [INSERT
AverageLatency(us)=3640.16]
 290 sec: 865658 operations; 0 current ops/sec;

It also may indicate that C* trying to finished active tasks and your write 
requests have been
in the queue all 10 sec. Try to monitor C* doing*$watch nodetool tpstats*  
and*$watch nodetool compactionstats.
*Any values >0 in*  pending*column isn't good.

Enable GC logging in cassandra-env.sh. How much memory is free when running C* ?
Increasing heap size may cause long GC delays since GC need to collect and copy 
memory and it also may
depend on your CPU resources.

Try to run C* on default settings and monitor it to found out the bottleneck.



On 03/26/2014 05:54 PM, Jiaan Zeng wrote:

Hi,

I am doing some performance benchmarks in a *single* node cassandra
1.2.4. BTW, the machine is dedicated to run one cassandra instance.
The workload is 100% write. The throughput varies dramatically and
sometimes even drops to 0. I have tried several things below and still
got the same observation. There is no errors in the log file. One
thing I spotted in the log is GCInspector reports GC takes more than
200 ms. I think that is because the size of the memtable setting. If I
lower the memtable size, that kind of report can go away. Any clues
about what is happening in this case and suggestions about how to
achieve a stable write throughput? Thanks a lot.

1) Increase heap size from 4 G to 8 G. The total memory is 16 G.
2) Increase "memtable_total_space_in_mb" and
"commitlog_total_space_in_mb" to decrease the number of memtable
flush.
3) Disable the compaction to eliminate the impact of compaction on disk.

Below is an example of throughput.
280 sec: 865658 operations; 2661.5 current ops/sec; [INSERT
AverageLatency(us)=3640.16]
  290 sec: 865658 operations; 0 current ops/sec;
  300 sec: 903204 operations; 3754.22 current ops/sec; [INSERT
AverageLatency(us)=12341.77]






Re: Why "select count("*) from .." hangs ?

2014-03-26 Thread Arthur Zubarev
I faced the same nuance in my early days with C*, specifically I got RPC 
timeouts on selecting data from CFs larger than 300 GB.

The typical remedy is to implement paging. So instead of using the CLI resort 
to a custom built client app. 

Regards,

Arthur

From: shahab 
Sent: Wednesday, March 26, 2014 4:34 AM
To: user@cassandra.apache.org 
Subject: Re: Why "select count("*) from .." hangs ?

Thanks for the hints. I got a better picture of how to deal with "count" 
queries.



On Tue, Mar 25, 2014 at 7:01 PM, Robert Coli  wrote:

  On Tue, Mar 25, 2014 at 8:36 AM, shahab  wrote:

But after iteration 8,  (i.e. inserting  150 sensor data), the "select 
count(') ...)  throws time-out exception and doesn't work anymore. I even tried 
 to execute "select count(*)..." using Datastax DevCenter GUI, but I got same 
result.

  All operations in Cassandra are subject to (various) timeouts, which by 
default are in the scale of single digit seconds.

  If you attempt to do an operation (such as aggregates across large numbers of 
large objects) which cannot complete in this time, this is a strong indication 
that either your overall approach is inappropriate or, at very least, that your 
buckets are too large.

  =Rob