Re: cassandra read latency help

2012-05-18 Thread Gurpreet Singh
Hi Viktor,

As i mentioned, my goal is to finally achieve 100 reads per second
throughput, its not a batch size of 100.

The writes are already done, and i am not doing them anymore. I loaded the
system with about 130 million keys.

I am just doing a read workload in my experiment as of now..

1. no invalidation of cache happenning because no writes are happenning. GC
is not an issue, its an off-heap native cache by  default in 1.0.9
2. reads are with batch size 1
3. read qps 25

I am keeping max read qps constant, and just varying the number of threads
doing the reads.

row cache hit ratio = 0.66

Observations:

1. With 20 threads doing reads, avg latency is 50 ms
2. With 6 threads doing reads, avg latency is 30 ms
3. With 2 threads doing reads, avg latency is 15 ms
4. With 3 threads, latency is 20 ms

Looks like the number of disks (2) are limiting the concurrency of the
system here. any other explanations?
/G


On Thu, May 17, 2012 at 10:49 PM, Viktor Jevdokimov <
viktor.jevdoki...@adform.com> wrote:

>  Row cache is ok until keys are not heavily updated, otherwise it
> frequently invalidates and pressures GC.
>
> ** **
>
> The high latency is from your batch of 100 keys. Review your data model to
> avoid such reads, if you need low latency.
>
> ** **
>
> 500M rows on one node, or on the cluster? Reading 100 random rows at total
> of 40KB data from a data set of 180GB uncompressed under 30ms is not an
> easy task.
>
> ** **
>
> ** **
>
>
>Best regards / Pagarbiai
> *Viktor Jevdokimov*
> Senior Developer
>
> Email: viktor.jevdoki...@adform.com
> Phone: +370 5 212 3063, Fax +370 5 261 0453
> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
> Follow us on Twitter: @adforminsider 
> What is Adform: watch this short video 
>  [image: Adform News] 
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>   *From:* Gurpreet Singh [mailto:gurpreet.si...@gmail.com]
> *Sent:* Thursday, May 17, 2012 20:24
> *To:* user@cassandra.apache.org
> *Subject:* Re: cassandra read latency help
>
> ** **
>
> Thanks Viktor for the advice.
>
> Right now, i just have 1 node that i am testing against and i am using CL
> one.
>
> Are you suggesting that the page cache might be doing better than the row
> cache?
> I am getting row cache hit of 0.66 right now.
>
> ** **
>
> /G
>
> ** **
>
> On Thu, May 17, 2012 at 12:26 AM, Viktor Jevdokimov <
> viktor.jevdoki...@adform.com> wrote:
>
> > Gurpreet Singh wrote:
> > Any ideas on what could help here bring down the read latency even more ?
> 
>
> Avoid Cassandra forwarding request to other nodes:
> - Use consistency level ONE;
> - Create data model to do single request with single key, since different
> keys may belong to different nodes and requires forwarding requests to them;
> - Use smart client to calculate token for key and select appropriate node
> (primary or replica) by token range;
> - Turn off Dynamic Snitch (it may forward request to other replica even it
> has the data);
> - Have all or hot data in page cache (no HDD disk IO) or use SSD;
> - If you do regular updates to key, do not use row cache, otherwise you
> may try.
>
>
>
>
> Best regards / Pagarbiai
>
> Viktor Jevdokimov
> Senior Developer
>
> Email: viktor.jevdoki...@adform.com
> Phone: +370 5 212 3063
> Fax: +370 5 261 0453
>
> J. Jasinskio 16C,
> LT-01112 Vilnius,
> Lithuania
>
>
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
> ** **
>
<>

RE: sstableloader 1.1 won't stream

2012-05-18 Thread Pieter Callewaert
Hi,

Sorry to say I didn't look further into this. I'm using CentOS 6.2 now for 
loader without any problems.

Kind regards,
Pieter Callewaert

-Original Message-
From: sj.climber [mailto:sj.clim...@gmail.com] 
Sent: vrijdag 18 mei 2012 3:56
To: cassandra-u...@incubator.apache.org
Subject: Re: sstableloader 1.1 won't stream

Pieter, Aaron,

Any further progress on this?  I'm running into the same issue, although in my 
case I'm trying to stream from Ubuntu 10.10 to a 2-node cluster (also Cassandra 
1.1.0, and running on separate Ubuntu 10.10 hosts).

Thanks in advance!

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/sstableloader-1-1-won-t-stream-tp7535517p7564811.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.




Re: Zurich / Swiss / Alps meetup

2012-05-18 Thread Benoit Perroud
+1 !



2012/5/17 Sasha Dolgy :
> All,
>
> A year ago I made a simple query to see if there were any users based in and
> around Zurich, Switzerland or the Alps region, interested in participating
> in some form of Cassandra User Group / Meetup.  At the time, 1-2 replies
> happened.  I didn't do much with that.
>
> Let's try this again.  Who all is interested?  I often am jealous about all
> the fun I miss out on with the regular meetups that happen stateside ...
>
> Regards,
> -sd
>
> --
> Sasha Dolgy
> sasha.do...@gmail.com



-- 
sent from my Nokia 3210


Re: cassandra read latency help

2012-05-18 Thread Radim Kolar
to get 100 random reads per second on large dataset (100 GB) you need 
more disks in raid 0 then 2.
Better is to add more nodes then stick too much disks into node. You 
need also adjust io scheduler in OS.


Re: Cassandra 1.0.6 multi data center read question

2012-05-18 Thread Tom Duffield (Mailing Lists)
Hey Roshan, 
Read requests accepted by your Coordinator node in your PROD environment will 
only be sent to your DR data center if you use a CONSISTENCY setting that 
specifies such. The easiest way to ensure you are only reading from Production 
is to use LOCAL_QUORUM or ONE on all reads in your PROD system. Unless you 
manage your Cassandra ring closely, other CONSISTENCY settings could result in 
data being read from DR. 

Hope this helps!

Tom 

-- 
Tom Duffield (Mailing Lists)
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, May 18, 2012 at 12:51 AM, Roshan wrote:

> Hi 
> 
> I have setup an Cassandra cluster in production and a separate cluster in
> our DR environment. The setup is basically 2 data center setup.
> 
> I want to create a separate keyspace on production (production has some
> other keyspaces) and only that keyspace will sync the data with DR.
> 
> If I do a read operation on the production, will that read operation goes to
> DR as well? If so can I disable that call?
> 
> My primary purpose is to keep the DR upto date and won't to communicate the
> production with DR.
> 
> Thanks.
> 
> /Roshan 
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-1-0-6-multi-data-center-read-question-tp7564940.html
> Sent from the cassandra-u...@incubator.apache.org 
> (mailto:cassandra-u...@incubator.apache.org) mailing list archive at 
> Nabble.com (http://Nabble.com).
> 
> 




unable to nodetool to remote EC2

2012-05-18 Thread ramesh

I updated the cassandra-env.sh
$JMX_HOST="10.20.30.40"
JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST"

netstat -ltn shows port 7199 is listening.

I tried both public and private IP for connecting but neither helps.

However, I am able to connect locally from within server.

 I get this error when I remote:

Error connection to remote JMX agent! java.rmi.ConnectException: 
Connection refused to host: 10.20.30.40; nested exception is: 
java.net.ConnectException: Connection timed out at 
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601) at 
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) 
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) 
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at 
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source) 
at 
javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329) 
at 
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279) 
at 
javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248) 
at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144) at 
org.apache.cassandra.tools.NodeProbe.(NodeProbe.java:114) at 
org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused by: 
java.net.ConnectException: Connection timed out at 
java.net.PlainSocketImpl.socketConnect(Native Method) at 
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at 
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at 
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at 
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at 
java.net.Socket.connect(Socket.java:529) at 
java.net.Socket.connect(Socket.java:478) at 
java.net.Socket.(Socket.java:375) at java.net.Socket.(Socket.java:189) 
at 
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22) 
at 
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128) 
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) ... 
10 more


Any help appreciated.
Regards
Ramesh


Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ

2012-05-18 Thread Piavlo

 Hi,

I had a schema disagreement problem in cassandra 1.0.9 cluster, where 
one node had different schema version.
So I followed the faq at 
http://wiki.apache.org/cassandra/FAQ#schema_disagreement
disabled gossip, disabled thrift, drained  and finally stopped the 
cassandra process, on startup

noticed
INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
Couldn't detect any schema definitions in local storage.

in the log, and after
INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
Bootstrap/Replace/Move completed! Now serving reads.
it started throwing Fatal exceptions for all read/write operations 
endlessly.


I had to stop cassandra process again(no draining was done)

On second start it did came up ok immediately loading the correct 
cluster schema version
INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) 
Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7


But now this node appears to have started with no data from keyspace 
which had schema disagreement.

The original keyspace sstables now appear under snapshots dir.

# nodetool -h localhost ring
Address DC  RackStatus State   Load
OwnsToken

   
141784319550391026443072753096570088106
10.49.127.4 eu-west 1a  Up Normal  8.19 GB 
16.67%  0
10.241.29.65eu-west 1b  Up Normal  8.18 GB 
16.67%  28356863910078205288614550619314017621
10.59.46.236eu-west 1c  Up Normal  8.22 GB 
16.67%  56713727820156410577229101238628035242
10.50.33.232eu-west 1a  Up Normal  8.2 GB  
16.67%  85070591730234615865843651857942052864
10.234.71.33eu-west 1b  Up Normal  8.15 GB 
16.67%  113427455640312821154458202477256070485
10.58.249.118   eu-west 1c  Up Normal  660.98 MB   
16.67%  141784319550391026443072753096570088106

#

The node is the one with 660.98 MB data( which is opscenter keyspace 
data which was not invalidated)


So i have some questions:

1) What did I wrong? - why cassandra was throwing exceptions on first 
startup?

2) Why the keyspace data was invalidated ? Is it expected?
3) If answer to #2 is  "yes it's expected" then  that's the point in 
doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement
then all keyspace data is lost anyway? It makes more sense to just do 
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
4) afaiu i could also stop cassandra again move old sstables from 
snapshot back to keyspace data dir and run repair for all keyspace CFs? 
So that it finishes faster
and makes less load than running a repair which has no previous keyspace 
data at all?


The first startup log is below:

 INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 
105) Logging initialized
 INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 
126) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
127) Heap size: 2600468480/2600468480
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
128) Classpath: 
/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/java/mx4j-tools.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.0.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra//lib/jamm-0.2.5.jar
 INFO [main] 2012-05-18 16:23:10,661 CLibrary.java (line 109) JNA 
mlockall successful
 INFO [main] 2012-05-18 16:23:10,692 DatabaseDescriptor.java (line 114) 
Loading settings from file:/etc/cassandra/ssa/cassandra.yaml
 INFO [main] 2012-05-18 16:23:10,868 DatabaseDescriptor.java (line 168) 
DiskAccessMode 'auto' determined to 

Re: unable to nodetool to remote EC2

2012-05-18 Thread Tyler Hobbs
Your firewall rules need to allow TCP traffic on any port >= 1024 for JMX
to work.  It initially connects on port 7199, but then the client is asked
to reconnect on a randomly chosen port.

You can open the firewall, SSH to the node first, or set up something like
this: http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html

On Fri, May 18, 2012 at 1:31 PM, ramesh  wrote:

>  I updated the cassandra-env.sh
> $JMX_HOST="10.20.30.40"
> JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST"
>
> netstat -ltn shows port 7199 is listening.
>
> I tried both public and private IP for connecting but neither helps.
>
> However, I am able to connect locally from within server.
>
>  I get this error when I remote:
>
> Error connection to remote JMX agent! java.rmi.ConnectException:
> Connection refused to host: 10.20.30.40; nested exception is:
> java.net.ConnectException: Connection timed out at
> sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601) at
> sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) at
> sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at
> sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at
> javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source) at
> javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329)
> at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279)
> at
> javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
> at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144) at
> org.apache.cassandra.tools.NodeProbe.**(NodeProbe.java:114) at
> org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused by:
> java.net.ConnectException: Connection timed out at
> java.net.PlainSocketImpl.socketConnect(Native Method) at
> java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at
> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at
> java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at
> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at
> java.net.Socket.connect(Socket.java:529) at
> java.net.Socket.connect(Socket.java:478) at 
> java.net.Socket.**(Socket.java:375)
> at java.net.Socket.**(Socket.java:189) at
> sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
> at
> sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
> at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) ... 10
> more**
>
> Any help appreciated.
> Regards
> Ramesh
>



-- 
Tyler Hobbs
DataStax 


Re: unable to nodetool to remote EC2

2012-05-18 Thread ramesh

On 05/18/2012 01:35 PM, Tyler Hobbs wrote:
Your firewall rules need to allow TCP traffic on any port >= 1024 for 
JMX to work.  It initially connects on port 7199, but then the client 
is asked to reconnect on a randomly chosen port.


You can open the firewall, SSH to the node first, or set up something 
like this: 
http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html


On Fri, May 18, 2012 at 1:31 PM, ramesh > wrote:


I updated the cassandra-env.sh
$JMX_HOST="10.20.30.40"
JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST"

netstat -ltn shows port 7199 is listening.

I tried both public and private IP for connecting but neither helps.

However, I am able to connect locally from within server.

 I get this error when I remote:

Error connection to remote JMX agent! java.rmi.ConnectException:
Connection refused to host: 10.20.30.40; nested exception is:
java.net.ConnectException: Connection timed out at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601)
at
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198)
at
sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown
Source) at

javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329)
at
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279)
at

javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
at
org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144)
at org.apache.cassandra.tools.NodeProbe. (NodeProbe.java:114) at
org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused
by: java.net.ConnectException: Connection timed out at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at
java.net.Socket.connect(Socket.java:529) at
java.net.Socket.connect(Socket.java:478) at java.net.Socket.
(Socket.java:375) at java.net.Socket. (Socket.java:189) at

sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
at

sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595)
... 10 more

Any help appreciated.
Regards
Ramesh




--
Tyler Hobbs
DataStax 



It helped.
Thanks Tyler for the info and the link to the post.

Regards
Ramesh


Re: Migrating a column family from one cluster to another

2012-05-18 Thread Rob Coli
On Thu, May 17, 2012 at 9:37 AM, Bryan Fernandez  wrote:
> What would be the recommended
> approach to migrating a few column families from a six node cluster to a
> three node cluster?

The easiest way (if you are not using counters) is :

1) make sure all filenames of sstables are unique [1]
2) copy all sstablefiles from the 6 nodes to all 3 nodes
3) run a "cleanup" compaction on the 3 nodes

=Rob
[1] https://issues.apache.org/jira/browse/CASSANDRA-1983

-- 
=Robert Coli
AIM>ALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: Migrating a column family from one cluster to another

2012-05-18 Thread Poziombka, Wade L
How does counters affect this?  Why would be different?  

Sent from my iPhone

On May 18, 2012, at 15:40, "Rob Coli"  wrote:

> On Thu, May 17, 2012 at 9:37 AM, Bryan Fernandez  
> wrote:
>> What would be the recommended
>> approach to migrating a few column families from a six node cluster to a
>> three node cluster?
> 
> The easiest way (if you are not using counters) is :
> 
> 1) make sure all filenames of sstables are unique [1]
> 2) copy all sstablefiles from the 6 nodes to all 3 nodes
> 3) run a "cleanup" compaction on the 3 nodes
> 
> =Rob
> [1] https://issues.apache.org/jira/browse/CASSANDRA-1983
> 
> -- 
> =Robert Coli
> AIM>ALK - rc...@palominodb.com
> YAHOO - rcoli.palominob
> SKYPE - rcoli_palominodb


Re: Migrating a column family from one cluster to another

2012-05-18 Thread Rob Coli
On Fri, May 18, 2012 at 1:41 PM, Poziombka, Wade L
 wrote:
> How does counters affect this?  Why would be different?

Oh, actually this is an obsolete caution as of Cassandra 0.8beta1 :

https://issues.apache.org/jira/browse/CASSANDRA-1938

Sorry! :)

=Rob
PS - for historical reference, before this ticket the counts were
based on the ip address of the nodes and things would be hosed if you
did the copy-all-the-sstables operations. it is easy for me to forget
that almost no one was using cassandra counters before 0.8, heh.

-- 
=Robert Coli
AIM>ALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: cassandra read latency help

2012-05-18 Thread Gurpreet Singh
Thanks Radim.
Radim, actually 100 reads per second is achievable even with 2 disks.
But achieving them with a really low avg latency per key is the issue.

I am wondering if anyone has played with index_interval, and how much of a
difference would it make to reads on reducing the index_interval. I am
thinking of devoting a 32 gig RAM machine to this node, and decreasing
index_interval to a value of 8 from 128.

For 500 million keys, this would mean 500/8 ~ 64 million keys in memory.

index overhead  = 64 million * (32 + avg key size) (
http://www.datastax.com/docs/1.0/cluster_architecture/cluster_planning)
my avg keysize=8. hence
overhead = 64 million * 40 = 2.56 gb (is this number same as the size in
memory?).
If yes,  then its not too bad, and eliminates the index disk read for a
large majority of the keys.

Also, my data has uniformly 2 columns. Will sstable compression help my
reads in any way?
Thanks
Gurpreet





On Fri, May 18, 2012 at 6:19 AM, Radim Kolar  wrote:

> to get 100 random reads per second on large dataset (100 GB) you need more
> disks in raid 0 then 2.
> Better is to add more nodes then stick too much disks into node. You need
> also adjust io scheduler in OS.
>


Cassandra 1.1.0 NullCompressor and DecoratedKey errors

2012-05-18 Thread Ron Siemens

We have some production Solaris boxes so I can't use SnappyCompressor (no 
library included for Solaris), so I set it to JavaDeflate.  I've also noticed 
higher load issues with 1.1.0 versus 1.0.6: could this be JavaDeflate, or is 
that what the old default was?  Anyway, I thought I would try no compression, 
since I found code like this in one of the issue discussions with 
SnappyCompression.

public class NullCompressor implements ICompressor
{
public static final NullCompressor instance = new NullCompressor();

public static NullCompressor create( Map compressionOptions 
) {
return instance;
}

public int initialCompressedBufferLength( int chunkLength ) {
return chunkLength;
}

public int compress( byte[] input, int inputOffset, int inputLength, 
ICompressor.WrappedArray output, int outputOffset ) throws IOException {
System.arraycopy( input, inputOffset, output.buffer, outputOffset, 
inputLength );
return inputLength;
}

public int uncompress( byte[] input, int inputOffset, int inputLength, 
byte[] output, int outputOffset ) throws IOException {
System.arraycopy( input, inputOffset, output, outputOffset, inputLength 
);
return inputLength;
}

}

But now I get some curious errors in Cassandra log, that I haven't seen 
previously:

ERROR [ReadStage:294] 2012-05-18 15:33:40,039 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[ReadStage:294,5,main]
java.lang.AssertionError: DecoratedKey(105946799083363489728328364782061531811, 
57161d05b50004b3130008007e04c057161d05b60004ae380008007f04c057161d05b700048d610008008004c057161d05b80004c1040008008104c057161d05b900048ac10008008204c057161d05ba0004ae8b0008008304c057161d05bb000474950008008404c057161d05bc0004bb240008008504c057161d05bd0004ba320008008604c057161d05be0004be9a0008008704c057161d05bf0004b9fa0008008804c057161d05c48e7f0008008904c057161d05c10004ba590008008a04c057161d05c20004b64d0008008b04c057161d05c30004bae30008008c04c057161d05c40004bee50008008d04c057161d05c5000487590008008e04c057161d05c60004bad8008f04c057161d05c70004badb0008009004c057161d05c80004bf140008009104c057161d05c90004b7ec0008009204c057161d05ca0004bace0008009304c057161d05cb0004ba170008009404c057161d05cc000484a10008009504c057161d05cd000495670008009604c057161d05ce0004ab98009704c057161d05cf0004b6110008009804c057161d05d4af550008009904c057161d05d10004abfc0008009a04c057161d05d20004bf350008009b04c057161d05d30004bacd0008009c04c057161d05d40004bd0a0008009d04c057161d05d50004bac10008009e04c057161d05d60004af530008009f04c057161d05d70004b97a000800a004c057161d05d80004af13000800a104c057161d05d90004a25600085452535f32373138004300130001020008004fb6cd4b0004c05716072b78000100080001000104c05716072b790004b87900074348535f3435360309001800030002be188961212f0cd18f5ddb69e0a336ed4fb6cd4f0004c057164b73f0001b00080001000204c057164b73f1000479ba00080002000104c057164b73f20004b4380003000104c057164b73f30004b462)
 != DecoratedKey(53124083656910387079795798648228312597, 5448535f323739) in 
/home/apollo/cassandra/data/ggstores3/ndx_items_category/ggstores3-ndx_items_category-hc-1-Data.db
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:58)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:66)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:78)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:233)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:61)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1273)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1155)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1090)
at or

Re: Cassandra 1.1.0 NullCompressor and DecoratedKey errors

2012-05-18 Thread Ron Siemens

I decided to wipe cassandra clean and try again.  Haven't seen it again yet, 
but will report if I do.  This may have been a symptom of having some previous 
data around, as my steps were:

1. shutdown and wipe data
2. run with NullCompressor
3. notice Cassandra complain compressor is not in package 
org.apache.cassandra.io
4. shutdown
5. move compressor to expected package
6. run with NullCOmpressor

Can't remember if I did another wipe after 4, so there may have been some data 
in a bad state?  It seems client side didn't care what package the compressor 
was in, but service side did.

Unless I see the error again, I'm guessing there was some data leftover between 
trials.

Ron


On May 18, 2012, at 3:38 PM, Ron Siemens wrote:

> 
> We have some production Solaris boxes so I can't use SnappyCompressor (no 
> library included for Solaris), so I set it to JavaDeflate.  I've also noticed 
> higher load issues with 1.1.0 versus 1.0.6: could this be JavaDeflate, or is 
> that what the old default was?  Anyway, I thought I would try no compression, 
> since I found code like this in one of the issue discussions with 
> SnappyCompression.
> 
> public class NullCompressor implements ICompressor
> {
>public static final NullCompressor instance = new NullCompressor();
> 
>public static NullCompressor create( Map 
> compressionOptions ) {
>return instance;
>}
> 
>public int initialCompressedBufferLength( int chunkLength ) {
>return chunkLength;
>}
> 
>public int compress( byte[] input, int inputOffset, int inputLength, 
> ICompressor.WrappedArray output, int outputOffset ) throws IOException {
>System.arraycopy( input, inputOffset, output.buffer, outputOffset, 
> inputLength );
>return inputLength;
>}
> 
>public int uncompress( byte[] input, int inputOffset, int inputLength, 
> byte[] output, int outputOffset ) throws IOException {
>System.arraycopy( input, inputOffset, output, outputOffset, 
> inputLength );
>return inputLength;
>}
> 
> }
> 
> But now I get some curious errors in Cassandra log, that I haven't seen 
> previously:
> 
> ERROR [ReadStage:294] 2012-05-18 15:33:40,039 AbstractCassandraDaemon.java 
> (line 134) Exception in thread Thread[ReadStage:294,5,main]
> java.lang.AssertionError: 
> DecoratedKey(105946799083363489728328364782061531811, 
> 57161d05b50004b3130008007e04c057161d05b60004ae380008007f04c057161d05b700048d610008008004c057161d05b80004c1040008008104c057161d05b900048ac10008008204c057161d05ba0004ae8b0008008304c057161d05bb000474950008008404c057161d05bc0004bb240008008504c057161d05bd0004ba320008008604c057161d05be0004be9a0008008704c057161d05bf0004b9fa0008008804c057161d05c48e7f0008008904c057161d05c10004ba590008008a04c057161d05c20004b64d0008008b04c057161d05c30004bae30008008c04c057161d05c40004bee50008008d04c057161d05c5000487590008008e04c057161d05c60004bad8008f04c057161d05c70004badb0008009004c057161d05c80004bf140008009104c057161d05c90004b7ec0008009204c057161d05ca0004bace0008009304c057161d05cb0004ba170008009404c057161d05cc000484a10008009504c057161d05cd000495670008009604c057161d05ce0004ab98009704c057161d05cf0004b6110008009804c057161d05d4af550008009904c057161d05d10004abfc0008009a04c057161d05d20004bf350008009b04c057161d05d30004bacd0008009c04c057161d05d40004bd0a0008009d04c057161d05d50004bac10008009e04c057161d05d60004af530008009f04c057161d05d70004b97a000800a004c057161d05d80004af13000800a104c057161d05d90004a25600085452535f32373138004300130001020008004fb6cd4b0004c05716072b78000100080001000104c05716072b790004b87900074348535f3435360309001800030002be188961212f0cd18f5ddb69e0a336ed4fb6cd4f0004c057164b73f0001b00080001000204c057164b73f1000479ba00080002000104c057164b73f20004b4380003000104c057164b73f30004b462)
>  != DecoratedKey(53124083656910387079795798648228312597, 5448535f323739) in 
> /home/apollo/cassandra/data/ggstores3/ndx_items_category/ggstores3-ndx_items_category-hc-1-Data.db
>   at 

Re: unable to nodetool to remote EC2

2012-05-18 Thread ramesh

On 05/18/2012 01:35 PM, Tyler Hobbs wrote:
Your firewall rules need to allow TCP traffic on any port >= 1024 for 
JMX to work.  It initially connects on port 7199, but then the client 
is asked to reconnect on a randomly chosen port.


You can open the firewall, SSH to the node first, or set up something 
like this: 
http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html


On Fri, May 18, 2012 at 1:31 PM, ramesh > wrote:


I updated the cassandra-env.sh
$JMX_HOST="10.20.30.40"
JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST"

netstat -ltn shows port 7199 is listening.

I tried both public and private IP for connecting but neither helps.

However, I am able to connect locally from within server.

 I get this error when I remote:

Error connection to remote JMX agent! java.rmi.ConnectException:
Connection refused to host: 10.20.30.40; nested exception is:
java.net.ConnectException: Connection timed out at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601)
at
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198)
at
sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown
Source) at

javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329)
at
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279)
at

javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
at
org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144)
at org.apache.cassandra.tools.NodeProbe. (NodeProbe.java:114) at
org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused
by: java.net.ConnectException: Connection timed out at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at
java.net.Socket.connect(Socket.java:529) at
java.net.Socket.connect(Socket.java:478) at java.net.Socket.
(Socket.java:375) at java.net.Socket. (Socket.java:189) at

sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
at

sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595)
... 10 more

Any help appreciated.
Regards
Ramesh




--
Tyler Hobbs
DataStax 

Got JConsole to work this way. But unable to get the a similar script 
for nodetool to work. Is there any guide or pointers to perform nodetool 
operations remotely with/without authentication ?


Also is Datastax Opscenter a replacement for nodetool ??

regards,
Ramesh