Getting stats of keyspaces

2012-07-16 Thread Thierry Templier

Hello,

I wonder if it's possible to get statistics for a keyspace like its 
size, size of each column family it contains. It's something I'd like 
from a request...


Thanks very much for your help.
Thierry


Cassandra occupy over 80% CPU when take a compaction

2012-07-16 Thread 黄荣桢
Hello,

I find the compaction of my secondary index takes a long time and occupy a
lot of CPU.

 INFO [CompactionExecutor:8] 2012-07-16 12:03:16,408 CompactionTask.java
(line 213) Compacted to [XXX].  71,018,346 to 9,020 (~0% of original) bytes
for 3 keys at 0.22MB/s.  Time: 397,602ms.

The stack of this over load Thread is:
"CompactionReducer:5" - Thread t@1073
   java.lang.Thread.State: RUNNABLE
at java.util.AbstractList$Itr.remove(AbstractList.java:360)
at
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedStandard(ColumnFamilyStore.java:851)
at
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(ColumnFamilyStore.java:835)
at
org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(ColumnFamilyStore.java:826)
at
org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:77)
at
org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer$MergeTask.call(ParallelCompactionIterable.java:224)
at
org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer$MergeTask.call(ParallelCompactionIterable.java:198)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
- locked <4be5863d> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

I guess this problem due to huge amount of columns in my index. The column
which is indexed only have 3 kinds of values, and one possible value
have several million of record, so this index have several million columns.
Compact these columns take a long time.

I find a similar issue on the jira:
https://issues.apache.org/jira/browse/CASSANDRA-3592

Is there any way to work around this issue?  Is there any way to improve
the efficiency to compact this index?


Re: Enable CQL3 from Astyanax

2012-07-16 Thread Thierry Templier

Hello Aaron,

I try to simulate a composition relationship within a single column 
family / table (for example, an entity and its fields). I dynamically 
add columns the contained elements.


Let's take an example. Here is my table definition with CQL 3:

CREATE TABLE "Entity" (
"id" varchar,
"name" varchar,
PRIMARY KEY ("id")
);

If I want to store an entity with its two fields, I'll have the 
following fields:


id: "myentityid"
name: "myentityname"
fields.0.id: "myfield1id"
fields.0.name: "myfield1name"
fields.1.id: "myfield2id"
fields.1.name: "myfield2name"

When accessing Cassandra data through Astyanax, I get all fields on a 
"load" operation but not from a CQL3 request.


Thanks very much for your help.
Thierry


Can you provide an example ?

select * should return all the columns from the CF.

Cheers


Re: Concerns about Cassandra upgrade from 1.0.6 to 1.1.X

2012-07-16 Thread aaron morton
The advice Tyler gave is the correct. Do a rolling upgrade, and shapshot if you 
want to have a rollback. 

My personal approach is to upgrade a node or two and let run for a few hours. 
Just to avoid the situation where you uprade every node and then discover some 
problem that causes wailing and gnashing of teeth. 

In general, node by node:
* drain
* snapshot
* shutdown 
* upgrade
* turn on. 

When they are all up I snap shot again if there is space. Then run 
upgradetables. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/07/2012, at 11:06 AM, Roshan wrote:

> Thanks Aaron. My major concern is upgrade node by node. Because currently we
> are using 1.0.6 in production and plan is to upgrade singe node to 1.1.2 at
> a time.
> 
> Any comments?
> 
> Thanks.
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Concerns-about-Cassandra-upgrade-from-1-0-6-to-1-1-X-tp7581197p7581221.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.



Re: High RecentWriteLatencyMicro

2012-07-16 Thread aaron morton
The write path for counters is different than non counter fields, for 
background 
http://www.datastax.com/wp-content/uploads/2011/07/cassandra_sf_counters.pdf
 
The write is applied on the leader *and then* replicated to the other replicas. 
This was controlled by a config setting called replicate_on_write which IIRC 
has been removed because you always want to do this. You can see this traffic 
in the REPLICATE_ON_WRITE thread pool. 

Have a look at the ROW stage and see it backing up. 

> 1) Is the whole of 7-8ms being spent in thrift overheads and
> Scheduling delays ? (there is insignificant .1ms ping time between
> machines)
The storage proxy / jmx latency is the total latency for the coordinator after 
the thrift deserialisation (and before serialising the response).  7 to 8 ms 
sounds a little high considering the low local node latency. But it would make 
sense if the nodes were at peak throughput. At max throughput request latency 
is wait time + processing time. 

What happens to node local latency and cluster latency when the throughput goes 
down?

Also this will be responsible for some of that latency…
> (GC
> stops threads for 100ms every 1-2 seconds, effectively pausing
> cassandra 5-10% of its time, but this doesn't seem to be the reason)


> 2) Do keeping a large number of CF(17 in our case) adversely affect
> write performance? (except from the extreme flushing scenario)
Should be fine with 17

> 3) I see a lot of threads(4,000-10,000) with names like
> "pool-2-thread-*" 
These are connection threads. Use connecting pooling or try the thread pooled 
connection manager, see yaml for details. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/07/2012, at 3:48 PM, rohit bhatia wrote:

> Hi
> 
> As I understand that writes in cassandra are directly pushed to memory
> and using counters with CL.ONE shouldn't take the read latency for
> counters in account. So Writes for incrementing counters with CL.ONE
> should basically be really fast.
> 
> But in my 8 node cluster(16 core/32G ram/cassandra1.0.5/java7 each)
> with RF=2, At a traffic of 55k qps = 14k increments per node/7k write
> requests per node, the write latency(from jmx) increases to around 7-8
> ms from the low traffic value of 0.5ms.  The Nodes aren't even pushed
> with absent I/O, lots of free RAM and 30% CPU idle time/OS Load 20.
> The write latency by cfstats (supposedly the latency for 1 node to
> increment its counter) is a small amount (< 0.05ms).
> 
> 1) Is the whole of 7-8ms being spent in thrift overheads and
> Scheduling delays ? (there is insignificant .1ms ping time between
> machines)
> 
> 2) Do keeping a large number of CF(17 in our case) adversely affect
> write performance? (except from the extreme flushing scenario)
> 
> 3) I see a lot of threads(4,000-10,000) with names like
> "pool-2-thread-*" (pointed out as client-connection-threads on the
> mailing list before) periodically forming up. but with idle cpu time
> and zero pending tasks in tpstats, why do requests keep piling up (GC
> stops threads for 100ms every 1-2 seconds, effectively pausing
> cassandra 5-10% of its time, but this doesn't seem to be the reason)
> 
> Thanks
> Rohit



Re: How to come up with a predefined topology

2012-07-16 Thread aaron morton
> Is the above understanding correct ?
yes, sorry.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/07/2012, at 4:24 PM, prasenjit mukherjee wrote:

> On Fri, Jul 13, 2012 at 4:04 AM, aaron morton  wrote:
>> The logic is here
>> https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L78
> 
> Thanks Aaron for pointing to the code.
> 
>> 
>> a. n>r : I am assuming, have 1 replica in each rack.
>> 
>> You have 1 replica in the first n racks.
>> 
>> b. n> in each racks.
>> 
>> int(n/r) racks will have the same number of replicas. n % r will have more.
> 
> Did you mean  r%n ( since r>n)  ?
> 
> Shouldn't the logic be : all racks will have at least int(r/n) and r%n
> will have 1 additional replica ?
> 
> Sample use case ( r = 8, n = 3 )
> n1 : 3 ( 2+1 )
> n2:  3 ( 2+1 )
> n3:  2
> 
> Is the above understanding correct ?
> 
> -Thanks,
> Prasenjit
> 
>> 
>> This is why multi rack replication can be tricky.
>> 
>> Hope that helps.
>> 
>> 
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 12/07/2012, at 8:05 PM, prasenjit mukherjee wrote:
>> 
>> Thanks. Some follow up questions :
>> 
>> 1.  How do the reads use strategy/snitch information ? I am assuming
>> the reads can go to any of the replicas. WIll it also use the
>> snitch/strategy info to find next 'R' replicas 'closest' to
>> coordinator-node ?
>> 
>> 2. In a single DC ( with n racks and r replicas ) what algorithm
>> cassandra uses to write its replicas in following scenarios :
>> a. n>r : I am assuming, have 1 replica in each rack.
>> b. n> in each racks.
>> 
>> -Thanks,
>> Prasenjit
>> 
>> On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs  wrote:
>> 
>> I highly recommend specifying the same rack for all nodes (using
>> 
>> cassandra-topology.properties) unless you really have a good reason not too
>> 
>> (and you probably don't).  The way that replicas are chosen when multiple
>> 
>> racks are in play can be fairly confusing and lead to a data imbalance if
>> 
>> you don't catch it.
>> 
>> 
>> 
>> On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee 
>> 
>> wrote:
>> 
>> 
>> As far as I know there isn't any way to use the rack name in the
>> 
>> strategy_options for a keyspace. You
>> 
>> might want to look at the code to dig into that, perhaps.
>> 
>> 
>> Aha, I was wondering if I could do that as well ( specify rack options )
>> 
>> :)
>> 
>> 
>> Thanks for the pointer, I will dig into the code.
>> 
>> 
>> -Thanks,
>> 
>> Prasenjit
>> 
>> 
>> On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe 
>> 
>> wrote:
>> 
>> If you then specify the parameters for the keyspace to use these, you
>> 
>> can control exactly which set of nodes replicas end up on.
>> 
>> 
>> For example, in cassandra-cli:
>> 
>> 
>> create keyspace ks1 with placement_strategy =
>> 
>> 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options
>> 
>> = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };
>> 
>> 
>> As far as I know there isn't any way to use the rack name in the
>> 
>> strategy_options for a keyspace. You might want to look at the code to dig
>> 
>> into that, perhaps.
>> 
>> 
>> Whichever snitch you use, the nodes are sorted in order of proximity to
>> 
>> the client node. How this is determined depends on the snitch that's used
>> 
>> but most (the ones that ship with Cassandra) will use the default ordering
>> 
>> of same-node < same-rack < same-datacenter < different-datacenter. Each
>> 
>> snitch has methods to tell Cassandra which rack and DC a node is in, so it
>> 
>> always knows which node is closest. Used with the Bloom filters this can
>> 
>> tell us where the nearest replica is.
>> 
>> 
>> 
>> 
>> -Original Message-
>> 
>> From: prasenjit mukherjee [mailto:prasen@gmail.com]
>> 
>> Sent: 11 July 2012 06:33
>> 
>> To: user
>> 
>> Subject: How to come up with a predefined topology
>> 
>> 
>> Quoting from
>> 
>> http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
>> 
>> :
>> 
>> 
>> "Asymmetrical replication groupings are also possible depending on your
>> 
>> use case. For example, you may want to have three replicas per data center
>> 
>> to serve real-time application requests, and then have a single replica in a
>> 
>> separate data center designated to running analytics."
>> 
>> 
>> Have 2 questions :
>> 
>> 1. Any example how to configure a topology with 3 replicas in one DC (
>> 
>> with 2 in 1 rack + 1 in another rack ) and one replica in another DC ?
>> 
>> The default networktopologystrategy with rackinferringsnitch will only
>> 
>> give me equal distribution ( 2+2 )
>> 
>> 
>> 2. I am assuming the reads can go to any of the replicas. Is there a
>> 
>> client which will send query to a node ( in cassandra ring ) which is
>> 
>> closest to the client ?
>> 
>> 
>> -Thanks,
>> 
>> Prasenjit
>> 
>> 
>> 

Re: Never ending manual repair after adding second DC

2012-07-16 Thread aaron morton
> Now, pretty much every single scenario points towards connectivity
> problem, however we also have few PostgreSQL replication streams
In the before time someone had problems with a switch/router that was dropping 
persistent but idle connections. Doubt this applies, and it would probably 
result in an error, just throwing it out there.

Have you combed through the logs logging for errors or warnings ?

I would repair a single small CF with -pr and watch closely. Consider setting 
DEBUG logging (you can do it via JMX) 

org.apache.cassandra.service.AntiEntropyService <- the 
class the manages repair
org.apache.cassandra.streaming  
<- package that handles streaming

There was a fix to repair in 1.0.11 but that has to do with streaming 
https://github.com/apache/cassandra/blob/cassandra-1.0/CHANGES.txt#L5

Good luck. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/07/2012, at 10:16 PM, Bart Swedrowski wrote:

> Hello everyone,
> 
> I'm facing quite weird problem with Cassandra since we've added
> secondary DC to our cluster and have totally ran out of ideas; this
> email is a call for help/advice!
> 
> History looks like:
> - we used to have 4 nodes in a single DC
> - running Cassandra 0.8.7
> - RF:3
> - around 50GB of data on each node
> - randomPartitioner and SimpleSnitch
> 
> All was working fine for over 9 months.  Few weeks ago we decided we
> want to add another 4 nodes in a second DC and join them to the
> cluster.  Prior doing that, we upgraded Cassandra to 1.0.9 to push it
> out of the doors before the multi-DC work.  After upgrade, we left it
> working for over a week and it was all good; no issues.
> 
> Then, we added 4 additional nodes in another DC bringing the cluster
> to 8 nodes in total, spreading across two DCs, so now we've:
> - 8 nodes across 2 DCs, 4 in each DC
> - 100Mbps low-latency connection (sub 5ms) running over Cisco ASA
> Site-to-Site VPN (which is ikev1 based)
> - DC1:3,DC2:3 RFs
> - randomPartitioner and using PropertyFileSnitch now
> 
> nodetool ring looks as follows:
> $ nodetool -h localhost ring
> Address DC  RackStatus State   Load
> OwnsToken
> 
>148873535527910577765226390751398592512
> 192.168.81.2DC1 RC1 Up Normal  37.9 GB
> 12.50%  0
> 192.168.81.3DC1 RC1 Up Normal  35.32 GB
> 12.50%  21267647932558653966460912964485513216
> 192.168.81.4DC1 RC1 Up Normal  39.51 GB
> 12.50%  42535295865117307932921825928971026432
> 192.168.81.5DC1 RC1 Up Normal  19.42 GB
> 12.50%  63802943797675961899382738893456539648
> 192.168.94.178  DC2 RC1 Up Normal  40.72 GB
> 12.50%  85070591730234615865843651857942052864
> 192.168.94.179  DC2 RC1 Up Normal  30.42 GB
> 12.50%  106338239662793269832304564822427566080
> 192.168.94.180  DC2 RC1 Up Normal  30.94 GB
> 12.50%  127605887595351923798765477786913079296
> 192.168.94.181  DC2 RC1 Up Normal  12.75 GB
> 12.50%  148873535527910577765226390751398592512
> 
> (please ignore the fact that nodes are not interleaved; they should be
> however there's been hiccup during the implementation phase.  Unless
> *this* is the problem!)
> 
> Now, the problem: over 7 out of 10 manual repairs are not being
> finished.  They usually get stuck and show 3 different sympoms:
> 
>  1). Say node 192.168.81.2 runs manual repair, it requests merkle
> trees from 192.168.81.2, 192.168.81.3, 192.168.81.5, 192.168.94.178,
> 192.168.94.179, 192.168.94.181.  It receives them from 192.168.81.2,
> 192.168.81.3, 192.168.81.5, 192.168.94.178, 192.168.94.179 but not
> from 192.168.94.181.  192.168.94.181 logs are saying that it has sent
> the merkle tree back but it's never received by 192.168.81.2.
>  2). Say node 192.168.81.2 runs manual repair, it requests merkle
> trees from 192.168.81.2, 192.168.81.3, 192.168.81.5, 192.168.94.178,
> 192.168.94.179, 192.168.94.181.  It receives them from 192.168.81.2,
> 192.168.81.3, 192.168.81.5, 192.168.94.178, 192.168.94.179 but not
> from 192.168.94.181.  192.168.94.181 logs are not saying *anything*
> about merkle tree being sent.  Also compactionstats are not even
> saying anything about them being validated (generated)
>  3). Merkle trees are being delivered, and nodes are sending data
> across to sync theirselves.  On a certain occasions, they'll get
> "stuck" streaming files between each other at 100% and won't move
> forward.  Now the interesting bit is, the ones that are getting stuck
> are always placed in different DCs!
> 
> Now, pretty much every single scenario points towards connectivity
> problem, however we also have few PostgreSQL replication streams
> happening over this connection, some other traffic going over and
> quite a lot of monitoring happening and none of those are being
> affe

Re: bootstrapping problem. 1.1.2 version

2012-07-16 Thread aaron morton
Check net stats a few times to look for progress, if there is none take a look 
at the logs on both sides for errors.

Hope that helps. 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/07/2012, at 10:53 PM, Michael Cherkasov wrote:

> Hi all,
> 
> I have only one node and trying to add new DC with one node too.
> 
> So I do all steps according this instruction 
> http://www.datastax.com/docs/1.0/operations/cluster_management#adding-nodes-to-a-cluster
> 
> But looks like nothing happens:
> 
> D:\db\apache-cassandra-1.1.2\bin>nodetool netstats
> Starting NodeTool
> Mode: JOINING
> Not sending any streams.
> Streaming from: /192.168.33.118
>DevKS: \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-12-Data.db 
> sections=1 progress=0/369 - 0%
>DevKS: 
> \home\user\cassandra-data\data\DevKS\TestCase\DevKS-TestCase-hd-11-Data.db 
> sections=1 progress=0/7255721 - 0%
>DevKS: 
> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-5-Data.db 
> sections=1 progress=0/113 - 0%
>DevKS: 
> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-6-Data.db 
> sections=1 progress=0/601578 - 0%
>DevKS: \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-13-Data.db 
> sections=1 progress=0/5138 - 0%
>DevKS: 
> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-4-Data.db 
> sections=1 progress=0/4049601 - 0%
>DevKS: \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-14-Data.db 
> sections=1 progress=0/4481977 - 0%
> Pool NameActive   Pending  Completed
> Commandsn/a 0  4
> Responses   n/a 0   3030
> 
> 
> there's really good connection between DCs, I'm pretty sure that there's no 
> problem with connection.
> So what can be wrong there?
> 
> Also there was one more DC before, which was removed by 'removetoken' 
> command. for new DC I reused the same DC name.



Re: bootstrapping problem. 1.1.2 version

2012-07-16 Thread Michael Cherkasov
I found this error:

ERROR [Streaming to /192.168.36.25:10] 2012-07-16 16:26:25,206
AbstractCassandraDaemon.java (line 134) Exception in thread
Thread[Streaming to /192.168.36.25:10,5,main]
java.lang.IllegalStateException: target reports current file is
\home\user\cassandra-data\data\DevKS\TestCase\DevKS-TestCase-hd-12-Data.db
but is
/home/user/cassandra-data/data/DevKS/TestCase/DevKS-TestCase-hd-12-Data.db
at
org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:174)
at
org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:59)
at
org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:208)
at
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

But I have no idea how to fix this. Also as you notice there's problem with
folder separator. DC located in different environments one on Win7 other on
Linux.

2012/7/16 aaron morton 

> Check net stats a few times to look for progress, if there is none take a
> look at the logs on both sides for errors.
>
> Hope that helps.
>
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 14/07/2012, at 10:53 PM, Michael Cherkasov wrote:
>
> Hi all,
>
> I have only one node and trying to add new DC with one node too.
>
> So I do all steps according this instruction
> http://www.datastax.com/docs/1.0/operations/cluster_management#adding-nodes-to-a-cluster
>
> But looks like nothing happens:
>
> D:\db\apache-cassandra-1.1.2\bin>nodetool netstats
> Starting NodeTool
> Mode: JOINING
> Not sending any streams.
> Streaming from: /192.168.33.118
>DevKS:
> \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-12-Data.db
> sections=1 progress=0/369 - 0%
>DevKS:
> \home\user\cassandra-data\data\DevKS\TestCase\DevKS-TestCase-hd-11-Data.db
> sections=1 progress=0/7255721 - 0%
>DevKS:
> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-5-Data.db
> sections=1 progress=0/113 - 0%
>DevKS:
> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-6-Data.db
> sections=1 progress=0/601578 - 0%
>DevKS:
> \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-13-Data.db
> sections=1 progress=0/5138 - 0%
>DevKS:
> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-4-Data.db
> sections=1 progress=0/4049601 - 0%
>DevKS:
> \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-14-Data.db
> sections=1 progress=0/4481977 - 0%
> Pool NameActive   Pending  Completed
> Commandsn/a 0  4
> Responses   n/a 0   3030
>
>
> there's really good connection between DCs, I'm pretty sure that there's
> no problem with connection.
> So what can be wrong there?
>
> Also there was one more DC before, which was removed by 'removetoken'
> command. for new DC I reused the same DC name.
>
>
>


Bulk Loading with Composite Column Slow?

2012-07-16 Thread Brian Reynolds
Hi

I'm using Cassandra 1.1.0 from Datastax and attempting to load a
ColumnFamily with a single column with a Composite name.  It seems to load
ok, but is much slower than similar code without the composite column.

I tried building the Composite outside the while loop and just copying it
before adding components in each iteration but that didn't seem to help.

thanks
bri

Code based on bulk loading tutorial:

List> compositeList = new
ArrayList>();
compositeList.add(UTF8Type.instance);
compositeList.add(UTF8Type.instance);

SSTableSimpleUnsortedWriter usersWriter = new
SSTableSimpleUnsortedWriter(
directory,
new RandomPartitioner(),
keyspace,
columnFamily,
CompositeType.getInstance(compositeList),
null,
32);

String line;
int lineNumber = 1;
CsvEntry entry = new CsvEntry();
// There is no reason not to use the same timestamp for every
column in that example.
long timestamp = System.currentTimeMillis() * 1000;
while ((line = reader.readLine()) != null)
{
if (entry.parse(line, lineNumber))
{
if ( (lineNumber % 1) == 0) {
System.out.println("Line "+lineNumber);
}
CompositeType.Builder builder = new CompositeType.Builder(
CompositeType.getInstance(compositeList) );
String siteKey = entry.key;
usersWriter.newRow(bytes( siteKey ));
builder.add(bytes(entry.part1));
builder.add(bytes(entry.part2));
usersWriter.addColumn(builder.build(), bytes(entry.date),
timestamp);
}
lineNumber++;
}


Re: Getting stats of keyspaces

2012-07-16 Thread Manoj Mainali
You can get the statistics using jmx.

See here : http://www.datastax.com/docs/1.1/operations/monitoring

Best regards,
Manoj


On Monday, July 16, 2012, Thierry Templier wrote:

> Hello,
>
> I wonder if it's possible to get statistics for a keyspace like its size,
> size of each column family it contains. It's something I'd like from a
> request...
>
> Thanks very much for your help.
> Thierry
>


Re: Getting stats of keyspaces

2012-07-16 Thread Thierry Templier

Thanks very much, Manoj. It's exactly what I looked for.

Thierry

You can get the statistics using jmx.

See here : http://www.datastax.com/docs/1.1/operations/monitoring

Best regards,
Manoj


Re: Never ending manual repair after adding second DC

2012-07-16 Thread Bart Swedrowski
On 16 July 2012 11:25, aaron morton  wrote:

> In the before time someone had problems with a switch/router that was
> dropping persistent but idle connections. Doubt this applies, and it would
> probably result in an error, just throwing it out there.
>

Yes, been through them few times.  There's literally no errors or warning
at all.  And sometimes, as aforementioned, there's actually INFO that
merkle tree has been sent where the other side is not receiving it.

Just now, I kicked off manual repair on node with IP 192.168.94.178 and
just got stuck on streaming files again.

Node 192.168.94.179:

Streaming from: /192.168.81.5
>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1127-Data.db
> sections=46 progress=0/5096 - 0%
>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1128-Data.db
> sections=244 progress=0/1548510 - 0%
>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1119-Data.db
> sections=228 progress=0/82859 - 0%


Node 192.168.81.5:

Streaming to: /192.168.94.179
>/var/lib/cassandra/data/Medals/dataa-hd-1129-Data.db sections=2
> progress=168/168 - 100%
>/var/lib/cassandra/data/Medals/dataa-hd-1128-Data.db sections=244
> progress=0/1548510 - 0%
>/var/lib/cassandra/data/Medals/dataa-hd-1127-Data.db sections=46
> progress=0/5096 - 0%
>/var/lib/cassandra/data/Medals/dataa-hd-1119-Data.db sections=228
> progress=0/82859 - 0%


Looks like streaming this specific SSTable hasn't finished (or been ACKed
on the other side)

   /var/lib/cassandra/data/Medals/dataa-hd-1129-Data.db sections=2
> progress=168/168 - 100%


This morning I've tightend monitoring so now we've each node monitoring
each other with ICMP packets (20 every minute) and monitoring is silent; no
issues reported since the morning, not a single packet lost.

I got some help from Acunu guys, first we believed we fixed the problem by
disabling bonding on the servers and blamed it for messing up stuff with
interrupts however this morning problem resurfaced.

I can see (and Acunu says) everything is pointing to network related
problem (although I'd expect IP stack to correct simple PL) but there's no
way to back this up (unless only Cassandra related traffic is getting lost
but *how* to monitor for it???).

Honestly, running out of ideas - further advice highly appreciated.


Snapshot issue in Cassandra 0.8.1

2012-07-16 Thread Adeel Akbar

Hi,

I have created snapshot with following command;

#./nodetool -h localhost snapshot cassandra_01_bkup

but the problem is, the snapshot is created on snapshot folder with 
different name (like 1342269988711) and I have no idea that if I used 
this command in script then how I gzip snapshot with script. Please help 
me to resolve this issue.

--


Thanks & Regards

*Adeel**Akbar*



Re: Never ending manual repair after adding second DC

2012-07-16 Thread Bill Au
I had ran into the same problem before:

http://comments.gmane.org/gmane.comp.db.cassandra.user/25334

I have not fond any solutions yet.

Bill

On Mon, Jul 16, 2012 at 11:10 AM, Bart Swedrowski  wrote:

>
>
> On 16 July 2012 11:25, aaron morton  wrote:
>
>> In the before time someone had problems with a switch/router that was
>> dropping persistent but idle connections. Doubt this applies, and it would
>> probably result in an error, just throwing it out there.
>>
>
> Yes, been through them few times.  There's literally no errors or warning
> at all.  And sometimes, as aforementioned, there's actually INFO that
> merkle tree has been sent where the other side is not receiving it.
>
> Just now, I kicked off manual repair on node with IP 192.168.94.178 and
> just got stuck on streaming files again.
>
> Node 192.168.94.179:
>
> Streaming from: /192.168.81.5
>>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1127-Data.db
>> sections=46 progress=0/5096 - 0%
>>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1128-Data.db
>> sections=244 progress=0/1548510 - 0%
>>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1119-Data.db
>> sections=228 progress=0/82859 - 0%
>
>
> Node 192.168.81.5:
>
> Streaming to: /192.168.94.179
>>/var/lib/cassandra/data/Medals/dataa-hd-1129-Data.db sections=2
>> progress=168/168 - 100%
>>/var/lib/cassandra/data/Medals/dataa-hd-1128-Data.db sections=244
>> progress=0/1548510 - 0%
>>/var/lib/cassandra/data/Medals/dataa-hd-1127-Data.db sections=46
>> progress=0/5096 - 0%
>>/var/lib/cassandra/data/Medals/dataa-hd-1119-Data.db sections=228
>> progress=0/82859 - 0%
>
>
> Looks like streaming this specific SSTable hasn't finished (or been ACKed
> on the other side)
>
>/var/lib/cassandra/data/Medals/dataa-hd-1129-Data.db sections=2
>> progress=168/168 - 100%
>
>
> This morning I've tightend monitoring so now we've each node monitoring
> each other with ICMP packets (20 every minute) and monitoring is silent; no
> issues reported since the morning, not a single packet lost.
>
> I got some help from Acunu guys, first we believed we fixed the problem by
> disabling bonding on the servers and blamed it for messing up stuff with
> interrupts however this morning problem resurfaced.
>
> I can see (and Acunu says) everything is pointing to network related
> problem (although I'd expect IP stack to correct simple PL) but there's no
> way to back this up (unless only Cassandra related traffic is getting lost
> but *how* to monitor for it???).
>
> Honestly, running out of ideas - further advice highly appreciated.
>


high i/o usage on one node

2012-07-16 Thread feedly team
I am having an issue where one node of a 2 node cluster seems to be using
much more I/O than the other node. the cassandra read/write requests seem
to be balanced, but iostat shows the data disk to be maxed at 100%
utilization for one machine and <50% for the other. r/s to be about 3x
greater on the high i/o node. I am using a RF of 2 and consistency mode of
ALL for reads and ONE for writes (current requests are very read heavy).
user CPU seems to be fairly low and the same on both machines, but the high
i/o machine shows an os load of 34 (!) while the other machine reports 7. I
ran a nodetool compactionstats and there are no tasks pending which i
assume means there is no compaction going on, and the logs seem to be ok as
well. the only difference is that on the high i/o node, i am doing full gc
logging, but that's on a separate disk than the data.

Another oddity is that the high i/o node shows a data size of 86GB while
the other shows 71GB. I understand there could be differences, but with a
RF of 2 I would think they would be roughly the equal?

I am using version 1.0.10.


Truncate failing with 1.0 client against 0.7 cluster

2012-07-16 Thread Guy Incognito
i'm doing an upgrade of Cassandra 0.7 to 1.0 at the moment, and as part 
of the preparation i'm upgrading to 1.0 client libraries (we use Hector 
1.0-5) prior to upgrading the cluster itself.  I'm seeing some of our 
integration tests against the dev 0.7 cluster fail as they get 
UnavailableExceptions when trying to truncate the test column families.  
This is new behaviour with the 1.0 client libraries, it doesn't happen 
with the 0.7 libraries.


It seems to fail immediately, it doesn't eg wait for eg the 10 second 
RPC timeout, it fails straight away.  Anyone have any ideas as to what 
may be happening?  Interestingly I seem to be able to get around it if i 
only tell Hector about one of the nodes (we have 4). If I give it all 
four then it throws the UnavailableException.


Re: Cassandra occupy over 80% CPU when take a compaction

2012-07-16 Thread aaron morton
Are you able to put together a test case, maybe using the stress testing tool, 
that models your data layout?

If so can you add it to https://issues.apache.org/jira/browse/CASSANDRA-3592

Thanks

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/07/2012, at 8:17 PM, 黄荣桢 wrote:

> Hello,
> 
> I find the compaction of my secondary index takes a long time and occupy a 
> lot of CPU.
> 
>  INFO [CompactionExecutor:8] 2012-07-16 12:03:16,408 CompactionTask.java 
> (line 213) Compacted to [XXX].  71,018,346 to 9,020 (~0% of original) bytes 
> for 3 keys at 0.22MB/s.  Time: 397,602ms.
> 
> The stack of this over load Thread is:
> "CompactionReducer:5" - Thread t@1073
>java.lang.Thread.State: RUNNABLE
>   at java.util.AbstractList$Itr.remove(AbstractList.java:360)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.removeDeletedStandard(ColumnFamilyStore.java:851)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(ColumnFamilyStore.java:835)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(ColumnFamilyStore.java:826)
>   at 
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:77)
>   at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer$MergeTask.call(ParallelCompactionIterable.java:224)
>   at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer$MergeTask.call(ParallelCompactionIterable.java:198)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> 
>Locked ownable synchronizers:
>   - locked <4be5863d> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> 
> I guess this problem due to huge amount of columns in my index. The column 
> which is indexed only have 3 kinds of values, and one possible value have 
> several million of record, so this index have several million columns. 
> Compact these columns take a long time. 
> 
> I find a similar issue on the jira:
> https://issues.apache.org/jira/browse/CASSANDRA-3592
> 
> Is there any way to work around this issue?  Is there any way to improve the 
> efficiency to compact this index?
> 



Re: Enable CQL3 from Astyanax

2012-07-16 Thread aaron morton
Can you provide an example where you add data, run a CQL statement in cqlsh 
that does not work and maybe list the data in the CLI. 

cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/07/2012, at 8:25 PM, Thierry Templier wrote:

> Hello Aaron,
> 
> I try to simulate a composition relationship within a single column family / 
> table (for example, an entity and its fields). I dynamically add columns the 
> contained elements.
> 
> Let's take an example. Here is my table definition with CQL 3:
> 
> CREATE TABLE "Entity" (
>"id" varchar,
>"name" varchar,
>PRIMARY KEY ("id")
> );
> 
> If I want to store an entity with its two fields, I'll have the following 
> fields:
> 
> id: "myentityid"
> name: "myentityname"
> fields.0.id: "myfield1id"
> fields.0.name: "myfield1name"
> fields.1.id: "myfield2id"
> fields.1.name: "myfield2name"
> 
> When accessing Cassandra data through Astyanax, I get all fields on a "load" 
> operation but not from a CQL3 request.
> 
> Thanks very much for your help.
> Thierry
> 
>> Can you provide an example ?
>> 
>> select * should return all the columns from the CF.
>> 
>> Cheers



Re: bootstrapping problem. 1.1.2 version

2012-07-16 Thread aaron morton
>  DC located in different environments one on Win7 other on Linux.
Running different operating systems is not supported. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/07/2012, at 12:30 AM, Michael Cherkasov wrote:

> I found this error:
> 
> ERROR [Streaming to /192.168.36.25:10] 2012-07-16 16:26:25,206 
> AbstractCassandraDaemon.java (line 134) Exception in thread Thread[Streaming 
> to /192.168.36.25:10,5,main]
> java.lang.IllegalStateException: target reports current file is 
> \home\user\cassandra-data\data\DevKS\TestCase\DevKS-TestCase-hd-12-Data.db 
> but is 
> /home/user/cassandra-data/data/DevKS/TestCase/DevKS-TestCase-hd-12-Data.db
> at 
> org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:174)
> at 
> org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:59)
> at 
> org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:208)
> at 
> org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
> at 
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:636)
> 
> But I have no idea how to fix this. Also as you notice there's problem with 
> folder separator. DC located in different environments one on Win7 other on 
> Linux.
> 
> 2012/7/16 aaron morton 
> Check net stats a few times to look for progress, if there is none take a 
> look at the logs on both sides for errors.
> 
> Hope that helps. 
> 
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 14/07/2012, at 10:53 PM, Michael Cherkasov wrote:
> 
>> Hi all,
>> 
>> I have only one node and trying to add new DC with one node too.
>> 
>> So I do all steps according this instruction 
>> http://www.datastax.com/docs/1.0/operations/cluster_management#adding-nodes-to-a-cluster
>> 
>> But looks like nothing happens:
>> 
>> D:\db\apache-cassandra-1.1.2\bin>nodetool netstats
>> Starting NodeTool
>> Mode: JOINING
>> Not sending any streams.
>> Streaming from: /192.168.33.118
>>DevKS: \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-12-Data.db 
>> sections=1 progress=0/369 - 0%
>>DevKS: 
>> \home\user\cassandra-data\data\DevKS\TestCase\DevKS-TestCase-hd-11-Data.db 
>> sections=1 progress=0/7255721 - 0%
>>DevKS: 
>> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-5-Data.db 
>> sections=1 progress=0/113 - 0%
>>DevKS: 
>> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-6-Data.db 
>> sections=1 progress=0/601578 - 0%
>>DevKS: \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-13-Data.db 
>> sections=1 progress=0/5138 - 0%
>>DevKS: 
>> \home\user\cassandra-data\data\DevKS\Parameter\DevKS-Parameter-hd-4-Data.db 
>> sections=1 progress=0/4049601 - 0%
>>DevKS: \home\user\cassandra-data\data\DevKS\Test\DevKS-Test-hd-14-Data.db 
>> sections=1 progress=0/4481977 - 0%
>> Pool NameActive   Pending  Completed
>> Commandsn/a 0  4
>> Responses   n/a 0   3030
>> 
>> 
>> there's really good connection between DCs, I'm pretty sure that there's no 
>> problem with connection.
>> So what can be wrong there?
>> 
>> Also there was one more DC before, which was removed by 'removetoken' 
>> command. for new DC I reused the same DC name.
> 
> 



Re: Snapshot issue in Cassandra 0.8.1

2012-07-16 Thread aaron morton
> #./nodetool -h localhost snapshot cassandra_01_bkup
tells cassandra to snapshot the keyspace called cassandra_01_bkup

To specify a name for the snapshot us the -t option

  snapshot [keyspaces...] -t [snapshotName] - Take a snapshot of the specified 
keyspaces using optional name snapshotName

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/07/2012, at 4:00 AM, Adeel Akbar wrote:

> Hi,
> 
> I have created snapshot with following command;
> 
> #./nodetool -h localhost snapshot cassandra_01_bkup
> 
> but the problem is, the snapshot is created on snapshot folder with different 
> name (like 1342269988711) and I have no idea that if I used this command in 
> script then how I gzip snapshot with script. Please help me to resolve this 
> issue.
> -- 
> 
> Thanks & Regards
> 
> Adeel Akbar
> 



Re: Never ending manual repair after adding second DC

2012-07-16 Thread aaron morton
Even if it is a network error it would be good to detect it. 

If you can run a small repair with those log settings I'll can take a look at 
the logs if you want. Cannot promise anything but another set of eyes may help. 

Ping me off list if you want to send me the logs. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/07/2012, at 4:32 AM, Bill Au wrote:

> I had ran into the same problem before:
> 
> http://comments.gmane.org/gmane.comp.db.cassandra.user/25334
> 
> I have not fond any solutions yet.
> 
> Bill
> 
> On Mon, Jul 16, 2012 at 11:10 AM, Bart Swedrowski  wrote:
> 
> 
> On 16 July 2012 11:25, aaron morton  wrote:
> In the before time someone had problems with a switch/router that was 
> dropping persistent but idle connections. Doubt this applies, and it would 
> probably result in an error, just throwing it out there.
> 
> Yes, been through them few times.  There's literally no errors or warning at 
> all.  And sometimes, as aforementioned, there's actually INFO that merkle 
> tree has been sent where the other side is not receiving it.
> 
> Just now, I kicked off manual repair on node with IP 192.168.94.178 and just 
> got stuck on streaming files again.
> 
> Node 192.168.94.179:
> 
> Streaming from: /192.168.81.5
>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1127-Data.db sections=46 
> progress=0/5096 - 0%
>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1128-Data.db sections=244 
> progress=0/1548510 - 0%
>Medals: /var/lib/cassandra/data/Medals/dataa-hd-1119-Data.db sections=228 
> progress=0/82859 - 0%
> 
> Node 192.168.81.5:
> 
> Streaming to: /192.168.94.179
>/var/lib/cassandra/data/Medals/dataa-hd-1129-Data.db sections=2 
> progress=168/168 - 100%
>/var/lib/cassandra/data/Medals/dataa-hd-1128-Data.db sections=244 
> progress=0/1548510 - 0%
>/var/lib/cassandra/data/Medals/dataa-hd-1127-Data.db sections=46 
> progress=0/5096 - 0%
>/var/lib/cassandra/data/Medals/dataa-hd-1119-Data.db sections=228 
> progress=0/82859 - 0%
> 
> Looks like streaming this specific SSTable hasn't finished (or been ACKed on 
> the other side)
> 
>/var/lib/cassandra/data/Medals/dataa-hd-1129-Data.db sections=2 
> progress=168/168 - 100%
> 
> This morning I've tightend monitoring so now we've each node monitoring each 
> other with ICMP packets (20 every minute) and monitoring is silent; no issues 
> reported since the morning, not a single packet lost.
> 
> I got some help from Acunu guys, first we believed we fixed the problem by 
> disabling bonding on the servers and blamed it for messing up stuff with 
> interrupts however this morning problem resurfaced.
> 
> I can see (and Acunu says) everything is pointing to network related problem 
> (although I'd expect IP stack to correct simple PL) but there's no way to 
> back this up (unless only Cassandra related traffic is getting lost but *how* 
> to monitor for it???).
> 
> Honestly, running out of ideas - further advice highly appreciated.
> 



Re: high i/o usage on one node

2012-07-16 Thread aaron morton
Is you client balancing between the two nodes ? Heavy writes at CL ONE could 
result in nodes dropping messages and having an unbalanced load.

Are you sure there is nothing else running on the machines ? 

Just for fun have you turned off GC logging to see the impact ?

Is there swapping going on ?

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/07/2012, at 5:12 AM, feedly team wrote:

> I am having an issue where one node of a 2 node cluster seems to be using 
> much more I/O than the other node. the cassandra read/write requests seem to 
> be balanced, but iostat shows the data disk to be maxed at 100% utilization 
> for one machine and <50% for the other. r/s to be about 3x greater on the 
> high i/o node. I am using a RF of 2 and consistency mode of ALL for reads and 
> ONE for writes (current requests are very read heavy). user CPU seems to be 
> fairly low and the same on both machines, but the high i/o machine shows an 
> os load of 34 (!) while the other machine reports 7. I ran a nodetool 
> compactionstats and there are no tasks pending which i assume means there is 
> no compaction going on, and the logs seem to be ok as well. the only 
> difference is that on the high i/o node, i am doing full gc logging, but 
> that's on a separate disk than the data.
> 
> Another oddity is that the high i/o node shows a data size of 86GB while the 
> other shows 71GB. I understand there could be differences, but with a RF of 2 
> I would think they would be roughly the equal?
> 
> I am using version 1.0.10.
> 



Re: Truncate failing with 1.0 client against 0.7 cluster

2012-07-16 Thread aaron morton
UnavailableException is a server side error, whats the full error message ?


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/07/2012, at 5:31 AM, Guy Incognito wrote:

> i'm doing an upgrade of Cassandra 0.7 to 1.0 at the moment, and as part of 
> the preparation i'm upgrading to 1.0 client libraries (we use Hector 1.0-5) 
> prior to upgrading the cluster itself.  I'm seeing some of our integration 
> tests against the dev 0.7 cluster fail as they get UnavailableExceptions when 
> trying to truncate the test column families.  This is new behaviour with the 
> 1.0 client libraries, it doesn't happen with the 0.7 libraries.
> 
> It seems to fail immediately, it doesn't eg wait for eg the 10 second RPC 
> timeout, it fails straight away.  Anyone have any ideas as to what may be 
> happening?  Interestingly I seem to be able to get around it if i only tell 
> Hector about one of the nodes (we have 4). If I give it all four then it 
> throws the UnavailableException.



2 nodes throwing exceptions trying to compact after upgrade to 1.1.2 from 1.1.0

2012-07-16 Thread Bryce Godfrey
This may not be directly related to the upgrade to 1.1.2, but I was running on 
1.1.0 for a while with no issues, and I did the upgrade to 1.1.2 a few days ago.

2 of my nodes started throwing lots of promote exceptions, and then a lot of 
the beforeAppend exceptions from then on every few minutes.  This is on the 
high update CF that's using leveled compaction and compression.  The other 3 
nodes are not experiencing this.  I can send entire log files if desired.
These 2 nodes now have much higher load #'s then the other 3, and I'm assuming 
that's because they are failing with the compaction errors?

$
INFO [CompactionExecutor:1783] 2012-07-13 07:35:23,268 CompactionTask.java 
(line 109) Compacting 
[SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-392322-Data$
ERROR [CompactionExecutor:1783] 2012-07-13 07:35:29,696 
AbstractCassandraDaemon.java (line 134) Exception in thread 
Thread[CompactionExecutor:1783,1,main]
java.lang.AssertionError
at 
org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:214)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:158)
at 
org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:531)
at 
org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:254)
at 
org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:978)
at 
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
at 
org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

INFO [CompactionExecutor:3310] 2012-07-16 11:14:02,481 CompactionTask.java 
(line 109) Compacting 
[SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-369173-Data$
ERROR [CompactionExecutor:3310] 2012-07-16 11:14:04,031 
AbstractCassandraDaemon.java (line 134) Exception in thread 
Thread[CompactionExecutor:3310,1,main]
java.lang.RuntimeException: Last written key 
DecoratedKey(150919285004100953907590722809541628889, 
5b30363334353237652d383966382d653031312d623131632d3030313535643031373530325d5b436f6d70757465725b4d5350422d$
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153)
at 
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
at 
org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)


Re: 2 nodes throwing exceptions trying to compact after upgrade to 1.1.2 from 1.1.0

2012-07-16 Thread Rudolf van der Leeden
See  https://issues.apache.org/jira/browse/CASSANDRA-4411
The bug is related to LCS (leveled compaction) and has been fixed.


On 16.07.2012, at 20:32, Bryce Godfrey wrote:

> This may not be directly related to the upgrade to 1.1.2, but I was running 
> on 1.1.0 for a while with no issues, and I did the upgrade to 1.1.2 a few 
> days ago.
>  
> 2 of my nodes started throwing lots of promote exceptions, and then a lot of 
> the beforeAppend exceptions from then on every few minutes.  This is on the 
> high update CF that’s using leveled compaction and compression.  The other 3 
> nodes are not experiencing this.  I can send entire log files if desired.
> These 2 nodes now have much higher load #’s then the other 3, and I’m 
> assuming that’s because they are failing with the compaction errors?
>  
> $
> INFO [CompactionExecutor:1783] 2012-07-13 07:35:23,268 CompactionTask.java 
> (line 109) Compacting 
> [SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-392322-Data$
> ERROR [CompactionExecutor:1783] 2012-07-13 07:35:29,696 
> AbstractCassandraDaemon.java (line 134) Exception in thread 
> Thread[CompactionExecutor:1783,1,main]
> java.lang.AssertionError
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:214)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:158)
> at 
> org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:531)
> at 
> org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:254)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:978)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
>  
> INFO [CompactionExecutor:3310] 2012-07-16 11:14:02,481 CompactionTask.java 
> (line 109) Compacting 
> [SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-369173-Data$
> ERROR [CompactionExecutor:3310] 2012-07-16 11:14:04,031 
> AbstractCassandraDaemon.java (line 134) Exception in thread 
> Thread[CompactionExecutor:3310,1,main]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(150919285004100953907590722809541628889, 
> 5b30363334353237652d383966382d653031312d623131632d3030313535643031373530325d5b436f6d70757465725b4d5350422d$
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134)
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)



RE: 2 nodes throwing exceptions trying to compact after upgrade to 1.1.2 from 1.1.0

2012-07-16 Thread Bryce Godfrey
Thanks, is there a way around this for now or should I fall back to 1.1.0?


From: Rudolf van der Leeden [mailto:rudolf.vanderlee...@scoreloop.com]
Sent: Monday, July 16, 2012 12:55 PM
To: user@cassandra.apache.org
Cc: Rudolf van der Leeden
Subject: Re: 2 nodes throwing exceptions trying to compact after upgrade to 
1.1.2 from 1.1.0

See  
https://issues.apache.org/jira/browse/CASSANDRA-4411
The bug is related to LCS (leveled compaction) and has been fixed.


On 16.07.2012, at 20:32, Bryce Godfrey wrote:


This may not be directly related to the upgrade to 1.1.2, but I was running on 
1.1.0 for a while with no issues, and I did the upgrade to 1.1.2 a few days ago.

2 of my nodes started throwing lots of promote exceptions, and then a lot of 
the beforeAppend exceptions from then on every few minutes.  This is on the 
high update CF that's using leveled compaction and compression.  The other 3 
nodes are not experiencing this.  I can send entire log files if desired.
These 2 nodes now have much higher load #'s then the other 3, and I'm assuming 
that's because they are failing with the compaction errors?

$
INFO [CompactionExecutor:1783] 2012-07-13 07:35:23,268 CompactionTask.java 
(line 109) Compacting 
[SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-392322-Data$
ERROR [CompactionExecutor:1783] 2012-07-13 07:35:29,696 
AbstractCassandraDaemon.java (line 134) Exception in thread 
Thread[CompactionExecutor:1783,1,main]
java.lang.AssertionError
at 
org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:214)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:158)
at 
org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:531)
at 
org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:254)
at 
org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:978)
at 
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
at 
org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

INFO [CompactionExecutor:3310] 2012-07-16 11:14:02,481 CompactionTask.java 
(line 109) Compacting 
[SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-369173-Data$
ERROR [CompactionExecutor:3310] 2012-07-16 11:14:04,031 
AbstractCassandraDaemon.java (line 134) Exception in thread 
Thread[CompactionExecutor:3310,1,main]
java.lang.RuntimeException: Last written key 
DecoratedKey(150919285004100953907590722809541628889, 
5b30363334353237652d383966382d653031312d623131632d3030313535643031373530325d5b436f6d70757465725b4d5350422d$
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153)
at 
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
at 
org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)



Re: Truncate failing with 1.0 client against 0.7 cluster

2012-07-16 Thread Guy Incognito
sorry i don't have the exact text right now but it's along the lines of 
'not enough replicas available to handle the requested consistency 
level'.  i'm requesting quorum but i've tried with one, and any and it 
made no difference.


On 16/07/2012 19:30, aaron morton wrote:
UnavailableException is a server side error, whats the full error 
message ?



Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/07/2012, at 5:31 AM, Guy Incognito wrote:

i'm doing an upgrade of Cassandra 0.7 to 1.0 at the moment, and as 
part of the preparation i'm upgrading to 1.0 client libraries (we use 
Hector 1.0-5) prior to upgrading the cluster itself.  I'm seeing some 
of our integration tests against the dev 0.7 cluster fail as they get 
UnavailableExceptions when trying to truncate the test column 
families.  This is new behaviour with the 1.0 client libraries, it 
doesn't happen with the 0.7 libraries.


It seems to fail immediately, it doesn't eg wait for eg the 10 second 
RPC timeout, it fails straight away.  Anyone have any ideas as to 
what may be happening?  Interestingly I seem to be able to get around 
it if i only tell Hector about one of the nodes (we have 4). If I 
give it all four then it throws the UnavailableException.







Cassandra Evaluation/ Benchmarking: Throughput not scaling as expected neither latency showing good numbers

2012-07-16 Thread Code Box
I am doing Cassandra Benchmarking using YCSB for evaluating the best
performance for my application which will be both read and write intensive.
I have set up a three cluster environment on EC2 and i am using YCSB in the
same availability region as a client. I have tried various combinations of
tuning cassandra parameters like FSync ( Setting to batch and periodic ),
Increasing the number of rpc_threads, increasing number of concurrent reads
and concurrent writes, write consistency one and Quorum i am not getting
very great results and also i do not see a linear graph in terms of
scalability that is if i increase the number of clients i do not see an
increase in the throughput.

Here are some sample numbers that i got :-

*Test 1:-  Write Consistency set to Quorum Write Proportion = 100%. FSync =
Batch and Window = 0ms*

ThreadsThroughput ( write per sec )Avg Latency (ms)TP95(ms)TP99(ms)Min(ms)
Max(ms)


1021493.198451.499291  100407023.828702.2260   200415145.96571301.71242
300419764.681154222.09216


If you look at the numbers the number of threads do not increase the
throughput. Also the latency values are not that great. I am using fsync
set to batch and with 0 ms window.

*Test 2:- ** Write Consistency set to Quorum Write Proportion = 100%. FSync
= Periodic and Window = 1000 ms*
*
*
18031.237121.012312.9Q100159445.3439251.21579.1Q200196309.04719701.171851Q
Are these numbers expected numbers or does Cassandra perform better ? Am i
missing something ?


Re: 2 nodes throwing exceptions trying to compact after upgrade to 1.1.2 from 1.1.0

2012-07-16 Thread Rudolf van der Leeden
Stay with 1.1.2 and create your CF with  
compaction_strategy_class='SizeTieredCompactionStrategy' 


On 16.07.2012, at 22:17, Bryce Godfrey wrote:

> Thanks, is there a way around this for now or should I fall back to 1.1.0?
>  
>  
> From: Rudolf van der Leeden [mailto:rudolf.vanderlee...@scoreloop.com] 
> Sent: Monday, July 16, 2012 12:55 PM
> To: user@cassandra.apache.org
> Cc: Rudolf van der Leeden
> Subject: Re: 2 nodes throwing exceptions trying to compact after upgrade to 
> 1.1.2 from 1.1.0
>  
> See  https://issues.apache.org/jira/browse/CASSANDRA-4411
> The bug is related to LCS (leveled compaction) and has been fixed.
>  
>  
> On 16.07.2012, at 20:32, Bryce Godfrey wrote:
> 
> 
> This may not be directly related to the upgrade to 1.1.2, but I was running 
> on 1.1.0 for a while with no issues, and I did the upgrade to 1.1.2 a few 
> days ago.
>  
> 2 of my nodes started throwing lots of promote exceptions, and then a lot of 
> the beforeAppend exceptions from then on every few minutes.  This is on the 
> high update CF that’s using leveled compaction and compression.  The other 3 
> nodes are not experiencing this.  I can send entire log files if desired.
> These 2 nodes now have much higher load #’s then the other 3, and I’m 
> assuming that’s because they are failing with the compaction errors?
>  
> $
> INFO [CompactionExecutor:1783] 2012-07-13 07:35:23,268 CompactionTask.java 
> (line 109) Compacting 
> [SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-392322-Data$
> ERROR [CompactionExecutor:1783] 2012-07-13 07:35:29,696 
> AbstractCassandraDaemon.java (line 134) Exception in thread 
> Thread[CompactionExecutor:1783,1,main]
> java.lang.AssertionError
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:214)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:158)
> at 
> org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:531)
> at 
> org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:254)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:978)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
>  
> INFO [CompactionExecutor:3310] 2012-07-16 11:14:02,481 CompactionTask.java 
> (line 109) Compacting 
> [SSTableReader(path='/opt/cassandra/data/MonitoringData/Properties/MonitoringData-Properties-hd-369173-Data$
> ERROR [CompactionExecutor:3310] 2012-07-16 11:14:04,031 
> AbstractCassandraDaemon.java (line 134) Exception in thread 
> Thread[CompactionExecutor:3310,1,main]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(150919285004100953907590722809541628889, 
> 5b30363334353237652d383966382d653031312d623131632d3030313535643031373530325d5b436f6d70757465725b4d5350422d$
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134)
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
>