Re: Lots of deletions results in death by GC

2014-02-06 Thread Ben Hood
On Wed, Feb 5, 2014 at 2:52 AM, srmore  wrote:
> Dropped messages are the sign that Cassandra is taking heavy that's the load
> shedding mechanism. I would love to see some sort of  back-pressure
> implemented.

+1 for back pressure in general with Cassandra


Re: No deletes - is periodic repair needed? I think not...

2014-02-06 Thread Alain RODRIGUEZ
Hi,

In a distributed system, such as Cassandra, things can happen (node down,
stop the world GC, hardware issue, ...) and desynchronize replicas, isn't
repairing also a needed operation to keep replicas up to date at least once
a week or once a month ? It is a strong and reliable process to keep things
synced, isn't it ?

I know that read repairs and hinted handoff are also there to handle this
kind of issues, but they might fail (I saw a lot of error in the logs
around hints not being delivered - some people even disable them - and read
repairs are often configured to trigger on 10% of the reads).


2014-01-28 14:53 GMT+01:00 Sylvain Lebresne :

>
>> I have actually set up one of our application streams such that the same
>> key is only overwritten with a monotonically increasing ttl.
>>
>> For example, a breaking news item might have an initial ttl of 60
>> seconds, followed in 45 seconds by an update with a ttl of 3000 seconds,
>> followed by an 'ignore me' update in 600 seconds with a ttl of 30 days (our
>> maximum ttl) when the article is published.
>>
>> My understanding is that this case fits the criteria and no 'periodic
>> repair' is needed.
>>
>
> That's correct. The real criteria for not needing repair if you do no
> deletes but only TTL is "update only with monotonically increasing (non
> necessarily strictly) ttl". Always setting the same TTL is just a special
> case of that, but it's the most commonly used one I think, so I tend to
> simplify it to that case.
>
>
>>
>> I guess another thing I would point out that is easy to miss or forget
>> (if you are a newish user like me), is that ttl's are fine-grained, by
>> column. So we are talking 'fixed' or 'variable' by individual column, not
>> by table. Which means, in my case, that ttl's can vary widely across a
>> table, but as long as I constrain them by key value to be fixed or
>> monotonically increasing, it fits the criteria.
>>
>
> We're talking monotonically increasing ttl "for a given primary key' if
> we're talking the CQL language and "for a given column" if we're talking
> the thrift one. Not "by table".
>
> --
> Sylvain
>
>
>
>>
>> Cheers,
>>
>> Michael
>>
>>
>> On Tue, Jan 28, 2014 at 4:18 AM, Sylvain Lebresne 
>> wrote:
>>
>>> On Tue, Jan 28, 2014 at 1:05 AM, Edward Capriolo 
>>> wrote:
>>>
 If you have only ttl columns, and you never update the column I would
 not think you need a repair.

>>>
>>> Right, no deletes and no updates is the case 1. of Michael on which I
>>> think we all agree 'periodic repair to avoid resurrected columns' is not
>>> required.
>>>
>>>

 Repair cures lost deletes. If all your writes have a ttl a lost write
 should not matter since the column was never written to the node and thus
 could never be resurected on said node.

>>>
>>>  I'm sure we're all in agreement here, but for the record, this is only
>>> true if you have no updates (overwrites) and/or if all writes have the
>>> *same* ttl. Because in the general case, a column with a relatively short
>>> TTL is basically very close to a delete, while a column with a long TTL is
>>> very close from one that has no TTL. If the former column (with short TTL)
>>> overwrites the latter one (with long TTL), and if one nodes misses the
>>> overwrite, that node could resurrect the column with the longer TTL (until
>>> that column expires that is). Hence the separation of the case 2. (fixed
>>> ttl, no repair needed) and 2.a. (variable ttl, repair may be needed).
>>>
>>> --
>>> Sylvain
>>>
>>>

 Unless i am missing something.

 On Monday, January 27, 2014, Laing, Michael 
 wrote:
 > Thanks Sylvain,
 > Your assumption is correct!
 > So I think I actually have 4 classes:
 > 1.Regular values, no deletes, no overwrites, write heavy,
 variable ttl's to manage size
 > 2.Regular values, no deletes, some overwrites, read heavy (10 to
 1), fixed ttl's to manage size
 > 2.a. Regular values, no deletes, some overwrites, read heavy (10 to
 1), variable ttl's to manage size
 > 3.Counter values, no deletes, update heavy, rotation/truncation
 to manage size
 > Only 2.a. above requires me to do 'periodic repair'.
 > What I will actually do is change my schema and applications slightly
 to eliminate the need for overwrites on the only table I have in that
 category.
 > And I will set gc_grace_period to 0 for the tables in the updated
 schema and drop 'periodic repair' from the schedule.
 > Cheers,
 > Michael
 >
 >
 > On Mon, Jan 27, 2014 at 4:22 AM, Sylvain Lebresne <
 sylv...@datastax.com> wrote:
 >>
 >> By periodic repair, I'll assume you mean "having to run repair every
 gc_grace period to make sure no deleted entries resurrect". With that
 assumption:
 >>
 >>>
 >>> 1. Regular values, no deletes, no overwrites, write heavy, ttl's to
 manage size
 >>
 >> Since 'repair within gc_grace'

Re: Adding datacenter for move to vnodes

2014-02-06 Thread Alain RODRIGUEZ
Hi, we did this exact same operation here too, with no issue.

Contrary to Paulo we did not modify our snitch.

We simply added a "dc_suffix" in the property in
cassandra-rackdc.properties conf file for nodes in the new cluster :

# Add a suffix to a datacenter name. Used by the Ec2Snitch and
Ec2MultiRegionSnitch

# to append a string to the EC2 region name.

dc_suffix=-xl

So our new cluster DC is basically : eu-west-xl

I think this is less risky, at least it is easier to do.

Hope this help.


2014-02-02 11:42 GMT+01:00 Paulo Ricardo Motta Gomes <
paulo.mo...@chaordicsystems.com>:

> We had a similar situation and what we did was first migrate the 1.1
> cluster to GossipingPropertyFileSnitch, making sure that for each node we
> specified the correct availability zone as the rack in
> the cassandra-rackdc.properties. In this way,
> the GossipingPropertyFileSnitch is equivalent to the EC2MultiRegionSnitch,
> so the data location does not change and no repair is needed afterwards.
> So, if your nodes are located in the us-east-1e AZ, your 
> cassandra-rackdc.properties
> should look like:
>
> dc=us-east
> rack=1e
>
> After this step is complete on all nodes, then you can add a new
> datacenter specifying different dc and rack on the
> cassandra-rackdc.properties of the new DC. Make sure you upgrade your
> initial datacenter to 1.2 before adding a new datacenter with vnodes
> enabled (of course).
>
> Cheers
>
>
> On Sun, Feb 2, 2014 at 6:37 AM, Katriel Traum  wrote:
>
>> Hello list.
>>
>> I'm upgrading a 1.1 cassandra cluster to 1.2(.13).
>> I've read here and in other places that the best way to migrate to vnodes
>> is to add a new DC, with the same amount of nodes, and run rebuild on each
>> of them.
>> However, I'm faced with the fact that I'm using EC2MultiRegion snitch,
>> which automagically creates the DC and RACK.
>>
>> Any ideas how I can go about adding a new DC with this kind of setup? I
>> need these new machines to be in the same EC2 Region as the current ones,
>> so adding to a new Region is not an option.
>>
>> TIA,
>> Katriel
>>
>
>
>
> --
> *Paulo Motta*
>
> Chaordic | *Platform*
> *www.chaordic.com.br *
> +55 48 3232.3200
> +55 83 9690-1314
>


Sporadic gossip exception on add node

2014-02-06 Thread Desimpel, Ignace
Environment : linux, cassandra 2.0.4, 3 node, embedded, byte ordered, LCS

When I add a node to the existing 3 node cluster I sometimes get the exception 
'Unable to gossip with any seeds ' listed below. If I just restart it without 
any change then mostly it works. Must be some timing issue.

The Cassandra at that time is configured using the Cassandra.yaml file
with the auto_bootstrap set true
and the initial_token set to something like : 00f35256, 041e692a, 0562d8b2, 
0930274a, 0b16ce96, 0c5b3e1e, 10cac47a, 12b16bc6, 13f5db4e, 186561aa, 1907996e, 
1c32b042, 1e19578e ..

The two seeds configured in this yaml are 10.164.8.250 and 10.164.8.249 and 
these are up and running.
The new node to add has ip 10.164.8.93

At the time of the exception, I do not get the gossip message 'Handshaking 
version with /10.164.8.93' on the seeds.
If the exception does not occurs, then I do get that gossip message 
'Handshaking version with /10.164.8.93' on the seed

2014-01-31 13:40:36.380 Loading persisted ring state
2014-01-31 13:40:36.386 Starting Messaging Service on port 9804
2014-01-31 13:40:36.408 Handshaking version with /10.164.8.250
2014-01-31 13:40:36.408 Handshaking version with /10.164.8.249
2014-01-31 13:41:07.415 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at 
org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:426)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:618)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:485) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.startUpCassandra(CassandraStore.java:469)
 [landc-services-search-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT-87937]
at 
be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.run(CassandraStore.java:460)
 [landc-services-search-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT-87937]
java.lang.RuntimeException: Unable to gossip with any seeds
at 
org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160)
at 
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:426)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:618)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:485)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461)
at 
be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.startUpCassandra(CassandraStore.java:469)
at 
be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.run(CassandraStore.java:460)
Exception encountered during startup: Unable to gossip with any seeds
2014-01-31 13:41:07.419 Exception in thread 
Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException: null
at 
org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:349)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:364)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService.access$3(StorageService.java:361) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:551)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_40]
2014-01-31 1

Re: GC taking a long time

2014-02-06 Thread Alain RODRIGUEZ
Hi Robert,

The heap, and GC are things a bit tricky to tune,

I recently read a post about heap, explaining how heap works and how to
tune it :
http://tech.shift.com/post/74311817513/cassandra-tuning-the-jvm-for-read-heavy-workloads

There is plenty of this kind of blogs or articles on the web.

Be careful, tuning highly depends on your workload and hardware. You
shouldn't use configuration found on these posts as is. You need to test,
incrementally and monitor how it behave.

If you are not able to find a solution, there is also a lot of
professionals, consultants (like Datastax or Aaron Morton) whose job is to
help with Cassandra integrations, including heap and GC tuning.

Hope this will help somehow.


2014-01-29 16:51 GMT+01:00 Robert Wille :

> Forget about what I said about there not being any load during the night.
> I forgot about my unit tests. They would have been running at this time and
> they run against this cluster.
>
> I also forgot to provide JVM information:
>
> java version "1.7.0_17"
> Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
>
> Thanks
>
> Robert
>
> From: Robert Wille 
> Reply-To: 
> Date: Wednesday, January 29, 2014 at 4:06 AM
> To: "user@cassandra.apache.org" 
> Subject: GC taking a long time
>
> I read through the recent thread "Cassandra mad GC", which seemed very
> similar to my situation, but didn't really help.
>
> Here is what I get from my logs when I grep for GCInspector. Note that
> this is the middle of the night on a dev server, so there should have been
> almost no load.
>
>  INFO [ScheduledTasks:1] 2014-01-29 02:41:16,579 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 341 ms for 1 collections, 8001582816 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:41:29,135 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 350 ms for 1 collections, 802776used; 
> max is
> 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:41:41,646 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 364 ms for 1 collections, 8075851136used; 
> max is
> 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:41:54,223 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 375 ms for 1 collections, 8124762400used; 
> max is
> 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:42:24,258 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 22995 ms for 2 collections, 7385470288
> used; max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:45:21,328 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 218 ms for 1 collections, 7582480104 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:45:33,418 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 222 ms for 1 collections, 7584743872 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:45:45,527 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 217 ms for 1 collections, 7588514264 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:45:57,594 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 223 ms for 1 collections, 7590223632 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:46:09,686 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 226 ms for 1 collections, 7592826720 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:46:21,867 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 229 ms for 1 collections, 7595464520 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:46:33,869 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 227 ms for 1 collections, 7597109672 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:46:45,962 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 230 ms for 1 collections, 7599909296 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:46:57,964 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 230 ms for 1 collections, 7601584048 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:47:10,018 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 229 ms for 1 collections, 7604217952 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:47:22,136 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 236 ms for 1 collections, 7605867784 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:47:34,277 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 239 ms for 1 collections, 7607521456 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:47:46,292 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 235 ms for 1 collections, 7610667376 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:47:58,537 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 261 ms for 1 collections, 7650345088 used;
> max is 8126464000
>  INFO [ScheduledTasks:1] 2014-01-29 02:48:10,783 GCInspector.java (line
> 116) GC for ConcurrentMarkSweep: 2

Re: GC taking a long time

2014-02-06 Thread Laing, Michael
for the restart issue see
CASSANDRA-6008
and
6086


On Thu, Feb 6, 2014 at 12:19 PM, Alain RODRIGUEZ  wrote:

> Hi Robert,
>
> The heap, and GC are things a bit tricky to tune,
>
> I recently read a post about heap, explaining how heap works and how to
> tune it :
> http://tech.shift.com/post/74311817513/cassandra-tuning-the-jvm-for-read-heavy-workloads
>
> There is plenty of this kind of blogs or articles on the web.
>
> Be careful, tuning highly depends on your workload and hardware. You
> shouldn't use configuration found on these posts as is. You need to test,
> incrementally and monitor how it behave.
>
> If you are not able to find a solution, there is also a lot of
> professionals, consultants (like Datastax or Aaron Morton) whose job is to
> help with Cassandra integrations, including heap and GC tuning.
>
> Hope this will help somehow.
>
>
> 2014-01-29 16:51 GMT+01:00 Robert Wille :
>
>  Forget about what I said about there not being any load during the
>> night. I forgot about my unit tests. They would have been running at this
>> time and they run against this cluster.
>>
>> I also forgot to provide JVM information:
>>
>> java version "1.7.0_17"
>> Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
>> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
>>
>> Thanks
>>
>> Robert
>>
>> From: Robert Wille 
>> Reply-To: 
>> Date: Wednesday, January 29, 2014 at 4:06 AM
>> To: "user@cassandra.apache.org" 
>> Subject: GC taking a long time
>>
>> I read through the recent thread "Cassandra mad GC", which seemed very
>> similar to my situation, but didn’t really help.
>>
>> Here is what I get from my logs when I grep for GCInspector. Note that
>> this is the middle of the night on a dev server, so there should have been
>> almost no load.
>>
>>  INFO [ScheduledTasks:1] 2014-01-29 02:41:16,579 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 341 ms for 1 collections, 8001582816 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:41:29,135 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 350 ms for 1 collections, 802776used; 
>> max is
>> 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:41:41,646 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 364 ms for 1 collections, 8075851136used; 
>> max is
>> 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:41:54,223 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 375 ms for 1 collections, 8124762400used; 
>> max is
>> 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:42:24,258 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 22995 ms for 2 collections, 7385470288
>> used; max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:45:21,328 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 218 ms for 1 collections, 7582480104 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:45:33,418 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 222 ms for 1 collections, 7584743872 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:45:45,527 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 217 ms for 1 collections, 7588514264 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:45:57,594 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 223 ms for 1 collections, 7590223632 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:46:09,686 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 226 ms for 1 collections, 7592826720 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:46:21,867 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 229 ms for 1 collections, 7595464520 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:46:33,869 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 227 ms for 1 collections, 7597109672 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:46:45,962 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 230 ms for 1 collections, 7599909296 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:46:57,964 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 230 ms for 1 collections, 7601584048 used;
>> max is 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:47:10,018 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 229 ms for 1 collections, 7604217952used; 
>> max is
>> 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:47:22,136 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 236 ms for 1 collections, 7605867784used; 
>> max is
>> 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:47:34,277 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 239 ms for 1 collections, 7607521456used; 
>> max is
>> 8126464000
>>  INFO [ScheduledTasks:1] 2014-01-29 02:47:46,292 GCInspector.java (line
>> 116) GC for ConcurrentMarkSweep: 235 ms for 1 collections, 7610667376 used;
>> max is

FW: exception during add node due to test beforeAppend on SSTableWriter

2014-02-06 Thread Desimpel, Ignace
Also, these nodes and data are entirely created by a 2.0.4 code, so should not 
really be a 1.1.x related bug.
Also, I restarted the whole test, thus completely new database, and I get 
similar problems.

From: Desimpel, Ignace
Sent: vrijdag 31 januari 2014 18:02
To: user@cassandra.apache.org
Subject: exception during add node due to test beforeAppend on SSTableWriter

The join with auto bootstrap itself was finished. So I restarted the added 
node. During restart I saw a message indicating that something is wrong about 
this row and sstable.
Of course, in my case I did not drop sstable from another node. But I did 
decommission and add the node, so that is still a kind of 
'data-from-another-node'.

At level 2, 
SSTableReader(path='../../../../data/cdi.cassandra.cdi/dbdatafile/Ks100K/ForwardStringFunction/Ks100K-ForwardStringFunction-jb-67-Data.db')
 [DecoratedKey(065864ce01024e4e505300, 065864ce01024e4e505300), 
DecoratedKey(14c9d35e0102646973706f736974696f6e7300, 
14c9d35e0102646973706f736974696f6e7300)] overlaps 
SSTableReader(path='../../../../data/cdi.cassandra.cdi/dbdatafile/Ks100K/ForwardStringFunction/Ks100K-ForwardStringFunction-jb-64-Data.db')
 [DecoratedKey(068c2e4101024d6f64616c207665726200, 
068c2e4101024d6f64616c207665726200), 
DecoratedKey(06c566b4010244657465726d696e657200, 
06c566b4010244657465726d696e657200)].  This could be caused by a bug in 
Cassandra 1.1.0 .. 1.1.3 or due to the fact that you have dropped sstables from 
another node into the data directory. Sending back to L0.  If you didn't drop 
in sstables, and have not yet run scrub, you should do so since you may also 
have rows out-of-order within an sstable



From: Desimpel, Ignace
Sent: vrijdag 31 januari 2014 17:43
To: user@cassandra.apache.org
Subject: exception during add node due to test beforeAppend on SSTableWriter

4 node, byte ordered, LCS, 3 Compaction Executors, replication factor 1
Code is 2.0.4 version but with patch for 
CASSANDRA-6638 However, 
no cleanup is run so patch should not play a roll

4 node cluster is started and insert/queries are done up to about only 10 GB of 
data on each node.
Then decommission one node, and delete local files.
Then add node again.
Exception : see below.

Any idea?

Regards,
Ignace Desimpel


  *   2014-01-31 17:12:02.600 ==>> Bootstrap is streaming data from other 
nodes... Please wait ...
  *   2014-01-31 17:12:02.600 ==>> Bootstrap stream state : rx= 29.00 tx= 
100.00 Please wait ...
  *   2014-01-31 17:12:18.908 Enqueuing flush of 
Memtable-compactions_in_progress@350895652(0/0 serialized/live bytes, 1 ops)
  *   2014-01-31 17:12:18.908 Writing 
Memtable-compactions_in_progress@350895652(0/0 serialized/live bytes, 1 ops)
  *   2014-01-31 17:12:19.009 Completed flushing 
../../../../data/cdi.cassandra.cdi/dbdatafile/system/compactions_in_progress/system-compactions_in_progress-jb-74-Data.db
 (42 bytes) for commitlog position ReplayPosition(segmentId=1391184546183, 
position=561494)
  *   2014-01-31 17:12:19.018 Exception in thread 
Thread[CompactionExecutor:1,1,main]
  *   java.lang.RuntimeException: Last written key 
DecoratedKey(8afc9237010380178575, 8afc9237010380178575) >= 
current key DecoratedKey(6e0bb955010383dfdd1d, 
6e0bb955010383dfdd1d) writing into 
/media/datadrive1/cdi.cassandra.cdi/dbdatafile/Ks100K/ForwardLongFunction/Ks100K-ForwardLongFunction-tmp-jb-159-Data.db
  *   at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT]
  *   at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_40]
  *   at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_40]
  *   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_40]
  *  

Re: Periodic rpc_timeout errors on select query

2014-02-06 Thread Chap Lovejoy

Hi Steve,

It looks like it will be pretty easy for us to do some testing with the 
new client version. I'm going to give it a shot and keep my fingers 
crossed.


Thanks again,
Chap

On 5 Feb 2014, at 18:10, Steven A Robenalt wrote:


Hi Chap,

If you have the ability to test the 2.0.0rc2 driver, I would recommend
doing so, even from a dedicated test client or a JUnit test case. 
There are
other benefits to the change, such as being able to use 
BatchStatements,

aside from possible impact on your read timeouts.

Steve


Re: Question: ConsistencyLevel.ONE with multiple datacenters

2014-02-06 Thread Chris Burroughs
I think the scenario you outlined is correct.  The DES handles multiple 
DCs poorly and the LOCAL_ONE hammer is the best bet.


On 01/31/2014 12:40 PM, Paulo Ricardo Motta Gomes wrote:

Hey,

When adding a new data center to our production C* datacenter using the
procedure described in [1], some of our application requests were returning
null/empty values. Rebuild was not complete in the new datacenter, so my
guess is that some requests were being directed to the brand new datacenter
which still didn't have the data.

Our Hector client was connected only to the original nodes, with
autoDiscoverHosts=false and we use ConsistencyLevel.ONE for reads. The
keyspace schema was already configured to use both data centers.

My question is: is it possible that the dynamic snitch is choosing the
nodes in the new (empty) datacenter when CL=ONE? In this case, it's
mandatory to use CL=LOCAL_ONE during bootstrap/rebuild of a new datacenter,
otherwise empty data might be returned, correct?

Cheers,

[1]
http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/operations/ops_add_dc_to_cluster_t.html





Re: Question about local reads with multiple data centers

2014-02-06 Thread Chris Burroughs

On 01/29/2014 08:07 PM, Donald Smith wrote:

My question: will the read process try to read first locally from the 
datacenter DC2 I specified in its connection string? I presume so.  (I 
doubt that it uses the client's IP address to decide which datacenter is 
closer. And I am unaware of another way to tell it to read locally.)



From the rest if this thread it looks like you were asking about how 
the client selected a Cassandra node to act as a coordinator.  Note 
however that if you are using a DC oblivious CL (ONE, QUORUM) then that 
Cassandra coordinator may send requests to the remote data center.




Also, will read repair happen between datacenters automatically 
("read_repair_chance=0.10")?  Or does that only happen within a single data 
center?


Yes read_repair_chance is global.  There is a separate dc_local repair 
chance if you want to make local reap repairs more common.


exceptions all around in clean cluster

2014-02-06 Thread Ondřej Černoš
Hi,

I am running a small 2 DC cluster of 3 nodes (each DC). I use 3 replicas in
both DCs (all 6 nodes have everything) on Cassandra 1.2.11. I populated the
cluster via cqlsh pipelined with a series of inserts. I use the cluster for
tests, the dataset is pretty small (hundreds of thousands of records max).

The cluster was completely up during inserts. Inserts were done serially on
one of the nodes.

The resulting load is uneven:

Datacenter: xxx
==
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns (effective)  Host ID
 Rack
UN  ip  1.63 GB256 100.0%
83ecd32a-3f2b-4cf6-b3c7-b316cb1986cc  default-rackUN  ip  1.5 GB
256 100.0%091ca530-2e95-4954-92c4-76f51fab0b66
default-rack
UN  ip  1.44 GB256 100.0%
d94d335e-08bf-4a30-ad58-4c5acdc2ef45  default-rack
Datacenter: yyy
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns (effective)  Host ID
 Rack
UN  ip   2.27 GB256 100.0%
e2584981-71f7-45b0-82f4-e08942c47585  1c
UN  ip   2.27 GB256 100.0%
e5c6de9a-819e-4757-a420-55ec3ffaf131  1c
UN  ip   2.27 GB256 100.0%
fa53f391-2dd3-4ec8-885d-8db6d453a708  1c


And 4 out of 6 nodes report corrupted sstables:


java.lang.RuntimeException:
org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.IOException: mmap segment underflow; remaining is 239882945
but 1349280116 requested
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1618)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.IOException: mmap segment underflow; remaining is 239882945
but 1349280116 requested
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.(IndexedSliceReader.java:119)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:68)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:44)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:104)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:272)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
at org.apache.cassandra.db.Table.getRow(Table.java:347)
at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
at 
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1062)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1614)
... 3 more
Caused by: java.io.IOException: mmap segment underflow; remaining is
239882945 but 1349280116 requested
at 
org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
at 
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader$SimpleBlockFetcher.(IndexedSliceReader.java:477)
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.(IndexedSliceReader.java:94)


repair -pr hangs, rebuild from the less corrupted dc hangs.

The only interesting exception (besides the java.io.EOFException during
repair) is the following:

org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 52f2665b
at org.apache.cassandra.db.marshal.UTF8Type.getString(UTF8Type.java:54)
at 
org.apache.cassandra.db.index.AbstractSimplePerColumnSecondaryIndex.insert(AbstractSimplePerColumnSecondaryIndex.java:102)
at 
org.apache.cassandra.db.index.SecondaryIndexManager.indexRow(SecondaryIndexManager.java:448)
at org.apache.cassandra.db.Table.indexRow(Table.java:431)   at
org.apache.cassandra

Re: what tool will create noncql columnfamilies in cassandra 3a

2014-02-06 Thread Chris Burroughs

On 02/05/2014 04:57 AM, Sylvain Lebresne wrote:

>How will users adjust the meta data of non cql column families

The rational for removing cassandra-cli is mainly that maintaining 2 fully
featured command line interface is a waste of the project resources in the
long
run. It's just a tool using the thrift interface however and you'll still be
able to adjust metadata through the thrift interface as before. As Patricia
mentioned, there is even some existing interactive options like pycassaShell
in the community.


It's also wasteful for the community to maintain multiple post 3.0 forks 
for cassandra-cli so they can continue using Cassandra.  It would be 
more efficient if they cool pool their resources in a central place, 
like a code repo at Apache.


Re: First SSTable file is not being compacted

2014-02-06 Thread Chris Burroughs

On 02/06/2014 01:17 AM, Sameer Farooqui wrote:

I'm running C* 2.0.4 and when I have a handful of SSTable files and trigger
a manual compaction with 'nodetool compact' the first SSTable file doesn't
get compacted away.

Is there something special about the first SSTable that it remains even
after a SizedTierCompaction?



No, this is not expected behavior.  Do the number of live SSTables 
reported match what is on disk?  Do you have a procedure that can repeat 
this?




Re: exceptions all around in clean cluster

2014-02-06 Thread Ondřej Černoš
I ran nodetool scrub on nodes in the less corrupted datacenter and tried
nodetool rebuild from this datacenter.

This is the result:

2014-02-06 15:04:24.645+0100 [Thread-83] [ERROR] CassandraDaemon.java(191)
org.apache.cassandra.service.CassandraDaemon: Exception in thread
Thread[Thread-83,5,main]
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.IllegalArgumentException
at
org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:152)
at
org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:187)
at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:138)
at
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:243)
at
org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:183)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:79)
Caused by: java.util.concurrent.ExecutionException:
java.lang.IllegalArgumentException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at
org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:144)
... 5 more
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:267)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:132)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:115)
at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:165)
at
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:45)
at
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:61)
at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.(MergeIterator.java:86)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:291)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
at org.apache.cassandra.db.SliceQueryPager.next(SliceQueryPager.java:57)
at org.apache.cassandra.db.Table.indexRow(Table.java:424)
at
org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
at
org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-02-06 15:04:24.646+0100 [CompactionExecutor:10] [ERROR]
CassandraDaemon.java(191) org.apache.cassandra.service.CassandraDaemon:
component=c4 Exception in thread Thread[CompactionExecutor:10,1,main]
java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:267)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:132)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:115)
at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:165)
at
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:45)
at
org.apache.cassandra.db

Re: Question: ConsistencyLevel.ONE with multiple datacenters

2014-02-06 Thread Paulo Ricardo Motta Gomes
Cool. I actually changed the consistency level to LOCAL_ONE and things
worked as expected.

Cheers!


On Thu, Feb 6, 2014 at 11:31 AM, Chris Burroughs
wrote:

> I think the scenario you outlined is correct.  The DES handles multiple
> DCs poorly and the LOCAL_ONE hammer is the best bet.
>
>
> On 01/31/2014 12:40 PM, Paulo Ricardo Motta Gomes wrote:
>
>> Hey,
>>
>> When adding a new data center to our production C* datacenter using the
>> procedure described in [1], some of our application requests were
>> returning
>> null/empty values. Rebuild was not complete in the new datacenter, so my
>> guess is that some requests were being directed to the brand new
>> datacenter
>> which still didn't have the data.
>>
>> Our Hector client was connected only to the original nodes, with
>> autoDiscoverHosts=false and we use ConsistencyLevel.ONE for reads. The
>> keyspace schema was already configured to use both data centers.
>>
>> My question is: is it possible that the dynamic snitch is choosing the
>> nodes in the new (empty) datacenter when CL=ONE? In this case, it's
>> mandatory to use CL=LOCAL_ONE during bootstrap/rebuild of a new
>> datacenter,
>> otherwise empty data might be returned, correct?
>>
>> Cheers,
>>
>> [1]
>> http://www.datastax.com/documentation/cassandra/1.2/
>> webhelp/cassandra/operations/ops_add_dc_to_cluster_t.html
>>
>>
>


-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br *
+55 48 3232.3200
+55 83 9690-1314


Bootstrap failure on C* 1.2.13

2014-02-06 Thread Paulo Ricardo Motta Gomes
Hello,

One the nodes of our cluster failed and I performed the "replace a dead
node procedure" described in [1]. After about 5 or 6 hours of streaming
during bootstrap, the node fails with the exception "Unable to fetch range
for keyspace foobar from any hosts." [2].

I haven't found any thread or forum with the same error message yet. Could
this be related to CASSANDRA-6648, or is this a 1.2.14 issue?

* Additional info:

0) Not using vnodes.

1) The replacement node does not show up on "nodetool status" but shows up
on "nodetool gossip info".

2) I am in the middle of a rebuild operation to bootstrap a new datacenter,
but I'm not rebuilding any nodes on the dead node range.

3) The replacement node was using exactly the same cassandra.yaml
configuration as the original node (apart from initial_token, obviously),
including the same seeds' hostnames, and after talking to the nodes in the
cluster, it was getting a strange "Unable to contact any seeds!" [3]
message and crashing. I solved this by using actual IP addresses instead of
hostnames for the seeds, but what is weird is that all nodes work well with
seeds hostnames. This is probably an unrelated issue that was already
solved, but just in case it's relevant.

[1]:
http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node

[2]:

 INFO [Thread-8048] 2014-02-06 15:24:05,385 StreamInSession.java (line 199)
Finished streaming session a3d0e841-8f11-11e3-b3d4-438819ab6fdb from /
23.23.48.71
ERROR [main] 2014-02-06 15:24:05,390 CassandraDaemon.java (line 464)
Exception encountered during startup
java.lang.RuntimeException: Unable to fetch range
[(25136549843694323996529816280365324662,30453461826833987488145094511486702966],
(42535295865117307932921825918971026432,46404197776252977962990779214850837877],
(35770373809973650979760322762601081269,41087285793113314471375551001729459574],
(30453461826833987488145094521416702966,31901471898837980949691369441728269824],
(41087285793113314471375551003729159574,42535295865117307932921825128971026432]]
for keyspace foobar from any hosts
at
org.apache.cassandra.dht.RangeStreamer.fetch(RangeStreamer.java:260)
at
org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:84)
at
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979)
at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
ERROR [StorageServiceShutdownHook] 2014-02-06 15:24:05,404
CassandraDaemon.java (line 191) Exception in thread
Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException
at
org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:358)
at
org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:373)
at
org.apache.cassandra.service.StorageService.access$000(StorageService.java:89)
at
org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:551)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at java.lang.Thread.run(Thread.java:662)


[3]:

ERROR [main] 2014-02-06 01:04:35,715 CassandraDaemon.java (line 464)
Exception encountered during startup
java.lang.IllegalStateException: Unable to contact any seeds!
at
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:977)
 at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745)
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:586)
 at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:483)
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)


-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br *
+55 48 3232.3200
+55 83 9690-1314


Re: First SSTable file is not being compacted

2014-02-06 Thread Sameer Farooqui
Yeah, it's definitely repeatable. I have a lab environment set up where the
issue is occurring and I've recreated the lab environment 4 - 5 times and
it's occurred each time.

In my demodb.users CF I currently have 2 data SSTables on disk
(demodb-users-jb-1-Data.db and demodb-users-jb-6-Data.db). However, in
OpsCenter the CF: SSTable Count (demodb.users) graph shows only one SSTable.

The nodetool cfstats command also shows "SSTable count: 1" for this CF.


- SF


On Thu, Feb 6, 2014 at 8:54 AM, Chris Burroughs
wrote:

> On 02/06/2014 01:17 AM, Sameer Farooqui wrote:
>
>> I'm running C* 2.0.4 and when I have a handful of SSTable files and
>> trigger
>> a manual compaction with 'nodetool compact' the first SSTable file doesn't
>> get compacted away.
>>
>> Is there something special about the first SSTable that it remains even
>> after a SizedTierCompaction?
>>
>
>
> No, this is not expected behavior.  Do the number of live SSTables
> reported match what is on disk?  Do you have a procedure that can repeat
> this?
>
>


Re: Adding datacenter for move to vnodes

2014-02-06 Thread Katriel Traum
Thank you Alain! That was exactly what I was looking for. I was worried I'd
have to do a rolling restart to change the snitch.

Katriel



On Thu, Feb 6, 2014 at 1:10 PM, Alain RODRIGUEZ  wrote:

> Hi, we did this exact same operation here too, with no issue.
>
> Contrary to Paulo we did not modify our snitch.
>
> We simply added a "dc_suffix" in the property in
> cassandra-rackdc.properties conf file for nodes in the new cluster :
>
> # Add a suffix to a datacenter name. Used by the Ec2Snitch and
> Ec2MultiRegionSnitch
>
> # to append a string to the EC2 region name.
>
> dc_suffix=-xl
>
> So our new cluster DC is basically : eu-west-xl
>
> I think this is less risky, at least it is easier to do.
>
> Hope this help.
>
>
> 2014-02-02 11:42 GMT+01:00 Paulo Ricardo Motta Gomes <
> paulo.mo...@chaordicsystems.com>:
>
> We had a similar situation and what we did was first migrate the 1.1
>> cluster to GossipingPropertyFileSnitch, making sure that for each node we
>> specified the correct availability zone as the rack in
>> the cassandra-rackdc.properties. In this way,
>> the GossipingPropertyFileSnitch is equivalent to the EC2MultiRegionSnitch,
>> so the data location does not change and no repair is needed afterwards.
>> So, if your nodes are located in the us-east-1e AZ, your 
>> cassandra-rackdc.properties
>> should look like:
>>
>> dc=us-east
>> rack=1e
>>
>> After this step is complete on all nodes, then you can add a new
>> datacenter specifying different dc and rack on the
>> cassandra-rackdc.properties of the new DC. Make sure you upgrade your
>> initial datacenter to 1.2 before adding a new datacenter with vnodes
>> enabled (of course).
>>
>> Cheers
>>
>>
>> On Sun, Feb 2, 2014 at 6:37 AM, Katriel Traum  wrote:
>>
>>> Hello list.
>>>
>>> I'm upgrading a 1.1 cassandra cluster to 1.2(.13).
>>> I've read here and in other places that the best way to migrate to
>>> vnodes is to add a new DC, with the same amount of nodes, and run rebuild
>>> on each of them.
>>> However, I'm faced with the fact that I'm using EC2MultiRegion snitch,
>>> which automagically creates the DC and RACK.
>>>
>>> Any ideas how I can go about adding a new DC with this kind of setup? I
>>> need these new machines to be in the same EC2 Region as the current ones,
>>> so adding to a new Region is not an option.
>>>
>>> TIA,
>>> Katriel
>>>
>>
>>
>>
>> --
>> *Paulo Motta*
>>
>> Chaordic | *Platform*
>> *www.chaordic.com.br *
>> +55 48 3232.3200
>> +55 83 9690-1314
>>
>
>


Re: Adding datacenter for move to vnodes

2014-02-06 Thread Alain RODRIGUEZ
Glad it helps.

Good luck with this.

Cheers,

Alain


2014-02-06 17:30 GMT+01:00 Katriel Traum :

> Thank you Alain! That was exactly what I was looking for. I was worried
> I'd have to do a rolling restart to change the snitch.
>
> Katriel
>
>
>
> On Thu, Feb 6, 2014 at 1:10 PM, Alain RODRIGUEZ wrote:
>
>> Hi, we did this exact same operation here too, with no issue.
>>
>> Contrary to Paulo we did not modify our snitch.
>>
>> We simply added a "dc_suffix" in the property in
>> cassandra-rackdc.properties conf file for nodes in the new cluster :
>>
>> # Add a suffix to a datacenter name. Used by the Ec2Snitch and
>> Ec2MultiRegionSnitch
>>
>> # to append a string to the EC2 region name.
>>
>> dc_suffix=-xl
>>
>> So our new cluster DC is basically : eu-west-xl
>>
>> I think this is less risky, at least it is easier to do.
>>
>> Hope this help.
>>
>>
>> 2014-02-02 11:42 GMT+01:00 Paulo Ricardo Motta Gomes <
>> paulo.mo...@chaordicsystems.com>:
>>
>> We had a similar situation and what we did was first migrate the 1.1
>>> cluster to GossipingPropertyFileSnitch, making sure that for each node we
>>> specified the correct availability zone as the rack in
>>> the cassandra-rackdc.properties. In this way,
>>> the GossipingPropertyFileSnitch is equivalent to the EC2MultiRegionSnitch,
>>> so the data location does not change and no repair is needed afterwards.
>>> So, if your nodes are located in the us-east-1e AZ, your 
>>> cassandra-rackdc.properties
>>> should look like:
>>>
>>> dc=us-east
>>> rack=1e
>>>
>>> After this step is complete on all nodes, then you can add a new
>>> datacenter specifying different dc and rack on the
>>> cassandra-rackdc.properties of the new DC. Make sure you upgrade your
>>> initial datacenter to 1.2 before adding a new datacenter with vnodes
>>> enabled (of course).
>>>
>>> Cheers
>>>
>>>
>>> On Sun, Feb 2, 2014 at 6:37 AM, Katriel Traum wrote:
>>>
 Hello list.

 I'm upgrading a 1.1 cassandra cluster to 1.2(.13).
 I've read here and in other places that the best way to migrate to
 vnodes is to add a new DC, with the same amount of nodes, and run rebuild
 on each of them.
 However, I'm faced with the fact that I'm using EC2MultiRegion snitch,
 which automagically creates the DC and RACK.

 Any ideas how I can go about adding a new DC with this kind of setup? I
 need these new machines to be in the same EC2 Region as the current ones,
 so adding to a new Region is not an option.

 TIA,
 Katriel

>>>
>>>
>>>
>>> --
>>> *Paulo Motta*
>>>
>>> Chaordic | *Platform*
>>> *www.chaordic.com.br *
>>> +55 48 3232.3200
>>> +55 83 9690-1314
>>>
>>
>>
>


Re: exceptions all around in clean cluster

2014-02-06 Thread Ondřej Černoš
Update: I dropped the keyspace, the system keyspace, deleted all the data
and started from fresh state. Now it behaves correctly. The previously
reported state is therefore the result of the keyspace being dropped
beforehand and recreated with no compression on sstables - maybe some
sstables were left in system keyspace as live though the keyspace was
completely dropped?

ondrej cernos


On Thu, Feb 6, 2014 at 3:11 PM, Ondřej Černoš  wrote:

> I ran nodetool scrub on nodes in the less corrupted datacenter and tried
> nodetool rebuild from this datacenter.
>
> This is the result:
>
> 2014-02-06 15:04:24.645+0100 [Thread-83] [ERROR] CassandraDaemon.java(191)
> org.apache.cassandra.service.CassandraDaemon: Exception in thread
> Thread[Thread-83,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.IllegalArgumentException
> at
> org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:152)
>  at
> org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:187)
> at
> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:138)
>  at
> org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:243)
> at
> org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:183)
>  at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:79)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.IllegalArgumentException
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>  at
> org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:144)
> ... 5 more
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Buffer.java:267)
> at
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
>  at
> org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
> at
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
>  at
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> at
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:132)
>  at
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:115)
> at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:165)
>  at
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:45)
> at
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:61)
>  at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>  at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
> at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>  at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.(MergeIterator.java:86)
> at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
>  at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
> at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>  at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:291)
> at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>  at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
>  at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
> at org.apache.cassandra.db.SliceQueryPager.next(SliceQueryPager.java:57)
>  at org.apache.cassandra.db.Table.indexRow(Table.java:424)
> at
> org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
>  at
> org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 2014-02-06 15:04:24.646+0100 [CompactionExecutor:10] [ERROR]
> CassandraDaemon.java(191) org.apache.cassandra.service.CassandraDaemon:
> component=c4 Exception in thread Thread[CompactionExecutor:10,1,main]
> java.lang.IllegalArgumentException
>  at java.nio.Buffer.limit(Buffer.java:267)
> at
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
>  at
> org.

Re: First SSTable file is not being compacted

2014-02-06 Thread Chris Burroughs

Sounds like you have done some solid test work.

I suggest reading https://issues.apache.org/jira/browse/CASSANDRA-6568 
and if you think your issue is the same adding your reproduction case 
there, otherwise create your own ticket.


On 02/06/2014 10:53 AM, Sameer Farooqui wrote:

Yeah, it's definitely repeatable. I have a lab environment set up where the
issue is occurring and I've recreated the lab environment 4 - 5 times and
it's occurred each time.

In my demodb.users CF I currently have 2 data SSTables on disk
(demodb-users-jb-1-Data.db and demodb-users-jb-6-Data.db). However, in
OpsCenter the CF: SSTable Count (demodb.users) graph shows only one SSTable.

The nodetool cfstats command also shows "SSTable count: 1" for this CF.


- SF


On Thu, Feb 6, 2014 at 8:54 AM, Chris Burroughs
wrote:


On 02/06/2014 01:17 AM, Sameer Farooqui wrote:


I'm running C* 2.0.4 and when I have a handful of SSTable files and
trigger
a manual compaction with 'nodetool compact' the first SSTable file doesn't
get compacted away.

Is there something special about the first SSTable that it remains even
after a SizedTierCompaction?




No, this is not expected behavior.  Do the number of live SSTables
reported match what is on disk?  Do you have a procedure that can repeat
this?








Re: One of my nodes is in the wrong datacenter - help!

2014-02-06 Thread Sholes, Joshua
Thanks for the advice.   I did use “removenode” as I was aware of the 
replace_token problems.
I haven’t run into the issue in CASSANDRA-6615 yet, and I don’t believe I’m at 
risk for it.

I’m actually running into a different problem.   Having done a remove node on 
the node with the incorrect datacenter name, I am still getting “one or more 
nodes were unavailable” messages when doing queries with consistency=all.   I’m 
doing a full repair pass on the column family in question just to be safe 
(which is taking forever!) before I do anything else.   So to reiterate:  my 
cluster now shows 7 nodes up when looking with gossipinfo or status, but will 
still not do consistency=all queries.   Are there any best practices for 
finding out other issues with the cluster, or should I anticipate the repair 
pass will fix the problem?
--
Josh Sholes

From: Robert Coli mailto:rc...@eventbrite.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Monday, February 3, 2014 at 7:30 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: One of my nodes is in the wrong datacenter - help!

On Sun, Feb 2, 2014 at 10:48 AM, Sholes, Joshua 
mailto:joshua_sho...@cable.comcast.com>> wrote:
I had a node in my 8-node production 1.2.8 cluster have a serious problem and 
need to be removed and rebuilt.   However, after doing nodetool removenode and 
then bootstrapping a new node on the same IP address, the new node somehow 
ended up with a different datacenter name (the rest of the nodes are in dc 
$NAME, and the new one is in dc $NAME6934724 — as in, a string of seemingly 
random numbers appended to the correct name).   How can I force it to change DC 
names back to what it should be?

You could change the entry in the system.local columnfamily on the affected 
node...

cqlsh > update system.local set data_center = "$NAME";

... but that is Not Supported and may have side effects of which I am not aware.

I’m working with 500+GB per node here so bootstrapping it again is not a huge 
issue, but I’d prefer to avoid it anyway.  I am NOT able to change the node’s 
IP address at this time so I’m stuck with bootstrapping a new node in the same 
place, which my gut feeling tells me might be part of the problem.

Note that replace_node/replace_token are broken in 1.2.8, did you attempt to 
use either of these? I presume not because you said you did removenode...

 If I were you, I would probably removenode and re-bootstrap, as the safest 
alternative.

As an aside, while trying to deal with this issue you should be aware of this 
ticket, so you do not do the sequence of actions it describes.

https://issues.apache.org/jira/browse/CASSANDRA-6615

=Rob


Re: exception during add node due to test beforeAppend on SSTableWriter

2014-02-06 Thread ravi prasad
I'm seeing the same with cassandra-2.0.4 during compaction, after lot of 
sstable files are streamed after bootstrap/repair. Strange thing is, the 'Last 
written key >= current key' exception during compaction of L0, L1 sstables, 
goes away after restarting cassandra. But, then see those warnings about 
overlapping sstables. 

I think this change in https://issues.apache.org/jira/browse/CASSANDRA-5921 is 
causing overlapping of sstables in L1. Didn't used to see this with 
cassandra-1.2.9 which had https://issues.apache.org/jira/browse/CASSANDRA-5907 
fixed.  Can you open a jira reporting this issue?





On Thursday, February 6, 2014 4:31 AM, "Desimpel, Ignace" 
 wrote:
 
 
Also, these nodes and data are entirely created by a 2.0.4 code, so should not 
really be a 1.1.x related bug.
Also, I restarted the whole test, thus completely new database, and I get 
similar problems.
 
From:Desimpel, Ignace 
Sent: vrijdag 31 januari 2014 18:02
To: user@cassandra.apache.org
Subject: exception during add node due to test beforeAppend on SSTableWriter
 
The join with auto bootstrap itself was finished. So I restarted the added 
node. During restart I saw a message indicating that something is wrong about 
this row and sstable.
Of course, in my case I did not drop sstable from another node. But I did 
decommission and add the node, so that is still a kind of 
‘data-from-another-node’.
 
At level 2, 
SSTableReader(path='../../../../data/cdi.cassandra.cdi/dbdatafile/Ks100K/ForwardStringFunction/Ks100K-ForwardStringFunction-jb-67-Data.db')
 [DecoratedKey(065864ce01024e4e505300, 065864ce01024e4e505300), 
DecoratedKey(14c9d35e0102646973706f736974696f6e7300, 
14c9d35e0102646973706f736974696f6e7300)] overlaps 
SSTableReader(path='../../../../data/cdi.cassandra.cdi/dbdatafile/Ks100K/ForwardStringFunction/Ks100K-ForwardStringFunction-jb-64-Data.db')
 [DecoratedKey(068c2e4101024d6f64616c207665726200, 
068c2e4101024d6f64616c207665726200), 
DecoratedKey(06c566b4010244657465726d696e657200, 
06c566b4010244657465726d696e657200)].  This could be caused by a bug in 
Cassandra 1.1.0 .. 1.1.3 or due to the fact that you have dropped sstables from 
another node into the data directory. Sending back to L0.  If you didn't drop 
in sstables, and have not yet run scrub, you should do so since you may also 
have rows out-of-order within an sstable
 
 
 
From:Desimpel, Ignace 
Sent: vrijdag 31 januari 2014 17:43
To: user@cassandra.apache.org
Subject: exception during add node due to test beforeAppend on SSTableWriter
 
4 node, byte ordered, LCS, 3 Compaction Executors, replication factor 1
Code is 2.0.4 version but with patch for CASSANDRA-6638 However, no cleanup is 
run so patch should not play a roll
 
4 node cluster is started and insert/queries are done up to about only 10 GB of 
data on each node.
Then decommission one node, and delete local files.
Then add node again.
Exception : see below.
 
Any idea?
 
Regards,
Ignace Desimpel
 
* 2014-01-31 17:12:02.600 ==>> Bootstrap is streaming data from other 
nodes... Please wait ... 
* 2014-01-31 17:12:02.600 ==>> Bootstrap stream state : rx= 29.00 tx= 
100.00 Please wait ... 
* 2014-01-31 17:12:18.908 Enqueuing flush of 
Memtable-compactions_in_progress@350895652(0/0 serialized/live bytes, 1 ops) 
* 2014-01-31 17:12:18.908 Writing 
Memtable-compactions_in_progress@350895652(0/0 serialized/live bytes, 1 ops) 
* 2014-01-31 17:12:19.009 Completed flushing 
../../../../data/cdi.cassandra.cdi/dbdatafile/system/compactions_in_progress/system-compactions_in_progress-jb-74-Data.db
 (42 bytes) for commitlog position ReplayPosition(segmentId=1391184546183, 
position=561494) 
* 2014-01-31 17:12:19.018 Exception in thread 
Thread[CompactionExecutor:1,1,main] 
* java.lang.RuntimeException: Last written key 
DecoratedKey(8afc9237010380178575, 8afc9237010380178575) >= 
current key DecoratedKey(6e0bb955010383dfdd1d, 
6e0bb955010383dfdd1d) writing into 
/media/datadrive1/cdi.cassandra.cdi/dbdatafile/Ks100K/ForwardLongFunction/Ks100K-ForwardLongFunction-tmp-jb-159-Data.db
 
* at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT] 
* at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT] 
* at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT] 
* at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 ~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT] 
* at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.0.4-SNAPSHOT.jar:2.0.4-SNAPSHOT] 
* at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 ~[apache-cas

ring describe returns only public ips

2014-02-06 Thread Ted Pearson
We are using Cassandra 1.2.13 in a multi-datacenter setup. We are using 
Astyanax as the client, and we’d like to enable its token aware connection pool 
type and ring describe node discovery type. Unfortunately, I’ve found that both 
thrift’s describe_ring and `nodetool ring` only report the public IPs of the 
cassandra nodes. This means that Astyanax tries to reconnect to the public IPs 
of each node, which doesn’t work and just results in no hosts being available 
for queries according to Astyanax.

I know from `nodetool gossipinfo` (and the fact that the clusters work) that 
it's sharing the LOCAL_IP via gossip, but have no idea how or if it’s possible 
to get describe_ring to return local IPs, or if there is some alternative.

Thanks,

-Ted

CQL list command

2014-02-06 Thread Andrew Cobley
TL;DR

Is there a CQL equivalent of the CLI List command ?  yes or No?

Long version

I often use the CLI command LIST for debugging or when teaching students 
showing them what’s going on under the hood of CQL.  I see that CLI swill be 
removed in Cassandra 3 and we will lose this ability.  It would be nice if CQL 
retained it, or something like it for debugging and etching purposes.

Any ideas ?
Andy



The University of Dundee is a registered Scottish Charity, No: SC015096


Re: CQL list command

2014-02-06 Thread Russell Bradberry
try SELECT * FROM my_table LIMIT 100;



On February 6, 2014 at 4:02:26 PM, Andrew Cobley (a.e.cob...@dundee.ac.uk) 
wrote:

TL;DR  

Is there a CQL equivalent of the CLI List command ? yes or No?  

Long version  

I often use the CLI command LIST for debugging or when teaching students 
showing them what’s going on under the hood of CQL. I see that CLI swill be 
removed in Cassandra 3 and we will lose this ability. It would be nice if CQL 
retained it, or something like it for debugging and etching purposes.  

Any ideas ?  
Andy  



The University of Dundee is a registered Scottish Charity, No: SC015096  


Auto-Bootstrap not Auto-Bootstrapping?

2014-02-06 Thread Thunder Stumpges
Hi all,

We recently needed/wanted to reconfigure the disks for our 3-node C*2.0.4
Cassandra setup and rebuild the server at the same time. Upon adding the
newly rebuilt server into the cluster, it immediately started serving read
requests with no data! Then because the latency is so "good" the vast
majority of requests were pushed onto that server. We are using 3 nodes
with RF=3. Why wouldn't the node stream in the needed data before serving?
My impression was that the auto_bootstrap setting was true by default (we
have not set it anywhere) and that a new node entering the cluster would
stream in data for its tokens (virtual nodes) prior to serving requests.

Does this have to do with re-using the same name/ip as the old server which
also happens to be in the seed list on our clients and in cassandra.yaml ?

Our admin did the following steps during this process:

- Stop one of the 3 servers. It then appeared as DOWN to the rest of the
cluster.
- Rebuild the system, reconfigure disks (name and ip are same as the server
that came down)
  - NOTE: there was NO data left from before on this machine, it is a new
bare-metal install
- nodetool removenode  (from one of the other remaining nodes)
  - wait for completion ~15 min
- Start cassandra on new node, wait for it to come up
- nodetool repair (on new node)

Immediately when it came up it was as if we'd lost 1/3 of our data because
so many read requests were hitting this new empty node. There does appear
to be streaming data coming into the new node, but it is still serving many
empty reponses.

Another curious thing is that I set all of our reads to Quorum ahead of
time hoping if this did happen again (after the first time caught us out),
that the quorum reads would prevent the bad consistency. This does not
appear to have helped.

Any insight as to what the heck went wrong here would be greatly
appreciated.

Thanks,
Thunder


Re: Auto-Bootstrap not Auto-Bootstrapping?

2014-02-06 Thread Keith Wright
Is it a seed node?  My understanding is that they do not bootstrap

On Feb 6, 2014 4:23 PM, Thunder Stumpges  wrote:
Hi all,

We recently needed/wanted to reconfigure the disks for our 3-node C*2.0.4 
Cassandra setup and rebuild the server at the same time. Upon adding the newly 
rebuilt server into the cluster, it immediately started serving read requests 
with no data! Then because the latency is so "good" the vast majority of 
requests were pushed onto that server. We are using 3 nodes with RF=3. Why 
wouldn't the node stream in the needed data before serving? My impression was 
that the auto_bootstrap setting was true by default (we have not set it 
anywhere) and that a new node entering the cluster would stream in data for its 
tokens (virtual nodes) prior to serving requests.

Does this have to do with re-using the same name/ip as the old server which 
also happens to be in the seed list on our clients and in cassandra.yaml ?

Our admin did the following steps during this process:

- Stop one of the 3 servers. It then appeared as DOWN to the rest of the 
cluster.
- Rebuild the system, reconfigure disks (name and ip are same as the server 
that came down)
  - NOTE: there was NO data left from before on this machine, it is a new 
bare-metal install
- nodetool removenode  (from one of the other remaining nodes)
  - wait for completion ~15 min
- Start cassandra on new node, wait for it to come up
- nodetool repair (on new node)

Immediately when it came up it was as if we'd lost 1/3 of our data because so 
many read requests were hitting this new empty node. There does appear to be 
streaming data coming into the new node, but it is still serving many empty 
reponses.

Another curious thing is that I set all of our reads to Quorum ahead of time 
hoping if this did happen again (after the first time caught us out), that the 
quorum reads would prevent the bad consistency. This does not appear to have 
helped.

Any insight as to what the heck went wrong here would be greatly appreciated.

Thanks,
Thunder



Re: Auto-Bootstrap not Auto-Bootstrapping?

2014-02-06 Thread Thunder Stumpges
I guess so, it likely was listed in the seeds in cassandra.yaml as all
three of the existing servers were, and the rebuilt one used the same name
and IP.


On Thu, Feb 6, 2014 at 1:42 PM, Keith Wright  wrote:

>  Is it a seed node?  My understanding is that they do not bootstrap
> On Feb 6, 2014 4:23 PM, Thunder Stumpges 
> wrote:
>  Hi all,
>
>  We recently needed/wanted to reconfigure the disks for our 3-node
> C*2.0.4 Cassandra setup and rebuild the server at the same time. Upon
> adding the newly rebuilt server into the cluster, it immediately started
> serving read requests with no data! Then because the latency is so "good"
> the vast majority of requests were pushed onto that server. We are using 3
> nodes with RF=3. Why wouldn't the node stream in the needed data before
> serving? My impression was that the auto_bootstrap setting was true by
> default (we have not set it anywhere) and that a new node entering the
> cluster would stream in data for its tokens (virtual nodes) prior to
> serving requests.
>
>  Does this have to do with re-using the same name/ip as the old server
> which also happens to be in the seed list on our clients and in
> cassandra.yaml ?
>
>  Our admin did the following steps during this process:
>
>  - Stop one of the 3 servers. It then appeared as DOWN to the rest of the
> cluster.
> - Rebuild the system, reconfigure disks (name and ip are same as the
> server that came down)
>   - NOTE: there was NO data left from before on this machine, it is a new
> bare-metal install
> - nodetool removenode  (from one of the other remaining nodes)
>   - wait for completion ~15 min
> - Start cassandra on new node, wait for it to come up
> - nodetool repair (on new node)
>
>  Immediately when it came up it was as if we'd lost 1/3 of our data
> because so many read requests were hitting this new empty node. There does
> appear to be streaming data coming into the new node, but it is still
> serving many empty reponses.
>
>  Another curious thing is that I set all of our reads to Quorum ahead of
> time hoping if this did happen again (after the first time caught us out),
> that the quorum reads would prevent the bad consistency. This does not
> appear to have helped.
>
>  Any insight as to what the heck went wrong here would be greatly
> appreciated.
>
>  Thanks,
> Thunder
>
>


Re: Adding datacenter for move to vnodes

2014-02-06 Thread Vasileios Vlachos
Hello,

My question is why would you need another DC to migrate to Vnodes? How
about decommissioning each node in turn, changing the cassandra.yaml
accordingly, delete the data and bring the node back in the cluster and let
it bootstrap from the others?

We did that recently with our demo cluster. Is that wrong in any way? The
only think to take into consideration is the disk space I think. We are not
using amazon, but I am not sure how would that be different for this
particular issue.

Thanks,

Bill
On 6 Feb 2014 16:34, "Alain RODRIGUEZ"  wrote:

> Glad it helps.
>
> Good luck with this.
>
> Cheers,
>
> Alain
>
>
> 2014-02-06 17:30 GMT+01:00 Katriel Traum :
>
>> Thank you Alain! That was exactly what I was looking for. I was worried
>> I'd have to do a rolling restart to change the snitch.
>>
>> Katriel
>>
>>
>>
>> On Thu, Feb 6, 2014 at 1:10 PM, Alain RODRIGUEZ wrote:
>>
>>> Hi, we did this exact same operation here too, with no issue.
>>>
>>> Contrary to Paulo we did not modify our snitch.
>>>
>>> We simply added a "dc_suffix" in the property in
>>> cassandra-rackdc.properties conf file for nodes in the new cluster :
>>>
>>> # Add a suffix to a datacenter name. Used by the Ec2Snitch and
>>> Ec2MultiRegionSnitch
>>>
>>> # to append a string to the EC2 region name.
>>>
>>> dc_suffix=-xl
>>>
>>> So our new cluster DC is basically : eu-west-xl
>>>
>>> I think this is less risky, at least it is easier to do.
>>>
>>> Hope this help.
>>>
>>>
>>> 2014-02-02 11:42 GMT+01:00 Paulo Ricardo Motta Gomes <
>>> paulo.mo...@chaordicsystems.com>:
>>>
>>> We had a similar situation and what we did was first migrate the 1.1
 cluster to GossipingPropertyFileSnitch, making sure that for each node we
 specified the correct availability zone as the rack in
 the cassandra-rackdc.properties. In this way,
 the GossipingPropertyFileSnitch is equivalent to the EC2MultiRegionSnitch,
 so the data location does not change and no repair is needed afterwards.
 So, if your nodes are located in the us-east-1e AZ, your 
 cassandra-rackdc.properties
 should look like:

 dc=us-east
 rack=1e

 After this step is complete on all nodes, then you can add a new
 datacenter specifying different dc and rack on the
 cassandra-rackdc.properties of the new DC. Make sure you upgrade your
 initial datacenter to 1.2 before adding a new datacenter with vnodes
 enabled (of course).

 Cheers


 On Sun, Feb 2, 2014 at 6:37 AM, Katriel Traum wrote:

> Hello list.
>
> I'm upgrading a 1.1 cassandra cluster to 1.2(.13).
> I've read here and in other places that the best way to migrate to
> vnodes is to add a new DC, with the same amount of nodes, and run rebuild
> on each of them.
> However, I'm faced with the fact that I'm using EC2MultiRegion snitch,
> which automagically creates the DC and RACK.
>
> Any ideas how I can go about adding a new DC with this kind of setup?
> I need these new machines to be in the same EC2 Region as the current 
> ones,
> so adding to a new Region is not an option.
>
> TIA,
> Katriel
>



 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br *
 +55 48 3232.3200
 +55 83 9690-1314

>>>
>>>
>>
>


Re: Adding datacenter for move to vnodes

2014-02-06 Thread Andrey Ilinykh
My understanding is you can't mix vnodes and regular nodes in the same DC.
Is it correct?


On Thu, Feb 6, 2014 at 2:16 PM, Vasileios Vlachos <
vasileiosvlac...@gmail.com> wrote:

> Hello,
>
> My question is why would you need another DC to migrate to Vnodes? How
> about decommissioning each node in turn, changing the cassandra.yaml
> accordingly, delete the data and bring the node back in the cluster and let
> it bootstrap from the others?
>
> We did that recently with our demo cluster. Is that wrong in any way? The
> only think to take into consideration is the disk space I think. We are not
> using amazon, but I am not sure how would that be different for this
> particular issue.
>
> Thanks,
>
> Bill
> On 6 Feb 2014 16:34, "Alain RODRIGUEZ"  wrote:
>
>> Glad it helps.
>>
>> Good luck with this.
>>
>> Cheers,
>>
>> Alain
>>
>>
>> 2014-02-06 17:30 GMT+01:00 Katriel Traum :
>>
>>> Thank you Alain! That was exactly what I was looking for. I was worried
>>> I'd have to do a rolling restart to change the snitch.
>>>
>>> Katriel
>>>
>>>
>>>
>>> On Thu, Feb 6, 2014 at 1:10 PM, Alain RODRIGUEZ wrote:
>>>
 Hi, we did this exact same operation here too, with no issue.

 Contrary to Paulo we did not modify our snitch.

 We simply added a "dc_suffix" in the property in
 cassandra-rackdc.properties conf file for nodes in the new cluster :

 # Add a suffix to a datacenter name. Used by the Ec2Snitch and
 Ec2MultiRegionSnitch

 # to append a string to the EC2 region name.

 dc_suffix=-xl

 So our new cluster DC is basically : eu-west-xl

 I think this is less risky, at least it is easier to do.

 Hope this help.


 2014-02-02 11:42 GMT+01:00 Paulo Ricardo Motta Gomes <
 paulo.mo...@chaordicsystems.com>:

 We had a similar situation and what we did was first migrate the 1.1
> cluster to GossipingPropertyFileSnitch, making sure that for each node we
> specified the correct availability zone as the rack in
> the cassandra-rackdc.properties. In this way,
> the GossipingPropertyFileSnitch is equivalent to the EC2MultiRegionSnitch,
> so the data location does not change and no repair is needed afterwards.
> So, if your nodes are located in the us-east-1e AZ, your 
> cassandra-rackdc.properties
> should look like:
>
> dc=us-east
> rack=1e
>
> After this step is complete on all nodes, then you can add a new
> datacenter specifying different dc and rack on the
> cassandra-rackdc.properties of the new DC. Make sure you upgrade your
> initial datacenter to 1.2 before adding a new datacenter with vnodes
> enabled (of course).
>
> Cheers
>
>
> On Sun, Feb 2, 2014 at 6:37 AM, Katriel Traum wrote:
>
>> Hello list.
>>
>> I'm upgrading a 1.1 cassandra cluster to 1.2(.13).
>> I've read here and in other places that the best way to migrate to
>> vnodes is to add a new DC, with the same amount of nodes, and run rebuild
>> on each of them.
>> However, I'm faced with the fact that I'm using EC2MultiRegion
>> snitch, which automagically creates the DC and RACK.
>>
>> Any ideas how I can go about adding a new DC with this kind of setup?
>> I need these new machines to be in the same EC2 Region as the current 
>> ones,
>> so adding to a new Region is not an option.
>>
>> TIA,
>> Katriel
>>
>
>
>
> --
> *Paulo Motta*
>
> Chaordic | *Platform*
> *www.chaordic.com.br *
> +55 48 3232.3200
> +55 83 9690-1314
>


>>>
>>


Re: Cassandra 2.0 with Hadoop 2.x?

2014-02-06 Thread Clint Kelly
Thunder, Cyril, thanks for your responses.

FWIW I started working on writing a set of InputFormat and
OutputFormat classes that use the DataStax Java driver instead of the
thrift client.  That seems to be going pretty well.  I am not sure if
I can eliminate all of the thrift code, but at least the RecordReader
was pretty easy to implement using only the DataStax driver.  I can
post a link if anyone is curious (once it is done).  The DataStax
driver has automatic paging and a few other nice features that made
the code more concise.

Best regards,
Clint



On Tue, Feb 4, 2014 at 3:36 PM, Thunder Stumpges
 wrote:
> Hello Clint,
>
> Yes I was able to get it working after a bit of work. I have pushed the
> branch with the fix (which is currently quite a ways behind latest). You can
> compare to yours I suppose. Let me know if you have any questions.
>
> https://github.com/VerticalSearchWorks/cassandra/tree/Cassandra2-CDH4
>
> regards,
> Thunder
>
>
>
>
> On Tue, Feb 4, 2014 at 1:40 PM, Cyril Scetbon  wrote:
>>
>> Hi,
>>
>> Look for posts from Thunder Stumpges in this mailing list. I know he has
>> succeeded to make it Hadoop 2.x work with Cassandra 2.x
>>
>> For those who are interested in using it with Cassandra 1.2.13 you can use
>> the patch
>> https://github.com/cscetbon/cassandra/commit/88d694362d8d6bc09b3eeceb6baad7b3cc068ad3.patch
>>
>> It uses Cloudera CDH4 repository for Hadoop Classes but you can use
>> others.
>>
>> Regards
>> --
>> Cyril SCETBON
>>
>> On 03 Feb 2014, at 19:10, Clint Kelly  wrote:
>>
>> > Folks,
>> >
>> > Has anyone out there used Cassandra 2.0 with Hadoop 2.x?  I saw this
>> > discussion on the Cassandra JIRA:
>> >
>> >https://issues.apache.org/jira/browse/CASSANDRA-5201
>> >
>> > but the fix referenced
>> > (https://github.com/michaelsembwever/cassandra-hadoop) is for
>> > Cassandra 1.2.
>> >
>> > I put together a similar patch for Cassandra 2.0 for anyone who is
>> > interested:
>> >
>> >https://github.com/wibiclint/cassandra2-hadoop2
>> >
>> > but I'm wondering if there is a more official solution to this
>> > problem.  Any help would be appreciated.  Thanks!
>> >
>> > Best regards,
>> > Clint
>>
>


Re: Cassandra 2.0 with Hadoop 2.x?

2014-02-06 Thread Alex Popescu
On Thu, Feb 6, 2014 at 3:50 PM, Clint Kelly  wrote:

> I can
> post a link if anyone is curious (once it is done).
>

I'm curious... thanks


-- 

:- a)


@al3xandru


Re: Cassandra 2.0 with Hadoop 2.x?

2014-02-06 Thread Steven A Robenalt
I am as well.

Thanks,
Steve



On Thu, Feb 6, 2014 at 4:13 PM, Alex Popescu  wrote:

>
> On Thu, Feb 6, 2014 at 3:50 PM, Clint Kelly  wrote:
>
>> I can
>> post a link if anyone is curious (once it is done).
>>
>
> I'm curious... thanks
>
>
> --
>
> :- a)
>
>
> @al3xandru
>



-- 
Steve Robenalt
Software Architect
HighWire | Stanford University
425 Broadway St, Redwood City, CA 94063

srobe...@stanford.edu
http://highwire.stanford.edu


Re: Cassandra 2.0 with Hadoop 2.x?

2014-02-06 Thread Clint Kelly
Okay neat, hopefully it will look reasonable by the end of the month or so!  :)

On Thu, Feb 6, 2014 at 4:15 PM, Steven A Robenalt  wrote:
> I am as well.
>
> Thanks,
> Steve
>
>
>
> On Thu, Feb 6, 2014 at 4:13 PM, Alex Popescu  wrote:
>>
>>
>> On Thu, Feb 6, 2014 at 3:50 PM, Clint Kelly  wrote:
>>>
>>> I can
>>> post a link if anyone is curious (once it is done).
>>
>>
>> I'm curious... thanks
>>
>>
>> --
>>
>> :- a)
>>
>>
>> @al3xandru
>
>
>
>
> --
> Steve Robenalt
> Software Architect
> HighWire | Stanford University
> 425 Broadway St, Redwood City, CA 94063
>
> srobe...@stanford.edu
> http://highwire.stanford.edu
>
>
>
>
>


Re: exceptions all around in clean cluster

2014-02-06 Thread Robert Coli
On Thu, Feb 6, 2014 at 8:39 AM, Ondřej Černoš  wrote:

> Update: I dropped the keyspace, the system keyspace, deleted all the data
> and started from fresh state. Now it behaves correctly. The previously
> reported state is therefore the result of the keyspace being dropped
> beforehand and recreated with no compression on sstables - maybe some
> sstables were left in system keyspace as live though the keyspace was
> completely dropped?
>

If you have a reproduction path, I recommend filing a JIRA in the Apache
Cassandra JIRA.

It's possible the response will be that dropping and recreating things
(CFs, Keyspaces) is currently problematic and will be fixed soon, but your
case seems particularly unusual/severe...

=Rob


Re: exceptions all around in clean cluster

2014-02-06 Thread Tupshin Harper
This is a known issue until Cassandra 2.1

https://issues.apache.org/jira/browse/CASSANDRA-5202

-Tupshin
On Feb 6, 2014 10:05 PM, "Robert Coli"  wrote:

> On Thu, Feb 6, 2014 at 8:39 AM, Ondřej Černoš  wrote:
>
>> Update: I dropped the keyspace, the system keyspace, deleted all the data
>> and started from fresh state. Now it behaves correctly. The previously
>> reported state is therefore the result of the keyspace being dropped
>> beforehand and recreated with no compression on sstables - maybe some
>> sstables were left in system keyspace as live though the keyspace was
>> completely dropped?
>>
>
> If you have a reproduction path, I recommend filing a JIRA in the Apache
> Cassandra JIRA.
>
> It's possible the response will be that dropping and recreating things
> (CFs, Keyspaces) is currently problematic and will be fixed soon, but your
> case seems particularly unusual/severe...
>
> =Rob
>
>


Re: CQL list command

2014-02-06 Thread Ben Hood
On Thu, Feb 6, 2014 at 9:01 PM, Andrew Cobley  wrote:
> I often use the CLI command LIST for debugging or when teaching students 
> showing them what's going on under the hood of CQL.  I see that CLI swill be 
> removed in Cassandra 3 and we will lose this ability.  It would be nice if 
> CQL retained it, or something like it for debugging and etching purposes.

I agree. I use LIST every now and then to verify the storage layout of
partitioning and cluster columns. What would be cool is to do
something like:

cqlsh:y> CREATE TABLE x (
  ... a int,
  ... b int,
  ... c int,
  ... PRIMARY KEY (a,b)
  ... );
cqlsh:y> insert into x (a,b,c) values (1,1,1);
cqlsh:y> insert into x (a,b,c) values (2,1,1);
cqlsh:y> insert into x (a,b,c) values (2,2,1);
cqlsh:y> select * from x;
 a | b | c
---+---+---
 1 | 1 | 1
 2 | 1 | 1
 2 | 2 | 1

(3 rows)

cqlsh:y> select * from x show storage; // requires monospace font

   +---+
+---+  |b:1|
|a:1| +--> |---|
+---+  |c:1|
   +---+

   +---+---+
+---+  |b:1|b:2|
|a:2| +--> |---|---|
+---+  |c:1|c:2|
   +---+---+

(2 rows)