Removed node, jumps back into the cluster

2012-09-11 Thread Fredrik
I've tested a scenario where I wanted to reuse a removed node in a new 
cluster with same IP, maybe not very common but anyway, found some 
strange behaviour in Gossiper.


Here is what I think/see happening:
- Cassandra 1.1. Three node cluster A, B and C.
- Shutdown node C and remove token for node C.
- Everything looks ok in logs, reporting that node C is removed etc..
- Node A and B still sends Gossip digest about the removed node, but I 
guess that's ok since they know about it (Gossiper.endpointStateMap).

- Node C has status removed when checking in JMX console.
- Checked in LocationInfo that Ring only contains token/IP for node A and B.
- Removed system/data tables for C.
- Changed seed on C to point to itself.
- Startup node C, node C only gossips itself and node A and B doesn't 
recognize that node C is running, which is correct.
- Restart e.g. node A. Now node A will loose all gossip information 
(Gossiper.endpointStateMap) about node C. Node A will request 
information from LocationInfo and ask node B
  about endpoint states. Node A will receive information from node B 
about node C, this will trigger Gossiper.handleMajorStateChange and node 
C will be first marked as unreachable
  because it's in dead state (removed), node A will try to Gossip 
(unreachable endpoints) to node C, which will reply that it's up and 
node C becomes incorporated into the "old" cluster again.


Is this a a bug or is it a requirement that if you take a node out of 
the cluster you must change IP on the removed node if you want to use it 
in another cluster?

Please enlight me.

Regards
/Fredrik





Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Omid Aladini
Which version of Cassandra has your data been created initially with?

A bug in Cassandra 1.1.2 and earlier could cause out-of-order sstables
and inter-level overlaps in CFs with Leveled Compaction. Your sstables
generated with 1.1.3 and later should not have this issue [1] [2].

In case you have old Leveled-compacted sstables (generated with 1.1.2
or earlier. including 1.0.x) you need to run offline scrub using
Cassandra 1.1.4 or later via /bin/sstablescrub command so it'll fix
out-of-order sstables and inter-level overlaps caused by previous
versions of LCS. You need to take nodes down in order to run offline
scrub.

> After 3 hours the job is done and there are 11390 compaction tasks pending.
> My question: Can these assertions be ignored? Or do I need to worry about
> it?

They can't be ignored since pending compactions elevate the upper
bound on number of disk seeks you need to make to read a row and you
don't get the nice guarantees of leveled compaction.

Cheers,
Omid

[1] https://issues.apache.org/jira/browse/CASSANDRA-4411
[2] https://issues.apache.org/jira/browse/CASSANDRA-4321

On Mon, Sep 10, 2012 at 6:37 PM, Rudolf van der Leeden
 wrote:
> Hi,
>
> I'm getting 5 identical assertions while running 'nodetool cleanup' on a
> Cassandra 1.1.4 node with Load=104G and 80m keys.
> From  system.log :
>
> ERROR [CompactionExecutor:576] 2012-09-10 11:25:50,265
> AbstractCassandraDaemon.java (line 134) Exception in thread
> Thread[CompactionExecutor:576,1,main]
> java.lang.AssertionError
> at
> org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:214)
> at
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:158)
> at
> org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:531)
> at
> org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:254)
> at
> org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:992)
> at
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
> at
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
> at
> org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>
> After 3 hours the job is done and there are 11390 compaction tasks pending.
> My question: Can these assertions be ignored? Or do I need to worry about
> it?
>
> Thanks for your help and best regards,
> -Rudolf.
>


Re: [RELEASE] Apache Cassandra 1.1.5 released

2012-09-11 Thread André Cruz
I'm also having "AssertionError"s.

ERROR [ReadStage:51687] 2012-09-10 14:33:54,211 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[ReadStage:51687,5,main]
java.io.IOError: java.io.EOFException
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:64)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:66)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:78)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:63)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1345)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1142)
at org.apache.cassandra.db.Table.getRow(Table.java:378)
at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
at 
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:816)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1250)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
at 
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:324)
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:398)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:380)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:54)
... 14 more
ERROR [ReadStage:51801] 2012-09-10 14:44:38,852 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[ReadStage:51801,5,main]
java.lang.AssertionError: DecoratedKey(12064825934064381804725403203980154559, 
0bc7e1c580001170726573656e746174696f6e5f707074200017696d6167652f782d706f727461626c652d7069786d617000420013000102487fff8001000469636f6e04c90f4f6527560007554e4b4e4f574e000a746578742f782d74657800420013000102487fff8001000469636f6e04c90f8a0b80ac0007554e4b4e4f574e00466170706c69636174696f6e2f766e642e6f70656e786d6c666f726d6174732d6f696365646f63756d656e742e70726573656e746174696f6e6d6c2e736c69646573686f77004c0013000102487fff8001000469636f6e04c90bc7e19a88001170726573656e746174696f6e5f7070732a746578742f782d632b2b00420013000102487fff8001000469636f6e04c90f4e902aaa0007554e4b4e4f574e000c696d6167652f782d78706d6900420013000102487fff8001000469636f6e04c90f4b8360f20007554e4b4e4f574e0013696d6167652f782d77696e646f77732d626d7000440013000102487fff8001000469636f6e04c90bc7de8969696d6167655f626d7000156170706c69636174696f6e2f782d646f736578656300490013000102487fff8001000469636f6e04c90bc7dd973e61706c69636174696f6e5f6578650009766964656f2f64766400430013000102487fff8001000469636f6e04c90bc7e07598746578745f766f620008746578742f63737300430013000102487fff8001000469636f6e04c90bc7e07d68746578745f637373001d6170706c69636174696f6e2f782d73686f636b776176652d666c61736800440013000102487fff8001000469636f6e04c90bc7deb079766964656f5f737766000a746578742f782d61776b00420013000102487fff8001000469636f6e04c9117d73ced50007554e4b4e4f574e00186170706c69636174696f6e2f766e642e6d732d657863656c00430013000102487fff8001000469636f6e04c90bc7df19e80008746578745f786c73000f766964656f2f717569636b74696d6500)
 != DecoratedKey(121031529647353036275964125031804748412, 
6170706c69636174696f6e2f7a6970) in 
/var/lib/cass

Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Rudolf van der Leeden
>
> Which version of Cassandra has your data been created initially with?
> A bug in Cassandra 1.1.2 and earlier could cause out-of-order sstables
> and inter-level overlaps in CFs with Leveled Compaction. Your sstables
> generated with 1.1.3 and later should not have this issue [1] [2].
> In case you have old Leveled-compacted sstables (generated with 1.1.2
> or earlier. including 1.0.x) you need to run offline scrub using
> Cassandra 1.1.4 or later via /bin/sstablescrub command so it'll fix
> out-of-order sstables and inter-level overlaps caused by previous
> versions of LCS. You need to take nodes down in order to run offline
> scrub.
>

The data was orginally created on a 1.1.2 cluster with STCS (i.e. NOT
leveled compaction).
After the upgrade to 1.1.4 we changed from STCS to LCS w/o problems.
Then we ran more tests and created more and very big keys with millions of
columns.
The assertion only shows up with one particular CF containing these big
keys.
So, from your explanation, I don't think an offline scrub will help.

Thanks,
-Rudolf.


Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Omid Aladini
Could you, as Aaron suggested, open a ticket?

-- Omid

On Tue, Sep 11, 2012 at 2:35 PM, Rudolf van der Leeden
 wrote:
>> Which version of Cassandra has your data been created initially with?
>> A bug in Cassandra 1.1.2 and earlier could cause out-of-order sstables
>> and inter-level overlaps in CFs with Leveled Compaction. Your sstables
>> generated with 1.1.3 and later should not have this issue [1] [2].
>> In case you have old Leveled-compacted sstables (generated with 1.1.2
>> or earlier. including 1.0.x) you need to run offline scrub using
>> Cassandra 1.1.4 or later via /bin/sstablescrub command so it'll fix
>> out-of-order sstables and inter-level overlaps caused by previous
>> versions of LCS. You need to take nodes down in order to run offline
>> scrub.
>
>
> The data was orginally created on a 1.1.2 cluster with STCS (i.e. NOT
> leveled compaction).
> After the upgrade to 1.1.4 we changed from STCS to LCS w/o problems.
> Then we ran more tests and created more and very big keys with millions of
> columns.
> The assertion only shows up with one particular CF containing these big
> keys.
> So, from your explanation, I don't think an offline scrub will help.
>
> Thanks,
> -Rudolf.
>


Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Rudolf van der Leeden
> Could you, as Aaron suggested, open a ticket?
>

Done:  https://issues.apache.org/jira/browse/CASSANDRA-4644


Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-11 Thread Jonathan Ellis
Relatedly, I'd love to learn how to reliably reproduce full GC pauses
on C* 1.1+.

On Mon, Sep 10, 2012 at 12:37 PM, Oleg Dulin  wrote:
> I am currently profiling a Cassandra 1.1.1 set up using G1 and JVM 7.
>
> It is my feeble attempt to reduce Full GC pauses.
>
> Has anyone had any experience with this ? Anyone tried it ?
>
> --
> Regards,
> Oleg Dulin
> NYC Java Big Data Engineer
> http://www.olegdulin.com/
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: JVM 7, Cass 1.1.1 and G1 garbage collector

2012-09-11 Thread Shahryar Sedghi
I was able to run IBM Java 7 with Cassandra (could not do it with 1.6
because of snappy). It has a new Garbage collection policy (called
balanced)  that is good for very large heap size (over 8 GB),
documented 

here
that
is so promising with Cassandra. I have not tried it but I like to see how
it is in action.

Regrads

Shahryar

On Mon, Sep 10, 2012 at 1:37 PM, Oleg Dulin  wrote:

> I am currently profiling a Cassandra 1.1.1 set up using G1 and JVM 7.
>
> It is my feeble attempt to reduce Full GC pauses.
>
> Has anyone had any experience with this ? Anyone tried it ?
>
> --
> Regards,
> Oleg Dulin
> NYC Java Big Data Engineer
> http://www.olegdulin.com/
>
>
>


Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Janne Jalkanen

> A bug in Cassandra 1.1.2 and earlier could cause out-of-order sstables
> and inter-level overlaps in CFs with Leveled Compaction. Your sstables
> generated with 1.1.3 and later should not have this issue [1] [2].

Does this mean that LCS on 1.0.x should be considered unsafe to use? I'm using 
them for semi-wide frequently-updated CounterColumns and they're performing 
much better on LCS than on STCS.

> In case you have old Leveled-compacted sstables (generated with 1.1.2
> or earlier. including 1.0.x) you need to run offline scrub using
> Cassandra 1.1.4 or later via /bin/sstablescrub command so it'll fix
> out-of-order sstables and inter-level overlaps caused by previous
> versions of LCS. You need to take nodes down in order to run offline
> scrub.

The  1.1.5 README does not mention this. Should it?

/Janne



Compound Keys: Connecting the dots between CQL3 and Java APIs

2012-09-11 Thread Brian O'Neill
Our data architects (ex-Oracle DBA types) are jumping on the CQL3
bandwagon and creating schemas for us.  That triggered me to write a
quick article mapping the CQL3 schemas to how they are accessed via
Java APIs (for our dev team).

I hope others find this useful as well:
http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html

-brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
Apache Cassandra MVP
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: replace_token code?

2012-09-11 Thread aaron morton
This looks correct…

>  INFO [GossipStage:1] 2012-09-10 08:01:23,036 Gossiper.java (line 850) Node 
> /10.72.201.80 is now part of the cluster

>  INFO [GossipStage:1] 2012-09-10 08:01:23,037 Gossiper.java (line 816) 
> InetAddress /10.72.201.80 is now UP
>  
80 joined the ring because it was in the stored ring state. 

> INFO [GossipStage:1] 2012-09-10 08:01:23,038 StorageService.java (line 1126) 
> Nodes /10.72.201.80 and /10.190.221.204 have the same token 
> 166594924822352415786406422619018814804.  Ignoring /10.72.201.80
New node took ownership

>  INFO [GossipTasks:1] 2012-09-10 08:01:32,967 Gossiper.java (line 830) 
> InetAddress /10.72.201.80 is now dead.
>  INFO [GossipTasks:1] 2012-09-10 08:01:53,976 Gossiper.java (line 644) 
> FatClient /10.72.201.80 has been silent for 3ms, removing from gossip
Old node marked as dead and the process to remove is started. 

Has the 80 node re appeared in the logs ? 

If it does can you include the output from nodetool gossipinfo ?

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/09/2012, at 5:59 AM, Yang  wrote:

> Thanks Jim, looks I'll have to read into the code to understand what is 
> happening under the hood
> 
> yang
> 
> On Mon, Sep 10, 2012 at 9:45 AM, Jim Cistaro  wrote:
> We have seen various issues from these replaced nodes hanging around.  For 
> clusters where a lot of nodes have been replaced, we see these replaced nodes 
> having an impact on heap/GC and a lot of tcp timeouts/retransmits (because 
> the old nodes no longer exist).  As a result, we have begun cleaning these up 
> using unsafeAssassinateEndpoint via jmx.  We have only started using 
> recently.  So far no bad side effects.  This also helps because those 
> replaced nodes can appear as "unreachable nodes" wrt schema and sometimes 
> prevent things like CF truncation.
> 
> Using unsafeAssassinateEndpoint will clean these from unreachable nodes and 
> will mark them as LEFT in gossip info.  There is a ttl for them in gossipinfo 
> and they should go away after 3 days.  Once they are marked LEFT, you should 
> stop seeing those up/same/dead messages.
> 
> unsafeAssassinateEndpoint is "unsafe" in that, if you specify IP of a real 
> node in cluster, that node will be assassinated.  Otherwise, if you specify 
> nodes that have been replaced, it is supposed to work correctly.
> 
> Hope this helps,
> jc
> 
> 
> From: Yang 
> Reply-To: 
> Date: Mon, 10 Sep 2012 01:10:56 -0700
> To: 
> Subject: replace_token code?
> 
> it looks that by specifying replace_token, the old owner is not removed from 
> gossip (which I had thought it would do).
> Then it's understandable that the old owner would resurface later and we get 
> some warning saying that the same token is owned by both.
> 
> 
> I ran an example with a 2-node cluster, with RF=2.  host 10.72.201.80 was 
> running for a while and had some data, then i shut it down, and 
> booted up 10.190.221.204 with replace_token of the old token owned by the 
> previous host.
> the following log sequence shows that the new host does acquire the token, 
> but it does not at the same time remove 80 forcefully from gossip.
> instead, a few seconds later, it believed that .80 became live again.
> I don't have much understanding of the Gossip protocol, but roughly know that 
> it's probability-based, looks we need an "assertive"/"NOW" 
> membership control message for replace_token.
> 
> 
> 
> 
> thanks
> yang
> 
> 
>  WARN [main] 2012-09-10 08:00:21,855 TokenMetadata.java (line 160) Token 
> 166594924822352415786406422619018814804 changing ownership from /10.72.201.80 
> to /10.190.221.204
>  INFO [main] 2012-09-10 08:00:21,855 StorageService.java (line 753) JOINING: 
> Starting to bootstrap...
>  INFO [CompactionExecutor:2] 2012-09-10 08:00:21,875 CompactionTask.java 
> (line 109) Compacting 
> [SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-1-Data.db'),
>  
> SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-3-Data.db'),
>  
> SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-4-Data.db'),
>  
> SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-2-Data.db')]
>  INFO [CompactionExecutor:2] 2012-09-10 08:00:21,979 CompactionTask.java 
> (line 221) Compacted to 
> [/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-5-Data.db,].  
> 499 to 394 (~78% of original) bytes for 3 keys at 0.003997MB/s.  Time: 94ms.
>  INFO [Thread-4] 2012-09-10 08:00:22,070 StreamInSession.java (line 214) 
> Finished streaming session 1 from /10.72.102.61
>  INFO [main] 2012-09-10 08:00:22,073 ColumnFamilyStore.java (line 643) 
> Enqueuing flush of Memtable-LocationInfo@30624226(77/96 serialized/live 
> bytes, 2 ops)
>  INFO [FlushWriter:2] 2012-09-10 08:00:22,074 Memtable.java (line 266) 
> Writing Memtable-LocationInfo@30624226(77/96 serialized/live bytes, 2 ops)
>  INFO [F

Re: Cassandra 1.1.1 on Java 7

2012-09-11 Thread Oleg Dulin

So, my experiment didn't quite work out.

I was hoping to use G1 collector to minimize pauses -- pauses didn't 
really go away, but what's worse is I think the memtable memory 
calculations are driven by CMS, so my memtables would fill up and cause 
Cass to run out of heap :(



On 2012-09-09 19:04:41 +, Jeremy Hanna said:

Starting with 1.6.0_34, you'll need xss set to 180k.  It's updated with 
the forthcoming 1.1.5 as well as the next minor rev of 1.0.x (1.0.12).

https://issues.apache.org/jira/browse/CASSANDRA-4631
See also the comments on 
https://issues.apache.org/jira/browse/CASSANDRA-4602 for the reference 
to what required a higher stack.


On Sep 9, 2012, at 12:47 PM, Christopher Keller  wrote:

This is necessary under the later versions of 1.6v35  as well. Nodetool 
will show the cluster as being down even though individual nodes will 
be up.


--Chris


On Sep 9, 2012, at 7:13 AM, dong.yajun  wrote:

ruuning for a while, you should set the -Xss to more than 160k when you 
using jdk1.7. On Sun, Sep 9, 2012 at 3:39 AM, Peter Schuller 
 wrote:

Has anyone tried running 1.1.1 on Java 7?


Have been running jdk 1.7 on several clusters on 1.1 for a while now.

--
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)



--
Ric Dong>> Newegg Ecommerce, MIS department>> --
"The downside of being better than everyone else is that people tend to 
assume you're pretentious."



--
Regards,
Oleg Dulin
NYC Java Big Data Engineer
http://www.olegdulin.com/




Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Omid Aladini
On Tue, Sep 11, 2012 at 8:33 PM, Janne Jalkanen
 wrote:
>
>> A bug in Cassandra 1.1.2 and earlier could cause out-of-order sstables
>> and inter-level overlaps in CFs with Leveled Compaction. Your sstables
>> generated with 1.1.3 and later should not have this issue [1] [2].
>
> Does this mean that LCS on 1.0.x should be considered unsafe to
> use? I'm using them for semi-wide frequently-updated CounterColumns
> and they're performing much better on LCS than on STCS.

That's true. "Unsafe" in the sense that your data might not be in the
right shape with respect to order of keys in sstables and LCS's
properties and you might need to offline-scrub when you upgrade to the
latest 1.1.x.

>> In case you have old Leveled-compacted sstables (generated with 1.1.2
>> or earlier. including 1.0.x) you need to run offline scrub using
>> Cassandra 1.1.4 or later via /bin/sstablescrub command so it'll fix
>> out-of-order sstables and inter-level overlaps caused by previous
>> versions of LCS. You need to take nodes down in order to run offline
>> scrub.
>
> The  1.1.5 README does not mention this. Should it?

The fix was released on 1.1.3 (LCS fix) and 1.1.4 (offline scrub) and
I agree it would be helpful to have it on NEWS.txt.

Cheers,
Omid

> /Janne
>


Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Mikhail Panchenko
Based on the steps outlined here
https://issues.apache.org/jira/browse/CASSANDRA-4644?focusedCommentId=13453156&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13453156it
seems that LCS was not used until after 1.1.4 and they were able to do
a
full repair cleanup compact cycle on 1.1.4 before running into problems.

I don't see any major bugfixes for LCS in 1.1.5 either, so this appears to
be a legitimate bug if the timeline is correct.

On Tue, Sep 11, 2012 at 2:50 PM, Omid Aladini  wrote:

> On Tue, Sep 11, 2012 at 8:33 PM, Janne Jalkanen
>  wrote:
> >
> >> A bug in Cassandra 1.1.2 and earlier could cause out-of-order sstables
> >> and inter-level overlaps in CFs with Leveled Compaction. Your sstables
> >> generated with 1.1.3 and later should not have this issue [1] [2].
> >
> > Does this mean that LCS on 1.0.x should be considered unsafe to
> > use? I'm using them for semi-wide frequently-updated CounterColumns
> > and they're performing much better on LCS than on STCS.
>
> That's true. "Unsafe" in the sense that your data might not be in the
> right shape with respect to order of keys in sstables and LCS's
> properties and you might need to offline-scrub when you upgrade to the
> latest 1.1.x.
>
> >> In case you have old Leveled-compacted sstables (generated with 1.1.2
> >> or earlier. including 1.0.x) you need to run offline scrub using
> >> Cassandra 1.1.4 or later via /bin/sstablescrub command so it'll fix
> >> out-of-order sstables and inter-level overlaps caused by previous
> >> versions of LCS. You need to take nodes down in order to run offline
> >> scrub.
> >
> > The  1.1.5 README does not mention this. Should it?
>
> The fix was released on 1.1.3 (LCS fix) and 1.1.4 (offline scrub) and
> I agree it would be helpful to have it on NEWS.txt.
>
> Cheers,
> Omid
>
> > /Janne
> >
>


Re: replace_token code?

2012-09-11 Thread Yang
replied in blue, Thanks
Yang


I thought the very first log line already acquired ownership , instead of
later in the sequence?


 WARN [main] 2012-09-10 08:00:21,855 TokenMetadata.java (line 160) Token
166594924822352415786406422619018814804 changing ownership from /
10.72.201.80 to /10.190.221.204



On Tue, Sep 11, 2012 at 1:55 PM, aaron morton wrote:

> This looks correct…
>
>   INFO [GossipStage:1] 2012-09-10 08:01:23,036 Gossiper.java (line 850)
>> Node /10.72.201.80 is now part of the cluster
>>
>  INFO [GossipStage:1] 2012-09-10 08:01:23,037 Gossiper.java (line 816)
>> InetAddress /10.72.201.80 is now UP
>>
>
>>
> 80 joined the ring because it was in the stored ring state.
>


This is where I was having a doubt: instead of being allowed to come out
from "stored ring state", 80 should be immediately purged from ring
membership right after the first log  line, which purports to have acquired
ownership.  It's true that token ownership and ring membership are
orthogonal things, but here an explicit "taking over token"
operation immediately implies that the old one must be dead, and should be
kicked out of the ring. Granted that the detection of duplicate ownership
later will kick the old node out, I guess it maybe leaves room for
uncertainty before
the duplication is detected.


>
> INFO [GossipStage:1] 2012-09-10 08:01:23,038 StorageService.java (line
>> 1126) Nodes /10.72.201.80 and /10.190.221.204 have the same token
>> 166594924822352415786406422619018814804.  Ignoring /10.72.201.80
>>
> New node took ownership
>
>  INFO [GossipTasks:1] 2012-09-10 08:01:32,967 Gossiper.java (line 830)
>> InetAddress /10.72.201.80 is now dead.
>>  INFO [GossipTasks:1] 2012-09-10 08:01:53,976 Gossiper.java (line 644)
>> FatClient /10.72.201.80 has been silent for 3ms, removing from gossip
>>
> Old node marked as dead and the process to remove is started.
>
> Has the 80 node re appeared in the logs ?
>
no,

>
> If it does can you include the output from nodetool gossipinfo ?
>






>
> Cheers
>
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 11/09/2012, at 5:59 AM, Yang  wrote:
>
> Thanks Jim, looks I'll have to read into the code to understand what is
> happening under the hood
>
> yang
>
> On Mon, Sep 10, 2012 at 9:45 AM, Jim Cistaro  wrote:
>
>>  We have seen various issues from these replaced nodes hanging around.
>>  For clusters where a lot of nodes have been replaced, we see these
>> replaced nodes having an impact on heap/GC and a lot of tcp
>> timeouts/retransmits (because the old nodes no longer exist).  As a result,
>> we have begun cleaning these up using unsafeAssassinateEndpoint via jmx.
>>  We have only started using recently.  So far no bad side effects.  This
>> also helps because those replaced nodes can appear as "unreachable nodes"
>> wrt schema and sometimes prevent things like CF truncation.
>>
>>  Using unsafeAssassinateEndpoint will clean these from unreachable nodes
>> and will mark them as LEFT in gossip info.  There is a ttl for them in
>> gossipinfo and they should go away after 3 days.  Once they are marked
>> LEFT, you should stop seeing those up/same/dead messages.
>>
>>  unsafeAssassinateEndpoint is "unsafe" in that, if you specify IP of a
>> real node in cluster, that node will be assassinated.  Otherwise, if you
>> specify nodes that have been replaced, it is supposed to work correctly.
>>
>>  Hope this helps,
>> jc
>>
>>
>>   From: Yang 
>> Reply-To: 
>> Date: Mon, 10 Sep 2012 01:10:56 -0700
>> To: 
>> Subject: replace_token code?
>>
>>  it looks that by specifying replace_token, the old owner is not removed
>> from gossip (which I had thought it would do).
>> Then it's understandable that the old owner would resurface later and we
>> get some warning saying that the same token is owned by both.
>>
>>
>>  I ran an example with a 2-node cluster, with RF=2.  host 10.72.201.80
>> was running for a while and had some data, then i shut it down, and
>> booted up 10.190.221.204 with replace_token of the old token owned by the
>> previous host.
>> the following log sequence shows that the new host does acquire the
>> token, but it does not at the same time remove 80 forcefully from gossip.
>> instead, a few seconds later, it believed that .80 became live again.
>> I don't have much understanding of the Gossip protocol, but roughly know
>> that it's probability-based, looks we need an "assertive"/"NOW"
>> membership control message for replace_token.
>>
>>
>>
>>
>>  thanks
>> yang
>>
>>
>>   WARN [main] 2012-09-10 08:00:21,855 TokenMetadata.java (line 160)
>> Token 166594924822352415786406422619018814804 changing ownership from /
>> 10.72.201.80 to /10.190.221.204
>>  INFO [main] 2012-09-10 08:00:21,855 StorageService.java (line 753)
>> JOINING: Starting to bootstrap...
>>  INFO [CompactionExecutor:2] 2012-09-10 08:00:21,875 CompactionTask.java
>> (line 109) Compacting
>> [SSTableReader(path='/mnt/cassandra/data/system

How to replace a dead *seed* node while keeping quorum

2012-09-11 Thread Edward Sargisson

Hi all,
We just ran into an interesting and unexpected situation with restarting 
a downed node.


If the downed node is a seed node then neither of the replace a dead 
node procedures work (-Dcassandra.replace_token and taking 
initial_token-1). The ring remains split.
The host is listed as a seed in the config for the other members of the 
ring. If we rename the host then it will rejoin the ring.
In other words, if the host name is on the seeds list then it appears 
that the rest of the ring refuses to bootstrap it.


This leads to a problem: If the node needs to be taken out of the seeds 
list on every working node then that requires a restart of each node - 
which means that, for short periods, the ring is missing 2 nodes and a 
quorum read or write (RF=3) will fail.


Are there any useful tricks for restarting the node with the same 
hostname or are we expected to rename the node?


Cheers,
Edward
--

Edward Sargisson

senior java developer
Global Relay

edward.sargis...@globalrelay.net 


*866.484.6630*
New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore 
(+65.3158.1301)


Global Relay Archive supports email, instant messaging, BlackBerry, 
Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, 
Facebook and more.



Ask about *Global Relay Message* 
*--- *The Future of 
Collaboration in the Financial Services World


*
*All email sent to or from this address will be retained by Global 
Relay's email archiving system. This message is intended only for the 
use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law.  Global Relay will not be liable for 
any compliance or technical information provided herein. All trademarks 
are the property of their respective owners.




Re: Number of columns per row for Composite Primary Key CQL 3.0

2012-09-11 Thread Data Craftsman 木匠
Hi Aaron,

Thanks for the suggestion, as always.  :)   I'll read your slides soon.

What is "MM" stands for? million ?

Thanks,
Charlie

On Mon, Sep 10, 2012 at 6:37 PM, aaron morton  wrote:
> In general wider rows take a bit longer to read, however different access
> patterns have different performance. I did some tests here
> http://www.slideshare.net/aaronmorton/cassandra-sf-2012-technical-deep-dive-query-performance
> and http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/
>
> I would suggest 1MM cols is fine, if you get to 10MM cols per row you
> probably have gone too far. Remember the byte size of the row is also
> important; larger rows churn memory more and take longer to compact /
> repair.
>
> Hope that helps.
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 8/09/2012, at 11:05 AM, Data Craftsman 木匠 
> wrote:
>
> Hello experts.
>
> Should I limit the number of rows per Composite Primary Key's leading
> column?
>
> I think it falls into the same wide row good practice for number of
> columns per row for CQL 2.0, e.g. 10M or less.
>
> Any comments will be appreciated.
>
> --
> Thanks,
>
> Charlie (@mujiang) 木匠
> ===
> Data Architect Developer 汉唐 田园牧歌DBA
> http://mujiang.blogspot.com


how to enter float value from cassandra-cli ?

2012-09-11 Thread Yuhan Zhang
Hi all,

I'm trying to manually adding some double values into a column family. From
the Hector client, there's a DoubleSerializer.
but looks like the cli tool is not providing a way to enter floating point
values. here's the message I got:

[default@video] set cateogry['1']['sport'] = float('0.5');
Function 'float' not found. Available functions: bytes, integer, long, int,
lexicaluuid, timeuuid, utf8, ascii, countercolumn.

Is there a way to insert floating pointer value from the cli tool?


Thank you.

Yuhan


Re: [RELEASE] Apache Cassandra 1.1.5 released

2012-09-11 Thread Jason Axelson
Hi André,

That looks like something that I've run into as well on previous
versions of Cassandra. Our workaround was to not drop a keyspace and
the re-use it (which we were doing as part of a test suite).

This is a related stackoverflow post:
http://stackoverflow.com/questions/11623356/cassandra-server-throws-java-lang-assertionerror-decoratedkey-decorated

Jason

On Mon, Sep 10, 2012 at 11:29 PM, André Cruz  wrote:
> I'm also having "AssertionError"s.
>
> ERROR [ReadStage:51687] 2012-09-10 14:33:54,211 AbstractCassandraDaemon.java 
> (line 134) Exception in thread Thread[ReadStage:51687,5,main]
> java.io.IOError: java.io.EOFException
> at 
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:64)
> at 
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:66)
> at 
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:78)
> at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
> at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:63)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1345)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1142)
> at org.apache.cassandra.db.Table.getRow(Table.java:378)
> at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
> at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:816)
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1250)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.EOFException
> at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
> at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
> at 
> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:324)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:398)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:380)
> at 
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.(SSTableSliceIterator.java:54)
> ... 14 more
> ERROR [ReadStage:51801] 2012-09-10 14:44:38,852 AbstractCassandraDaemon.java 
> (line 134) Exception in thread Thread[ReadStage:51801,5,main]
> java.lang.AssertionError: 
> DecoratedKey(12064825934064381804725403203980154559, 
> 0bc7e1c580001170726573656e746174696f6e5f707074200017696d6167652f782d706f727461626c652d7069786d617000420013000102487fff8001000469636f6e04c90f4f6527560007554e4b4e4f574e000a746578742f782d74657800420013000102487fff8001000469636f6e04c90f8a0b80ac0007554e4b4e4f574e00466170706c69636174696f6e2f766e642e6f70656e786d6c666f726d6174732d6f696365646f63756d656e742e70726573656e746174696f6e6d6c2e736c69646573686f77004c0013000102487fff8001000469636f6e04c90bc7e19a88001170726573656e746174696f6e5f7070732a746578742f782d632b2b00420013000102487fff8001000469636f6e04c90f4e902aaa0007554e4b4e4f574e000c696d6167652f782d78706d6900420013000102487fff8001000469636f6e04c90f4b8360f20007554e4b4e4f574e0013696d6167652f782d77696e646f77732d626d7000440013000102487fff8001000469636f6e04c90bc7de8969696d6167655f626d7000156170706c69636174696f6e2f782d646f736578656300490013000102487fff8001000469636f6e04c90bc7dd973e61706c69636174696f6e5f6578650009766964656f2f64766400430013000102487fff8001000469636f6e04c90bc7e07598746578745f766f620008746578742f63737300430013000102487fff8001000469636f6e04c90bc7e07d68746578745f637373001d6170706c69636174696f6e2f782d73686f636b776176652d666c61736800440013000102487fff8001000469636f6e04c90bc7deb079766964656f5f7377

Re: nodetool connection refused

2012-09-11 Thread Manu Zhang
problems solved. I didn't add the jmx_host and jmx_port to vm_arguments in
Eclipse. How come it is not covered in wiki
http://wiki.apache.org/cassandra/RunningCassandraInEclipse ? Or is it
outdated?

On Mon, Sep 10, 2012 at 10:11 AM, Manu Zhang wrote:

> It's more like an Eclipse issue now since I find a "0.0.0.0:7199"
> listener when executing "bin/cassandra" in terminal but none when running
> Cassandra in Eclipse.
>
>
> On Sun, Sep 9, 2012 at 12:56 PM, Manu Zhang wrote:
>
>> No, I don't find a listener whose port is 7199. Where to setup? I've been
>> experimenting on my laptop so both of them are local.
>>
>>
>> On Sun, Sep 9, 2012 at 1:28 AM, Senthilvel Rangaswamy <
>> senthil...@gmail.com> wrote:
>>
>>> What is the address for thrift listener. Did you put 0.0.0.0:7199 ?
>>>
>>> On Fri, Sep 7, 2012 at 11:53 PM, Manu Zhang wrote:
>>>
 When I run Cassandra-trunk in Eclipse, nodetool fail to connect with
 the following error
 "Failed to connect to '127.0.0.1:7199': Connection refused"
 But if I run in terminal, all will be fine.

>>>
>>>
>>>
>>> --
>>> ..Senthil
>>>
>>> "If there's anything more important than my ego around, I want it
>>>  caught and shot now."
>>> - Douglas Adams.
>>>
>>>
>>
>


Re: Astyanax InstantiationException when accessing ColumnList

2012-09-11 Thread Ran User
Oops, forgot to mention Cassandra version - 1.1.4

On Tue, Sep 11, 2012 at 5:54 AM, Ran User  wrote:

> Stuck for hours on this one, thanks in advance!
>
> -  Scala 2.9.2
> - Astyanax 1.0.6 (also tried 1.0.5)
> - Using CompositeRowKey, CompositeColumnName
> - No problem inserting into Cassandra
> - Can read a row, ColumnList.size() returns correct count however any
> attempt to access ColumnList (i.e. iterate, access iterate ColumnList,
> getColumnByIndex(), getColumnByName(), etc) will throw the following
> exception:
>
> Exception:
>
> java.lang.RuntimeException: java.lang.InstantiationException
>
> relevant stack trace:
>
> java.lang.RuntimeException: java.lang.InstantiationException:
> shops.integration.db.scalaquery.ReportingDao$MetricsLogFileCompositeColumn
> at
> com.netflix.astyanax.serializers.AnnotatedCompositeSerializer.fromByteBuffer(AnnotatedCompositeSerializer.java:136)
> at
> com.netflix.astyanax.serializers.AbstractSerializer.fromBytes(AbstractSerializer.java:40)
> at
> com.netflix.astyanax.thrift.model.ThriftColumnOrSuperColumnListImpl.constructMap(ThriftColumnOrSuperColumnListImpl.java:201)
> at
> com.netflix.astyanax.thrift.model.ThriftColumnOrSuperColumnListImpl.getColumn(ThriftColumnOrSuperColumnListImpl.java:189)
> at
> com.netflix.astyanax.thrift.model.ThriftColumnOrSuperColumnListImpl.getColumnByName(ThriftColumnOrSuperColumnListImpl.java:103)
>
> Relevant sample code:
>
> class TestCompositeColumn(@(Component @field) var logFileId: Long,
> @(Component @field) var dt: String, @(Component @field) var dk: String)
> extends Ordered[TestCompositeColumn] {
> def this() = this(0l, "", "")
> //equals, hashCode, compare all implemented
> }
>
> I've also tried this variation on the class:
>
> class TestCompositeColumn(idIn: Long, key1In: String, key2In: String)
> extends Ordered[TestCompositeColumn] {
> @Component(ordinal = 0) var id: Long = idIn
> @Component(ordinal = 1) var key1: String = key1In
> @Component(ordinal = 2) var key2: String = key2In
>
> def this() = this(0, null, null)
> //equals, hashCode, compare all implemented
> }
> val TEST_COLUMN_FAMILY = new ColumnFamily[TestRowKey, TestCompositeColumn](
> "test_column_family",
> new AnnotatedCompositeSerializer[TestRowKey](classOf[TestRowKey]),
> new
> AnnotatedCompositeSerializer[TestCompositeColumn](classOf[TestCompositeColumn]),
> BytesArraySerializer.get());
>
> var columnList = keyspace.prepareQuery(TEST_COLUMN_FAMILY)
> .getKey(TestRowKey(1l, 2012090100))
> .execute().getResult()
>
> // OK - will return 6 for example, also verified via cassandra-cli
> println(columnList.size())
>
> // ERROR - will throw exception above.  Iterating, or any type of access
> will also throw same exception
> println(columnList.getColumnByIndex(0).getStringValue())
>
> Thank you!!!
>
>
>