Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-05 Thread Thomas van Neerijnen
Thanks for the help Aaron.
I've checked NodeIdInfo and LocationInfo as below.
What am I looking at? I'm guessing the first row in NodeIdInfo represents
the ring with the node ids, but the second row perhaps dead nodes with old
schemas? That's a total guess, I'd be very interested to know what it and
the LocationInfo are.
If there's anything else you'd like me to check let me know, otherwise I'll
attempt your workaround later today.

[default@system] list NodeIdInfo ;
Using default limit of 100
---
RowKey: 4c6f63616c
=> (column=b10552c0-ea0f-11e0--cb1f02ccbcff, value=0a1020d2,
timestamp=1317241393645)
=> (column=e64fc8f0-595b-11e1--51be601cd0d7, value=0a1020d2,
timestamp=1329478703871)
=> (column=732d4690-a596-11e1--51be601cd09f, value=0a1020d2,
timestamp=1337860139385)
=> (column=bffd9d40-aa45-11e1--51be601cd0fe, value=0a1020d2,
timestamp=1338375234836)
=> (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
timestamp=1344414498989)
=> (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
timestamp=1345386691897)
---
RowKey: 43757272656e744c6f63616c
=> (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
timestamp=1344414498989)
=> (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
timestamp=1345386691897)

2 Rows Returned.
Elapsed time: 128 msec(s).
[default@system] list LocationInfo ;
Using default limit of 100
---
RowKey: 52696e67
=> (column=00, value=0a1080d2, timestamp=134104900)
=> (column=04a7128b6c83505dcd618720f92028f4, value=0a1020b7,
timestamp=1332360971660)
=> (column=09249249249249249249249249249249, value=0a1080cd,
timestamp=1341136002862)
=> (column=12492492492492492492492492492492, value=0a1020d3,
timestamp=1341135999465)
=> (column=1500, value=0a1060d3,
timestamp=134104671)
=> (column=1555, value=0a1020d3,
timestamp=1344530188382)
=> (column=1b6db6db6db6db6db6db6db6db6db6db, value=0a1020b1,
timestamp=1341135997643)
=> (column=1c71c71c71c71bff, value=0a1080d2,
timestamp=1317241889689)
=> (column=24924924924924924924924924924924, value=0a1060d3,
timestamp=1341135996555)
=> (column=29ff, value=0a1020d3,
timestamp=1317241534292)
=> (column=2aaa, value=0a1060d3,
timestamp=1344530187539)
=> (column=38e38e38e38e37ff, value=0a1060d3,
timestamp=1317241257569)
=> (column=38e38e38e38e38e38e38e38e38e38e38, value=0a1060d3,
timestamp=1343136501647)
=> (column=393170e0207a17d8519f0c1bfe325d51, value=0a1020d3,
timestamp=1345381375120)
=> (column=3fff, value=0a1080d3,
timestamp=134104939)
=> (column=471c71c71c71c71c71c71c71c71c71c6, value=0a1080d3,
timestamp=1343133153701)
=> (column=471c71c71c71c7ff, value=0a1080d3,
timestamp=1317241786636)
=> (column=49249249249249249249249249249249, value=0a1080d3,
timestamp=1341136002693)
=> (column=52492492492492492492492492492492, value=0a106010,
timestamp=1341136002626)
=> (column=53ff, value=0a1020d4,
timestamp=1328473688357)
=> (column=5554, value=0a1060d4,
timestamp=134104910)
=> (column=5b6db6db6db6db6db6db6db6db6db6da, value=0a1060d4,
timestamp=1332389784945)
=> (column=5b6db6db6db6db6db6db6db6db6db6db, value=0a1060d4,
timestamp=1341136001027)
=> (column=638e38e38e38e38e38e38e38e38e38e2, value=0a1060d4,
timestamp=1343125208462)
=> (column=638e38e38e38e3ff, value=0a1060d4,
timestamp=1317241257577)
=> (column=6c00, value=0a1020d3,
timestamp=134104789)
---
RowKey: 4c
=> (column=436c75737465724e616d65,
value=4d6f6e737465724d696e642050726f6420436c7573746572,
timestamp=1317241251097000)
=> (column=47656e65726174696f6e, value=50447e78, timestamp=134104152000)
=> (column=50617274696f6e6572,
value=6f72672e6170616368652e63617373616e6472612e6468742e52616e646f6d506172746974696f6e6572,
timestamp=1317241251097000)
=> (column=546f6b656e, value=2a00,
timestamp=134104214)
---
RowKey: 436f6f6b696573
=>
(column=48696e7473207075726765642061732070617274206f6620757067726164696e672066726f6d20302e362e7820746f20302e37,
value=6f68207965732c20697420746865792077657265207075726765642e,
timestamp=1317241251249)
=> (column=5072652d312e302068696e747320707572676564,
value=6f68207965732c2074686579207765726520707572676564,
timestamp=1326274339337)
---
RowKey: 426f6f747374726170
=> (column=42, value=01, timestamp=134104213)

4 Rows Returned.
Elapsed time: 34 msec(s).

On Wed, Sep 5, 2012 at 2:42 AM, aaron morton wrote:

> Hmmm, this looks like an error in ctor for NodeId$LocalNodeIdHistory. Are
> there any other ERROR log messages?
>
> Do you see either of these two messages in the log:
> "No saved local node id, using newly generated: {}"
> or
> "Saved local node id: {}"
>
>
> Can you use cassandra-cli / cql

Re: Practical node size limits

2012-09-05 Thread Віталій Тимчишин
You can try increasing streaming throttle.

2012/9/4 Dustin Wenz 

> I'm following up on this issue, which I've been monitoring for the last
> several weeks. I thought people might find my observations interesting.
>
> Ever since increasing the heap size to 64GB, we've had no OOM conditions
> that resulted in a JVM termination. Our nodes have around 2.5TB of data
> each, and the replication factor is four. IO on the cluster seems to be
> fine, though I haven't been paying particular attention to any GC hangs.
>
> The bottleneck now seems to be the repair time. If any node becomes too
> inconsistent, or needs to be replaced, the rebuilt time is over a week.
> That issue alone makes this cluster configuration unsuitable for production
> use.
>
> - .Dustin
>
> On Jul 30, 2012, at 2:04 PM, Dustin Wenz  wrote:
>
> > Thanks for the pointer! It sounds likely that's what I'm seeing. CFStats
> reports that the bloom filter size is currently several gigabytes. Is there
> any way to estimate how much heap space a repair would require? Is it a
> function of simply adding up the filter file sizes, plus some fraction of
> neighboring nodes?
> >
> > I'm still curious about the largest heap sizes that people are running
> with on their deployments. I'm considering increasing ours to 64GB (with
> 96GB physical memory) to see where that gets us. Would it be necessary to
> keep the young-gen size small to avoid long GC pauses? I also suspect that
> I may need to keep my memtable sizes small to avoid long flushes; maybe in
> the 1-2GB range.
> >
> >   - .Dustin
> >
> > On Jul 29, 2012, at 10:45 PM, Edward Capriolo 
> wrote:
> >
> >> Yikes. You should read:
> >>
> >> http://wiki.apache.org/cassandra/LargeDataSetConsiderations
> >>
> >> Essentially what it sounds like your are now running into is this:
> >>
> >> The BloomFilters for each SSTable must exist in main memory. Repair
> >> tends to create some extra data which normally gets compacted away
> >> later.
> >>
> >> Your best bet is to temporarily raise the Xmx heap and adjust the
> >> index sampling size. If you need to save the data (if it is just test
> >> data you may want to give up and start fresh)
> >>
> >> Generally the issue with the large disk configurations it is hard to
> >> keep a good ram/disk ratio. Then most reads turn into disk seeks and
> >> the throughput is low. I get the vibe people believe large stripes are
> >> going to help Cassandra. The issue is that stripes generally only
> >> increase sequential throughput, but Cassandra is a random read system.
> >>
> >> How much ram/disk you need is case dependent but 1/5 ratio of RAM to
> >> disk is where I think most people want to be, unless their system is
> >> carrying SSD disks.
> >>
> >> Again you have to keep your bloom filters in java heap memory so and
> >> design that tries to create a quatrillion small rows is going to have
> >> memory issues as well.
> >>
> >> On Sun, Jul 29, 2012 at 10:40 PM, Dustin Wenz 
> wrote:
> >>> I'm trying to determine if there are any practical limits on the
> amount of data that a single node can handle efficiently, and if so,
> whether I've hit that limit or not.
> >>>
> >>> We've just set up a new 7-node cluster with Cassandra 1.1.2 running
> under OpenJDK6. Each node is 12-core Xeon with 24GB of RAM and is connected
> to a stripe of 10 3TB disk mirrors (a total of 20 spindles each) and
> connected via dual SATA-3 interconnects. I can read and write around
> 900MB/s sequentially on the arrays. I started out with Cassandra tuned with
> all-default values, with the exception of the compaction throughput which
> was increased from 16MB/s to 100MB/s. These defaults will set the heap size
> to 6GB.
> >>>
> >>> Our schema is pretty simple; only 4 column families and each has one
> secondary index. The replication factor was set to four, and compression
> disabled. Our access patterns are intended to be about equal numbers of
> inserts and selects, with no updates, and the occasional delete.
> >>>
> >>> The first thing we did was begin to load data into the cluster. We
> could perform about 3000 inserts per second, which stayed mostly flat.
> Things started to go wrong around the time the nodes exceeded 800GB.
> Cassandra began to generate a lot of "mutations messages dropped" warnings,
> and was complaining that the heap was over 75% capacity.
> >>>
> >>> At that point, we stopped all activity on the cluster and attempted a
> repair. We did this so we could be sure that the data was fully consistent
> before continuing. Our mistake was probably trying to repair all of the
> nodes simultaneously - within an hour, Java terminated on one of the nodes
> with a heap out-of-memory message. I then increased all of the heap sizes
> to 8GB, and reduced the heap_newsize to 800MB. All of the nodes were
> restarted, and there was no no outside activity on the cluster. I then
> began a repair on a single node. Within a few hours, it OOMed again and
> exited. I then increased

Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-05 Thread Thomas van Neerijnen
forgot to answer your first question. I see this:
INFO 14:31:31,896 No saved local node id, using newly generated:
92109b80-ea0a-11e1--51be601cd0af


On Wed, Sep 5, 2012 at 8:41 AM, Thomas van Neerijnen
wrote:

> Thanks for the help Aaron.
> I've checked NodeIdInfo and LocationInfo as below.
> What am I looking at? I'm guessing the first row in NodeIdInfo represents
> the ring with the node ids, but the second row perhaps dead nodes with old
> schemas? That's a total guess, I'd be very interested to know what it and
> the LocationInfo are.
> If there's anything else you'd like me to check let me know, otherwise
> I'll attempt your workaround later today.
>
> [default@system] list NodeIdInfo ;
> Using default limit of 100
> ---
> RowKey: 4c6f63616c
> => (column=b10552c0-ea0f-11e0--cb1f02ccbcff, value=0a1020d2,
> timestamp=1317241393645)
> => (column=e64fc8f0-595b-11e1--51be601cd0d7, value=0a1020d2,
> timestamp=1329478703871)
> => (column=732d4690-a596-11e1--51be601cd09f, value=0a1020d2,
> timestamp=1337860139385)
> => (column=bffd9d40-aa45-11e1--51be601cd0fe, value=0a1020d2,
> timestamp=1338375234836)
> => (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
> timestamp=1344414498989)
> => (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
> timestamp=1345386691897)
> ---
> RowKey: 43757272656e744c6f63616c
> => (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
> timestamp=1344414498989)
> => (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
> timestamp=1345386691897)
>
> 2 Rows Returned.
> Elapsed time: 128 msec(s).
> [default@system] list LocationInfo ;
> Using default limit of 100
> ---
> RowKey: 52696e67
> => (column=00, value=0a1080d2, timestamp=134104900)
> => (column=04a7128b6c83505dcd618720f92028f4, value=0a1020b7,
> timestamp=1332360971660)
> => (column=09249249249249249249249249249249, value=0a1080cd,
> timestamp=1341136002862)
> => (column=12492492492492492492492492492492, value=0a1020d3,
> timestamp=1341135999465)
> => (column=1500, value=0a1060d3,
> timestamp=134104671)
> => (column=1555, value=0a1020d3,
> timestamp=1344530188382)
> => (column=1b6db6db6db6db6db6db6db6db6db6db, value=0a1020b1,
> timestamp=1341135997643)
> => (column=1c71c71c71c71bff, value=0a1080d2,
> timestamp=1317241889689)
> => (column=24924924924924924924924924924924, value=0a1060d3,
> timestamp=1341135996555)
> => (column=29ff, value=0a1020d3,
> timestamp=1317241534292)
> => (column=2aaa, value=0a1060d3,
> timestamp=1344530187539)
> => (column=38e38e38e38e37ff, value=0a1060d3,
> timestamp=1317241257569)
> => (column=38e38e38e38e38e38e38e38e38e38e38, value=0a1060d3,
> timestamp=1343136501647)
> => (column=393170e0207a17d8519f0c1bfe325d51, value=0a1020d3,
> timestamp=1345381375120)
> => (column=3fff, value=0a1080d3,
> timestamp=134104939)
> => (column=471c71c71c71c71c71c71c71c71c71c6, value=0a1080d3,
> timestamp=1343133153701)
> => (column=471c71c71c71c7ff, value=0a1080d3,
> timestamp=1317241786636)
> => (column=49249249249249249249249249249249, value=0a1080d3,
> timestamp=1341136002693)
> => (column=52492492492492492492492492492492, value=0a106010,
> timestamp=1341136002626)
> => (column=53ff, value=0a1020d4,
> timestamp=1328473688357)
> => (column=5554, value=0a1060d4,
> timestamp=134104910)
> => (column=5b6db6db6db6db6db6db6db6db6db6da, value=0a1060d4,
> timestamp=1332389784945)
> => (column=5b6db6db6db6db6db6db6db6db6db6db, value=0a1060d4,
> timestamp=1341136001027)
> => (column=638e38e38e38e38e38e38e38e38e38e2, value=0a1060d4,
> timestamp=1343125208462)
> => (column=638e38e38e38e3ff, value=0a1060d4,
> timestamp=1317241257577)
> => (column=6c00, value=0a1020d3,
> timestamp=134104789)
> ---
> RowKey: 4c
> => (column=436c75737465724e616d65,
> value=4d6f6e737465724d696e642050726f6420436c7573746572,
> timestamp=1317241251097000)
> => (column=47656e65726174696f6e, value=50447e78,
> timestamp=134104152000)
> => (column=50617274696f6e6572,
> value=6f72672e6170616368652e63617373616e6472612e6468742e52616e646f6d506172746974696f6e6572,
> timestamp=1317241251097000)
> => (column=546f6b656e, value=2a00,
> timestamp=134104214)
> ---
> RowKey: 436f6f6b696573
> =>
> (column=48696e7473207075726765642061732070617274206f6620757067726164696e672066726f6d20302e362e7820746f20302e37,
> value=6f68207965732c20697420746865792077657265207075726765642e,
> timestamp=1317241251249)
> => (column=5072652d312e302068696e747320707572676564,
> value=6f68207965732c2074686579207765726520707572676564,
> timestamp=1326274339337)
> ---
> RowKey: 426f6f747374726170
> => 

Cannot bootstrap new nodes in 1.0.11 ring - schema issue

2012-09-05 Thread Jason Harvey
Hey folks,

I have a 1.0.11 ring running in production with 6 nodes. Trying to 
bootstrap a new node in, and I'm getting the following consistently:

 INFO [main] 2012-09-05 04:24:13,317 StorageService.java (line 668) 
JOINING: waiting for schema information to complete


After waiting for over 30 minutes, I restarted the node to try again, and 
got the same thing. Tried wiping out the data dir on the new node, as well. 
Same result.

Turned on DEBUG, and got the following:

 INFO [main] 2012-09-05 03:58:55,205 StorageService.java (line 668) 
JOINING: waiting for schema information to complete
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 
DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily 
from /10.140.128.218
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 
DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily 
from /10.140.128.218
DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.


The logs continue with a bunch of failed migration errors from each node in 
the ring.

So, I'm guessing that there is a schema history problem on one of my nodes? 
Any clues on how I can fix this? I had considered wiping out the schema on 
one of my running nodes and starting it back up, but I'm worried it might 
not come back if it gets the same errors.


Also as a random question: is there any way to 'merge' historical schema 
changes together?


Thanks,
Jason


Re: configure KeyCahce to use Non-Heap memory ?

2012-09-05 Thread Ananth Gundabattula
Hello Aaron,

Thanks a lot for the response. Raised a request 
https://issues.apache.org/jira/browse/CASSANDRA-4619

Here is the nodetool dump: (from one of the two nodes in the cluster)

Token: 0
Gossip active: true
Thrift active: true
Load : 147.64 GB
Generation No: 1346635362
Uptime (seconds) : 182707
Heap Memory (MB) : 4884.33 / 8032.00
Data Center  : datacenter1
Rack : rack1
Exceptions   : 0
Key Cache: size 777651120 (bytes), capacity 777651120 (bytes), 44354999 
hits, 98275175 requests, 0.451 recent hit rate, 14400 save period in seconds
Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN 
recent hit rate, 0 save period in seconds


Number of rows in the 2 node cluster is 74+ Million



Regards,
Ananth




From: aaron morton mailto:aa...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Wednesday, September 5, 2012 11:33 AM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: configure KeyCahce to use Non-Heap memory ?

Is there any way I can configure KeyCahce to use Non-Heap memory ?
No.
You could add a feature request here 
https://issues.apache.org/jira/browse/CASSANDRA

Could you post some stats on the current key cache size and hit rate ? (from 
nodetool info)
It would be interesting to know how many keys it contains Vs the number of rows 
on the box and the hit rate.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 4/09/2012, at 3:01 PM, Ananth Gundabattula 
mailto:agundabatt...@threatmetrix.com>> wrote:


Is there any way I can configure KeyCahce to use Non-Heap memory ?

We have large memory nodes :  ~96GB memory per node and effectively using only  
8 GB configured for heap ( to avoid GC issues because of a large heap)

We have a constraint with respect to :

 1.  Row cache models don't reflect our data query patterns and hence can only 
optimize on the key cache
 2.  Time constrained to change our schema to be more NO-SQL specific


Regards,
Ananth



Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Martin Koch
Hi list

We have a 5-node Cassandra cluster with a single 1.0.9 installation and
four 1.0.6 installations.

We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the
instructions on http://www.datastax.com/docs/1.1/install/upgrading).

After bringing up 1.1.4 there are no errors in the log, but the cluster now
suffers from schema disagreement

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] <- The new 1.1.4 node

943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45,
10.10.145.90, 10.38.127.80] <- nodes in the old cluster

The recipe for recovering from schema disagreement (
http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the
new directory layout. The system/Schema directory is empty save for a
snapshots subdirectory. system/schema_columnfamilies and
system/schema_keyspaces contain some files. As described in datastax's
description, we tried running nodetool upgradesstables. When this had done,
describe schema in the cli showed a schema definition which seemed correct,
but was indeed different from the schema on the other nodes in the cluster.

Any clues on how we should proceed?

Thanks,
/Martin Koch


Re: Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Edward Sargisson

I would try nodetool resetlocalschema.


On 12-09-05 07:08 AM, Martin Koch wrote:

Hi list

We have a 5-node Cassandra cluster with a single 1.0.9 installation 
and four 1.0.6 installations.


We have tried installing 1.1.4 on one of the 1.0.6 nodes (following 
the instructions on http://www.datastax.com/docs/1.1/install/upgrading).


After bringing up 1.1.4 there are no errors in the log, but the 
cluster now suffers from schema disagreement


[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] <-The new 1.1.4 node

943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 
10.10.145.90, 10.38.127.80] <- nodes in the old cluster


The recipe for recovering from schema disagreement 
(http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't 
cover the new directory layout. The system/Schema directory is empty 
save for a snapshots subdirectory. system/schema_columnfamilies and 
system/schema_keyspaces contain some files. As described in datastax's 
description, we tried running nodetool upgradesstables. When this had 
done, describe schema in the cli showed a schema definition which 
seemed correct, but was indeed different from the schema on the other 
nodes in the cluster.


Any clues on how we should proceed?

Thanks,
/Martin Koch


--

Edward Sargisson

senior java developer
Global Relay

edward.sargis...@globalrelay.net 


*866.484.6630*
New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore 
(+65.3158.1301)


Global Relay Archive supports email, instant messaging, BlackBerry, 
Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, 
Facebook and more.



Ask about *Global Relay Message* 
*--- *The Future of 
Collaboration in the Financial Services World


*
*All email sent to or from this address will be retained by Global 
Relay's email archiving system. This message is intended only for the 
use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law.  Global Relay will not be liable for 
any compliance or technical information provided herein. All trademarks 
are the property of their respective owners.




SurgeCon 2012

2012-09-05 Thread Chris Burroughs
Surge [1] is scalability focused conference in late September hosted in
Baltimore.  It's a pretty cool conference with a good mix of
operationally minded people interested in scalability, distributed
systems, systems level performance and good stuff like that.  You should
go! [2]

For those of you who like historical trivia Mike Malone gave a well
recieved Cassandra talk at the first SurgeCon in 2010 [3].

This year there is organised room for BoF's and such with several
one-hour slots Wednesday and Thursday evenings, between 9 p.m. and
midnight for BoFs.  Last year a few of us got together informally around
lunch time [4].

Interested in getting together again this year?  Think we have critical
mass for a BoF?

[1] http://omniti.com/surge/2012

[2] http://omniti.com/surge/2012/register

[3] http://omniti.com/surge/2010/speakers/mike-malone

[4]
http://mail-archives.apache.org/mod_mbox/cassandra-user/201109.mbox/%3c4e82140a.5070...@gmail.com%3E


Re: unsubscribe

2012-09-05 Thread Rob Coli
http://wiki.apache.org/cassandra/FAQ#unsubscribe

On Wed, Aug 29, 2012 at 3:57 PM, Juan Antonio Gomez Moriano <
mori...@exciteholidays.com> wrote:

>
> --
>   *Juan Antonio Gomez Moriano*
> DEVELOPER TEAM LEADER  [image: Excite Holidays]
>
> T +61 2 8061 2917
>
> emori...@exciteholidays.com
>
> Wwww.exciteholidays.com
> A Suite 1901, 101 Grafton St, Bondi Junction, NSW 2022, Australia
>



-- 
=Robert Coli
AIM>ALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Omid Aladini
Do you see exceptions like "java.lang.UnsupportedOperationException:
Not a time-based UUID" in log files of nodes running 1.0.6 and 1.0.9?
Then it's probably due to [1] explained here [2] -- In this case you
either have to upgrade all nodes to 1.1.4 or if you prefer keeping a
mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join
the cluster again, unless you temporarily upgrade them to 1.0.11.

Cheers,
Omid

[1] https://issues.apache.org/jira/browse/CASSANDRA-1391
[2] https://issues.apache.org/jira/browse/CASSANDRA-4195

On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch  wrote:
>
> Hi list
>
> We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 
> 1.0.6 installations.
>
> We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the 
> instructions on http://www.datastax.com/docs/1.1/install/upgrading).
>
> After bringing up 1.1.4 there are no errors in the log, but the cluster now 
> suffers from schema disagreement
>
> [default@unknown] describe cluster;
> Cluster Information:
>Snitch: org.apache.cassandra.locator.SimpleSnitch
>Partitioner: org.apache.cassandra.dht.RandomPartitioner
>Schema versions:
> 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] <- The new 1.1.4 node
>
> 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 
> 10.10.145.90, 10.38.127.80] <- nodes in the old cluster
>
> The recipe for recovering from schema disagreement 
> (http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the 
> new directory layout. The system/Schema directory is empty save for a 
> snapshots subdirectory. system/schema_columnfamilies and 
> system/schema_keyspaces contain some files. As described in datastax's 
> description, we tried running nodetool upgradesstables. When this had done, 
> describe schema in the cli showed a schema definition which seemed correct, 
> but was indeed different from the schema on the other nodes in the cluster.
>
> Any clues on how we should proceed?
>
> Thanks,
> /Martin Koch


Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Venkat Rama
Hi,

We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM for
writes and reads.  The network we have seen between the DC is sometimes
flaky lasting few minutes to few 10 of minutes.

I wanted to know what is the best way to measure/monitor either the lag or
replication latency between the data centers.  Are there any metrics I can
monitor to find the backlog of data that needs to be transferred?

Thanks in advance.

VR


Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
As far as I know Cassandra doesn't use internal queueing mechanism specific
to replication. Cassandra sends the write the remote DC and after that it's
upto the tcp/ip stack to deal with buffering. If requests starts to timeout
Cassandra would use HH upto certain time. For longer outage you would have
to run repair.

Also look at tcp/ip tuning parameters that are helpful with your scenario:

http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html

Run iperf and test the latency.

On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama wrote:

> Hi,
>
> We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
> for writes and reads.  The network we have seen between the DC is sometimes
> flaky lasting few minutes to few 10 of minutes.
>
> I wanted to know what is the best way to measure/monitor either the lag or
> replication latency between the data centers.  Are there any metrics I can
> monitor to find the backlog of data that needs to be transferred?
>
> Thanks in advance.
>
> VR
>


Re: Practical node size limits

2012-09-05 Thread Rob Coli
On Sun, Jul 29, 2012 at 7:40 PM, Dustin Wenz  wrote:
> We've just set up a new 7-node cluster with Cassandra 1.1.2 running under 
> OpenJDK6.

It's worth noting that Cassandra project recommends Sun JRE. Without
the Sun JRE, you might not be able to use JAMM to determine the live
ratio. Very few people use OpenJDK in production, so using it also
increases the likelihood that you might be the first to encounter a
given issue. FWIW!

=Rob

-- 
=Robert Coli
AIM>ALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Venkat Rama
Thanks for the quick reply, Mohit.Can we measure/monitor the size of
Hinted Handoffs?  Would it be a good enough indicator of my back log?

Although we know when a network is flaky, we are interested in knowing how
much data is piling up in local DC that needs to be transferred.

Greatly appreciate your help.

VR


On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia wrote:

> As far as I know Cassandra doesn't use internal queueing mechanism
> specific to replication. Cassandra sends the write the remote DC and after
> that it's upto the tcp/ip stack to deal with buffering. If requests starts
> to timeout Cassandra would use HH upto certain time. For longer outage you
> would have to run repair.
>
> Also look at tcp/ip tuning parameters that are helpful with your scenario:
>
> http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html
>
> Run iperf and test the latency.
>
> On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama wrote:
>
>> Hi,
>>
>> We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
>> for writes and reads.  The network we have seen between the DC is sometimes
>> flaky lasting few minutes to few 10 of minutes.
>>
>> I wanted to know what is the best way to measure/monitor either the lag
>> or replication latency between the data centers.  Are there any metrics I
>> can monitor to find the backlog of data that needs to be transferred?
>>
>> Thanks in advance.
>>
>> VR
>>
>
>


Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
Cassandra exposes lot of metrics through Jconsole. You might be able to get
some information from Jconsole.

On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama wrote:

> Thanks for the quick reply, Mohit.Can we measure/monitor the size of
> Hinted Handoffs?  Would it be a good enough indicator of my back log?
>
> Although we know when a network is flaky, we are interested in knowing how
> much data is piling up in local DC that needs to be transferred.
>
> Greatly appreciate your help.
>
> VR
>
>
> On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia wrote:
>
>> As far as I know Cassandra doesn't use internal queueing mechanism
>> specific to replication. Cassandra sends the write the remote DC and after
>> that it's upto the tcp/ip stack to deal with buffering. If requests starts
>> to timeout Cassandra would use HH upto certain time. For longer outage you
>> would have to run repair.
>>
>> Also look at tcp/ip tuning parameters that are helpful with your scenario:
>>
>> http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html
>>
>> Run iperf and test the latency.
>>
>>  On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama wrote:
>>
>>> Hi,
>>>
>>> We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
>>> for writes and reads.  The network we have seen between the DC is sometimes
>>> flaky lasting few minutes to few 10 of minutes.
>>>
>>> I wanted to know what is the best way to measure/monitor either the lag
>>> or replication latency between the data centers.  Are there any metrics I
>>> can monitor to find the backlog of data that needs to be transferred?
>>>
>>> Thanks in advance.
>>>
>>> VR
>>>
>>
>>
>


Re: Schema Disagreement after migration from 1.0.6 to 1.1.4

2012-09-05 Thread Martin Koch
Thanks, this is exactly it. We'd like to do a rolling upgrade - this is a
production cluster - so I guess we'll upgrade 1.0.6 -> 1.0.11 -> 1.1.4,
then.

/Martin

On Thu, Sep 6, 2012 at 2:35 AM, Omid Aladini  wrote:

> Do you see exceptions like "java.lang.UnsupportedOperationException:
> Not a time-based UUID" in log files of nodes running 1.0.6 and 1.0.9?
> Then it's probably due to [1] explained here [2] -- In this case you
> either have to upgrade all nodes to 1.1.4 or if you prefer keeping a
> mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join
> the cluster again, unless you temporarily upgrade them to 1.0.11.
>
> Cheers,
> Omid
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-1391
> [2] https://issues.apache.org/jira/browse/CASSANDRA-4195
>
> On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch  wrote:
> >
> > Hi list
> >
> > We have a 5-node Cassandra cluster with a single 1.0.9 installation and
> four 1.0.6 installations.
> >
> > We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the
> instructions on http://www.datastax.com/docs/1.1/install/upgrading).
> >
> > After bringing up 1.1.4 there are no errors in the log, but the cluster
> now suffers from schema disagreement
> >
> > [default@unknown] describe cluster;
> > Cluster Information:
> >Snitch: org.apache.cassandra.locator.SimpleSnitch
> >Partitioner: org.apache.cassandra.dht.RandomPartitioner
> >Schema versions:
> > 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] <- The new 1.1.4 node
> >
> > 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45,
> 10.10.145.90, 10.38.127.80] <- nodes in the old cluster
> >
> > The recipe for recovering from schema disagreement (
> http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover
> the new directory layout. The system/Schema directory is empty save for a
> snapshots subdirectory. system/schema_columnfamilies and
> system/schema_keyspaces contain some files. As described in datastax's
> description, we tried running nodetool upgradesstables. When this had done,
> describe schema in the cli showed a schema definition which seemed correct,
> but was indeed different from the schema on the other nodes in the cluster.
> >
> > Any clues on how we should proceed?
> >
> > Thanks,
> > /Martin Koch
>


Secondary index read/write explanation

2012-09-05 Thread Venkat Rama
Hi All,

I am a new bee to Cassandra and trying to understand how secondary indexes
work.  I have been going over the discussion on
https://issues.apache.org/jira/browse/CASSANDRA-749 about local secondary
indexes. And interesting question on
http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html.  The
discussion seems to assume that most common uses cases are ones with range
queries.  Is this right?

I am trying to understand the low cardinality reasoning and how the read
gets executed.  I have following questions, hoping i can explain my
question well :)

1.  When a write request is received, it is written to the base CF and
secondary index to secondary (hidden) CF. If this right, will the secondary
index be written local the node or will it follow RP/OPP to write to nodes.
2.  When a coordinator receives a read request with say predicate x=y where
column x is the secondary index, how does the coordinator query relevant
node(s)? How does it avoid sending it to all nodes if it is locally indexed?

If there is any article/blog that can help understand this better, please
let me know.

Thanks again in advance.

VR


Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Venkat Rama
Is there a specific metric you can recommend?

VR

On Wed, Sep 5, 2012 at 9:19 PM, Mohit Anchlia wrote:

> Cassandra exposes lot of metrics through Jconsole. You might be able to
> get some information from Jconsole.
>
>
> On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama wrote:
>
>> Thanks for the quick reply, Mohit.Can we measure/monitor the size of
>> Hinted Handoffs?  Would it be a good enough indicator of my back log?
>>
>> Although we know when a network is flaky, we are interested in knowing
>> how much data is piling up in local DC that needs to be transferred.
>>
>> Greatly appreciate your help.
>>
>> VR
>>
>>
>> On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia wrote:
>>
>>> As far as I know Cassandra doesn't use internal queueing mechanism
>>> specific to replication. Cassandra sends the write the remote DC and after
>>> that it's upto the tcp/ip stack to deal with buffering. If requests starts
>>> to timeout Cassandra would use HH upto certain time. For longer outage you
>>> would have to run repair.
>>>
>>> Also look at tcp/ip tuning parameters that are helpful with your
>>> scenario:
>>>
>>> http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html
>>>
>>> Run iperf and test the latency.
>>>
>>>  On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama 
>>> wrote:
>>>
 Hi,

 We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
 for writes and reads.  The network we have seen between the DC is sometimes
 flaky lasting few minutes to few 10 of minutes.

 I wanted to know what is the best way to measure/monitor either the lag
 or replication latency between the data centers.  Are there any metrics I
 can monitor to find the backlog of data that needs to be transferred?

 Thanks in advance.

 VR

>>>
>>>
>>
>


Re: Order of the cyclic group of hashed partitioners

2012-09-05 Thread Tim Wintle
On Wed, 2012-09-05 at 13:23 +1200, aaron morton wrote:
> > I believe the question is why is the maximum 2**127 and not
> > 0x

oops - I got the wrong number of digits there.

> The maximum is the size of the digest created by MD5. 

(I may be mistaken) - isn't the range of MD5 values 
0 <= hash < (2**128)
?

If you're dropping one bit to store as a signed integer to give 127 bits
of entropy then it would be in the range:

0 <= hash < (2**127)

but the range being checked is:

0 <= hash <= (2**127)

> Does that answer the question?

I meant that what the OP spotted was it's an inclusive maximum <=

0 <= hash <= 2**127 gives (2**127) + 1 different values, and is
mathematically the clock-arithmetic (cyclic) group:
Z/(2**127 + 1)   [0]


I _believe_ the issue is actually the other way around in
AbstractHashedPartitioner (upper and lower bounds are exclusive) - but
the comments are incorrect.

i.e. both the code and the comments have off-by-one errors.


{{{

if (i.compareTo(ZERO) < 0)
throw new ConfigurationException("Token must be >= 0");
if (i.compareTo(MAXIMUM) > 0)
throw new ConfigurationException("Token must be <= 2**127");
}}}

The comments imply that 0 and 2**127 are both valid tokens (which they
shouldn't be).

The code does exclusive comparisons and excludes the value 0 though.

Tim
> 
[0] I believe the OP mistyped that as Z/(127+1)

> 
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 3/09/2012, at 8:20 PM, Tim Wintle  wrote:
> 
> > On Tue, 2012-08-28 at 16:57 +1200, aaron morton wrote:
> > > Sorry I don't understand your question. 
> > > 
> > > Can you explain it a bit more or maybe someone else knows.
> > 
> > I believe the question is why is the maximum 2**127 and not
> > 0x
> > 
> > Tim
> > 
> > > 
> > > Cheers
> > > 
> > > -
> > > Aaron Morton
> > > Freelance Developer
> > > @aaronmorton
> > > http://www.thelastpickle.com
> > > 
> > > On 27/08/2012, at 7:16 PM, Romain HARDOUIN
> > >  wrote:
> > > 
> > > > 
> > > > Thank you Aaron. 
> > > > This limit was pushed down in RandomPartitioner but the question
> > > > still exists... 
> > > > 
> > > > 
> > > > aaron morton  a écrit sur 26/08/2012
> > > > 23:35:50 :
> > > > 
> > > > > > AbstractHashedPartitioner 
> > > > > does not exist in the trunk. 
> > > > > https://git-wip-us.apache.org/repos/asf?p=cassandra.git;
> > > > > a=commitdiff;h=a89ef1ffd4cd2ee39a2751f37044dba3015d72f1
> > > > > 
> > > > > 
> > > > > Cheers
> > > > > 
> > > > > -
> > > > > Aaron Morton
> > > > > Freelance Developer
> > > > > @aaronmorton
> > > > > http://www.thelastpickle.com
> > > > > 
> > > > > On 24/08/2012, at 10:51 PM, Romain HARDOUIN
> > > > >  wrote:
> > > > > 
> > > > > > 
> > > > > > Hi, 
> > > > > > 
> > > > > > AbstractHashedPartitioner defines a maximum of 2**127 hence
> > > > > > an 
> > > > > order of (2**127)+1. 
> > > > > > I'd say that tokens of such partitioners are intented to be 
> > > > > distributed in Z/(127), hence a maximum of (2**127)-1. 
> > > > > > Could there be a mix up between maximum and order? 
> > > > > > This is a detail but could someone confirm/invalidate? 
> > > > > > 
> > > > > > Regards, 
> > > > > > 
> > > > > > Romain
> > > > > 
> > > 
> > 
> > 
> 
>