Brisk Unbalanced Ring

2011-07-18 Thread tamara.alexander
We're running brisk v1 beta2 on 12 nodes - 8 cassandra in DC1 and 4 brisk in DC 
2 in EC2. Wrote a few TBs of data to the cluster, and unfortunately the load is 
very unbalanced. Every key is the same size and we are using RandomPartitioner.

There are two replicas of data in DC1 and one replica in DC2. The load amount 
in DC2 makes sense (about 250GB per node). DC1 should also have about 250GB per 
node (since there is twice the data and twice the number of nodes), but as can 
be seen below two nodes have an inordinate amount of data and the other 6 have 
only about 128GB:

Address DC  RackStatus State   LoadOwns
Token
   
148873535527910577765226390751398592512
10.2.206.127DC1 RAC1Up Normal  901.6 GB12.50%  0
10.116.230.151  DC2 RAC1Up Normal  258.23 GB   6.25%   
10633823966279326983230456482242756608
10.110.6.237DC1 RAC1Up Normal  129.08 GB   6.25%   
21267647932558653966460912964485513216
10.2.38.43  DC1 RAC1Up Normal  128.51 GB   12.50%  
42535295865117307932921825928971026432
10.114.39.110   DC2 RAC1Up Normal  257.32 GB   6.25%   
53169119831396634916152282411213783040
10.210.27.208   DC1 RAC1Up Normal  128.67 GB   6.25%   
63802943797675961899382738893456539648
10.207.39.230   DC1 RAC2Up Normal  643.14 GB   12.50%  
85070591730234615865843651857942052864
10.85.157.77DC2 RAC1Up Normal  256.78 GB   6.25%   
95704415696513942849074108340184809472
10.2.209.240DC1 RAC2Up Normal  128.96 GB   6.25%   
106338239662793269832304564822427566080
10.96.74.213DC1 RAC2Up Normal  128.3 GB12.50%  
127605887595351923798765477786913079296
10.194.205.155  DC2 RAC1Up Normal  257.15 GB   6.25%   
138239711561631250781995934269155835904
10.201.194.16   DC1 RAC2Up Normal  129.46 GB   6.25%   
148873535527910577765226390751398592512

I should also node that the first node used to have 640GB of load until the 
instance went down and we needed to run repair on a new instance in its place.

Any ideas why this may have happened?

Thanks,
Tamara


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the email by you is prohibited.


RE: Need some beginner help with Eclipse+Hector with Cassandra 0.7

2011-01-11 Thread tamara.alexander
What about this logger error? I'm getting it too, and I am also running simple 
code with Hector and Eclipse:
log4j:WARN No appenders could be found for logger 
(me.prettyprint.cassandra.connection.CassandraHostRetryService).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.

Went to the site in the error and it says: 
This occurs when the default configuration files log4j.properties and log4j.xml 
can not be found and the application performs no explicit configuration.
I don't think it is the problem because those files are where they should be. 
Any ideas?

Thanks!

-Original Message-
From: Nate McCall [mailto:n...@riptano.com] 
Sent: Monday, January 10, 2011 7:37 PM
To: user@cassandra.apache.org
Cc: hector-us...@googlegroups.com
Subject: Re: Need some beginner help with Eclipse+Hector with Cassandra 0.7

Add commons-pooling to the classpath (the remaining references will be
removed shortly as it is no longer actively used). An updated version
of the Hector doc will be out shortly to reflect a few minor changes -
thanks for pointing out this specifically though.

I've cc'ed hector-us...@googlegroups.com -  feel free to send
hector-specific questions here in the future.


On Mon, Jan 10, 2011 at 7:19 PM, Cassy Andra  wrote:
> Hi,
>
> I'm trying to use Eclipse with Hector (latest version) to write a new row to
> Cassandra 0.7RC4. However, I keep getting a Java error. Any
>
>
> Here is the .java file:
> - - - - - -
> import me.prettyprint.cassandra.serializers.StringSerializer;
> import me.prettyprint.cassandra.service.CassandraHostConfigurator;
> import me.prettyprint.hector.api.Cluster;
> import me.prettyprint.hector.api.Keyspace;
> import me.prettyprint.hector.api.factory.HFactory;
> import me.prettyprint.hector.api.mutation.Mutator;
>
> class HelloWorldApp {
> private static StringSerializer stringSerializer =
> StringSerializer.get();
> public static void main(String[] args) {
> System.out.println("Hello World!"); // Display the string.
>
> Cluster cluster = HFactory.getOrCreateCluster("Test Cluster",
> new CassandraHostConfigurator("170.252.179.233:9160"));
> Keyspace keyspace = HFactory.createKeyspace("SameerKey",
> cluster);
> Mutator mutator = HFactory.createMutator(keyspace,
> stringSerializer);
> mutator.insert("cat", "Pets",
> HFactory.createStringColumn("says", "meow"));
>
> }
> }
> - - - - - - - -
>
> Error:
>
> Hello World!
> log4j:WARN No appenders could be found for logger
> (me.prettyprint.cassandra.connection.CassandraHostRetryService).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
> Exception in thread "main" java.lang.Error: Unresolved compilation problems:
> The import org.apache.commons.pool cannot be resolved
> GenericObjectPool cannot be resolved to a variable
> GenericObjectPool cannot be resolved to a variable
> GenericObjectPool cannot be resolved to a variable
>
> at
> me.prettyprint.cassandra.service.CassandraHost.(CassandraHost.java:9)
> at
> me.prettyprint.cassandra.service.CassandraHostConfigurator.buildCassandraHosts(CassandraHostConfigurator.java:53)
> at
> me.prettyprint.cassandra.connection.HConnectionManager.(HConnectionManager.java:60)
> at
> me.prettyprint.cassandra.service.AbstractCluster.(AbstractCluster.java:62)
> at
> me.prettyprint.cassandra.service.AbstractCluster.(AbstractCluster.java:58)
> at
> me.prettyprint.cassandra.service.ThriftCluster.(ThriftCluster.java:17)
> at
> me.prettyprint.hector.api.factory.HFactory.createCluster(HFactory.java:107)
> at
> me.prettyprint.hector.api.factory.HFactory.getOrCreateCluster(HFactory.java:99)
> at HelloWorldApp.main(test.java:13)
>
>
> - - - - - -
>
> By the way, the Hector PDF said to add the google-collections library as a
> runtime dependency, but I couldn't find v1.0 of this b/c it's been replaced
> by Guava. I added Guava r07 to the build path. Also, instead of slf4j-api &
> slf4j-log4j 1.5.8, I'm using v1.6.1 (not sure if this matters).
>
> Any ideas?
>
>
>



This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information.  If you have received it in 
error, please notify the sender immediately and delete the original.  Any other 
use of the email by you is prohibited.


Decommissioning node is causing broken pipe error

2011-05-03 Thread tamara.alexander
Hi all,

I ran decommission on a node in my 32 node cluster. After about an hour of 
streaming files to another node, I got this error on the node being 
decommissioned:
INFO [MiscStage:1] 2011-05-03 21:49:00,235 StreamReplyVerbHandler.java (line 
58) Need to re-stream file /raiddrive/MDR/MeterRecords-f-2283-Data.db to 
/10.206.63.208
ERROR [Streaming:1] 2011-05-03 21:49:01,580 DebuggableThreadPoolExecutor.java 
(line 103) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Broken pipe
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
at 
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:105)
at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:67)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
ERROR [Streaming:1] 2011-05-03 21:49:01,581 AbstractCassandraDaemon.java (line 
112) Fatal exception in thread Thread[Streaming:1,1,main]
java.lang.RuntimeException: java.io.IOException: Broken pipe
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
at 
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:105)
at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:67)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more

And this message on the node that it was streaming to:
INFO [Thread-333] 2011-05-03 21:49:00,234 StreamInSession.java (line 121) 
Streaming of file 
/raiddrive/MDR/MeterRecords-f-2283-Data.db/(98605680685,197932763967)
 progress=49016107008/99327083282 - 49% from 
org.apache.cassandra.streaming.StreamInSession@33721219 failed: requesting a 
retry.

I tried running decommission again (and running scrub + decommission), but I 
keep getting this error on the same file.

I checked out the file and saw that it is a lot bigger than all the other 
sstables... 184GB instead of about 74MB. I haven't run a major compaction for a 
bit, so I'm trying to stream 658 sstables.

I'm using Cassandra 0.7.4, I have two data directories (I know that's not good 
practice...), and all my nodes are on Amazon EC2.

Any thoughts on what could be going on or how to prevent this?

Thanks!
Tamara




This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the email by you is prohibited.


RE: Decommissioning node is causing broken pipe error

2011-05-05 Thread tamara.alexander
Unfortunately no messages at ERROR level:

INFO [Thread-460] 2011-05-04 21:31:14,427 StreamInSession.java (line 121) 
Streaming of file 
/raiddrive/MDR/MeterRecords-f-2264-Data.db/(98339515276,197218618166)
 progress=41536315392/98879102890 - 42% from 
org.apache.cassandra.streaming.StreamInSession@4eef9d00 failed: requesting a 
retry.
DEBUG [Thread-460] 2011-05-04 21:31:14,427 FileUtils.java (line 48) Deleting 
MeterRecords-tmp-f-3522-Data.db
DEBUG [Thread-460] 2011-05-04 21:31:16,410 IncomingTcpConnection.java (line 
125) error reading from socket; closing
java.io.IOException: No space left on device
at sun.nio.ch.FileDispatcher.pwrite0(Native Method)
at sun.nio.ch.FileDispatcher.pwrite(FileDispatcher.java:45)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:96)
at sun.nio.ch.IOUtil.write(IOUtil.java:56)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:648)
at 
sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel(FileChannelImpl.java:569)
at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:603)
at 
org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:86)
at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91)

Not sure why we didn't think to check available disk space to begin with, but 
it would have been nice to get an error regardless.

Thanks again for your help!

From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Thursday, May 05, 2011 4:54 PM
To: user@cassandra.apache.org
Subject: Re: Decommissioning node is causing broken pipe error

Could you provide some of the log messages when the receiver ran out of disk 
space ? Sounds like it should be at ERROR level.

Thanks

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 6 May 2011, at 09:16, Sameer Farooqui wrote:


Just wanted to update you guys that we turned on DEBUG level logging on the 
decommissioned node and the node receiving the decommissioned node's range. We 
did this by editing /conf/log4j-server.properties and changing 
the log4j.rootLogger to DEBUG.

We ran decommission again and saw the that the receiving node was running out 
of disk space. The 184GB file was not able to fully stream to the receiving 
node.

We simply added more disk space to the receiving node and then decommission ran 
successfully.

Thanks for your help Aaron and also thanks for all those Cassandra articles on 
your blog. We found them helpful.

- Sameer
Accenture Technology Labs

On Thu, May 5, 2011 at 3:59 AM, aaron morton 
mailto:aa...@thelastpickle.com>> wrote:
Yes that was what I was trying to say.

thanks
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 5 May 2011, at 18:52, Tyler Hobbs wrote:


On Thu, May 5, 2011 at 1:21 AM, Peter Schuller 
mailto:peter.schul...@infidyne.com>> wrote:
> It's no longer recommended to run nodetool compact regularly as it can mean
> that some tombstones do not get to be purged for a very long time.
I think this is a mis-typing; it used to be that major compactions
were necessary to remove tombstones, but this is no longer the case in
0.7 so that the need for major compactions is significantly lessened
or even eliminated. However, running major compactions won't cause
tombstones *not* to be removed; it's just not required *in order* for
them to be removed.

I think he was suggesting that any tombstones *left* in the large sstable 
generated by the major compaction won't be removed for a long time because that 
sstable itself will not participate in any minor compactions for a long time.  
(In general, rows in that sstable will not be merged for a long time.)

--
Tyler Hobbs
Software Engineer, DataStax
Maintainer of the pycassa Cassandra Python 
client library





This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the email by you is prohibited.