Re: Restore Snapshots

2015-06-26 Thread Alain RODRIGUEZ
Hi Jean,

Glad to hear it worked this way.

Some other people provided (and continue providing) similar help to me,
just trying to give back to the community as much as I received from it.

See you around.

Alain

2015-06-26 8:44 GMT+02:00 Jean Tremblay :

>  Good morning,
> Alain, thank you so much. This is exactly what I needed.
>
>   In my test I had a node which had for whatever reason the directory
> containing my data corrupted. I keep in a separate folder my snapshots.
>
>  Here are the steps I took to recover my sick node:
>
>  0) Cassandra is stopped on my sick node.
> 1) I wiped out my data directory. My snapshots were kept outside this
> directory.
> 2) I modified my Cassandra.yaml. I added auto_bootstrap: false .This is to
> make sure that my node does not synch with the others.
> 3) I restarted Cassandra. This step created a basic structure for my new
> data directory.
> 4) I did the command: nodetool resetlocalschema. This recreated all the
> folders for my cf.
> 5) I stopped Cassandra on my node.
> 6) I copied my snapshot in the right location. I actually hard linked
> them, this is very fast.
> 7) I restarted Cassandra.
>
>  That's it.
>
>  Thank you SO MUCH ALAIN for your support. You really helped me a lot.
>
> On 25 Jun,2015, at 18:37, Alain RODRIGUEZ  wrote:
>
>   Hi Jean,
>
>  Answers in line to be sure to be exhaustive:
>
>  - how can I restore the data directory structure in order to copy my
> snapshots at the right position?
> --> making a script to do it and testing it I would say. basically under
> any table repo you have a "snapshots/snapshot_name" directory (snapshot_name
> is timestamp if not specified off the top of my head..) and then your
> sstables.
>
>  - is it possible to recreate the schema on one node?
> --> The easiest way that come to my mind is to set "auto_bootstrap: false"
> on a node not already in the ring. If you have trouble with the schema of a
> node in the ring run a "nodetool resetlocalschema"
>
>  - how can I avoid the node from streaming from the other nodes?
> --> See above (auto_bootstrap: false). BTW, option might not be present
> at all, just add it.
>
>  - must I also have the snapshot of the system tables in order to restore
> a node from only the snapshot of my tables?
> --> just you user table. Yet remember that snapshot is per node and as
> such you will just have part of the data this node use to hold. meaning
> that if the new node have different tokens, there will be unused data +
> missing data for sure.
>
>  Basically when a node is down I use to remove it, repair the cluster,
> and bootstap it (auto_bootstrap: true). Streams are part of Cassandra. I
> accept that. An other solution would be to "replace" the node -->
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
>
>
>  C*heers,
>
>  Alain
>
> 2015-06-25 17:07 GMT+02:00 Jean Tremblay <
> jean.tremb...@zen-innovations.com>:
>
>> Hi,
>>
>>  I am testing snapshot restore procedures in case of a major catastrophe
>> on our cluster. I’m using Cassandra 2.1.7 with RF:3
>>
>>  The scenario that I am trying to solve is how to quickly get one node
>> back to work after its disk failed and lost all its data assuming that the
>> only thing I have is its snapshots.
>>
>>  The procedure that I’m following is the one explained here:
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html
>>
>>  I can do a snapshot that is straight forward.
>> My problem is in the restore of the snapshot.
>>
>>  If I restart Cassandra with an empty data directory the node will
>> bootstrap.
>> Bootstrap is very nice, since it recreate the schema and reload the data
>> from its neighbour.
>> But this is quite heavy traffic and quite a slow process.
>>
>>  My questions are:
>>
>>  - how can I restore the data directory structure in order to copy my
>> snapshots at the right position?
>> - is it possible to recreate the schema on one node?
>> - how can I avoid the node from streaming from the other nodes?
>> - must I also have the snapshot of the system tables in order to restore
>> a node from only the snapshot of my tables?
>>
>>  Thanks for your comments.
>>
>>  Jean
>>
>>
>>
>>
>>
>


Cassandra stuck at DataSink running on cluster

2015-06-26 Thread Susanne Bülow
Hi,

 

I am trying to write into Cassandra via the CqlBulkOutputFormat from an
apache flink program. The program succeeds to write into a cassandra-cluster
while the program is running locally on my pc.

However, when trying to run the program on the cluster, it seems to get
stuck at SSTableSimpleUnsortedWriter.put() waiting for the
Diskwriter-Thread, that is not running anymore.

 

I am using cassandra version 1.5 and apache flink version 0.9.0.

 

Attached is the full stacktrace.

 

Thanks in advance,

Susanne


2015-06-26 11:15:35
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode):

"JMX server connection timeout 68" - Thread t@68
   java.lang.Thread.State: TIMED_WAITING
at java.lang.Object.wait(Native Method)
- waiting on <117d0002> (a [I)
at 
com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(ServerCommunicatorAdmin.java:168)
at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
- None

"RMI TCP Connection(4)-172.16.30.87" - Thread t@67
   java.lang.Thread.State: RUNNABLE
at sun.management.ThreadImpl.dumpThreads0(Native Method)
at sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:446)
at sun.reflect.GeneratedMethodAccessor62.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
at 
com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:193)
at 
com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:175)
at 
com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:117)
at 
com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:54)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at javax.management.StandardMBean.invoke(StandardMBean.java:405)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at sun.rmi.transport.Transport$1.run(Transport.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
- locked <4f733919> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"RMI Scheduler(0)" - Thread t@66
   java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <258b8c46> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:10

Re: [MASSMAIL]Cassandra stuck at DataSink running on cluster

2015-06-26 Thread Marcos Ortiz

Regards, Susanne.
Which version of Java are you using here?
Have you tested this with more recent versions of Cassandra?

These new version have a lot of improvements related to SSTable reading 
and writing, and much more.


I recommend you that you should use at least a 2.1.x version.
Best,

--
Marcos Ortiz , Sr. Product Manager (Data 
Infrastructure) at UCI

@marcosluis2186 

On 26/06/15 08:21, Susanne Bülow wrote:


Hi,

I am trying to write into Cassandra via the CqlBulkOutputFormat from 
an apache flink program. The program succeeds to write into a 
cassandra-cluster while the program is running locally on my pc.


However, when trying to run the program on the cluster, it seems to 
get stuck at SSTableSimpleUnsortedWriter.put() waiting for the 
Diskwriter-Thread, that is not running anymore.


I am using cassandra version 1.5 and apache flink version 0.9.0.

Attached is the full stacktrace.

Thanks in advance,

Susanne







AW: [MASSMAIL]Cassandra stuck at DataSink running on cluster

2015-06-26 Thread Susanne Bülow
Hi,

 

I am using Java 7. 

The cassandra version I use is actually 2.1.5, not 1.5. Sorry for the
confusion. 
I also tried cassandra 2.1.6, but the problem stays the same.

 

Best regards,

Susanne

 

Von: Marcos Ortiz [mailto:mlor...@uci.cu] 
Gesendet: Freitag, 26. Juni 2015 15:34
An: susanne...@gmx.de
Cc: user@cassandra.apache.org
Betreff: Re: [MASSMAIL]Cassandra stuck at DataSink running on cluster

 

Regards, Susanne.
Which version of Java are you using here?
Have you tested this with more recent versions of Cassandra?

These new version have a lot of improvements related to SSTable reading and
writing, and much more.

I recommend you that you should use at least a 2.1.x version.
Best,

-- 
Marcos Ortiz  , Sr. Product Manager (Data
Infrastructure) at UCI
@marcosluis2186  

On 26/06/15 08:21, Susanne Bülow wrote:

Hi,

 

I am trying to write into Cassandra via the CqlBulkOutputFormat from an
apache flink program. The program succeeds to write into a cassandra-cluster
while the program is running locally on my pc.

However, when trying to run the program on the cluster, it seems to get
stuck at SSTableSimpleUnsortedWriter.put() waiting for the
Diskwriter-Thread, that is not running anymore.

 

I am using cassandra version 1.5 and apache flink version 0.9.0.

 

Attached is the full stacktrace.

 

Thanks in advance,

Susanne

 

 

 



Re: Cassandra stuck at DataSink running on cluster

2015-06-26 Thread Nathan Bijnens
I strongly disagree with recommending to use version 2.1.x. It only very
recently became more or less stable. Anything before 2.1.5 was unusable.
You might be better of with a recent 2.0.n version.

Best regards,
  Nathan

On Fri, Jun 26, 2015 at 3:36 PM Marcos Ortiz  wrote:

>  Regards, Susanne.
> Which version of Java are you using here?
> Have you tested this with more recent versions of Cassandra?
>
> These new version have a lot of improvements related to SSTable reading
> and writing, and much more.
>
> I recommend you that you should use at least a 2.1.x version.
> Best,
>
> --
> Marcos Ortiz , Sr. Product Manager (Data
> Infrastructure) at UCI
> @marcosluis2186 
>
>
> On 26/06/15 08:21, Susanne Bülow wrote:
>
>  Hi,
>
>
>
> I am trying to write into Cassandra via the CqlBulkOutputFormat from an
> apache flink program. The program succeeds to write into a
> cassandra-cluster while the program is running locally on my pc.
>
> However, when trying to run the program on the cluster, it seems to get
> stuck at SSTableSimpleUnsortedWriter.put() waiting for the
> Diskwriter-Thread, that is not running anymore.
>
>
>
> I am using cassandra version 1.5 and apache flink version 0.9.0.
>
>
>
> Attached is the full stacktrace.
>
>
>
> Thanks in advance,
>
> Susanne
>
>
>
>
>
>


Mixing incremental repair with sequential

2015-06-26 Thread Carl Hu
Dear colleagues,

We are using incremental repair and have noticed that every few repairs,
the cluster experiences pauses.

We run the repair with the following command: nodetool repair -par -inc

I have tried to run it not in parallel, but get the following error:
"It is not possible to mix sequential repair and incremental repairs."

Does anyone have any suggestions?

Many thanks in advance,
Carl


Re: Mixing incremental repair with sequential

2015-06-26 Thread Alain RODRIGUEZ
"It is not possible to mix sequential repair and incremental repairs."

I guess that is a system limitation, even if I am not sure of it (I don't
have used C*2.1 yet)

I would focus on tuning your repair by :
- Monitoring performance / logs (see why the cluster hangs)
- Use range repairs (as a workaround to the Merkle tree 32K limit) or at
list run it per table (
http://www.datastax.com/dev/blog/advanced-repair-techniques)

Depending on what's the root issue that makes hang your cluster it is hard
to help you.

- If CPU is a limit, then some tuning around compactions or GC might be
needed (or a few more things)
- if you have Disk IO limitations, you might want to add machines or tune
compaction throughput
- If your network is the issue, there are commands to tune the bandwidth
used by streams.

You need to troubleshot this and give us more informations. I hope you have
a monitoring tool up and running and an easy way to detect errors on your
logs.

C*heers,

Alain

2015-06-26 16:26 GMT+02:00 Carl Hu :

> Dear colleagues,
>
> We are using incremental repair and have noticed that every few repairs,
> the cluster experiences pauses.
>
> We run the repair with the following command: nodetool repair -par -inc
>
> I have tried to run it not in parallel, but get the following error:
> "It is not possible to mix sequential repair and incremental repairs."
>
> Does anyone have any suggestions?
>
> Many thanks in advance,
> Carl
>
>


Slow reads on C* 2.0.15 using Spark Cassandra

2015-06-26 Thread Nathan Bijnens
We are using the Spark Cassandra driver, version 1.2.0 (Spark 1.2.1)
connecting to a 6 node bare metal (16gb ram, Xeon E3-1270 (8core), 4x 7,2k
SATA disks) Cassandra cluster. Spark runs on a separate Mesos cluster.

We are running a transformation job, where we read the complete contents of
a table into Spark, do some transformations and write them back to C*. We
are using Spark to do a data migration in C*.

Before we execute, the load on Cassandra is very little.

We notice incredibly slow reads, 600mb in an hour, we are using quorum
LOCAL_ONE reads.
The load_one of Cassandra increases from <1 to 60! There is no CPU wait,
only user & nice.

The table & cassandra.yaml:
https://gist.github.com/nathan-gs/908a48aed8a0eb3c3183

Anyone any idea?

Thanks,
  Nathan


Re: Slow reads on C* 2.0.15 using Spark Cassandra

2015-06-26 Thread Nate McCall
> We notice incredibly slow reads, 600mb in an hour, we are using quorum
LOCAL_ONE reads.
> The load_one of Cassandra increases from <1 to 60! There is no CPU wait,
only user & nice.

Without seeing the code and query, it's hard to tell, but I noticed
something similar when we had a client incorrectly using the 'take' method
for a result count like so:
val resultCount = query.take(count).length

'take' can call limit under the hood. The docs for the latter are
interesting:
"The limit will be applied for each created Spark partition. In other
words, unless the data are fetched from a single Cassandra partition the
number of results is unpredictable." [0]

Removing that line (it wasnt necessary for the use case) and just relying
on a simple 'myRDD.select("my_col")).toArray.foreach" got performance back
to where it should be. Per the docs, limit (and therefore take) works fine
as long as the partition key is used as a predicate in the where clause
("WHERE test_id = somevalue" in your example).

[0]
https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L92-L101

--
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: sstableloader "Could not retrieve endpoint ranges"

2015-06-26 Thread Mitch Gitman
I want to follow up on this thread to describe what I was able to get
working. My goal was to switch a cluster to vnodes, in the process
preserving the data for a single table, endpoints.endpoint_messages.
Otherwise, I could afford to start from a clean slate. As should be
apparent, I could also afford to do this within a maintenance window where
the cluster was down. In other words, I had the luxury of not having to add
a new data center to a live cluster per DataStax's documented procedure to
enable vnodes:
http://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configVnodesProduction_t.html
http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configVnodesProduction_t.html

What I got working relies on the nodetool snapshot command to create
various SSTable snapshots under
endpoints/endpoint_messages/snapshots/SNAPSHOT_NAME. The snapshots
represent the data being backed up and restored from. The backup and
restore is not directly, literally working against the original SSTables
directly in various endpoints/endpoint_messages/ directories.

   - endpoints/endpoint_messages/snapshots/SNAPSHOT_NAME/: These SSTables
   are being copied off and restored from.
   - endpoints/endpoint_messages/: These SSTables are obviously the source
   of the snapshots but are not being copied off and restored from.

Instead of using sstableloader to load the snapshots into the
re-initialized Cassandra cluster, I used the JMX StorageService.bulkLoad
command after establishing a JConsole session to each node. I copied off
the snapshots to load to a directory path that ends with
endpoints/endpoint_messages/ to give the bulk-loader a path it expects. The
directory path that is the destination for nodetool snapshot and the source
for StorageService.bulkLoad is on the same host as the Cassandra node but
outside the purview of the Cassandra node.

This procedure can be summarized as follows:
1. For each node, create a snapshot of the endpoint_messages table as a
backup.
2. Stop the cluster.
3. On each node, wipe all the data, i.e. the contents of
data_files_directories, commitlog, and saved_caches.
4. Deploy the cassandra.yaml configuration that makes the switch to vnodes
and restart the cluster to apply the vnodes change.
5. Re-create the endpoints keyspace.
6. On each node, bulk-load the snapshots for that particular node.

This summary can be reduced even further:
1. On each node, export the data to preserve.
2. On each node, wipe the data.
3. On all nodes, switch to vnodes.
4. On each node, import back in the exported data.

I'm sure this process could have been streamlined.

One caveat for anyone looking to emulate this: Our situation might have
been a little easier to reason about because our original endpoint_messages
table had a replication factor of 1. We used the vnodes switch as an
opportunity to up the RF to 3.

I can only speculate as to why what I was originally attempting wasn't
working. But what I was originally attempting wasn't precisely the use case
I care about. What I'm following up with now was.

On Fri, Jun 19, 2015 at 8:22 PM, Mitch Gitman  wrote:

> I checked the system.log for the Cassandra node that I did the jconsole
> JMX session against and which had the data to load. Lot of log output
> indicating that it's busy loading the files. Lot of stacktraces indicating
> a broken pipe. I have no reason to believe there are connectivity issues
> between the nodes, but verifying that is beyond my expertise. What's
> indicative is this last bit of log output:
>  INFO [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,441
> StreamReplyVerbHandler.java (line 44) Successfully sent
> /srv/cas-snapshot-06-17-2015/endpoints/endpoint_messages/endpoints-endpoint_messages-ic-34-Data.db
> to /10.205.55.101
>  INFO [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,457
> OutputHandler.java (line 42) Streaming session to /10.205.55.101 failed
> ERROR [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,458
> CassandraDaemon.java (line 253) Exception in thread Thread[Streaming to /
> 10.205.55.101:5,5,RMI Runtime]
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> at com.google.common.base.Throwables.propagate(Throwables.java:160)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Broken pipe
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:433)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:565)
> at
> org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(CompressedFileStreamTask.java:93)
> at
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> at org.apache.cassandra

Re: Slow reads on C* 2.0.15 using Spark Cassandra

2015-06-26 Thread Nathan Bijnens
Thanks for the suggestion, will take a look.

Our code looks like this:

val rdd = sc.cassandraTable[EventV0](keyspace, "test")

val transformed = rdd.map{e => EventV1(e.testId, e.ts, e.channel,
e.groups, e.event)}
transformed.saveToCassandra(keyspace, "test_v1")

Not sure if this code might translate to limits.

The total date in this table is +/- 2gb on disk, total data for each node
is around 290gb.

On Fri, Jun 26, 2015 at 7:01 PM Nate McCall  wrote:

> > We notice incredibly slow reads, 600mb in an hour, we are using quorum
> LOCAL_ONE reads.
> > The load_one of Cassandra increases from <1 to 60! There is no CPU wait,
> only user & nice.
>
> Without seeing the code and query, it's hard to tell, but I noticed
> something similar when we had a client incorrectly using the 'take' method
> for a result count like so:
> val resultCount = query.take(count).length
>
> 'take' can call limit under the hood. The docs for the latter are
> interesting:
> "The limit will be applied for each created Spark partition. In other
> words, unless the data are fetched from a single Cassandra partition the
> number of results is unpredictable." [0]
>
> Removing that line (it wasnt necessary for the use case) and just relying
> on a simple 'myRDD.select("my_col")).toArray.foreach" got performance back
> to where it should be. Per the docs, limit (and therefore take) works fine
> as long as the partition key is used as a predicate in the where clause
> ("WHERE test_id = somevalue" in your example).
>
> [0]
> https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L92-L101
>
> --
> -
> Nate McCall
> Austin, TX
> @zznate
>
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>


Re: Mixing incremental repair with sequential

2015-06-26 Thread Carl Hu
Thank you, Alain, for the response. We're using 2.1 indeed. I've lowered
compaction threshhold from 18 to 10mb/s. Will see what happens.

>  I hope you have a monitoring tool up and running and an easy way to
detect errors on your logs.

We do not have this. What do you use for this?

Thank you,
Carl


On Fri, Jun 26, 2015 at 11:26 AM, Alain RODRIGUEZ 
wrote:

> "It is not possible to mix sequential repair and incremental repairs."
>
> I guess that is a system limitation, even if I am not sure of it (I don't
> have used C*2.1 yet)
>
> I would focus on tuning your repair by :
> - Monitoring performance / logs (see why the cluster hangs)
> - Use range repairs (as a workaround to the Merkle tree 32K limit) or at
> list run it per table (
> http://www.datastax.com/dev/blog/advanced-repair-techniques)
>
> Depending on what's the root issue that makes hang your cluster it is hard
> to help you.
>
> - If CPU is a limit, then some tuning around compactions or GC might be
> needed (or a few more things)
> - if you have Disk IO limitations, you might want to add machines or tune
> compaction throughput
> - If your network is the issue, there are commands to tune the bandwidth
> used by streams.
>
> You need to troubleshot this and give us more informations. I hope you
> have a monitoring tool up and running and an easy way to detect errors on
> your logs.
>
> C*heers,
>
> Alain
>
> 2015-06-26 16:26 GMT+02:00 Carl Hu :
>
>> Dear colleagues,
>>
>> We are using incremental repair and have noticed that every few repairs,
>> the cluster experiences pauses.
>>
>> We run the repair with the following command: nodetool repair -par -inc
>>
>> I have tried to run it not in parallel, but get the following error:
>> "It is not possible to mix sequential repair and incremental repairs."
>>
>> Does anyone have any suggestions?
>>
>> Many thanks in advance,
>> Carl
>>
>>
>


Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?

2015-06-26 Thread Sid Tantia
Hello, I haven’t been able to find any documentation for best practices on 
this…is it okay to set up opscenter as a smaller node than the rest of the 
cluster. 


For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1 t2.micro 
node for OpsCenter?

Re: Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?

2015-06-26 Thread Jonathan Haddad
It doesn't need to be the same size. It's not part of the cluster.
On Fri, Jun 26, 2015 at 1:34 PM Sid Tantia 
wrote:

>  Hello, I haven’t been able to find any documentation for best practices
> on this…is it okay to set up opscenter as a smaller node than the rest of
> the cluster.
>
> For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1
> t2.micro node for OpsCenter?
>
>


Re: Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?

2015-06-26 Thread arun sirimalla
Hi Sid,

I would recommend you to use either c3s or m3s instances for Opscenter and
for Cassandra nodes it depends on your use case.
You can go with either c3s or i2s for Cassandra nodes. But i would
recommend you to run performance tests before selecting the instance type.
If your use case requires more CPU i would recommend  c3s.

On Fri, Jun 26, 2015 at 1:20 PM, Sid Tantia 
wrote:

>  Hello, I haven’t been able to find any documentation for best practices
> on this…is it okay to set up opscenter as a smaller node than the rest of
> the cluster.
>
> For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1
> t2.micro node for OpsCenter?
>
>


-- 
Arun
Senior Hadoop/Cassandra Engineer
Cloudwick


2014 Data Impact Award Winner (Cloudera)
http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html


Re: Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?

2015-06-26 Thread Robert Coli
On Fri, Jun 26, 2015 at 1:20 PM, Sid Tantia 
wrote:

>  For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1
> t2.micro node for OpsCenter?
>

m3.medium is below the minimum size I would use for Cassandra doing
anything meaningful, for the record.

=Rob


Re: Mixing incremental repair with sequential

2015-06-26 Thread Alain RODRIGUEZ
Here is something I wrote some time ago:

http://planetcassandra.org/blog/interview/video-advertising-platform-teads-chose-cassandra-spm-and-opscenter-to-monitor-a-personalized-ad-experience/

Monitoring absolutely necessary to understand what is happening in the
system. There is no magic in there and if you find bottlenecks, you can
think about how to alleviate things. I would say at least as much as the
design of your data models.

"I've lowered compaction threshhold from 18 to 10mb/s. Will see what
happens."
If you have no SSD and compactions are creating a bottleneck at the disk
the disk, this looks reasonable as long as the "compactions pending" metric
remains low enough.

If it is a cpu issue and you have many cores, I would advice you to try
lowering the concurrent_compactor: number. (by default 1 compactor per core)

Once again it will depend on were the pressure is. Anyway, you might want
to do anything you will try on one node only to test it first. Also, one
option at the time (or a couple that you believe would have a synergy), and
monitor the evolutions.

C*heers,

Alain

2015-06-26 21:30 GMT+02:00 Carl Hu :

> Thank you, Alain, for the response. We're using 2.1 indeed. I've lowered
> compaction threshhold from 18 to 10mb/s. Will see what happens.
>
> >  I hope you have a monitoring tool up and running and an easy way to
> detect errors on your logs.
>
> We do not have this. What do you use for this?
>
> Thank you,
> Carl
>
>
> On Fri, Jun 26, 2015 at 11:26 AM, Alain RODRIGUEZ 
> wrote:
>
>> "It is not possible to mix sequential repair and incremental repairs."
>>
>> I guess that is a system limitation, even if I am not sure of it (I don't
>> have used C*2.1 yet)
>>
>> I would focus on tuning your repair by :
>> - Monitoring performance / logs (see why the cluster hangs)
>> - Use range repairs (as a workaround to the Merkle tree 32K limit) or at
>> list run it per table (
>> http://www.datastax.com/dev/blog/advanced-repair-techniques)
>>
>> Depending on what's the root issue that makes hang your cluster it is
>> hard to help you.
>>
>> - If CPU is a limit, then some tuning around compactions or GC might be
>> needed (or a few more things)
>> - if you have Disk IO limitations, you might want to add machines or tune
>> compaction throughput
>> - If your network is the issue, there are commands to tune the bandwidth
>> used by streams.
>>
>> You need to troubleshot this and give us more informations. I hope you
>> have a monitoring tool up and running and an easy way to detect errors on
>> your logs.
>>
>> C*heers,
>>
>> Alain
>>
>> 2015-06-26 16:26 GMT+02:00 Carl Hu :
>>
>>> Dear colleagues,
>>>
>>> We are using incremental repair and have noticed that every few repairs,
>>> the cluster experiences pauses.
>>>
>>> We run the repair with the following command: nodetool repair -par -inc
>>>
>>> I have tried to run it not in parallel, but get the following error:
>>> "It is not possible to mix sequential repair and incremental repairs."
>>>
>>> Does anyone have any suggestions?
>>>
>>> Many thanks in advance,
>>> Carl
>>>
>>>
>>
>


Re: Mixing incremental repair with sequential

2015-06-26 Thread Carl Hu
Alain,

The reduction of compaction is having significant impact lowering response
time, especially at the 90th percentile level, for us.

For the record, we are using AWS's i2.2xl instance types (these are ssd).
We were running compaction_throughput_mb_per_sec at 18. Now we are running
at 10. Latency variation for reads is hugely reduced. This is very
promising.

Thanks, Alain.

Best,
Carl


On Fri, Jun 26, 2015 at 7:40 PM, Alain RODRIGUEZ  wrote:

> Here is something I wrote some time ago:
>
>
> http://planetcassandra.org/blog/interview/video-advertising-platform-teads-chose-cassandra-spm-and-opscenter-to-monitor-a-personalized-ad-experience/
>
> Monitoring absolutely necessary to understand what is happening in the
> system. There is no magic in there and if you find bottlenecks, you can
> think about how to alleviate things. I would say at least as much as the
> design of your data models.
>
> "I've lowered compaction threshhold from 18 to 10mb/s. Will see what
> happens."
> If you have no SSD and compactions are creating a bottleneck at the disk
> the disk, this looks reasonable as long as the "compactions pending" metric
> remains low enough.
>
> If it is a cpu issue and you have many cores, I would advice you to try
> lowering the concurrent_compactor: number. (by default 1 compactor per
>  core)
>
> Once again it will depend on were the pressure is. Anyway, you might want
> to do anything you will try on one node only to test it first. Also, one
> option at the time (or a couple that you believe would have a synergy), and
> monitor the evolutions.
>
> C*heers,
>
> Alain
>
> 2015-06-26 21:30 GMT+02:00 Carl Hu :
>
>> Thank you, Alain, for the response. We're using 2.1 indeed. I've lowered
>> compaction threshhold from 18 to 10mb/s. Will see what happens.
>>
>> >  I hope you have a monitoring tool up and running and an easy way to
>> detect errors on your logs.
>>
>> We do not have this. What do you use for this?
>>
>> Thank you,
>> Carl
>>
>>
>> On Fri, Jun 26, 2015 at 11:26 AM, Alain RODRIGUEZ 
>> wrote:
>>
>>> "It is not possible to mix sequential repair and incremental repairs."
>>>
>>> I guess that is a system limitation, even if I am not sure of it (I
>>> don't have used C*2.1 yet)
>>>
>>> I would focus on tuning your repair by :
>>> - Monitoring performance / logs (see why the cluster hangs)
>>> - Use range repairs (as a workaround to the Merkle tree 32K limit) or at
>>> list run it per table (
>>> http://www.datastax.com/dev/blog/advanced-repair-techniques)
>>>
>>> Depending on what's the root issue that makes hang your cluster it is
>>> hard to help you.
>>>
>>> - If CPU is a limit, then some tuning around compactions or GC might be
>>> needed (or a few more things)
>>> - if you have Disk IO limitations, you might want to add machines or
>>> tune compaction throughput
>>> - If your network is the issue, there are commands to tune the bandwidth
>>> used by streams.
>>>
>>> You need to troubleshot this and give us more informations. I hope you
>>> have a monitoring tool up and running and an easy way to detect errors on
>>> your logs.
>>>
>>> C*heers,
>>>
>>> Alain
>>>
>>> 2015-06-26 16:26 GMT+02:00 Carl Hu :
>>>
 Dear colleagues,

 We are using incremental repair and have noticed that every few
 repairs, the cluster experiences pauses.

 We run the repair with the following command: nodetool repair -par -inc

 I have tried to run it not in parallel, but get the following error:
 "It is not possible to mix sequential repair and incremental repairs."

 Does anyone have any suggestions?

 Many thanks in advance,
 Carl


>>>
>>
>