Corrupt SSTable after dropping column
Hi, It seems like dropping a column can cause a "java.io.IOException: Corrupt empty row found in unfiltered partition" exception when existing SSTables are later compacted. This seems to happen with all Cassandra 3.x versions and is very easy to replicate. I've created a jira with all the details: https://issues.apache.org/jira/browse/CASSANDRA-13337 This is extra problematic with Cassandra < 3.10 where nodes will fail to start once this happens. Have anyone else seen this? / Jonas
Re: repair performance
You changed compaction_throughput_mb_per_sec, but did you also increase concurrent_compactors? In reference to the reaper and some other info I received on the user forum to my question on "nodetool repair", here are some useful links/slides - https://www.datastax.com/dev/blog/repair-in-cassandra https://www.pythian.com/blog/effective-anti-entropy-repair-cassandra/ http://www.slideshare.net/DataStax/real-world-tales-of-repair-alexander-dejanovski-the-last-pickle-cassandra-summit-2016 http://www.slideshare.net/DataStax/real-world-repairs-vinay-chella-netflix-cassandra-summit-2016 From: Roland Otta Date: Friday, March 17, 2017 at 5:47 PM To: "user@cassandra.apache.org" Subject: Re: repair performance did not recognize that so far. thank you for the hint. i will definitely give it a try On Fri, 2017-03-17 at 22:32 +0100, benjamin roth wrote: The fork from thelastpickle is. I'd recommend to give it a try over pure nodetool. 2017-03-17 22:30 GMT+01:00 Roland Otta mailto:roland.o...@willhaben.at>>: forgot to mention the version we are using: we are using 3.0.7 - so i guess we should have incremental repairs by default. it also prints out incremental:true when starting a repair INFO [Thread-7281] 2017-03-17 09:40:32,059 RepairRunnable.java:125 - Starting repair command #7, repairing keyspace xxx with repair options (parallelism: parallel, primary range: false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [ProdDC2], hosts: [], # of ranges: 1758) 3.0.7 is also the reason why we are not using reaper ... as far as i could figure out it's not compatible with 3.0+ On Fri, 2017-03-17 at 22:13 +0100, benjamin roth wrote: It depends a lot ... - Repairs can be very slow, yes! (And unreliable, due to timeouts, outages, whatever) - You can use incremental repairs to speed things up for regular repairs - You can use "reaper" to schedule repairs and run them sliced, automated, failsafe The time repairs actually may vary a lot depending on how much data has to be streamed or how inconsistent your cluster is. 50mbit/s is really a bit low! The actual performance depends on so many factors like your CPU, RAM, HD/SSD, concurrency settings, load of the "old nodes" of the cluster. This is a quite individual problem you have to track down individually. 2017-03-17 22:07 GMT+01:00 Roland Otta mailto:roland.o...@willhaben.at>>: hello, we are quite inexperienced with cassandra at the moment and are playing around with a new cluster we built up for getting familiar with cassandra and its possibilites. while getting familiar with that topic we recognized that repairs in our cluster take a long time. To get an idea of our current setup here are some numbers: our cluster currently consists of 4 nodes (replication factor 3). these nodes are all on dedicated physical hardware in our own datacenter. all of the nodes have 32 cores @2,9Ghz 64 GB ram 2 ssds (raid0) 900 GB each for data 1 seperate hdd for OS + commitlogs current dataset: approx 530 GB per node 21 tables (biggest one has more than 200 GB / node) i already tried setting compactionthroughput + streamingthroughput to unlimited for testing purposes ... but that did not change anything. when checking system resources i cannot see any bottleneck (cpus are pretty idle and we have no iowaits). when issuing a repair via nodetool repair -local on a node the repair takes longer than a day. is this normal or could we normally expect a faster repair? i also recognized that initalizing of new nodes in the datacenter was really slow (approx 50 mbit/s). also here i expected a much better performance - could those 2 problems be somehow related? br// roland
Grouping time series data into blocks of times
I have a use case where a stream of time series data is coming in. Each item in the stream has a timestamp of when it was sent, and covers the activity that happened within a 5 minute timespan. I need to group the items together into 30 minute blocks of time. E.g, say I receive the following items: 5:00 PM, 5:05 PM, 5:10 PM... 5:30 PM, 6:20 PM I need to group the messages from 5:00 PM to 5:30 PM into one block, and put the 6:20 PM message into another block. It seems simple enough to do, if for each message, I look up the last received message. If it was within 30 minutes, then the message goes into the current block. Otherwise, a new block is started. My concern is about messages that arrive out of order, or are processed concurrently. Saving and reading them with Consistency=ALL would be bad for performance, and I've had issues where queries have failed due to timeouts with those settings (and timeouts can't be increased on a per query basis). Would it be better to use Redis, or another database, to use as a helper / companion to C*? Or perhaps, all messages should just be stored first, and then ~30 minutes later, a job is run which gets all messages within last 30 mins, sorts them by time, and then sorts them into blocks of time?
Re: Grouping time series data into blocks of times
If its a sliding 30 min window you will need to implement it and have an in-memory timestamp list but out of order messages will always be a headache. If you are ok with a fixed 30 min window (each 30 min eg 5:00, 5:30, 6:00,..) then just add a time bucket into the partition key and you are done. Our of order messages go into their time bucket partitions and thats it. No need to read before write and worry about consistency. Depends on what your requirements are. On Sat, Mar 18, 2017 at 6:27 PM, Ali Akhtar wrote: > I have a use case where a stream of time series data is coming in. > > Each item in the stream has a timestamp of when it was sent, and covers > the activity that happened within a 5 minute timespan. > > I need to group the items together into 30 minute blocks of time. > > E.g, say I receive the following items: > > 5:00 PM, 5:05 PM, 5:10 PM... 5:30 PM, 6:20 PM > > I need to group the messages from 5:00 PM to 5:30 PM into one block, and > put the 6:20 PM message into another block. > > It seems simple enough to do, if for each message, I look up the last > received message. If it was within 30 minutes, then the message goes into > the current block. Otherwise, a new block is started. > > My concern is about messages that arrive out of order, or are processed > concurrently. > > Saving and reading them with Consistency=ALL would be bad for performance, > and I've had issues where queries have failed due to timeouts with those > settings (and timeouts can't be increased on a per query basis). > > Would it be better to use Redis, or another database, to use as a helper / > companion to C*? > > Or perhaps, all messages should just be stored first, and then ~30 minutes > later, a job is run which gets all messages within last 30 mins, sorts them > by time, and then sorts them into blocks of time? >
AW: How can I scale my read rate?
+1 for executeAsync – had a long time to argue that it’s not bad as with good old rdbms. Gesendet von meinem Windows 10 Phone Von: Arvydas Jonusonis Gesendet: Samstag, 18. März 2017 19:08 An: user@cassandra.apache.org Betreff: Re: How can I scale my read rate? ..then you're not taking advantage of request pipelining. Use executeAsync - this will increase your throughput for sure. http://www.datastax.com/dev/blog/java-driver-async-queries On Sat, Mar 18, 2017 at 08:00 S G wrote: I have enabled JMX but not sure what metrics to look for - they are way too many of them. I am using session.execute(...) On Fri, Mar 17, 2017 at 2:07 PM, Arvydas Jonusonis wrote: It would be interesting to see some of the driver metrics (in your stress test tool) - if you enable JMX, they should be exposed by default. Also, are you using session.execute(..) or session.executeAsync(..) ?
Re: How can I scale my read rate?
Thanks. It seems that you guys have found executeAsync to yield good results. I want to share my understanding how this could benefit performance and some validation from the group will be awesome. I will call executeAsync() each time I want to get by primary-key. That ways, my client thread is not blocked anymore and I can submit a lot more requests per unit time. The async requests get piled on the underlying Netty I/O thread which ensures that it is always busy all the time. Earlier, the Netty I/O thread would have wasted some cycles when the sync-execute method was processing the results. And earlier, the client thread would also have wasted some cycles waiting for netty-thread to complete. With executeAsync(), none of them is waiting. Only thing to ensure is that the Netty thread's queue does not grow indefinitely. If the above theory is correct, then it sounds like a really good thing to try. If not, please do share some more details. On Sat, Mar 18, 2017 at 2:00 PM, wrote: > +1 for executeAsync – had a long time to argue that it’s not bad as with > good old rdbms. > > > > > > > > Gesendet von meinem Windows 10 Phone > > > > *Von: *Arvydas Jonusonis > *Gesendet: *Samstag, 18. März 2017 19:08 > *An: *user@cassandra.apache.org > *Betreff: *Re: How can I scale my read rate? > > > > ..then you're not taking advantage of request pipelining. Use executeAsync > - this will increase your throughput for sure. > > > > http://www.datastax.com/dev/blog/java-driver-async-queries > > > > > > On Sat, Mar 18, 2017 at 08:00 S G wrote: > > I have enabled JMX but not sure what metrics to look for - they are way > too many of them. > > I am using session.execute(...) > > > > > > On Fri, Mar 17, 2017 at 2:07 PM, Arvydas Jonusonis < > arvydas.jonuso...@gmail.com> wrote: > > It would be interesting to see some of the driver metrics (in your stress > test tool) - if you enable JMX, they should be exposed by default. > > Also, are you using session.execute(..) or session.executeAsync(..) ? > > > > >
Re: How can I scale my read rate?
ok, I gave the executeAsync() a try. Good part is that it was really easy to write the code for that. Bad part is that it did not had a huge effect on my throughput - I gained about 5% increase in throughput. I suspect it is so because my queries are all get-by-primary-key queries and were anyways completing in less than 2 milliseconds. So there was not much wait to begin with. Here is my code: String getByKeyQueryStr = "Select * from fooTable where key = " + key; //ResultSet result = session.execute(getByKeyQueryStr); // Previous code ResultSetFuture future = session.executeAsync(getByKeyQueryStr); FutureCallback callback = new MyFutureCallback(); executor = MoreExecutors.sameThreadExecutor(); //executor = Executors.newFixedThreadPool(3); // Tried this too, no effect //executor = Executors.newFixedThreadPool(10); // Tried this too, no effect Futures.addCallback(future, callback, executor); Can I improve the above code in some way? Are there any JMX metrics that can tell me what's going on? >From the vmstat command, I see that CPU idle time is about 70% even though I am running about 60 threads per VM Total 20 client-VMs with 8 cores each are querying a Cassandra cluster with 16 VMs, 8-core each too. Thanks SG On Sat, Mar 18, 2017 at 5:38 PM, S G wrote: > Thanks. It seems that you guys have found executeAsync to yield good > results. > I want to share my understanding how this could benefit performance and > some validation from the group will be awesome. > > I will call executeAsync() each time I want to get by primary-key. > That ways, my client thread is not blocked anymore and I can submit a lot > more requests per unit time. > The async requests get piled on the underlying Netty I/O thread which > ensures that it is always busy all the time. > Earlier, the Netty I/O thread would have wasted some cycles when the > sync-execute method was processing the results. > And earlier, the client thread would also have wasted some cycles waiting > for netty-thread to complete. > > With executeAsync(), none of them is waiting. > Only thing to ensure is that the Netty thread's queue does not grow > indefinitely. > > If the above theory is correct, then it sounds like a really good thing to > try. > If not, please do share some more details. > > > > > On Sat, Mar 18, 2017 at 2:00 PM, wrote: > >> +1 for executeAsync – had a long time to argue that it’s not bad as with >> good old rdbms. >> >> >> >> >> >> >> >> Gesendet von meinem Windows 10 Phone >> >> >> >> *Von: *Arvydas Jonusonis >> *Gesendet: *Samstag, 18. März 2017 19:08 >> *An: *user@cassandra.apache.org >> *Betreff: *Re: How can I scale my read rate? >> >> >> >> ..then you're not taking advantage of request pipelining. Use >> executeAsync - this will increase your throughput for sure. >> >> >> >> http://www.datastax.com/dev/blog/java-driver-async-queries >> >> >> >> >> >> On Sat, Mar 18, 2017 at 08:00 S G wrote: >> >> I have enabled JMX but not sure what metrics to look for - they are way >> too many of them. >> >> I am using session.execute(...) >> >> >> >> >> >> On Fri, Mar 17, 2017 at 2:07 PM, Arvydas Jonusonis < >> arvydas.jonuso...@gmail.com> wrote: >> >> It would be interesting to see some of the driver metrics (in your stress >> test tool) - if you enable JMX, they should be exposed by default. >> >> Also, are you using session.execute(..) or session.executeAsync(..) ? >> >> >> >> >> >
Re: How can I scale my read rate?
Forgot to mention that this vmstat picture is for the client-cluster reading from Cassandra. On Sat, Mar 18, 2017 at 6:47 PM, S G wrote: > ok, I gave the executeAsync() a try. > Good part is that it was really easy to write the code for that. > Bad part is that it did not had a huge effect on my throughput - I gained > about 5% increase in throughput. > I suspect it is so because my queries are all get-by-primary-key queries > and were anyways completing in less than 2 milliseconds. > So there was not much wait to begin with. > > > Here is my code: > > String getByKeyQueryStr = "Select * from fooTable where key = " + key; > //ResultSet result = session.execute(getByKeyQueryStr); // Previous code > ResultSetFuture future = session.executeAsync(getByKeyQueryStr); > FutureCallback callback = new MyFutureCallback(); > executor = MoreExecutors.sameThreadExecutor(); > //executor = Executors.newFixedThreadPool(3); // Tried this too, no effect > //executor = Executors.newFixedThreadPool(10); // Tried this too, no > effect > Futures.addCallback(future, callback, executor); > > Can I improve the above code in some way? > Are there any JMX metrics that can tell me what's going on? > > From the vmstat command, I see that CPU idle time is about 70% even though > I am running about 60 threads per VM > Total 20 client-VMs with 8 cores each are querying a Cassandra cluster > with 16 VMs, 8-core each too. > > > > > > > Thanks > SG > > > On Sat, Mar 18, 2017 at 5:38 PM, S G wrote: > >> Thanks. It seems that you guys have found executeAsync to yield good >> results. >> I want to share my understanding how this could benefit performance and >> some validation from the group will be awesome. >> >> I will call executeAsync() each time I want to get by primary-key. >> That ways, my client thread is not blocked anymore and I can submit a lot >> more requests per unit time. >> The async requests get piled on the underlying Netty I/O thread which >> ensures that it is always busy all the time. >> Earlier, the Netty I/O thread would have wasted some cycles when the >> sync-execute method was processing the results. >> And earlier, the client thread would also have wasted some cycles waiting >> for netty-thread to complete. >> >> With executeAsync(), none of them is waiting. >> Only thing to ensure is that the Netty thread's queue does not grow >> indefinitely. >> >> If the above theory is correct, then it sounds like a really good thing >> to try. >> If not, please do share some more details. >> >> >> >> >> On Sat, Mar 18, 2017 at 2:00 PM, wrote: >> >>> +1 for executeAsync – had a long time to argue that it’s not bad as with >>> good old rdbms. >>> >>> >>> >>> >>> >>> >>> >>> Gesendet von meinem Windows 10 Phone >>> >>> >>> >>> *Von: *Arvydas Jonusonis >>> *Gesendet: *Samstag, 18. März 2017 19:08 >>> *An: *user@cassandra.apache.org >>> *Betreff: *Re: How can I scale my read rate? >>> >>> >>> >>> ..then you're not taking advantage of request pipelining. Use >>> executeAsync - this will increase your throughput for sure. >>> >>> >>> >>> http://www.datastax.com/dev/blog/java-driver-async-queries >>> >>> >>> >>> >>> >>> On Sat, Mar 18, 2017 at 08:00 S G wrote: >>> >>> I have enabled JMX but not sure what metrics to look for - they are way >>> too many of them. >>> >>> I am using session.execute(...) >>> >>> >>> >>> >>> >>> On Fri, Mar 17, 2017 at 2:07 PM, Arvydas Jonusonis < >>> arvydas.jonuso...@gmail.com> wrote: >>> >>> It would be interesting to see some of the driver metrics (in your >>> stress test tool) - if you enable JMX, they should be exposed by default. >>> >>> Also, are you using session.execute(..) or session.executeAsync(..) ? >>> >>> >>> >>> >>> >> >
Running cassandra
Hi anyone, I am trying to get started to play with Cassandra follow this doc: http://cassandra.apache.org/doc/latest/getting_started/installing.html#prerequisites But I always get the error: qlong@~/ws/cas/apache-cassandra-3.10 $ ./bin/cassandra -f Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file ./bin/../logs/gc.log due to No such file or directory Error: Could not find or load main class -ea qlong@~/ws/cas/apache-cassandra-3.10 $ ./bin/cassandra Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file ./bin/../logs/gc.log due to No such file or directory qlong@~/ws/cas/apache-cassandra-3.10 $ Error: Could not find or load main class -ea Did I miss something? My java is 1.8: qlong@~ $ java -version java version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode) Thanks for any help, Long
Running cassandra
Hi I am trying to get started to play with Cassandra follow this doc: http://cassandra.apache.org/doc/latest/getting_started/installing.html#prerequisites But I always get the error: qlong@~/ws/cas/apache-cassandra-3.10 $ ./bin/cassandra -f Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file ./bin/../logs/gc.log due to No such file or directory Error: Could not find or load main class -ea qlong@~/ws/cas/apache-cassandra-3.10 $ ./bin/cassandra Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file ./bin/../logs/gc.log due to No such file or directory qlong@~/ws/cas/apache-cassandra-3.10 $ Error: Could not find or load main class -ea Did I miss something? My java is 1.8: qlong@~ $ java -version java version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode) Thanks for any help, Long