Are counters faster than CAS or vice versa?

2016-07-20 Thread Kevin Burton
We ended up implementing a task/queue system which uses a global pointer. Basically the pointer just increments ... so we have thousands of tasks that just increment this one pointer. The problem is that we're seeing contention on it and not being able to write this record properly. We're just d

Re: Are counters faster than CAS or vice versa?

2016-07-20 Thread Kevin Burton
On Wed, Jul 20, 2016 at 11:53 AM, Jeff Jirsa wrote: > Can you tolerate the value being “close, but not perfectly accurate”? If > not, don’t use a counter. > > > yeah.. agreed.. this is a problem which is something I was considering. I guess it depends on whether they are 10x faster.. -- We’r

Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-02 Thread Kevin Burton
We have a 60 node CS cluster running 2.2.7 and about 20GB of RAM allocated to each C* node. We're aware of the recommended 8GB limit to keep GCs low but our memory has been creeping up (probably) related to this bug. Here's what we're seeing... if we do a low level of writes we think everything g

Re: Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-02 Thread Kevin Burton
index/content_legacy_2016_08_02:1470154500099 (106107128 bytes) On Tue, Aug 2, 2016 at 6:43 PM, Kevin Burton wrote: > We have a 60 node CS cluster running 2.2.7 and about 20GB of RAM allocated > to each C* node. We're aware of the recommended 8GB limit to keep GCs low > but our memory has been cr

Re: Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-03 Thread Kevin Burton
to make your partitions smaller (like > 1/10th of the size). > > Cheers > Ben > <https://issues.apache.org/jira/browse/CASSANDRA-11206> > > On Wed, 3 Aug 2016 at 12:35 Kevin Burton wrote: > >> I have a theory as to what I think is happening here. >> >

Re: Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-03 Thread Kevin Burton
nt, your best >> solution would be to find a way to make your partitions smaller (like >> 1/10th of the size). >> >> Cheers >> Ben >> <https://issues.apache.org/jira/browse/CASSANDRA-11206> >> >> On Wed, 3 Aug 2016 at 12:35 Kevin Burton wrote: &

Re: [Marketing Mail] Re: Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-03 Thread Kevin Burton
We usually use 100 per every 5 minutes.. but you're right. We might actually move this use case over to using Elasticsearch in the next couple of weeks. On Wed, Aug 3, 2016 at 11:09 AM, Jonathan Haddad wrote: > Kevin, > > "Our scheme uses large buckets of content where we write to a > bucket/pa

Mutation of X bytes is too large for the maximum size of Y

2016-08-03 Thread Kevin Burton
It seems these are basically impossible to track down. https://support.datastax.com/hc/en-us/articles/207267063-Mutation-of-x-bytes-is-too-large-for-the-maxiumum-size-of-y- has some information but their work around is to increase the transaction log. There's no way to find out WHAT client or wh

Re: Mutation of X bytes is too large for the maximum size of Y

2016-08-03 Thread Kevin Burton
(but other drivers should > have a similar exception): > https://github.com/datastax/python-driver/blob/master/cassandra/protocol.py#L288 > > On Wed, Aug 3, 2016 at 1:59 PM Ryan Svihla wrote: > >> Made a Jira about it already >> https://issues.apache.org/jira/plugi

Re: [Marketing Mail] Re: Memory leak and lockup on our 2.2.7 Cassandra cluster.

2016-08-04 Thread Kevin Burton
BTW. we think we tracked this down to using large partitions to implement inverted indexes. C* just doesn't do a reasonable job at all with large partitions so we're going to migrate this use case to using Elasticsearch On Wed, Aug 3, 2016 at 1:54 PM, Ben Slater wrote: > Yep, that was what I w

iostat -like tool to parse 'nodetool cfstats'

2016-12-20 Thread Kevin Burton
nodetool cfstats has some valuable data but what I would like is a 1 minute delta. Similar to iostat... It's easy to parse this but has anyone done it? I want to see IO throughput and load on C* for each table. -- We’re hiring if you know of any awesome Java Devops or Linux Operations Enginee

Lots of write timeouts and missing data during decomission/bootstrap

2015-07-01 Thread Kevin Burton
We get lots of write timeouts when we decommission a node. About 80% of them are write timeout and just about 20% of them are read timeout. We’ve tried to adjust streamthroughput (and compaction throughput) for that matter and that doesn’t resolve the issue. We’ve increased write_request_timeout

Re: Lots of write timeouts and missing data during decomission/bootstrap

2015-07-01 Thread Kevin Burton
in failures of CAS? This is Cassandra 2.0.9 btw. On Wed, Jul 1, 2015 at 2:22 PM, Kevin Burton wrote: > We get lots of write timeouts when we decommission a node. About 80% of > them are write timeout and just about 20% of them are read timeout. > > We’ve tried to adjust streamthrou

Re: Lots of write timeouts and missing data during decomission/bootstrap

2015-07-01 Thread Kevin Burton
WOW.. nice. you rock!! On Wed, Jul 1, 2015 at 3:18 PM, Robert Coli wrote: > On Wed, Jul 1, 2015 at 2:58 PM, Kevin Burton wrote: > >> Looks like all of this is happening because we’re using CAS operations >> and the driver is going to SERIAL consistency level. >> ... &g

Configuring the java client to retry on write failure.

2015-07-12 Thread Kevin Burton
I can’t seem to find a decent resource to really explain this… Our app seems to fail some write requests, a VERY low percentage. I’d like to retry the write requests that fail due to number of replicas not being correct. http://docs.datastax.com/en/developer/java-driver/2.0/common/drivers/refere

TTLs on tables with *only* primary keys?

2015-08-04 Thread Kevin Burton
I have a table which just has primary keys. basically: create table foo ( sequence bigint, signature text, primary key( sequence, signature ) ) I need these to eventually get GCd however it doesn’t seem to work. If I then run: select ttl(sequence) from foo; I get: Cannot use sel

Re: TTLs on tables with *only* primary keys?

2015-08-05 Thread Kevin Burton
RA-9312. > > On Tue, Aug 4, 2015 at 9:22 PM, Kevin Burton wrote: > >> I have a table which just has primary keys. >> >> basically: >> >> create table foo ( >> >> sequence bigint, >> signature text, >> primary key( sequ

Best strategy for hiring from OSS communities.

2015-08-13 Thread Kevin Burton
Mildly off topic but we are looking to hire someone with Cassandra experience.. I don’t necessarily want to spam the list though. We’d like someone from the community who contributes to Open Source, etc. Are there forums for Apache / Cassandra, etc for jobs? I couldn’t fine one. -- Founder/CE

Practical limitations of too many columns/cells ?

2015-08-23 Thread Kevin Burton
Is there any advantage to using say 40 columns per row vs using 2 columns (one for the pk and the other for data) and then shoving the data into a BLOB as a JSON object? To date, we’ve been just adding new columns. I profiled Cassandra and about 50% of the CPU time is spent on CPU doing compactio

Re: Practical limitations of too many columns/cells ?

2015-08-23 Thread Kevin Burton
ile=nodes Averages from the middle 80% of > values:interval_op_rate : 23489 > > From: on behalf of Kevin Burton > Reply-To: "user@cassandra.apache.org" > Date: Sunday, August 23, 2015 at 1:02 PM > To: "user@cassandra.apache.org" > Subject: Practical limitation

Store JSON as text or UTF-8 encoded blobs?

2015-08-23 Thread Kevin Burton
Hey. I’m considering migrating my DB from using multiple columns to just 2 columns, with the second one being a JSON object. Is there going to be any real difference between TEXT or UTF-8 encoded BLOB? I guess it would probably be easier to get tools like spark to parse the object as JSON if it’

Re: Practical limitations of too many columns/cells ?

2015-08-23 Thread Kevin Burton
shows a ton of > different examples, but they’re not scientific, and at this point they’re > old versions (and performance varies version to version). > > - Jeff > > From: on behalf of Kevin Burton > Reply-To: "user@cassandra.apache.org" > Date:

Re: Practical limitations of too many columns/cells ?

2015-08-25 Thread Kevin Burton
ge this, but > it's good to have it on the radar. > > > On Sun, Aug 23, 2015 at 10:31 PM Kevin Burton wrote: > >> Agreed. We’re going to run a benchmark. Just realized we grew to 144 >> columns. Fun. Kind of disappointing that Cassandra is so slow in this &g

Re: Cassandra 2.2 for time series

2015-09-02 Thread Kevin Burton
Check out kairosd for a time series db on Cassandra. On Aug 31, 2015 7:12 AM, "Peter Lin" wrote: > > I didn't realize they had added max and min as stock functions. > > to get the sample time. you'll probably need to write a custom function. > google for it and you'll find people that have done i

Re: Best strategy for hiring from OSS communities.

2015-09-13 Thread Kevin Burton
upport * http://sematext.com/ > > > On Thu, Aug 13, 2015 at 6:02 PM, Kevin Burton wrote: > >> Mildly off topic but we are looking to hire someone with Cassandra >> experience.. >> >> I don’t necessarily want to spam the list though. We’d like someone from >&

cassandra-stress on 3.0 with column widths benchmark.

2015-09-13 Thread Kevin Burton
I’m trying to benchmark two scenarios… 10 columns with 150 bytes each vs 150 columns with 10 bytes each. The total row “size” would be 1500 bytes (ignoring overhead). Our app uses 150 columns so I’m trying to see if packing it into a JSON structure using one column would improve performance.

Running Cassandra on Java 8 u60..

2015-09-25 Thread Kevin Burton
Any issues with running Cassandra 2.0.16 on Java 8? I remember there is long term advice on not changing the GC but not the underlying version of Java. Thoughts? -- We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers! Founder/CEO Spinn3r.com Location: *San Francis

Using inline JSON is 2-3x faster than using many columns (>20)

2015-09-26 Thread Kevin Burton
I wanted to share this with the community in the hopes that it might help someone with their schema design. I didn't get any red flags early on to limit the number of columns we use. If anything the community pushes for dynamic schema because Cassandra has super nice online ALTER TABLE. However,

Re: Running Cassandra on Java 8 u60..

2015-09-27 Thread Kevin Burton
k JDK9 will be the one. > > On Sep 25, 2015, at 7:14 PM, Stefano Ortolani wrote: > > I think those were referring to Java7 and G1GC (early versions were buggy). > > Cheers, > Stefano > > > On Fri, Sep 25, 2015 at 5:08 PM, Kevin Burton wrote: > >> Any issu

Maximum node decommission // bootstrap at once.

2015-10-06 Thread Kevin Burton
We're in the middle of migrating datacenters. We're migrating from 13 nodes to 30 nodes in the new datacenter. The plan was to bootstrap the 30 nodes first, wait until they have joined. then we're going to decommission the old ones. How many nodes can we bootstrap at once? How many can we deco

Re: Maximum node decommission // bootstrap at once.

2015-10-06 Thread Kevin Burton
rote: > On Tue, Oct 6, 2015 at 12:32 PM, Kevin Burton wrote: > >> How many nodes can we bootstrap at once? How many can we decommission? >> > > short answer : 1 node can join or part at simultaneously > > longer answer : https://issues.apache.org/jira/browse/CASSANDRA-2

Re: Maximum node decommission // bootstrap at once.

2015-10-06 Thread Kevin Burton
TCP tuning, > > On Tue, Oct 6, 2015 at 1:29 PM, Kevin Burton wrote: > >> I'm not sure which is faster/easier. Just joining one box at a time and >> then decommissioning or using replace_address. >> >> this stuff is always something you do rarely and then more comple

Why can't nodetool status include a hostname?

2015-10-07 Thread Kevin Burton
I find it really frustrating that nodetool status doesn't include a hostname Makes it harder to track down problems. I realize it PRIMARILY uses the IP but perhaps cassandra.yml can include an optional 'hostname' parameter that can be set by the user. OR have the box itself include the hostname

Does failing to run "nodetool cleanup" end up causing more data to be transferred during bootstrapping?

2015-10-07 Thread Kevin Burton
Let's say I have 10 nodes, I add 5 more, if I fail to run nodetool cleanup, is excessive data transferred when I add the 6th node? IE do the existing nodes send more data to the 6th node? the documentation is unclear. It sounds like the biggest problem is that the existing data causes things to

Re: Does failing to run "nodetool cleanup" end up causing more data to be transferred during bootstrapping?

2015-10-07 Thread Kevin Burton
e technology, > delivering Apache Cassandra to the world’s most innovative enterprises. > Datastax is built to be agile, always-on, and predictably scalable to any > size. With more than 500 customers in 45 countries, DataStax is the > database technology and transactional backbone of cho

Post portem of a large Cassandra datacenter migration.

2015-10-09 Thread Kevin Burton
We just finished up a pretty large migration of about 30 Cassandra boxes to a new datacenter. We'll be migrating to about 60 boxes here in the next month so scalability (and being able to do so cleanly) is important. We also completed an Elasticsearch migration at the same time. The ES migration

Would we have data corruption if we bootstrapped 10 nodes at once?

2015-10-17 Thread Kevin Burton
We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new nodes) By default we have auto_boostrap = false so we just push our config to the cluster, the cassandra daemons restart, and they're not cluster members and are the only nodes in the cluster. Anyway. While I was about 1/2

Re: reiserfs - DirectoryNotEmptyException

2015-10-17 Thread Kevin Burton
My advice is to not even consider anything else or make any other changes to your architecture until you get onto a modern and maintained filesystem. VERY VERY VERY few people are deploying anything on ReiserFS so you're going to be the first group encountering any problems. On Thu, Oct 15, 2015

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

2015-10-18 Thread Kevin Burton
An shit.. I think we're seeing corruption.. missing records :-/ On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton wrote: > We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new > nodes) > > By default we have auto_boostrap = false > > so we just push ou

compact/repair shouldn't compete for normal compaction resources.

2015-10-18 Thread Kevin Burton
I'm doing a big nodetool repair right now and I'm pretty sure the added overhead is impacting our performance. Shouldn't you be able to throttle repair so that normal compactions can use most of the resources? -- We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

2015-10-18 Thread Kevin Burton
if done on a single > node, is typically correctable with `nodetool repair`. > > If you do it on many nodes at once, it’s possible that the new nodes > could represent all 3 replicas of the data, but don’t physically have any > of that data, leading to missing records. > > >

Re: compact/repair shouldn't compete for normal compaction resources.

2015-10-19 Thread Kevin Burton
ogy, > delivering Apache Cassandra to the world’s most innovative enterprises. > Datastax is built to be agile, always-on, and predictably scalable to any > size. With more than 500 customers in 45 countries, DataStax is the > database technology and transactional backbone of choice for th

Re: compact/repair shouldn't compete for normal compaction resources.

2015-10-19 Thread Kevin Burton
this would resolve this problem. IF anyone else thinks this is an issue I'll create a JIRA. On Mon, Oct 19, 2015 at 3:38 PM, Robert Coli wrote: > On Mon, Oct 19, 2015 at 9:30 AM, Kevin Burton wrote: > >> I think the point I was trying to make is that on highly loaded boxes, >&g

Using cassandra a BLOB store / web cache.

2016-01-18 Thread Kevin Burton
Internally we have the need for a blob store for web content. It's MOSTLY key, ,value based but we'd like to have lookups by coarse grained tags. This needs to store normal web content like HTML , CSS, JPEG, SVG, etc. Highly doubt that anything over 5MB would need to be stored. We also need the

Re: Using cassandra a BLOB store / web cache.

2016-01-19 Thread Kevin Burton
18, 2016 at 6:52 PM, Kevin Burton wrote: > >> Internally we have the need for a blob store for web content. It's >> MOSTLY key, ,value based but we'd like to have lookups by coarse grained >> tags. >> > > I know you know how to operate and scale MySQ

Re: Using cassandra a BLOB store / web cache.

2016-01-20 Thread Kevin Burton
There's also the 'support' issue.. C* is hard enough as it is... maybe you can bring in another system like ES or HDFS but the more you bring in the more your complexity REALLY goes through the roof. Better to keep things simple. I really like the chunking idea for C*... seems like an easy way to

Strategy / order for upgradesstables during rolling upgrade.

2016-01-21 Thread Kevin Burton
I think there are two strategies to upgradesstables after a release. We're doing a 2.0 to 2.1 upgrade (been procrastinating here). I think we can go with B below... Would you agree? Strategy A: - foreach server - upgrade to 2.1 - nodetool upgradesstables Strategy B: -

automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-01-22 Thread Kevin Burton
Not sure if this is a bug or not or kind of a *fuzzy* area. In 2.0 this worked fine. We have a bunch of automated scripts that go through and create tables... one per day. at midnight UTC our entire CQL went offline.. .took down our whole app. ;-/ The resolution was a full CQL shut down and th

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-01-22 Thread Kevin Burton
47 PM, Jonathan Haddad wrote: > Instead of using ZK, why not solve your concurrency problem by removing > it? By that, I mean simply have 1 process that creates all your tables > instead of creating a race condition intentionally? > > On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton wrote: &

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-01-23 Thread Kevin Burton
fic Jira assigned, and the antipattern doc doesn't appear to > reference this scenario. Maybe a committer can shed some more light. > > -- Jack Krupansky > > On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton wrote: > >> I sort of agree.. but we are also considering migrating t

Faster version of 'nodetool status'

2016-02-12 Thread Kevin Burton
Is there a faster way to get the output of 'nodetool status' ? I want us to more aggressively monitor for 'nodetool status' and boxes being DN... I was thinking something like jolokia and REST but I'm not sure if there are variables exported by jolokia for nodetool status. Thoughts? -- We’re

Efficiently filtering results directly in CS

2016-04-07 Thread Kevin Burton
I have a paging model whereby we stream data from CS by fetching 'pages' thereby reading (sequentially) entire datasets. We're using the bucket approach where we write data for 5 minutes, then we can just fetch the bucket for that range. Our app now has TONS of data and we have a piece of middlew

Re: Efficiently filtering results directly in CS

2016-04-08 Thread Kevin Burton
Ha.. Yes... C*... I guess I need something like coprocessors in bigtable. On Fri, Apr 8, 2016 at 1:49 AM, vincent gromakowski < vincent.gromakow...@gmail.com> wrote: > c* I suppose > > 2016-04-07 19:30 GMT+02:00 Jonathan Haddad : > >> What is CS? >> >> O

Connecting to cassandra.

2012-11-10 Thread Kevin Burton
I have installed Cassandra on a Ubuntu Server but I fail to see it with either: ps ax or netstat -an | grep 9160 I see a file /etc/init.d/cassandra so I am assuming that it should start up. What else do I need to do? I have edited cassandra.yaml for all the places that specifically

RE: Connecting to cassandra.

2012-11-10 Thread Kevin Burton
ache.org Subject: RE: Connecting to cassandra. Importance: Low The first thing to check is the log files under /var/log/cassandra, should give you some hint. Thanks. -Wei Sent from my Samsung smartphone on AT&T Original message Subject: Connecting to cassandra. From:

RE: CREATE COLUMNFAMILY

2012-11-11 Thread Kevin Burton
sstabletojson 3) If you add a built-in secondary index the type information is needed, strings sort differently then integer 4) columns in rows are sorted by the column name, strings sort differently then integers On Sat, Nov 10, 2012 at 11:55 PM, Kevin Burton wrote: > I am sure this has been as

RE: Connecting to cassandra.

2012-11-11 Thread Kevin Burton
ne on AT&T Original message Subject: RE: Connecting to cassandra. From: Kevin Burton To: user@cassandra.apache.org CC: Thank you in the output.log I see the line: INFO 13:36:59,110 This node will not auto bootstrap because it is configured to be a seed node. A

RE: CREATE COLUMNFAMILY

2012-11-11 Thread Kevin Burton
heers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/11/2012, at 8:06 AM, Kevin Burton wrote: Thank you this helps with my understanding. So the goal here is to supply as many name/type pairs as can be reasonably be foreseen when the column fami

CF metadata syntax for an array

2012-11-11 Thread Kevin Burton
I am sorry if this is an FAQ. But I was wondering what the syntax for describing an array? I have gotten as far as feeling a need to understand a 'super-column' but I fail after that. Once I have the metadata in place to describe an array how do I insert data into the array? Get data from the arra

Re: CF metadata syntax for an array

2012-11-12 Thread Kevin Burton
gt; Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 12/11/2012, at 8:35 PM, Kevin Burton wrote: > >> I am sorry if this is an FAQ. But I was wondering what the syntax for >> describing an array? I have gotten as far as feeli

RE: CF metadata syntax for an array

2012-11-13 Thread Kevin Burton
le.com On 13/11/2012, at 9:46 AM, Kevin Burton wrote: While this solves the problem for an array of 'primitive' types. What if I want an array or collection of an arbitrary type like list, where foo is a user defined type? I am guessing that this cannot be done with 'collecti

RE: CF metadata syntax for an array

2012-11-13 Thread Kevin Burton
good starting point http://www.datastax.com/docs/1.1/references/cql/index Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/11/2012, at 2:42 AM, Kevin Burton wrote: Sorry to be so slow but I am just learning CQL. Would this synt

RE: CF metadata syntax for an array

2012-11-14 Thread Kevin Burton
as "variable names" to identify a particular vector or list. They are the storage engine "row key". Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 14/11/2012, at 5:31 PM, Kevin

RE: CF metadata syntax for an array

2012-11-14 Thread Kevin Burton
ndra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 15/11/2012, at 10:38 AM, Kevin Burton wrote: > An array would be a list of groups of items. In my case I want a list/array of line items. An order has certain characteristics and one of them is a list of the items that are being ordered. Say

RE: CF metadata syntax for an array

2012-11-14 Thread Kevin Burton
y uses composite key, which gives you additional capabilities like order by in the where clause On Wed, Nov 14, 2012 at 5:27 PM, Kevin Burton wrote: I hope I am not bugging you but now what is the purpose of PRIMARY_KEY(id, item_id)? By expressing the KEY as two values this basically gives the

Admin for cassandra?

2012-11-15 Thread Kevin Burton
Is there an IDE for a Cassandra database? Similar to the SQL Server Management Studio for SQL server. I mainly want to execute queries and see the results. Preferably that runs under a Windows OS. Thank you.

Table not being created but no error.

2014-08-13 Thread Kevin Burton
I'm tracking down a weird bug and was wondering if you guys had any feedback. I'm trying to create ten tables programatically.. . The first one I create, for some reason, isn't created. The other 9 are created without a problem. Im doing this with the datastax driver's session.execute(). No ex

Re: Table not being created but no error.

2014-08-13 Thread Kevin Burton
uyHai Doan wrote: > Can you just give the C* version and the complete DDL script to reproduce > the issue ? > > > On Wed, Aug 13, 2014 at 10:08 PM, Kevin Burton wrote: > >> I'm tracking down a weird bug and was wondering if you guys had any >> feedback. >>

Re: Table not being created but no error.

2014-08-13 Thread Kevin Burton
and I'm certain that the CQL is executing… because I get a ResultSet back and verified that the CQL is correct. On Wed, Aug 13, 2014 at 1:26 PM, Kevin Burton wrote: > 2.0.5… I'm upgrading to 2.0.9 now just to rule this out…. > > I can give you the full CQL for the table,

Re: Table not being created but no error.

2014-08-13 Thread Kevin Burton
yeah… problem still exists on 2.0.9 On Wed, Aug 13, 2014 at 1:26 PM, Kevin Burton wrote: > and I'm certain that the CQL is executing… because I get a ResultSet back > and verified that the CQL is correct. > > > On Wed, Aug 13, 2014 at 1:26 PM, Kevin Burton wrote: > >

Re: Table not being created but no error.

2014-08-13 Thread Kevin Burton
ah.. good idea. I'll try that now. On Wed, Aug 13, 2014 at 1:36 PM, DuyHai Doan wrote: > Maybe tracing the requests ? (just the one creating the schema of course) > > > On Wed, Aug 13, 2014 at 10:30 PM, Kevin Burton wrote: > >> yeah… problem still exists on 2.0.9 >

Re: Table not being created but no error.

2014-08-13 Thread Kevin Burton
ug 13, 2014 at 1:38 PM, Kevin Burton wrote: > ah.. good idea. I'll try that now. > > > On Wed, Aug 13, 2014 at 1:36 PM, DuyHai Doan wrote: > >> Maybe tracing the requests ? (just the one creating the schema of course) >> >> >> On Wed, Aug 13, 2014

Re: Table not being created but no error.

2014-08-13 Thread Kevin Burton
the tables back out, or run a SELECT against it, it will fail. Hm… On Wed, Aug 13, 2014 at 1:52 PM, Kevin Burton wrote: > It still failed. Tracing shows that the query is being executed. Just > that the table isn't created. I did a diff against the two table names and > the only

Re: Table not being created but no error.

2014-08-13 Thread Kevin Burton
the table? This feels > like code error rather than a database bug. > > > On Wed, Aug 13, 2014 at 1:26 PM, Kevin Burton wrote: > >> 2.0.5… I'm upgrading to 2.0.9 now just to rule this out…. >> >> I can give you the full CQL for the table, but I can't seem t

Could table partitioning be implemented using a customer compaction strategy?

2014-08-14 Thread Kevin Burton
We use log structured tables to hold logs for analysis. It's basically append only, and immutable. Every record has a timestamp for each record inserted. Having this in ONE big monolithic table can be problematic. 1. compactions have to compact old data that might not even be used often. 2.

Best way to format a ResultSet / Row ?

2014-08-18 Thread Kevin Burton
The DataStax java driver has a Row object which getInt, getLong methods… However, the getString only works on string columns. That's probably reasonable… but if I have a raw Row, how the heck do I easily print it? I need a handy way do dump a ResultSet … -- Founder/CEO Spinn3r.com Location: *

Re: EC2 SSD cluster costs

2014-08-19 Thread Kevin Burton
You're pricing it out at $ per GB… that's not the way to look at it. Price it out at $ per IO… Once you price it that way, SSD makes a LOT more sense. Of course, it depends on your workload. If you're just doing writes, and they're all sequential, then cost per IO might not make a lot of sense.

Re: Best way to format a ResultSet / Row ?

2014-08-19 Thread Kevin Burton
I agree that it belongs on that mailing list but it's setup weird.. .I can't subscribe to it in Google Groups… I am not sure what exactly is wrong with it.. mailed the admins but it hasn't been resolved. On Tue, Aug 19, 2014 at 1:49 AM, Sylvain Lebresne wrote: > This kind of question belong to

Disk failure policy should let you run some basic commands.

2014-08-20 Thread Kevin Burton
So , right now, I have a full cassandra cluster… all my nodes are down. Fun! And I have a table, which I could just issue a truncate command to. It's just a log table so dropping the data is fine. but instead, I can't do that because my cluster is completely offline. Now, the disk failure poli

Re: Disk failure policy should let you run some basic commands.

2014-08-20 Thread Kevin Burton
> > >> +1, though because you can't drop the snapshots those two commands > automatically create (if the snapshot-before-DROP even works with disk > full, which it probably doesn't...) you still need access to the machines > to reclaim your disk space. > > True.. I actually disabled the snapshot f

stalled nodetool repair?

2014-08-21 Thread Kevin Burton
How do I watch the progress of nodetool repair. Looks like the folklore from the list says to just use nodetool compactionstats nodetool netstats … but the repair seems locked/stalled and neither of these are showing any progress.. granted , this is a lot of data, but it would be nice to at lea

Blocking while a node finishes joining the cluster after restart.

2014-09-16 Thread Kevin Burton
Say I want to do a rolling restart of Cassandra… I can’t just restart all of them because they need some time to gossip and for that gossip to get to all nodes. What is the best strategy for this. It would be something like: /etc/init.d/cassandra restart && wait-for-cassandra.sh … or something

Re: Blocking while a node finishes joining the cluster after restart.

2014-09-19 Thread Kevin Burton
s wrote: > Hi Kevin, if you are using the latest version of opscenter, then even the > community (= free) edition can do a rolling restart of your cluster. It's > pretty convenient. > > Ciao, Duncan. > > On 16/09/14 19:44, Kevin Burton wrote: > >> Say I want to

Re: Blocking while a node finishes joining the cluster after restart.

2014-09-19 Thread Kevin Burton
-- > *From:* Duncan Sands > *To:* user@cassandra.apache.org > *Sent:* Tuesday, September 16, 2014 11:09 AM > *Subject:* Re: Blocking while a node finishes joining the cluster after > restart. > > Hi Kevin, if you are using the latest version of opscenter, then even t

Re: Adjusting readahead for SSD disk seeks

2014-09-25 Thread Kevin Burton
I’d advise keeping read ahead low… or turning it off on SSD. Also, noop IO scheduler might help you on that disk.. IF Cassandra DOES perform a contiguous read, read ahead won’t be helpful. It’s essentially obsolete now on SSDs. On Wed, Sep 24, 2014 at 1:20 PM, Daniel Chia wrote: > Cassandra o

simple map / table scans without hadoop?

2014-09-26 Thread Kevin Burton
I have the requirements to periodically run full tables scans on our data. It’s mostly for repair tasks or making bulk UPDATEs… but I’d prefer to do it in Java because I need something mildly trivial. Pig / hadoop / etc are mildly overkill for this. I don’t want or need a whole hadoop or HDFS set

Re: Apache Cassandra 2.1.0 : cassandra-stress performance discrepancy between SSD and SATA drive

2014-09-26 Thread Kevin Burton
What SSD was it? There are a lot of variability in terms of SSD performance. 1. Is it a new vs old SSD? Old SSDs can become slower if they’re really worn out 2. was the office SSD near capacity holding other data? 3. what models were they? SSD != SSD… there is a massive amount of performan

paging through an entire table in chunks?

2014-09-27 Thread Kevin Burton
I need a way to do a full table scan across all of our data. Can’t I just use token() for this? This way I could split up our entire keyspace into say 1024 chunks, and then have one activemq task work with range 0, then range 1, etc… that way I can easily just map() my whole table. and since it’

Re: paging through an entire table in chunks?

2014-09-27 Thread Kevin Burton
rator from the ResultSet > 4) Iterate > > > > On Sat, Sep 27, 2014 at 11:42 PM, Kevin Burton wrote: > >> I need a way to do a full table scan across all of our data. >> >> Can’t I just use token() for this? >> >> This way I could split up our entire keysp

Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread Kevin Burton
I’m trying to query an entire table in parallel by splitting it up in token ranges. However, it’s not working because I get this: cqlsh:blogindex> select token(hashcode), hashcode from source where token(hashcode) >= 0 and token(hashcode) <= 17014118346046923173168730371588410572 limit 10; Bad

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread Kevin Burton
gt; > On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote: > > I’m trying to query an entire table in parallel by splitting it up in > token ranges. > > However, it’s not working because I get this: > > cqlsh:blogindex> select token(hashcode), hashcode from source where

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread Kevin Burton
nitely 64 bits > > > On Sep 28, 2014, at 5:55 PM, Kevin Burton wrote: > > Hm.. is it 64 bits or 128 bits? > > I’m using Murmur3Partitioner > > … > > I can’t find any documentation on it (as usual.. ha) > > This says: > > http://www.datastax.com/docs/1.1/

is lack of full text search hurting cassandra and datastax?

2014-10-02 Thread Kevin Burton
So right now I have plenty of quality and robust full text search systems I can use. Solr cloud, elastic search. They all also have very robust UIs on top of them… kibana, banana, etc. and my alternative for cassandra is… paying for a proprietary database. Which might be fine for some parties…

describe tables… and vertical formatting?

2014-10-12 Thread Kevin Burton
It seems annoying that I can’t get “describe tables” to vertical. maybe there’s some option I’m missing? Kevin -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile

Re: describe tables… and vertical formatting?

2014-10-12 Thread Kevin Burton
huh. That sort of works. The problem now is that there are multiple entries per table... On Sun, Oct 12, 2014 at 10:39 AM, graham sanderson wrote: > select keyspace_name, columnfamily_name from system.schema_columns; > ? > > On Oct 12, 2014, at 10:29 AM, Kevin Burton wrote:

How do you run integration tests for your cassandra code?

2014-10-13 Thread Kevin Burton
Curious to see if any of you have an elegant solution here. Right now I”m using cassandra unit; https://github.com/jsevellec/cassandra-unit for my integration tests. The biggest problem is that it doesn’t support shutdown. so I can’t stop or cleanup after cassandra between tests. I have other

C* on Fusion IO

2014-11-06 Thread Kevin Burton
We’re looking at switching data centers and they’re offering pretty aggressive pricing on boxes with fusion IO cards. 2x 1.2TB Fusion IO 128GB RAM 20 cores. now.. this isn’t the typical cassandra box. Most people are running multiple nodes to scale out vs scale vertically. But these boxes are p

Re: C* on Fusion IO

2014-11-06 Thread Kevin Burton
ends on your >> workload, and how often you need to repair. >> >> Sent from my iPhone >> >> On Nov 6, 2014, at 3:40 PM, Kevin Burton wrote: >> >> We’re looking at switching data centers and they’re offering pretty >> aggressive pricing on boxes with fusion IO

Re: C* on Fusion IO

2014-11-06 Thread Kevin Burton
) using fusion I/O, but >>> with 10GBe connections. I mean why buy a Ferrari and never leave first gear? >>> >>> As far as saturating the network goes, I guess that all depends on your >>> workload, and how often you need to repair. >>> >>> Sent from

Re: C* on Fusion IO

2014-11-06 Thread Kevin Burton
On Thu, Nov 6, 2014 at 2:10 PM, Christopher Brodt wrote: > Yep. The "trouble" with FIOs is that they almost completely remove your > disk throughput problems, so then you're constrained by CPU. Concurrent > compactors and concurrent writes are two params that come to mind but there > are likely o

  1   2   3   >