Re: best practices for time-series data with massive amounts of records

2015-03-06 Thread Clint Kelly
Hi all, Thanks for the responses, this was very helpful. I don't know yet what the distribution of clicks and users will be, but I expect to see a few users with an enormous amount of interactions and most users having very few. The idea of doing some additional manual partitioning, and then mai

best practices for time-series data with massive amounts of records

2015-03-02 Thread Clint Kelly
Hi all, I am designing an application that will capture time series data where we expect the number of records per user to potentially be extremely high. I am not sure if we will eclipse the max row size of 2B elements, but I assume that we would not want our application to approach that size any

Re: how to scan all rows of cassandra using multiple threads

2015-02-25 Thread Clint Kelly
Hi Gaurav, I recommend you just run a MapReduce job for this computation. Alternatively, you can look at the code for the C* MapReduce input format: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlInputFormat.java That should give you what you need to

Any notion of "unions" in C* user-defined types?

2015-02-23 Thread Clint Kelly
Hi all, I am building an application that keeps a time-series record of clickstream data (clicks, impressions, etc.). The data model looks something like: CREATE TABLE clickstream ( userid text, event_time timestamp, interaction frozen , PRIMARY KEY (userid, timestamp) ) WITH CLUSTERING

Re: Why no virtual nodes for Cassandra on EC2?

2015-02-23 Thread Clint Kelly
Hi mck, I'm not familiar with this ticket, but my understanding was that performance of Hadoop jobs on C* clusters with vnodes was poor because a given Hadoop input split has to run many individual scans (one for each vnode) rather than just a single scan. I've run C* and Hadoop in production wit

Re: Running Cassandra + Spark on AWS - architecture questions

2015-02-23 Thread Clint Kelly
>> write only for incoming data and read-only from aggregated table, it is >> less IO intensive than the analytics DC with lot of read & write to compute >> aggregations. >> >> >> >> On Fri, Feb 20, 2015 at 10:17 PM, Clint Kelly >> wrote: >> >>> H

Re: AMI to use to launch a cluster with OpsCenter on AWS

2015-02-20 Thread Clint Kelly
:36 PM, Clint Kelly wrote: > Hi all, > > I am trying to follow the instructions here for installing DSE 4.6 on AWS: > > > http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/install/installAMIOpsc.html > > I was successful creating a sing

AMI to use to launch a cluster with OpsCenter on AWS

2015-02-20 Thread Clint Kelly
Hi all, I am trying to follow the instructions here for installing DSE 4.6 on AWS: http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/install/installAMIOpsc.html I was successful creating a single-node instance running OpsCenter, which I intended to bootstrap creat

Re: Why no virtual nodes for Cassandra on EC2?

2015-02-20 Thread Clint Kelly
would mean that paying a small efficiency cost when reading data out of Cassandra initially might not be the end of the world (especially given the benefits of using vnodes). On Fri, Feb 20, 2015 at 8:29 AM, Clint Kelly wrote: > Hi Mark, > > Thanks for your reply. That makes sense. I rec

Running Cassandra + Spark on AWS - architecture questions

2015-02-20 Thread Clint Kelly
Hi all, I read the DSE 4.6 documentation and I'm still not 100% sure what a mixed workload Cassandra + Spark installation would look like, especially on AWS. What I gather is that you use OpsCenter to set up the following: - One "virtual data center" for real-time processing (e.g., ingestion

Re: Why no virtual nodes for Cassandra on EC2?

2015-02-20 Thread Clint Kelly
This is specific to the example AMI and that type of workload. This is by no > means a warning for users to disable vnodes on their Real-Time/Transactional > Cassandra only clusters on EC2. > > > I've used vnodes on EC2 without issue. > > Regards, > Mark > > On 2

Why no virtual nodes for Cassandra on EC2?

2015-02-19 Thread Clint Kelly
Hi all, The guide for installing Cassandra on EC2 says that "Note: The DataStax AMI does not install DataStax Enterprise nodes with virtual nodes enabled." http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/install/installAMI.html Just curious why this is the case

Re: No schema agreement from live replicas?

2015-02-03 Thread Clint Kelly
FWIW increasing the threshold for withMaxSchemaAgreementWaitSeconds to 30sec was enough to fix my problem---I would like to understand whether the cluster has some kind of configuration problem that made doing so necessary, however. Thanks! On Tue, Feb 3, 2015 at 7:44 AM, Clint Kelly wrote

No schema agreement from live replicas?

2015-02-03 Thread Clint Kelly
Hi all, I have an application that uses the Java driver to create a table and then immediately write to it. I see the following warning in my logs: [10.241.17.134] out: 15/02/03 09:32:24 WARN com.datastax.driver.core.Cluster: No schema agreement from live replicas after 10 s. The schema may not

Best practice for emulating a Cassandra timeout during unit tests?

2014-12-09 Thread Clint Kelly
Hi all, I'd like to write some tests for my code that uses the Cassandra Java driver to see how it behaves if there is a read timeout while accessing Cassandra. Is there a best-practice for getting this done? I was thinking about adjusting the settings in the cluster builder to adjust the timeou

Re: any way to get nodetool proxyhistograms data for an entire cluster?

2014-11-19 Thread Clint Kelly
On Nov 19, 2014, at 8:48 PM, Robert Coli wrote: > > On Wed, Nov 19, 2014 at 3:22 PM, Clint Kelly > wrote: > >> Is there any way (other than me cooking up a little script) to >> automatically get the proxyhistogram stats for my entire cluster? >> > > OpsCenter migh

any way to get nodetool proxyhistograms data for an entire cluster?

2014-11-19 Thread Clint Kelly
If I run this tool on a given host, it shows me stats for only the cases where that host was the coordinator node, correct? Is there any way (other than me cooking up a little script) to automatically get the proxyhistogram stats for my entire cluster? -Clint

Re: What time range does nodetool cfhistograms use?

2014-11-16 Thread Clint Kelly
e the read latencies shown the latencies within a single host, or are > they the end-to-end latencies from the coordinator node?" --> cfhistograms > shows metrics at table/node level, proxyhistograms shows metrics at > cluster/coordinator level > > On Sun, Nov 16, 2014 at 10:31

Best practices for route tracing

2014-11-16 Thread Clint Kelly
Hi all, I am trying to debug some high-latency outliers (99th percentile) in an application I'm working on. I thought that I could turn on route tracing, print the route traces to logs, and then examine my logs after a load test to find the highest-latency paths and figure out what is going on.

Re: What time range does nodetool cfhistograms use?

2014-11-16 Thread Clint Kelly
t startup and then > re-evaluated during compaction. > > > Mark > > > On 16 November 2014 17:12, Clint Kelly wrote: >> >> Hi all, >> >> Over what time range does "nodetool cfhistograms" operate? >> >> I am using Cassandra 2.0.8.39.

What time range does nodetool cfhistograms use?

2014-11-16 Thread Clint Kelly
Hi all, Over what time range does "nodetool cfhistograms" operate? I am using Cassandra 2.0.8.39. I am trying to debug some very high 95th and 99th percentile read latencies in an application that I'm working on. I tried running nodetool cfhistograms to get a flavor for the distribution of read

best practice for waiting for schema changes to propagate

2014-09-29 Thread Clint Kelly
Hi all, I often have problems with code that I write that uses the DataStax Java driver to create / modify a keyspace or table and then soon after reads the metadata for the keyspace to verify that whatever changes I made the keyspace or table are complete. As an example, I may create a table cal

nondeterministic NoHostAvailableException occurs while dropping a table

2014-09-05 Thread Clint Kelly
Hi all, TL;DR - I think my unit tests are sometimes failing because of read timeouts to an EmbeddedCassandraService when dropping a table triggers a compaction on a highly-loaded build slave. Does this sound reasonable? What options should I change in my Cluster.Builder (or elsewhere) to prevent

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Clint Kelly
gt; > On Tue, Aug 19, 2014 at 11:03 PM, Clint Kelly wrote: >> >> Thanks for the update, Benedict. We are still using 2.0.9 >> unfortunately. :/ I will keep that in mind for when we upgrade. >> >> On Tue, Aug 19, 2014 at 10:51 AM, Benedict Elliott Smith >

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Clint Kelly
proved-cassandra-2-1-stress-tool-benchmark-any-schema > > There are however some features up for revision before release in order to > help generate realistic workloads. See > https://issues.apache.org/jira/browse/CASSANDRA-7519 for details. > > > On Tue, Aug 19, 2014 at 1

Re: cassandra-stress with clustering columns?

2014-08-19 Thread Clint Kelly
ttps://github.com/Mishail/CqlJmeter > > -M > > > > On 8/17/14 12:26, Clint Kelly wrote: >> >> Hi all, >> >> Is there a way to use the cassandra-stress tool with clustering columns? >> >> I am trying to figure out whether an application that I'm

Re: cassandra-stress with clustering columns?

2014-08-17 Thread Clint Kelly
ering columns make a big difference in write performance? On Sun, Aug 17, 2014 at 12:26 PM, Clint Kelly wrote: > Hi all, > > Is there a way to use the cassandra-stress tool with clustering columns? > > I am trying to figure out whether an application that I'm running o

cassandra-stress with clustering columns?

2014-08-17 Thread Clint Kelly
Hi all, Is there a way to use the cassandra-stress tool with clustering columns? I am trying to figure out whether an application that I'm running on is slow because of my application logic, C* data model, or underlying C* setup (e.g., I need more nodes or to tune some parameters). My applicatio

Re: question about OpsCenter agent

2014-08-15 Thread Clint Kelly
> configuration options available to the datastax-agent see this page: > datastax.com/documentation/opscenter/5.0/opsc/configure/agentAddressConfiguration.html > > > Mark > > > > On Fri, Aug 15, 2014 at 3:32 AM, Clint Kelly wrote: >> >> Hi all, >> >&g

question about OpsCenter agent

2014-08-14 Thread Clint Kelly
Hi all, I just installed DataStax Enterprise 4.5. I installed OpsCenter Server on one of my four machines. The port that OpsCenter usually uses () was used by something else, so I modified /usr/share/opscenter/conf/opscenterd.conf to set the port to 8889. When I log into OpsCenter, it says

Re: Cassandra process exiting mysteriously

2014-08-12 Thread Clint Kelly
the log before the shutdown lines in at least an hour before.. > We're using C* 2.0.9. > > > On Thu, Aug 7, 2014 at 12:49 AM, Clint Kelly wrote: >> >> Hi Rob, >> >> Thanks for the clarification; this is really useful. I'll run some >> experime

Re: Cassandra process exiting mysteriously

2014-08-06 Thread Clint Kelly
Hi Rob, Thanks for the clarification; this is really useful. I'll run some experiments to see if the problem is a JVM OOM on our build machine. Best regards, Clint On Wed, Aug 6, 2014 at 1:14 PM, Robert Coli wrote: > On Wed, Aug 6, 2014 at 1:12 PM, Robert Coli wrote: >> >> On Wed, Aug 6, 2014

Re: Cassandra process exiting mysteriously

2014-08-06 Thread Clint Kelly
Hi Duncan, Thanks for your help. I am at a loss as to what is causing this process to stop then. I would not expect the Cassandra process to finish until my code calls Process#destroy, but it seems to non-deterministically stop much earlier sometimes. FWIW I have seen failures on another machin

Re: Cassandra process exiting mysteriously

2014-08-05 Thread Clint Kelly
Best regards, Clint On Tue, Aug 5, 2014 at 9:29 PM, Kevin Burton wrote: > If there is an oom it will be in the logs. > > On Aug 5, 2014 8:17 PM, "Clint Kelly" wrote: >> >> Hi everyone, >> >> For some integration tests, we start up a CassandraD

Cassandra process exiting mysteriously

2014-08-05 Thread Clint Kelly
Hi everyone, For some integration tests, we start up a CassandraDaemon in a separate process (using the Java 7 ProcessBuilder API). All of my integration tests run beautifully on my laptop, but one of them fails on our Jenkins cluster. The failing integration test does around 10k writes to diffe

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
t 11:42 AM, Clint Kelly wrote: > Hi Rob, > > Thanks for your feedback. I understand that use of ALLOW FILTERING is > not a best practice. In this case, however, I am building a tool on > top of Cassandra that allows users to sometimes do things that are > less than optimal. W

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
cy. Thanks for your help! Best regards, Clint On Tue, Aug 5, 2014 at 10:54 AM, Robert Coli wrote: > On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly wrote: >> >> Allow me to rephrase a question I asked last week. I am performing some >> queries with ALLOW FILTERING and getti

Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Hi all, Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses we

Re: Occasional read timeouts seen during row scans

2014-08-04 Thread Clint Kelly
ansky > > -Original Message- From: Duncan Sands > Sent: Saturday, August 2, 2014 7:04 AM > To: user@cassandra.apache.org > Subject: Re: Occasional read timeouts seen during row scans > > > Hi Clint, is time correctly synchronized between your nodes? > > Ciao, Duncan

Re: Occasional read timeouts seen during row scans

2014-08-01 Thread Clint Kelly
I was observing the timeout) Best regards, Clint On Fri, Aug 1, 2014 at 5:02 PM, Clint Kelly wrote: > Hi everyone, > > I am seeing occasional read timeouts during multi-row queries, but I'm > having difficulty reproducing them or understanding what the problem > is. &

Occasional read timeouts seen during row scans

2014-08-01 Thread Clint Kelly
Hi everyone, I am seeing occasional read timeouts during multi-row queries, but I'm having difficulty reproducing them or understanding what the problem is. First, some background: Our team wrote a custom MapReduce InputFormat that looks pretty similar to the DataStax InputFormat except that it

Re: Index creation sometimes fails

2014-07-25 Thread Clint Kelly
Hi Tyler, FWIW I was not able to reproduce this problem with a smaller example. I'll go ahead and file the JIRA anyway. Thanks for your help! Best regards, Clint On Thu, Jul 17, 2014 at 3:05 PM, Tyler Hobbs wrote: > > On Thu, Jul 17, 2014 at 4:59 PM, Clint Kelly > wrote:

How to maintain the N-most-recent versions of a value?

2014-07-17 Thread Clint Kelly
Hi everyone, I am trying to design a schema that will keep the N-most-recent versions of a value. Currently my table looks like the following: CREATE TABLE foo ( rowkey text, family text, qualifier text, version long, value blob, PRIMARY KEY (rowkey, family, qualifier, ve

Re: Index creation sometimes fails

2014-07-17 Thread Clint Kelly
astax JIRA, correct? Best regards, Clint On Wed, Jul 16, 2014 at 4:32 PM, Tyler Hobbs wrote: > > On Tue, Jul 15, 2014 at 1:40 PM, Clint Kelly wrote: >> >> >> Is there some way to get the driver to block until the schema code has >> propagated everywhere? My cu

Re: Index creation sometimes fails

2014-07-15 Thread Clint Kelly
, 2014 at 11:32 AM, DuyHai Doan wrote: > As far as I know, schema propagation always takes some times in the cluster. > On this mailing list some people in the past faced similar behavior. > > > On Tue, Jul 15, 2014 at 8:20 PM, Clint Kelly wrote: >> >> FWIW I was able to

Re: Index creation sometimes fails

2014-07-15 Thread Clint Kelly
not, pause 5 seconds This loop took three iterations to create the index. Is this expected? This seems really weird! Best regards, Clint On Mon, Jul 14, 2014 at 5:54 PM, Clint Kelly wrote: > BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver > on a three-node clus

Re: Index creation sometimes fails

2014-07-14 Thread Clint Kelly
BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node cluster with DSE 4.5. On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly wrote: > Hi everyone, > > I have some code that I've been fiddling with today that uses the > DataStax Java driver to create

Index creation sometimes fails

2014-07-14 Thread Clint Kelly
Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the

Re: Setting up DSE 4.5 for mixed workload with BYOH

2014-07-02 Thread Clint Kelly
(I assume this would be an increase of several orders of magnitude in the number of input splits.) Best regards, Clint On Wed, Jul 2, 2014 at 6:04 PM, Clint Kelly wrote: > Hi Tupshin, > > Thanks for the quick reply. Is the performance concern from the > Hadoop integration needi

Re: Setting up DSE 4.5 for mixed workload with BYOH

2014-07-02 Thread Clint Kelly
n't enable vnodes on any Cassandra/DSE > datacenter that is doing hadoop analytics workloads. Other DCs in the > cluster can use vnodes. > > -Tupshin > > On Jul 2, 2014 5:50 PM, "Clint Kelly" wrote: >> >> Hi everyone, >> >> Apologies if this

Setting up DSE 4.5 for mixed workload with BYOH

2014-07-02 Thread Clint Kelly
Hi everyone, Apologies if this is the incorrect forum for a question like this. I am going to set up a mixed-workload (real-time and analytics) installation of DSE 4.5 using bring-your-own Hadoop (BYOH). We are using CDH 5.0. I was reviewing the installation instructions, and I came across the

Re: Is the tarball for a given release in a Maven repository somewhere?

2014-05-22 Thread Clint Kelly
arch.maven.org/#search%7Cga%7C1%7Ca%3A%22apache-cassandra%22 > > > On 05/20/2014 05:30 PM, Clint Kelly wrote: >> >> Hi all, >> >> I am using the maven assembly plugin to build a project that contains >> a development environment for a project that we've bui

Re: Is the tarball for a given release in a Maven repository somewhere?

2014-05-21 Thread Clint Kelly
Thanks, Lewis. I created a ticket here: https://issues.apache.org/jira/browse/CASSANDRA-7283 For now I just copied the "cassandra" and "cassandra.in.sh" scripts into my project, along with custom configuration files. We already have all of the necessary JARs in our project's "lib" directory, si

Is the tarball for a given release in a Maven repository somewhere?

2014-05-20 Thread Clint Kelly
Hi all, I am using the maven assembly plugin to build a project that contains a development environment for a project that we've built at work on top of Cassandra. I'd like this development environment to include the latest release of Cassandra. Is there a maven repo anywhere that contains an ar

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-16 Thread Clint Kelly
Hi Anton, One approach you could look at is to write a custom InputFormat that allows you to limit the token range of rows that you fetch (if the AbstractColumnFamilyInputFormat does not do what you want). Doing so is not too much work. If you look at the class RowIterator within CqlRecordReader

Hadoop InputFormat that supports multiple queries

2014-05-12 Thread Clint Kelly
Hi everyone, I couple of months ago I started working on a new Hadoop InputFormat that we needed for something at my work. It is in a semi-working state now so I thought I would post a link in case anyone is interested: https://github.com/wibiclint/cassandra2-hadoop2 At the time I started worki

Re: Error "evicting cold readers" when launching an EmbeddedCassandraService for a second time

2014-05-02 Thread Clint Kelly
llo Clint > > Why do you need to remove all SSTables or dropping keyspace between tests > ? Truncating tables is not enough to have clean and repeatable tests ? > > Regards > > Duy Hai DOAN > > > On Thu, May 1, 2014 at 5:54 PM, Clint Kelly wrote: > >> Hi,

Re: Error "evicting cold readers" when launching an EmbeddedCassandraService for a second time

2014-05-01 Thread Clint Kelly
move the > SSTables between tests ? I'm using extensively the same infrastructure than > the EmbeddedCassandraService with Achilles and I have no such issue so far > > Regards > > > > On Wed, Apr 30, 2014 at 8:43 PM, Clint Kelly wrote: > >> Hi all, >> &

Error "evicting cold readers" when launching an EmbeddedCassandraService for a second time

2014-04-30 Thread Clint Kelly
Hi all, I have a unit test framework for a Cassandra project that I'm working on. For every one of my test classes, I delete all of the data file, commit log, and saved cache locations, start an EmbeddedCassandraService, and populate a keyspace and tables from scratch. Currently, the unit tests t

Per-keyspace partitioners?

2014-04-09 Thread Clint Kelly
Hi everyone, Is there a way to change the partitioner on a per-table or per-keyspace basis? We have some tables for which we'd like to enable ordered scans of rows, so we'd like to use the ByteOrdered partitioner for those, but use Murmur3 for everything else in our cluster. Is this possible? O

Re: Cassandra Chef cookbook - weird bug with broadcast_address: 10.0.2.15

2014-03-31 Thread Clint Kelly
in > vagrant/vbox, with default networking configuration node[:ipaddress] is > equal 10.0.2.15 hence your broadcast_address. > You can setup networking in different way or setup attribute > node[:cassandra][:broadcast_address] manually. > > > > On Mon, Mar 31, 2014 at 3:03 AM, Cli

Re: Meaning of "token" column in system.peers and system.local

2014-03-31 Thread Clint Kelly
and system.peers tables you must make sure > that you don't connect to other nodes. I think the inconsistency you think > you found is because the first and second queries went to different nodes. > the java driver will connect to all nodes and load balance requests by > default. >

Re: Meaning of "token" column in system.peers and system.local

2014-03-30 Thread Clint Kelly
Best regards, Clint On Sun, Mar 30, 2014 at 4:51 PM, Clint Kelly wrote: > Hi all, > > > I am working on a Hadoop InputFormat implementation that uses only the > native protocol Java driver and not the Thrift API. I am currently trying > to replicate some of the behavior

Cassandra Chef cookbook - weird bug with broadcast_address: 10.0.2.15

2014-03-30 Thread Clint Kelly
All, Has anyone used the Cassandra Chef cookbook https://github.com/michaelklishin/cassandra-chef-cookbook and seen "broadcast_address: 10.0.2.15" in /etc/cassandra/cassandra.yaml? I looked through the source code for the cookbook and I have no idea how this is happening. I was able to fix this

Meaning of "token" column in system.peers and system.local

2014-03-30 Thread Clint Kelly
Hi all, I am working on a Hadoop InputFormat implementation that uses only the native protocol Java driver and not the Thrift API. I am currently trying to replicate some of the behavior of *Cassandra.client.describe_ring(myKeyspace)* from the Thrift API. I would like to do the following: -

How to tear down an EmbeddedCassandraService in unit tests?

2014-03-28 Thread Clint Kelly
All, I have a question about how to use the EmbeddedCassandraService in unit tests. I wrote a short collection of unit tests here: https://github.com/wibiclint/cassandra-java-driver-keyspaces I'm trying to start up a new EmbeddedCassandraService for each unit test. I looked at the Cassandra sou

Building a maven project that depends on Cassandra 2.0.6 or 2.1 beta?

2014-03-03 Thread Clint Kelly
Folks, Can anyone instruct me about how to set up a maven project that depends on either 2.0.6 or 2.1? I am interested in using some of the new features (e.g., static columns) in my current project. Being able to just install one of these versions in my local maven repository would be good enoug

Re: Resetting a counter in CQL

2014-03-02 Thread Clint Kelly
work. Over a >> period of time , the counters drift from the correct value. There were >> several open issues and proposal to rewrite the counter implementation >> >> Have you checked if all the issues with counters have been fixed ? >> >> regards >> >>

Re: Resetting a counter in CQL

2014-03-02 Thread Clint Kelly
I checked as in v 2.0.0 counters did not work. Over a >>> period of time , the counters drift from the correct value. There were >>> several open issues and proposal to rewrite the counter implementation >>> >>> Have you checked if all the issues with counters

Re: Resetting a counter in CQL

2014-02-28 Thread Clint Kelly
Great, thanks! On Fri, Feb 28, 2014 at 4:38 PM, Tyler Hobbs wrote: > > On Fri, Feb 28, 2014 at 6:32 PM, Clint Kelly wrote: > >> >> >> What is the best known method for resetting a counter in CQL? Is it best >> to read the counter and then increment it by

Resetting a counter in CQL

2014-02-28 Thread Clint Kelly
Folks, What is the best known method for resetting a counter in CQL? Is it best to read the counter and then increment it by a negative amount? Or to delete the row and then increment it by zero? These are the two methods I could come up with. Both of these seem fine to me---I'm just wondering

Any way to get a list of per-node token ranges using the DataStax Java driver?

2014-02-28 Thread Clint Kelly
Hi everyone, I've been working on a rewrite of the Cassandra InputFormat for Hadoop 2 using the DataStax Java driver instead of the Thrift API. I have a prototype working now, but there is one bit of code that I have not been able to replace with code for the Java driver. In the InputFormat#getS

Re: Getting the most-recent version from time-series data

2014-02-28 Thread Clint Kelly
ll have to indicate (to our software that sits on top of C*) that they are going to use paging, and then we are going to be doing multiple client / server operations anyway. I'd just like to minimize them. :) Best regards, Clint On Fri, Feb 28, 2014 at 9:47 AM, Clint Kelly wrote: > Hi T

Re: Combine multiple SELECT statements into one RPC?

2014-02-28 Thread Clint Kelly
> On Thu, Feb 27, 2014 at 1:00 AM, Clint Kelly wrote: > >> Hi all, >> >> Is there any way to use the DataStax Java driver to combine multiple >> SELECT statements into a single RPC? I assume not (I could not find >> anything about this in the documentati

Re: Getting the most-recent version from time-series data

2014-02-28 Thread Clint Kelly
o it directly on the client. The token aware client >>> will send each request for a row straight to a node that owns it. With a >>> separate connection open to each node, this is done in parallel from the >>> get-go. Fewer hops. Less load on the coordinator. No bottlene

Re: CQL: Any way to have inequalities on multiple clustering columns in a WHERE clause?

2014-02-28 Thread Clint Kelly
;,2013) and qual < 'D' ALLOW FILTERING > > > On Fri, Feb 28, 2014 at 6:57 AM, Clint Kelly wrote: > >> All, >> >> Is there any way to have inequalities comparisons on multiple clustering >> columns in a WHERE clause in CQL? For example, I'd like to

CQL: Any way to have inequalities on multiple clustering columns in a WHERE clause?

2014-02-27 Thread Clint Kelly
All, Is there any way to have inequalities comparisons on multiple clustering columns in a WHERE clause in CQL? For example, I'd like to do: select * from foo where fam = 'Info' and qual > 'A' and qual < 'D' and version > 2013 ALLOW FILTERING; I get an error: Bad Request: PRIMARY KEY part

Re: Naming variables in a prepared statement in the DataStax Java driver

2014-02-27 Thread Clint Kelly
Ah never mind, I see, currently you can refer to the ?'s by name by using the name of the column to which the ? refers. And this works as long as each column is present only one in the statement. Sorry for the extra list traffic! On Thu, Feb 27, 2014 at 7:33 PM, Clint Kelly wrote: &g

Naming variables in a prepared statement in the DataStax Java driver

2014-02-27 Thread Clint Kelly
Folks, Is there a way to name the variables in a prepared statement when using the DataStax Java driver? For example, instead of doing: ByteBuffer byteBuffer = ... // some application logic String query = "SELECT * FROM foo WHERE bar = ?"; PreparedStatement preparedStatement = session.prepare(qu

Combine multiple SELECT statements into one RPC?

2014-02-26 Thread Clint Kelly
Hi all, Is there any way to use the DataStax Java driver to combine multiple SELECT statements into a single RPC? I assume not (I could not find anything about this in the documentation), but I just wanted to check. Thanks! Best regards, Clint

Re: Update multiple rows in a CQL lightweight transaction

2014-02-26 Thread Clint Kelly
m anymore. > > > On Wed, Feb 26, 2014 at 3:18 AM, Tupshin Harper wrote: > >> Unfortunately there is no option to vote for a "resolved" ticket, but if >> you can propose a better syntax that people agree on, you could probably >> get some fresh traction on it. >>

Re: Getting the most-recent version from time-series data

2014-02-25 Thread Clint Kelly
this help you > in your situation? > > Jonathan > > > On Feb 25, 2014, at 7:49 PM, Clint Kelly wrote: > > > > Hi everyone, > > > > Let's say that I have a table that looks like the following: > > > > CREATE TABLE time_series_stuff ( &g

Getting the most-recent version from time-series data

2014-02-25 Thread Clint Kelly
Hi everyone, Let's say that I have a table that looks like the following: CREATE TABLE time_series_stuff ( key text, family text, version int, val text, PRIMARY KEY (key, family, version) ) WITH CLUSTERING ORDER BY (family ASC, version DESC) AND bloom_filter_fp_chance=0.01 AND c

Re: Update multiple rows in a CQL lightweight transaction

2014-02-25 Thread Clint Kelly
ending on your needs, you might be able to use a static > column (coming with 2.0.6) as your conditional flag, as that column is > shared by all rows in the partition. > > -Tupshin > > > > On Mon, Feb 24, 2014 at 3:57 PM, Clint Kelly wrote: > >> Hi Tupshin, >> &

Re: Update multiple rows in a CQL lightweight transaction

2014-02-24 Thread Clint Kelly
True > > (both updates succeeded because the check on t succeeded) > > select * from foo; > x | y | t | z > ---+---+---+--- > a | 1 | 2 | 1 > a | 2 | 2 | 2 > > Hope this helps. > > -Tupshin > > > > On Fri, Feb 21, 2014 at 6:05 PM, DuyHai Doan

Update multiple rows in a CQL lightweight transaction

2014-02-21 Thread Clint Kelly
Folks, Does anyone know how I can modify multiple rows at once in a lightweight transaction in CQL3? I saw the following ticket: https://issues.apache.org/jira/browse/CASSANDRA-5633 but it was not obvious to me from the comments how (or whether) this got resolved. I also couldn't find anyt

CQL3 delete using < or > ?

2014-02-08 Thread Clint Kelly
Folks, Is there any way to perform a delete in CQL of all rows where a particular columns (that is part of the primary key) is less than a certain value? I believe that the corresponding SELECT statement works, as in this example: cqlsh:fiddle> describe table foo; CREATE TABLE foo ( key text,

Re: Buffering for lots of INSERT or UPDATE calls with DataStax Java driver?

2014-02-08 Thread Clint Kelly
Java driver?" --> Yes, use UNLOGGED batches. More > info here: > http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_reference/batch_r.html > > > On Sat, Feb 8, 2014 at 10:19 PM, Clint Kelly wrote: >> >> Folks, >> >> Is there a recomm

Buffering for lots of INSERT or UPDATE calls with DataStax Java driver?

2014-02-08 Thread Clint Kelly
Folks, Is there a recommended way to perform lots of INSERT operations in a row when using the DataStax Java driver? I notice that the RecordWriter for the CQL3 Hadoop implementation in Cassandra does some per-data-node buffering of CQL3 queries. The DataStax Java driver, on the other hand, supp

Re: Cassandra 2.0 with Hadoop 2.x?

2014-02-06 Thread Clint Kelly
Okay neat, hopefully it will look reasonable by the end of the month or so! :) On Thu, Feb 6, 2014 at 4:15 PM, Steven A Robenalt wrote: > I am as well. > > Thanks, > Steve > > > > On Thu, Feb 6, 2014 at 4:13 PM, Alex Popescu wrote: >> >> >> On Thu,

Re: Cassandra 2.0 with Hadoop 2.x?

2014-02-06 Thread Clint Kelly
ository for Hadoop Classes but you can use >> others. >> >> Regards >> -- >> Cyril SCETBON >> >> On 03 Feb 2014, at 19:10, Clint Kelly wrote: >> >> > Folks, >> > >> > Has anyone out there used Cassandra 2.0 with

Cassandra 2.0 with Hadoop 2.x?

2014-02-03 Thread Clint Kelly
Folks, Has anyone out there used Cassandra 2.0 with Hadoop 2.x? I saw this discussion on the Cassandra JIRA: https://issues.apache.org/jira/browse/CASSANDRA-5201 but the fix referenced (https://github.com/michaelsembwever/cassandra-hadoop) is for Cassandra 1.2. I put together a similar pat