RE: Use Cassnadra thrift API with collection type

2014-06-23 Thread James Campbell
Huilang, Since there hasn't been another reply yet, I'll throw out an idea that worked for us as part of a test, though it does not seem exactly like a "preferred" way since it crosses code-bases. We built the type using straight java type, then used the Datastax v2 driver's DataType class s

Re: Data model for streaming a large table in real time.

2014-06-07 Thread James Campbell
This is a basic question, but having heard that advice before, I'm curious about why the minimum recommended replication factor is three? Certainly additional redundancy, and, I believe, a minimum threshold for paxos. Are there other reasons? On Jun 7, 2014 10:52 PM, Colin wrote: To have any r

RE: Consolidating records and TTL

2014-06-05 Thread James Campbell
I should be thinking about for that sort of batch updating?​ James Campbell From: Aaron Morton Sent: Thursday, June 5, 2014 5:26 AM To: Cassandra User Cc: charlie@gmail.com Subject: Re: Consolidating records and TTL As Tyler says, with atomic batches which ar

RE: CQL 3 and wide rows

2014-05-19 Thread James Campbell
Maciej, In CQL3 "wide rows" are expected to be created using clustering columns. So while the schema will have a relatively smaller number of named columns, the effect is a wide row. For example: CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text, PRIMA

Re: Best partition type for Cassandra with JBOD

2014-05-17 Thread James Campbell
n Fri, May 16, 2014 at 10:29 AM, James Campbell mailto:ja...@breachintelligence.com>> wrote: Hi all- What partition type is best/most commonly used for a multi-disk JBOD setup running Cassandra on CentOS 64bit? The datastax production server guidelines recommend XFS for data partitio

Best partition type for Cassandra with JBOD

2014-05-16 Thread James Campbell
Hi all- What partition type is best/most commonly used for a multi-disk JBOD setup running Cassandra on CentOS 64bit? The datastax production server guidelines recommend XFS for data partitions, saying, "Because Cassandra can use almost half your disk space for a single file, use XFS when usin

BulkOutputFormat and CQL3

2014-04-22 Thread James Campbell
Hi Cassandra Users- I have a Hadoop job that uses the pattern in Cassandra 2.0.6's hadoop_cql3_word_count example to load data from HDFS into Cassandra. Having read about BulkOutputFormat as a way to potentially significantly increase the write throughput from Hadoop to Cassandra, I am conside