Cassandra 2 Upgrade

2013-09-11 Thread Christopher Wirt
Hello,

 

I'm keen on moving to 2.0. The new thrift server implementation and other
performance improvements are getting me excited.

I'm currently running 1.2.8  in 3 DC's with 3-3-9 nodes 64GB RAM, 3x200GB
SSDs, thrift, LCS, Snappy,  Vnodes, 

 

Is anyone using 2.0 in production yet? Had any issues? I haven't seen
anything popup here or on JIRA, so either there are none/few, or nobody is
using it yet.

 

How important is it to move to 1.2.9 before 2.0? To me it looks like 1.2.8
to 2.0 will be fine. 

 

Am I ok running a mixed cluster for 24 hours? E.g. I switched just one DC
for 24 hours as a test.

 

Size-tiered compaction on L0 compaction strategy will be default now for
LCS?

 

Thanks

 

Chris

 

 



Re: Composite Column Grouping

2013-09-11 Thread Laing, Michael
Then you can do this. I handle millions of entries this way and it works
well if you are mostly interested in recent activity.

If you need to span all activity then you can use a separate table to
maintain the 'latest'. This table should also be sharded as entries will be
'hot'. Sharding will spread the heat and the tombstones (compaction load)
around the cluster.

-ml

-- put this in  and run using 'cqlsh -f 

DROP KEYSPACE latest;

CREATE KEYSPACE latest WITH replication = {
'class': 'SimpleStrategy',
'replication_factor' : 1
};

USE latest;

CREATE TABLE time_series (
bucket_userid text, -- bucket is the beginning of a datetime span
concatenated with a shard designator
pkid text,
timeuuid text,
colname text,
PRIMARY KEY (bucket_userid, timeuuid)
);

-- the example table is using 15 minute bucket spans and 2 shards for
illustration (you would usually use more shards)
-- adjust these appropriately for your application

UPDATE time_series SET pkid = '1000', colname = 'Col-Name-1' where
bucket_userid = '2013-09-11T05:15-0_XYZ' AND timeuuid='200';
UPDATE time_series SET pkid = '1001', colname = 'Col-Name-2' where
bucket_userid = '2013-09-11T05:15-1_XYZ' AND timeuuid='201';
UPDATE time_series SET pkid = '1000', colname = 'Col-Name-3' where
bucket_userid = '2013-09-11T05:15-0_XYZ' AND timeuuid='202';
UPDATE time_series SET pkid = '1000', colname = 'Col-Name-4' where
bucket_userid = '2013-09-11T05:30-1_XYZ' AND timeuuid='203';
UPDATE time_series SET pkid = '1002', colname = 'Col-Name-5' where
bucket_userid = '2013-09-11T05:30-0_XYZ' AND timeuuid='204';

-- This query assumes that the 'current' span is 2013-09-11T05:30 and I am
interested in this span and the previous one.

SELECT * FROM time_series
WHERE bucket_userid in ( -- go back as many spans as you need to, all
shards in each span (cartesian product)
'2013-09-11T05:15-0_XYZ',
'2013-09-11T05:15-1_XYZ',
'2013-09-11T05:30-0_XYZ',
'2013-09-11T05:30-1_XYZ'
)
ORDER BY timeuuid DESC;

-- returns:
-- bucket_userid  | timeuuid | colname| pkid
--+--++--
-- 2013-09-11T05:30-0_XYZ |  204 | Col-Name-5 | 1002
-- 2013-09-11T05:30-1_XYZ |  203 | Col-Name-4 | 1000
-- 2013-09-11T05:15-0_XYZ |  202 | Col-Name-3 | 1000
-- 2013-09-11T05:15-1_XYZ |  201 | Col-Name-2 | 1001
-- 2013-09-11T05:15-0_XYZ |  200 | Col-Name-1 | 1000

-- do a stable purge on pkid to get the result.


On Wed, Sep 11, 2013 at 1:01 AM, Ravikumar Govindarajan <
ravikumar.govindara...@gmail.com> wrote:

> Thanks Michael,
>
> But I cannot sort the rows in memory, as the number of columns will be
> quite huge.
>
> From the python script above:
>select_stmt = "select * from time_series where userid = 'XYZ'"
>
> This would return me many hundreds of thousands of columns. I need to go
> in time-series order using ranges [Pagination queries].
>
>
> On Wed, Sep 11, 2013 at 7:06 AM, Laing, Michael  > wrote:
>
>> If you have set up the table as described in my previous message, you
>> could run this python snippet to return the desired result:
>>
>> #!/usr/bin/env python
>> # -*- coding: utf-8 -*-
>> import logging
>> logging.basicConfig()
>>
>> from operator import itemgetter
>>
>> import cassandra
>> from cassandra.cluster import Cluster
>> from cassandra.query import SimpleStatement
>>
>> cql_cluster = Cluster()
>> cql_session = cql_cluster.connect()
>> cql_session.set_keyspace('latest')
>>
>> select_stmt = "select * from time_series where userid = 'XYZ'"
>> query = SimpleStatement(select_stmt)
>> rows = cql_session.execute(query)
>>
>> results = []
>> for row in rows:
>> max_time = max(row.colname.keys())
>> results.append((row.userid, row.pkid, max_time,
>> row.colname[max_time]))
>>
>> sorted_results = sorted(results, key=itemgetter(2), reverse=True)
>> for result in sorted_results: print result
>>
>> # prints:
>>
>> # (u'XYZ', u'1002', u'204', u'Col-Name-5')
>> # (u'XYZ', u'1000', u'203', u'Col-Name-4')
>> # (u'XYZ', u'1001', u'201', u'Col-Name-2')
>>
>>
>>
>> On Tue, Sep 10, 2013 at 6:32 PM, Laing, Michael <
>> michael.la...@nytimes.com> wrote:
>>
>>> You could try this. C* doesn't do it all for you, but it will
>>> efficiently get you the right data.
>>>
>>> -ml
>>>
>>> -- put this in  and run using 'cqlsh -f 
>>>
>>> DROP KEYSPACE latest;
>>>
>>> CREATE KEYSPACE latest WITH replication = {
>>> 'class': 'SimpleStrategy',
>>> 'replication_factor' : 1
>>> };
>>>
>>> USE latest;
>>>
>>> CREATE TABLE time_series (
>>> userid text,
>>> pkid text,
>>> colname map,
>>> PRIMARY KEY (userid, pkid)
>>> );
>>>
>>> UPDATE time_series SET colname = colname + {'200':'Col-Name-1'} WHERE
>>> userid = 'XYZ' AND pkid = '1000';
>>> UPDATE time_series SET colname = colname +
>>> {'201':'Col-Name-2'} WHERE userid = 'XYZ' AND pkid = '1001';
>>> UPDATE time_series SET colname = colname +
>>> {'202':'Col-Name-3'} WHERE userid = 'XYZ' AND pkid = '1000';
>>> UPDATE time_seri

Re: read consistency and clock drift and ntp

2013-09-11 Thread Paulo Motta
Here some links related do C* and clock synchronization:

http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks

http://ria101.wordpress.com/2011/02/08/cassandra-the-importance-of-system-clocks-avoiding-oom-and-how-to-escape-oom-meltdown/




2013/9/11 Jimmy Lin 

> hi,
> I have few question around the area how Cassandra use record's timestamp
> to determine which one to return from its replicated nodes ...
>
> -
> A record's timestamp is determined by the Cassandra server node's system
> timestamp when the request arrive the server and NOT by the client
> timestamp who make the request(unlike timeuuid)??
>
> -
> so clock synchronization between nodes is very important , clock drifting
> however is still possible even if one use NTP? I wonder what are the common
> practices in Cassandra community do to minimize clock drifting?
>
> -
> is there a recommend maximum drifting allowed in a cluster before things
> can get very ugly?
>
> -
> how to determine if two nodes in a cluster have out of sync clock?
> (as monitor or alert so appropriate action can be taken)
>
> Thanks
>
>
>
>
>
>



-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*


Re: Long running nodetool move operation

2013-09-11 Thread Ike Walker
The restart worked.

Thanks, Rob!

After the restart I ran 'nodetool move' again, used 'nodetool netstats | grep 
-v "0%"' to verify that data was actively streaming, and the move completed 
successfully.

-Ike

On Sep 10, 2013, at 11:04 AM, Ike Walker  wrote:

> Below is the output of "nodetool netstats".
> 
> I've never run that before, but from what I can read it shows no incoming 
> streams, and a bunch of outgoing streams to two other nodes, all at 0%.
> 
> I'll try the restart.
> 
> Thanks.
> 
> nodetool netstats
> Mode: MOVING
> Streaming to: /10.xxx.xx.xx
> 
> ...
> Streaming to: /10.xxx.xx.xxx
> 
> ...
> Not receiving any streams.
> Pool NameActive   Pending  Completed
> Commandsn/a 0  243401039
> Responses   n/a 0  295522535
> 
> On Sep 9, 2013, at 10:54 PM, Robert Coli  wrote:
> 
>>   On Mon, Sep 9, 2013 at 7:08 PM, Ike Walker  wrote:
>> I've been using nodetool move to rebalance my cluster. Most of the moves 
>> take under an hour, or a few hours at most. The current move has taken 4+ 
>> days so I'm afraid it will never complete. What's the best way to cancel it 
>> and try again?
>> 
>> What does "nodetool netstats" say? If it shows no streams in progress, the 
>> move is probably hung...
>> 
>> Restart the affected node. If that doesn't work, restart other nodes which 
>> might have been receiving a stream. I think in the case of "move" it should 
>> work to just restart the affected node. Restart the move, you will re-stream 
>> anything you already streamed once.
>> 
>> https://issues.apache.org/jira/browse/CASSANDRA-3486
>> 
>> If this ticket were completed, it would presumably include the ability to 
>> stop other hung streaming operations, like "move".
>> 
>> =Rob
> 



Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Keith Freeman
Yes, I started with a fresh keyspace (dropped and re-created) to run 
this test.


On 09/10/2013 02:01 PM, sankalp kohli wrote:

Have you dropped and recreated a keyspace with the same name recently?


On Tue, Sep 10, 2013 at 8:40 AM, Keith Freeman <8fo...@gmail.com 
> wrote:


While running a heavy insert load, one of my nodes started
throwing this exception when trying a compaction:

 INFO [CompactionExecutor:23] 2013-09-09 16:08:07,528
CompactionTask.java (line 105) Compacting [SSTableReader(p
ath='/var/lib/cassandra/data/smdb/tracedata/smdb-tracedata-ic-6-Data.db'),
SSTableReader(path='/var/lib/cassandr
a/data/smdb/tracedata/smdb-tracedata-ic-5-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/smdb/tracedata/
smdb-tracedata-ic-1-Data.db'),

SSTableReader(path='/var/lib/cassandra/data/smdb/tracedata/smdb-tracedata-ic-4-Da
ta.db'),

SSTableReader(path='/var/lib/cassandra/data/smdb/tracedata/smdb-tracedata-ic-2-Data.db')]
ERROR [CompactionExecutor:23] 2013-09-09 16:08:07,611
CassandraDaemon.java (line 192) Exception in thread Thread
[CompactionExecutor:23,1,main]
java.lang.RuntimeException: java.io.FileNotFoundException:
/var/lib/cassandra/data/smdb/tracedata/smdb-tracedata
-ic-5-Data.db (No such file or directory)
at
org.apache.cassandra.io.util.ThrottledReader.open(ThrottledReader.java:53)
at org.apache.cassandra.io

.sstable.SSTableReader.openDataReader(SSTableReader.java:1194)
at org.apache.cassandra.io

.sstable.SSTableScanner.(SSTableScanner.java:54)
at org.apache.cassandra.io

.sstable.SSTableReader.getDirectScanner(SSTableReader.java:1014)
at org.apache.cassandra.io

.sstable.SSTableReader.getDirectScanner(SSTableReader.java:1026)
at

org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.
java:157)
at

org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:163)
at

org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:117)
at

org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at

org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at

org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at

org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:211)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException:
/var/lib/cassandra/data/smdb/tracedata/smdb-tracedata-ic-5-Data.db
(No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:216)
at

org.apache.cassandra.io.util.RandomAccessReader.(RandomAccessReader.java:67)
at
org.apache.cassandra.io.util.ThrottledReader.(ThrottledReader.java:35)
at
org.apache.cassandra.io.util.ThrottledReader.open(ThrottledReader.java:49)
... 18 more

This shows up many times in the log.  I figured running a repair
on the node might fix it, but the repair ran for over an hour (the
node only has about 3G of data), so I figured it was hung.  I
tried restarting the repair, but each time it starts the node logs
that same exception immediately:

 INFO [AntiEntropySessions:5] 2013-09-10 09:36:35,526
AntiEntropyService.java (line 651) [repair #c6ab9c00-1a2e-
11e3-b0e5-05d1729cecff] new session: will sync /192.168.27.73
, /192.168.27.75  on
range (4925454539472655923,4991
066214171147775] for smdb.[tracedata, processors]
 INFO [AntiEntropySessions:5] 2013-09-10 09:36:35,526
AntiEntropyService.java (line 857) [repair #c6ab9c00-1a2e-
11e3-b0e5-05d1729cecff] requesting merkle trees for tracedata (to
[/192.168.27.75 , /192.168.27.73
])
ERROR [ValidationExecutor:2] 2013-09-10 09:36:35,535
Cassan

Re: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread

2013-09-11 Thread srmore
Thanks Viktor,

- check (cassandra-env.sh) -Xss size, you may need to increase it for your
JVM;

This seems to have done the trick !

Thanks !


On Tue, Sep 10, 2013 at 12:46 AM, Viktor Jevdokimov <
viktor.jevdoki...@adform.com> wrote:

>  For start:
>
> - check (cassandra-env.sh) -Xss size, you may need to increase it for your
> JVM;
>
> - check (cassandra-env.sh) -Xms and -Xmx size, you may need to increase it
> for your data load/bloom filter/index sizes.
>
> ** **
>
> ** **
>
> ** **
>Best regards / Pagarbiai
> *Viktor Jevdokimov*
> Senior Developer
>
>  [image: Adform News] 
>
> *Visit us at Dmexco: *Hall 6 Stand B-52
> September 18-19 Cologne, Germany
> Email: viktor.jevdoki...@adform.com
> Phone: +370 5 212 3063, Fax +370 5 261 0453
> J. Jasinskio 16C, LT-03163 Vilnius, Lithuania
> Follow us on Twitter: @adforminsider 
> Take a ride with Adform's Rich Media Suite
>  [image: Dmexco 2013] 
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
> *From:* srmore [mailto:comom...@gmail.com]
> *Sent:* Tuesday, September 10, 2013 6:16 AM
> *To:* user@cassandra.apache.org
> *Subject:* Error during startup - java.lang.OutOfMemoryError: unable to
> create new native thread [heur]
>
> ** **
>
>
> I have a 5 node cluster with a load of around 300GB each. A node went down
> and does not come up. I can see the following exception in the logs.
>
> ERROR [main] 2013-09-09 21:50:56,117 AbstractCassandraDaemon.java (line
> 139) Fatal exception in thread Thread[main,5,main]
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:640)
> at
> java.util.concurrent.ThreadPoolExecutor.addIfUnderCorePoolSize(ThreadPoolExecutor.java:703)
> at
> java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(ThreadPoolExecutor.java:1392)
> at
> org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor.(JMXEnabledThreadPoolExecutor.java:77)
> at
> org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor.(JMXEnabledThreadPoolExecutor.java:65)
> at
> org.apache.cassandra.concurrent.JMXConfigurableThreadPoolExecutor.(JMXConfigurableThreadPoolExecutor.java:34)
> at
> org.apache.cassandra.concurrent.StageManager.multiThreadedConfigurableStage(StageManager.java:68)
> at
> org.apache.cassandra.concurrent.StageManager.(StageManager.java:42)
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:344)
> at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:173)***
> *
>
> ** **
>
> The *ulimit -u* output is
> *515042*
>
> Which is far more than what is recommended [1] (10240) and I am skeptical
> to set it to unlimited as recommended here [2]
>
> Any pointers as to what could be the issue and how to get the node up.
>
>
>
>
> [1]
> http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html?pagename=docs&version=1.2&file=install/recommended_settings#cassandra/install/installRecommendSettings.html
>
> [2]
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201303.mbox/%3CCAPqEvGE474Omea1BFLJ6U_pbAkOwWxk=dwo35_pc-atwb4_...@mail.gmail.com%3E
> 
>
> Thanks !
>
<><>

cass 1.2.8 -> 1.2.9

2013-09-11 Thread Christopher Wirt
Anyone had issues upgrading to 1.2.9?

 

I tried upgrading one server in a three node DC.

The server appeared to come online fine without any errors, handshaking,
etc.

 

looking at tpstats the machine was serving very few reads.

Looking from the server side we were getting a lot of Unavailable errors

 

Had to rollback sharpish as this was a live system.

 



Re: making sure 1 copy per availability zone(rack) using EC2Snitch

2013-09-11 Thread rash aroskar
Thanks that is helpful.


On Tue, Sep 10, 2013 at 3:52 PM, Robert Coli  wrote:

> On Mon, Sep 9, 2013 at 11:21 AM, rash aroskar wrote:
>
>> Are you suggesting deploying 1.2.9 only if using Cassandra "DC" outside
>> of EC2 or if I wish to use rack replication at all?
>>
>
> 1) use 1.2.9 no matter what, instead of 1.2.5
> 2) if only *ever* will have clusters in EC2, EC2Snitch is fine, but read
> and understand CASSANDRA-3810, especially if not using vnodes
> 3) if *ever* may have clusters outside of EC2 + inside EC2, use
> GossipingPropertyFileSnitch
> 4) if using vnodes, just create a cluster out of hosts with 50% in each AZ
> and you should be all set.
>
> =Rob
>


Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman

I have RF=2

On 09/10/2013 11:18 AM, Robert Coli wrote:
On Tue, Sep 10, 2013 at 10:17 AM, Robert Coli > wrote:


On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8fo...@gmail.com
> wrote:

On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for
commitlog and data


BTW, is RF=3? If so, you effectively have a 1 node cluster while writing.

=Rob




Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman


On 09/10/2013 11:42 AM, Nate McCall wrote:
With SSDs, you can turn up memtable_flush_writers - try 3 initially (1 
by default) and see what happens. However, given that there are no 
entries in 'All time blocked' for such, they may be something else.
Tried that, it seems to have reduced the loads a little after everything 
warmed-up, but not much.


How are you inserting the data?


A java client on a separate box using the datastax java driver, 48 
threads writing 100 records each iteration as prepared batch statements.


At 5000 records/sec, the servers just can't keep up, so the client backs 
up.  That's only 5M of data/sec, which doesn't seem like much.  As I 
mentioned, switching to SSDs didn't help much, so I'm assuming at this 
point that the server overloads are what's holding up the client.


Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 6:49 AM, Keith Freeman <8fo...@gmail.com> wrote:

>  Yes, I started with a fresh keyspace (dropped and re-created) to run this
> test.
>

https://issues.apache.org/jira/browse/CASSANDRA-4219

=Rob


Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 10:12 AM, Keith Freeman <8fo...@gmail.com> wrote:

>  I had seen that issue before, but it's marked Resolved/Fixed in v1.1.1,
> and I'm on v1.2.8.  Also it talks about not being able to re-create the
> keyspace, while my problem is that after re-creating, I eventually get
> FileNotFound exceptions.  This has happened to me several times in testing,
> this is the first time I've been able to follow-up and report it to the
> mailing list.
>

Sorry, pasted the wrong JIRA.

https://issues.apache.org/jira/browse/CASSANDRA-4857
and
https://issues.apache.org/jira/browse/CASSANDRA-4221
for background

=Rob


Re: Cassandra 2 Upgrade

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 2:59 AM, Christopher Wirt wrote:

> I’m keen on moving to 2.0. The new thrift server implementation and other
> performance improvements are getting me excited.
>
> **
>
> I’m currently running 1.2.8  in 3 DC’s with 3-3-9 nodes 64GB RAM, 3x200GB
> SSDs, thrift, LCS, Snappy,  Vnodes,
>
History indicates that you should not run a Cassandra version x.y.z where z
< 5 in production. Unless your bosses have tasked you with finding
potentially serious bugs in your database software in production.

> **
>
> Is anyone using 2.0 in production yet? Had any issues? I haven’t seen
> anything popup here or on JIRA, so either there are none/few, or nobody is
> using it yet.
>
I'm sure some brave/foolish souls must be...

> **
>
> How important is it to move to 1.2.9 before 2.0? To me it looks like 1.2.8
> to 2.0 will be fine.
>
All upgrades to 2.0 must pass through 1.2.9. I'll be doing a blog post on
the upgrade path from 1.2.x to 2.0.x soon, but for now you can refer to
this thread :

http://mail-archives.apache.org/mod_mbox/cassandra-user/201308.mbox/%3ccalehuf-wjuuoe_7ytqkxdd+bvxhukluccjnnec4kpbmoxs8...@mail.gmail.com%3E

=Rob


Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Keith Freeman
I had seen that issue before, but it's marked Resolved/Fixed in v1.1.1, 
and I'm on v1.2.8.  Also it talks about not being able to re-create the 
keyspace, while my problem is that after re-creating, I eventually get 
FileNotFound exceptions.  This has happened to me several times in 
testing, this is the first time I've been able to follow-up and report 
it to the mailing list.


On 09/11/2013 10:55 AM, Robert Coli wrote:
On Wed, Sep 11, 2013 at 6:49 AM, Keith Freeman <8fo...@gmail.com 
> wrote:


Yes, I started with a fresh keyspace (dropped and re-created) to
run this test.


https://issues.apache.org/jira/browse/CASSANDRA-4219

=Rob




Re: Cassandra 2 Upgrade

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 2:59 AM, Christopher Wirt wrote:

> Am I ok running a mixed cluster for 24 hours? E.g. I switched just one DC
> for 24 hours as a test.
>

(missed this line)

This is unsupported, generally. It may or may not work. I wouldn't do it.

=Rob


Re: Composite Column Grouping

2013-09-11 Thread Laing, Michael
Here's a slightly better version and a python script. -ml

-- put this in  and run using 'cqlsh -f 

DROP KEYSPACE latest;

CREATE KEYSPACE latest WITH replication = {
'class': 'SimpleStrategy',
'replication_factor' : 1
};

USE latest;

CREATE TABLE time_series (
bucket_userid text, -- bucket is the beginning of a datetime span
concatenated with a shard designator
user_id text,
pkid text,
timeuuid text,
colname text,
PRIMARY KEY (bucket_userid, timeuuid)
);

UPDATE time_series
SET
user_id = 'XYZ',
pkid = '1000',
colname = 'Col-Name-1'
WHERE
bucket_userid = '2013-09-11T05:15-0_XYZ' AND
timeuuid='200'
;
UPDATE time_series
SET
user_id = 'XYZ',
pkid = '1001',
colname = 'Col-Name-2'
WHERE
bucket_userid = '2013-09-11T05:15-1_XYZ' AND
timeuuid='201'
;
UPDATE time_series
SET
user_id = 'XYZ',
pkid = '1000',
colname = 'Col-Name-3'
WHERE
bucket_userid = '2013-09-11T05:15-0_XYZ' AND
timeuuid='202'
;
UPDATE time_series
SET
user_id = 'XYZ',
pkid = '1000',
colname = 'Col-Name-4'
WHERE
bucket_userid = '2013-09-11T05:30-1_XYZ' AND
timeuuid='203'
;
UPDATE time_series
SET
user_id = 'XYZ',
pkid = '1002',
colname = 'Col-Name-5'
WHERE
bucket_userid = '2013-09-11T05:30-0_XYZ' AND
timeuuid='204'
;

-- This query assumes that the 'current' span is 2013-09-11T05:30 and I am
interested in this span and the previous one.

SELECT * FROM time_series
WHERE bucket_userid IN ( -- go back as many spans as you need to, all
shards in each span (cartesian product)
'2013-09-11T05:15-0_XYZ',
'2013-09-11T05:15-1_XYZ',
'2013-09-11T05:30-0_XYZ',
'2013-09-11T05:30-1_XYZ'
) -- you could add a range condition on timeuuid to further restrict the
results
ORDER BY timeuuid DESC;

-- returns:
-- bucket_userid  | timeuuid | colname| pkid | user_id
--+--++--+-
-- 2013-09-11T05:30-0_XYZ |  204 | Col-Name-5 | 1002 | XYZ
-- 2013-09-11T05:30-1_XYZ |  203 | Col-Name-4 | 1000 | XYZ
-- 2013-09-11T05:15-0_XYZ |  202 | Col-Name-3 | 1000 | XYZ
-- 2013-09-11T05:15-1_XYZ |  201 | Col-Name-2 | 1001 | XYZ
-- 2013-09-11T05:15-0_XYZ |  200 | Col-Name-1 | 1000 | XYZ

-- do a stable purge on pkid to get the result


python script:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging
logging.basicConfig()

import cassandra
from cassandra.cluster import Cluster
from cassandra.query import SimpleStatement

cql_cluster = Cluster()
cql_session = cql_cluster.connect()
cql_session.set_keyspace('latest')

select_stmt = """
SELECT * FROM time_series
WHERE bucket_userid IN ( -- go back as many spans as you need to, all
shards in each span (cartesian product)
'2013-09-11T05:15-0_XYZ',
'2013-09-11T05:15-1_XYZ',
'2013-09-11T05:30-0_XYZ',
'2013-09-11T05:30-1_XYZ'
)
ORDER BY timeuuid DESC;
"""

query = SimpleStatement(select_stmt)
rows = cql_session.execute(query)

pkids = set()
for row in rows:
if row.pkid in pkids:
continue
else:
print row.user_id, row.timeuuid, row.colname, row.pkid
pkids.add(row.pkid)

# prints:

# XYZ 204 Col-Name-5 1002
# XYZ 203 Col-Name-4 1000
# XYZ 201 Col-Name-2 1001


On Wed, Sep 11, 2013 at 6:13 AM, Laing, Michael
wrote:

> Then you can do this. I handle millions of entries this way and it works
> well if you are mostly interested in recent activity.
>
> If you need to span all activity then you can use a separate table to
> maintain the 'latest'. This table should also be sharded as entries will be
> 'hot'. Sharding will spread the heat and the tombstones (compaction load)
> around the cluster.
>
> -ml
>
> -- put this in  and run using 'cqlsh -f 
>
> DROP KEYSPACE latest;
>
> CREATE KEYSPACE latest WITH replication = {
> 'class': 'SimpleStrategy',
> 'replication_factor' : 1
> };
>
> USE latest;
>
> CREATE TABLE time_series (
> bucket_userid text, -- bucket is the beginning of a datetime span
> concatenated with a shard designator
> pkid text,
> timeuuid text,
> colname text,
> PRIMARY KEY (bucket_userid, timeuuid)
> );
>
> -- the example table is using 15 minute bucket spans and 2 shards for
> illustration (you would usually use more shards)
> -- adjust these appropriately for your application
>
> UPDATE time_series SET pkid = '1000', colname = 'Col-Name-1' where
> bucket_userid = '2013-09-11T05:15-0_XYZ' AND timeuuid='200';
> UPDATE time_series SET pkid = '1001', colname = 'Col-Name-2' where
> bucket_userid = '2013-09-11T05:15-1_XYZ' AND timeuuid='201';
> UPDATE time_series SET pkid = '1000', colname = 'Col-Name-3' where
> bucket_userid = '2013-09-11T05:15-0_XYZ' AND timeuuid='202';
> UPDATE time_series SET pkid = '1000', colname = 'Col-Name-4' where
> bucket_userid = '2013-09-11T05:30-1_XYZ' AND timeuuid='203';
> UPDATE time_series SET pkid = '1002', colname = 'Col-Name-5' where
> bucket_use

RE: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Paul Cichonski
How much of the data you are writing is going against the same row key? 

I've experienced some issues using CQL to write a full wide-row at once (across 
multiple threads) that exhibited some of the symptoms you have described (i.e., 
high cpu, dropped mutations). 

This question goes into it a bit more: 
http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque
 . I was able to solve my issue by switching to using the thrift batch_mutate 
to write a full wide-row at once instead of using many CQL INSERT statements. 

-Paul

> -Original Message-
> From: Keith Freeman [mailto:8fo...@gmail.com]
> Sent: Wednesday, September 11, 2013 9:16 AM
> To: user@cassandra.apache.org
> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
> 
> 
> On 09/10/2013 11:42 AM, Nate McCall wrote:
> > With SSDs, you can turn up memtable_flush_writers - try 3 initially (1
> > by default) and see what happens. However, given that there are no
> > entries in 'All time blocked' for such, they may be something else.
> Tried that, it seems to have reduced the loads a little after everything
> warmed-up, but not much.
> >
> > How are you inserting the data?
> 
> A java client on a separate box using the datastax java driver, 48 threads
> writing 100 records each iteration as prepared batch statements.
> 
> At 5000 records/sec, the servers just can't keep up, so the client backs up.
> That's only 5M of data/sec, which doesn't seem like much.  As I mentioned,
> switching to SSDs didn't help much, so I'm assuming at this point that the
> server overloads are what's holding up the client.


Re: cqlsh error after enabling encryption

2013-09-11 Thread Les Hazlewood
bump.  Any ideas?  We're seeing the same issue on 2.0 as well.

Thanks!

On Tue, Sep 3, 2013 at 2:20 PM, David Laube  wrote:
> Hi All,
>
> After enabling encryption on our Cassandra 1.2.8 nodes, we receiving the
> error "Connection error: TSocket read 0 bytes" while attempting to use CQLsh
> to talk to the ring. I've followed the docs over at
> http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/security/secureCqlshSSL_t.html
> but can't seem to figure out why this isn't working. Inter-node
> communication seems to be working properly since "nodetool status" shows our
> nodes as up, but the CQLsh client is unable to talk to a single node or any
> node in the cluster (specifying the IP in .cqlshrc or on the CLI) for some
> reason. I'm providing the applicable config file entries below for review.
> Any insight or suggestions would be greatly appreciated! :)
>
>
>
> My ~/.cqlshrc file:
> 
>
> [connection]
> hostname = 127.0.0.1
> port = 9160
> factory = cqlshlib.ssl.ssl_transport_factory
>
> [ssl]
> certfile = /etc/cassandra/conf/cassandra_client.crt
> validate = true ## Optional, true by default.
>
> [certfiles] ## Optional section, overrides the default certfile in the [ssl]
> section.
> 192.168.1.3 = ~/keys/cassandra01.cert
> 192.168.1.4 = ~/keys/cassandra02.cert
> 
>
>
>
> Our cassandra.yaml file config blocks:
> 
> …snip…
>
> server_encryption_options:
> internode_encryption: all
> keystore: /etc/cassandra/conf/.keystore
> keystore_password: yeah-right
> truststore: /etc/cassandra/conf/.truststore
> truststore_password: yeah-right
> # More advanced defaults below:
> # protocol: TLS
> # algorithm: SunX509
> # store_type: JKS
> # cipher_suites:
> [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA]
> # require_client_auth: false
>
> # enable or disable client/server encryption.
> client_encryption_options:
> enabled: true
> keystore: /etc/cassandra/conf/.keystore
> keystore_password: yeah-right
> # require_client_auth: false
> # Set trustore and truststore_password if require_client_auth is true
> # truststore: conf/.truststore
> # truststore_password: cassandra
> # More advanced defaults below:
> protocol: TLS
> algorithm: SunX509
> store_type: JKS
> cipher_suites:
> [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA]
>
> …snip...
> 
>
>
>
>
> Thanks,
> -David Laube
>


Complex JSON objects

2013-09-11 Thread Hartzman, Leslie
Hi,

What would be the recommended way to deal with a complex JSON structure, short 
of storing the whole JSON as a value to a column? What options are there to 
store dynamic data like this?

e.g.,

{
  " readings": [
{
   "value" : 20,
  "rate_of_change" : 0.05,
  "timestamp" :  1378686742465
 },
{
   "value" : 22,
  "rate_of_change" : 0.05,
  "timestamp" :  1378686742466
 },
{
   "value" : 21,
  "rate_of_change" : 0.05,
  "timestamp" :  1378686742467
 }
  ],
  "events" : [
 {
"type" : "direction_change",
"version" : 0.1,
"timestamp": 1378686742465
 "data" : {
  "units" : "miles",
  "direction" : "NW",
  "offset" : 23
  }
   },
 {
"type" : "altitude_change",
"version" : 0.1,
"timestamp": 1378686742465
 "data" : {
  "rate": 0.2,
  "duration" : 18923
  }
}
   ]
}



[CONFIDENTIALITY AND PRIVACY NOTICE]

Information transmitted by this email is proprietary to Medtronic and is 
intended for use only by the individual or entity to which it is addressed, and 
may contain information that is private, privileged, confidential or exempt 
from disclosure under applicable law. If you are not the intended recipient or 
it appears that this mail has been forwarded to you without proper authority, 
you are notified that any use or dissemination of this information in any 
manner is strictly prohibited. In such cases, please delete this mail from your 
records.

To view this notice in other languages you can either select the following link 
or manually copy and paste the link into the address bar of a web browser: 
http://emaildisclaimer.medtronic.com


Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman
Thanks, I had seen your stackoverflow post.  I've got hundreds of 
(wide-) rows, and the writes are pretty well distributed across them.  
I'm very reluctant to drop back to the thrift interface.


On 09/11/2013 10:46 AM, Paul Cichonski wrote:

How much of the data you are writing is going against the same row key?

I've experienced some issues using CQL to write a full wide-row at once (across 
multiple threads) that exhibited some of the symptoms you have described (i.e., 
high cpu, dropped mutations).

This question goes into it a bit 
more:http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque
  . I was able to solve my issue by switching to using the thrift batch_mutate 
to write a full wide-row at once instead of using many CQL INSERT statements.

-Paul


-Original Message-
From: Keith Freeman [mailto:8fo...@gmail.com]
Sent: Wednesday, September 11, 2013 9:16 AM
To:user@cassandra.apache.org
Subject: Re: heavy insert load overloads CPUs, with MutationStage pending


On 09/10/2013 11:42 AM, Nate McCall wrote:

With SSDs, you can turn up memtable_flush_writers - try 3 initially (1
by default) and see what happens. However, given that there are no
entries in 'All time blocked' for such, they may be something else.

Tried that, it seems to have reduced the loads a little after everything
warmed-up, but not much.

How are you inserting the data?

A java client on a separate box using the datastax java driver, 48 threads
writing 100 records each iteration as prepared batch statements.

At 5000 records/sec, the servers just can't keep up, so the client backs up.
That's only 5M of data/sec, which doesn't seem like much.  As I mentioned,
switching to SSDs didn't help much, so I'm assuming at this point that the
server overloads are what's holding up the client.




Re: Complex JSON objects

2013-09-11 Thread Edward Capriolo
I was playing a while back with the concept of storing JSON into cassandra
columns in a sortable way.

Warning: This is kinda just a cool idea, I never productionized it.
https://github.com/edwardcapriolo/Cassandra-AnyType



On Wed, Sep 11, 2013 at 2:26 PM, Hartzman, Leslie <
leslie.d.hartz...@medtronic.com> wrote:

>  Hi,
>
> ** **
>
> What would be the recommended way to deal with a complex JSON structure,
> short of storing the whole JSON as a value to a column? What options are
> there to store dynamic data like this?
>
> ** **
>
> e.g.,
>
> ** **
>
> {
>
>   “ readings”: [
>
> {
>
>“value” : 20,
>
>   “rate_of_change” : 0.05,
>
>   “timestamp” :  1378686742465
>
>  },
>
> {
>
>“value” : 22,
>
>   “rate_of_change” : 0.05,
>
>   “timestamp” :  1378686742466
>
>  },
>
> {
>
>“value” : 21,
>
>   “rate_of_change” : 0.05,
>
>   “timestamp” :  1378686742467
>
>  }
>
>   ],
>
>   “events” : [
>
>  {
>
> “type” : “direction_change”,
>
> “version” : 0.1,
>
> “timestamp”: 1378686742465
>
>  “data” : {
>
>   “units” : “miles”,
>
>   “direction” : “NW”,
>
>   “offset” : 23
>
>   }
>
>},
>
>  {
>
> “type” : “altitude_change”,
>
> “version” : 0.1,
>
> “timestamp”: 1378686742465
>
>  “data” : {
>
>   “rate”: 0.2,
>
>   “duration” : 18923
>
>   }
>
> }
>
>]
>
> }
>
> ** **
>
>  
>
> [CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this email
> is proprietary to Medtronic and is intended for use only by the individual
> or entity to which it is addressed, and may contain information that is
> private, privileged, confidential or exempt from disclosure under
> applicable law. If you are not the intended recipient or it appears that
> this mail has been forwarded to you without proper authority, you are
> notified that any use or dissemination of this information in any manner is
> strictly prohibited. In such cases, please delete this mail from your
> records. To view this notice in other languages you can either select the
> following link or manually copy and paste the link into the address bar of
> a web browser: http://emaildisclaimer.medtronic.com
>


RE: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Paul Cichonski
I was reluctant to use the thrift as well, and I spent about a week trying to 
get the CQL inserts to work by partitioning the INSERTS in different ways and 
tuning the cluster.

However, nothing worked remotely as well as the batch_mutate when it came to 
writing a full wide-row at once. I think Cassandra 2.0 makes CQL work better 
for these cases (CASSANDRA-4693), but I haven't tested it yet.

-Paul

> -Original Message-
> From: Keith Freeman [mailto:8fo...@gmail.com]
> Sent: Wednesday, September 11, 2013 1:06 PM
> To: user@cassandra.apache.org
> Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
> 
> Thanks, I had seen your stackoverflow post.  I've got hundreds of
> (wide-) rows, and the writes are pretty well distributed across them.
> I'm very reluctant to drop back to the thrift interface.
> 
> On 09/11/2013 10:46 AM, Paul Cichonski wrote:
> > How much of the data you are writing is going against the same row key?
> >
> > I've experienced some issues using CQL to write a full wide-row at once
> (across multiple threads) that exhibited some of the symptoms you have
> described (i.e., high cpu, dropped mutations).
> >
> > This question goes into it a bit
> more:http://stackoverflow.com/questions/18522191/using-cassandra-and-
> cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque  . I was able to
> solve my issue by switching to using the thrift batch_mutate to write a full
> wide-row at once instead of using many CQL INSERT statements.
> >
> > -Paul
> >
> >> -Original Message-
> >> From: Keith Freeman [mailto:8fo...@gmail.com]
> >> Sent: Wednesday, September 11, 2013 9:16 AM
> >> To:user@cassandra.apache.org
> >> Subject: Re: heavy insert load overloads CPUs, with MutationStage
> >> pending
> >>
> >>
> >> On 09/10/2013 11:42 AM, Nate McCall wrote:
> >>> With SSDs, you can turn up memtable_flush_writers - try 3 initially
> >>> (1 by default) and see what happens. However, given that there are
> >>> no entries in 'All time blocked' for such, they may be something else.
> >> Tried that, it seems to have reduced the loads a little after
> >> everything warmed-up, but not much.
> >>> How are you inserting the data?
> >> A java client on a separate box using the datastax java driver, 48
> >> threads writing 100 records each iteration as prepared batch statements.
> >>
> >> At 5000 records/sec, the servers just can't keep up, so the client backs 
> >> up.
> >> That's only 5M of data/sec, which doesn't seem like much.  As I
> >> mentioned, switching to SSDs didn't help much, so I'm assuming at
> >> this point that the server overloads are what's holding up the client.



Re: Cassandra input paging for Hadoop

2013-09-11 Thread Jiaan Zeng
Speaking of thrift client, i.e. ColumnFamilyInputFormat, yes,
ConfigHelper.setRangeBatchSize() can reduce the number of rows sent to
Cassandra.

Depend on how big your column is, you may also want to increase thrift
message length through setThriftMaxMessageLengthInMb().

Hope that helps.

On Tue, Sep 10, 2013 at 8:18 PM, Renat Gilfanov  wrote:
> Hi,
>
> We have Hadoop jobs that read data from our Cassandra column families and
> write some data back to another column families.
> The input column families are pretty simple CQL3 tables without wide rows.
> In Hadoop jobs we set up corresponding WHERE clause in
> ConfigHelper.setInputWhereClauses(...), so we don't process the whole table
> at once.
> Never  the less, sometimes the amount of data returned by input query is big
> enough to cause TimedOutExceptions.
>
> To mitigate this, I'd like to configure Hadoop job in a such way that it
> sequentially fetches input rows by smaller portions.
>
> I'm looking at the ConfigHelper.setRangeBatchSize() and
> CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if
> that's what I need and if yes, which one should I use for those purposes.
>
> Any help is appreciated.
>
> Hadoop version is 1.1.2, Cassandra version is 1.2.8.



-- 
Regards,
Jiaan


Re: Complex JSON objects

2013-09-11 Thread Paulo Motta
What you can do to store a complex json object in a C* skinny row is to
serialize each field independently as a Json String and store each field as
a C* column within the same row (representing a JSON object).

So using the example you mentioned, you could store it in cassandra as:

ColumnFamily["objectKey"]["readings"] = "[{reading1}, {reading2},
{reading3}]"
ColumnFamily["objectKey"]["events"] = "[{event1}, {event2}, {event3}]"

But in fact, that isn't an optimal way to store such data in cassandra,
since you would need to de-serialize all the readings if you were
interested in a particular reading or time period.

A better way to store time series data is to store one measurement/event
per column, so you're able to retrieve data for a particular time period
more easily (since columns are stored in sorted order). One way to do that
for your data would be to store them in 2 column families, as in:

Reading["objectKey"]["timestamp3"] = "{reading3}"

Reading["objectKey"]["timestamp2"] = "{reading2}"

Reading["objectKey"]["timestamp1"] = "{reading1}"

Event["objectKey"]["timestamp3"] = "{event3}"

Event["objectKey"]["timestamp2"] = "{event2}"

Event["objectKey"]["timestamp1"] = "{event1}"


So you're able to reconstruct the original JSON "objectKey" by fetching the
columns from Reading["objectKey"] and Event["objectKey"], and you're also
able to efficiently query all readings between timestamp2 and timestamp3
that ocurred inside the json object, if necessary.


In this post you can find more information on how to store time series data
in C* in an efficient way:
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra


2013/9/11 Edward Capriolo 

> I was playing a while back with the concept of storing JSON into cassandra
> columns in a sortable way.
>
> Warning: This is kinda just a cool idea, I never productionized it.
> https://github.com/edwardcapriolo/Cassandra-AnyType
>
>
>
> On Wed, Sep 11, 2013 at 2:26 PM, Hartzman, Leslie <
> leslie.d.hartz...@medtronic.com> wrote:
>
>>  Hi,
>>
>> ** **
>>
>> What would be the recommended way to deal with a complex JSON structure,
>> short of storing the whole JSON as a value to a column? What options are
>> there to store dynamic data like this?
>>
>> ** **
>>
>> e.g.,
>>
>> ** **
>>
>> {
>>
>>   “ readings”: [
>>
>> {
>>
>>“value” : 20,
>>
>>   “rate_of_change” : 0.05,
>>
>>   “timestamp” :  1378686742465
>>
>>  },
>>
>> {
>>
>>“value” : 22,
>>
>>   “rate_of_change” : 0.05,
>>
>>   “timestamp” :  1378686742466
>>
>>  },
>>
>> {
>>
>>“value” : 21,
>>
>>   “rate_of_change” : 0.05,
>>
>>   “timestamp” :  1378686742467
>>
>>  }
>>
>>   ],
>>
>>   “events” : [
>>
>>  {
>>
>> “type” : “direction_change”,
>>
>> “version” : 0.1,
>>
>> “timestamp”: 1378686742465
>>
>>  “data” : {
>>
>>   “units” : “miles”,
>>
>>   “direction” : “NW”,
>>
>>   “offset” : 23
>>
>>   }
>>
>>},
>>
>>  {
>>
>> “type” : “altitude_change”,
>>
>> “version” : 0.1,
>>
>> “timestamp”: 1378686742465
>>
>>  “data” : {
>>
>>   “rate”: 0.2,
>>
>>   “duration” : 18923
>>
>>   }
>>
>> }
>>
>>]
>>
>> }
>>
>> ** **
>>
>>  
>>
>> [CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this
>> email is proprietary to Medtronic and is intended for use only by the
>> individual or entity to which it is addressed, and may contain information
>> that is private, privileged, confidential or exempt from disclosure under
>> applicable law. If you are not the intended recipient or it appears that
>> this mail has been forwarded to you without proper authority, you are
>> notified that any use or dissemination of this information in any manner is
>> strictly prohibited. In such cases, please delete this mail from your
>> records. To view this notice in other languages you can either select the
>> following link or manually copy and paste the link into the address bar of
>> a web browser: http://emaildisclaimer.medtronic.com
>>
>
>


-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto

Re: Complex JSON objects

2013-09-11 Thread Laing, Michael
A way to do this would be to express the JSON structure as (path, value)
tuples and then use a map to store them.

For example, your JSON above can be expressed as shown below where the path
is a list of keys/indices and the value is a scalar.

You could also concatenate the path elements and use them as a column key
instead. The advantage there is that you can do range queries against such
structures, and they will efficiently yield subtrees. E.g. a query for
"path > 'readings.1.' and path < 'readings.1.\u'" will yield the
appropriate rows.

ml

([u'events', 0, u'timestamp'], 1378686742465)

([u'events', 0, u'version'], 0.1)

([u'events', 0, u'type'], u'direction_change')

([u'events', 0, u'data', u'units'], u'miles')

([u'events', 0, u'data', u'direction'], u'NW')

([u'events', 0, u'data', u'offset'], 23)

([u'events', 1, u'timestamp'], 1378686742465)

([u'events', 1, u'version'], 0.1)

([u'events', 1, u'type'], u'altitude_change')

([u'events', 1, u'data', u'duration'], 18923)

([u'events', 1, u'data', u'rate'], 0.2)

([u'readings', 0, u'timestamp'], 1378686742465)

([u'readings', 0, u'value'], 20)

([u'readings', 0, u'rate_of_change'], 0.05)

([u'readings', 1, u'timestamp'], 1378686742466)

([u'readings', 1, u'value'], 22)

([u'readings', 1, u'rate_of_change'], 0.05)

([u'readings', 2, u'timestamp'], 1378686742467)

([u'readings', 2, u'value'], 21)

([u'readings', 2, u'rate_of_change'], 0.05)


On Wed, Sep 11, 2013 at 2:26 PM, Hartzman, Leslie <
leslie.d.hartz...@medtronic.com> wrote:

>  Hi,
>
> ** **
>
> What would be the recommended way to deal with a complex JSON structure,
> short of storing the whole JSON as a value to a column? What options are
> there to store dynamic data like this?
>
> ** **
>
> e.g.,
>
> ** **
>
> {
>
>   “ readings”: [
>
> {
>
>“value” : 20,
>
>   “rate_of_change” : 0.05,
>
>   “timestamp” :  1378686742465
>
>  },
>
> {
>
>“value” : 22,
>
>   “rate_of_change” : 0.05,
>
>   “timestamp” :  1378686742466
>
>  },
>
> {
>
>“value” : 21,
>
>   “rate_of_change” : 0.05,
>
>   “timestamp” :  1378686742467
>
>  }
>
>   ],
>
>   “events” : [
>
>  {
>
> “type” : “direction_change”,
>
> “version” : 0.1,
>
> “timestamp”: 1378686742465
>
>  “data” : {
>
>   “units” : “miles”,
>
>   “direction” : “NW”,
>
>   “offset” : 23
>
>   }
>
>},
>
>  {
>
> “type” : “altitude_change”,
>
> “version” : 0.1,
>
> “timestamp”: 1378686742465
>
>  “data” : {
>
>   “rate”: 0.2,
>
>   “duration” : 18923
>
>   }
>
> }
>
>]
>
> }
>
> ** **
>
>  
>
> [CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this email
> is proprietary to Medtronic and is intended for use only by the individual
> or entity to which it is addressed, and may contain information that is
> private, privileged, confidential or exempt from disclosure under
> applicable law. If you are not the intended recipient or it appears that
> this mail has been forwarded to you without proper authority, you are
> notified that any use or dissemination of this information in any manner is
> strictly prohibited. In such cases, please delete this mail from your
> records. To view this notice in other languages you can either select the
> following link or manually copy and paste the link into the address bar of
> a web browser: http://emaildisclaimer.medtronic.com
>


Re: is the select result grouped by the value of the partition key?

2013-09-11 Thread John Lumby
I would like to make quite sure about this implicit GROUP BY "feature",

since it seems really important yet does not seem to be mentioned in the
CQL reference documentation.



Aaron,   you said "yes"  --   is that "yes,  always,   in all scenarios no 
matter what"

or "yes usually"?  Is it something we can bet the farm and farmer's family 
on?



The kinds of scenarios where I am wondering if it's possible for partition-key 
groups
to get intermingled are :



  .   what if the node containing primary copy of a row is down
                and 
cassandra fetches this row from a replica on a different node
               (e.g.  with CONSISTENCY ONE)

  .   what if there is a heavy stream of UPDATE activity from applications which
      connect to all nodes,   causing different nodes to have different 
versions of replicas of same row?



Can you point me to some place in the cassandra source code where this grouping 
is ensured?



Many thanks,

John Lumby

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman

I have the defaults as shown in your response.

On 09/10/2013 01:59 PM, sankalp kohli wrote:

What have you set these to?
# commitlog_sync may be either "periodic" or "batch."
# When in batch mode, Cassandra won't ack writes until the commit log
# has been fsynced to disk.  It will wait up to
# commitlog_sync_batch_window_in_ms milliseconds for other writes, before
# performing the sync.
#
# commitlog_sync: batch
# commitlog_sync_batch_window_in_ms: 50
#
# the other option is "periodic" where writes may be acked immediately
# and the CommitLog is simply synced every commitlog_sync_period_in_ms
# milliseconds.
commitlog_sync: periodic
commitlog_sync_period_in_ms: 1000


On Tue, Sep 10, 2013 at 10:42 AM, Nate McCall > wrote:


With SSDs, you can turn up memtable_flush_writers - try 3
initially (1 by default) and see what happens. However, given that
there are no entries in 'All time blocked' for such, they may be
something else.

How are you inserting the data?


On Tue, Sep 10, 2013 at 12:40 PM, Keith Freeman <8fo...@gmail.com
> wrote:


On 09/10/2013 11:17 AM, Robert Coli wrote:

On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman
<8fo...@gmail.com > wrote:

On my 3-node cluster (v1.2.8) with 4-cores each and SSDs
for commitlog and data


On SSD, you don't need to separate commitlog and data. You
only win from this separation if you have a head to not-move
between appends to the commit log. You will get better IO
from a strip with an additional SSD.
Right, actually both partitions are on the same SSD.  
Assuming you meant "stripe", would that really make a difference



Pool Name  Active   Pending  Completed Blocked
 All time blocked
MutationStage 1 9 290394 0  
  0
FlushWriter 1 2 20 0
0


I can't seem find information about the real meaning of
MutationStage, is this just normal for lots of inserts?


The mutation stage is the stage in which mutations to rows in
memtables ("writes") occur.

The FlushWriter stage is the stage that turns memtables into
SSTables by flushing them.

However, 9 pending mutations is a very small number. For
reference on an overloaded cluster which was being written to
death I recently saw 1216434 pending MutationStage. What
problem other than "high CPU load" are you experiencing? 2
Pending FlushWriters is slightly suggestive of some sort of
bound related to flushing..

So the basic problem is that write performance is lower than I
expected.  I can't get sustained writing of 5000 ~1024-byte
records / sec at RF=2 on a good 3-node cluster, and my only
guess is that's because of the heavy CPU loads on the server
(loads over 10 on 4-CPU systems).  I've tried both a single
client writing 5000 rows/second and 2 clients (on separate
boxes) writing 2500 rows/second, and in both cases the
server(s) doesn't respond quickly enough to maintain that
rate.  It keeps up ok with 2000 or 3000 rows per second (and
has lower server loads).








VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
Hello,

We are deciding whether to get VMs or physical machines for a Cassandra
cluster. I know this is a very high-level question depending on lots of
factors and in fact I want to know that how to tackle this is and what
factors should we take into consideration while trying to find the answer.

Data size? Writing speed (whether write heavy usecases or not)? Random ead
use-cases? column family design/how we store data?

Any pointers, documents, guidance, advise would be appreciated.

Thanks a lot.

Regards,
Shahab


Re: VMs versus Physical machines

2013-09-11 Thread Aaron Turner
Physical machines unless you're running your cluster in the cloud (AWS/etc).

Reason is simple: Look how Cassandra scales and provides redundancy.

Aaron Turner
http://synfin.net/ Twitter: @synfinatic
https://github.com/synfinatic/tcpreplay - Pcap editing and replay tools for
Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin



On Wed, Sep 11, 2013 at 4:21 PM, Shahab Yunus wrote:

> Hello,
>
> We are deciding whether to get VMs or physical machines for a Cassandra
> cluster. I know this is a very high-level question depending on lots of
> factors and in fact I want to know that how to tackle this is and what
> factors should we take into consideration while trying to find the answer.
>
> Data size? Writing speed (whether write heavy usecases or not)? Random ead
> use-cases? column family design/how we store data?
>
> Any pointers, documents, guidance, advise would be appreciated.
>
> Thanks a lot.
>
> Regards,
> Shahab
>


Re: VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
Thanks Aaron for the reply. Yes, VMs or the nodes will be in cloud if we
don't go the physical route.

" Look how Cassandra scales and provides redundancy.  "
But how does it differ for physical machines or VMs (in cloud.) Or after
your first comment, are you saying that there is no difference whether we
use physical or VMs (in cloud)?

Regards,
Shahab


On Wed, Sep 11, 2013 at 7:34 PM, Aaron Turner  wrote:

> Physical machines unless you're running your cluster in the cloud
> (AWS/etc).
>
> Reason is simple: Look how Cassandra scales and provides redundancy.
>
> Aaron Turner
> http://synfin.net/ Twitter: @synfinatic
> https://github.com/synfinatic/tcpreplay - Pcap editing and replay tools
> for Unix & Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
> -- Benjamin Franklin
>
>
>
> On Wed, Sep 11, 2013 at 4:21 PM, Shahab Yunus wrote:
>
>> Hello,
>>
>> We are deciding whether to get VMs or physical machines for a Cassandra
>> cluster. I know this is a very high-level question depending on lots of
>> factors and in fact I want to know that how to tackle this is and what
>> factors should we take into consideration while trying to find the answer.
>>
>> Data size? Writing speed (whether write heavy usecases or not)? Random
>> ead use-cases? column family design/how we store data?
>>
>> Any pointers, documents, guidance, advise would be appreciated.
>>
>> Thanks a lot.
>>
>> Regards,
>> Shahab
>>
>
>


Re: VMs versus Physical machines

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 4:40 PM, Shahab Yunus wrote:

> But how does it differ for physical machines or VMs (in cloud.) Or after
> your first comment, are you saying that there is no difference whether we
> use physical or VMs (in cloud)?
>

Physical will always outperform virtual. He's just saying don't buy one big
physical box, virtualize on it, and then run cassandra on those VMs.

=Rob


Re: is the select result grouped by the value of the partition key?

2013-09-11 Thread Aaron Morton
> GROUP BY "feature",
I would not think of it like that, this is about physical order of rows.  

> since it seems really important yet does not seem to be mentioned in the
> CQL reference documentation.
It's baked in, this is how the data is organised on the row. 

http://www.datastax.com/dev/blog/thrift-to-cql3
We often say the PRIMARY KEY is the PARTITION KEY and the GROUPING COLUMNS
http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_reference/create_table_r.html

See also http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html

> Is it something we can bet the farm and farmer's family on?
Sure. 

> The kinds of scenarios where I am wondering if it's possible for 
> partition-key groups
> to get intermingled are :
All instances of the table entity with the same value(s) for the PARTITION KEY 
portion of the PRIMARY KEY existing in the same storage engine row. 

>   .   what if the node containing primary copy of a row is down
There is no primary copy of a row. 

>   .   what if there is a heavy stream of UPDATE activity from applications 
> which
>   connect to all nodes,   causing different nodes to have different 
> versions of replicas of same row?
That's fine with me. 
It's only an issue when the data is read, and at that point the Consistency 
Level determines what we do. 

Hope that helps. 


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/09/2013, at 7:43 AM, John Lumby  wrote:

> I would like to make quite sure about this implicit GROUP BY "feature",
> 
> since it seems really important yet does not seem to be mentioned in the
> CQL reference documentation.
> 
> 
> 
> Aaron,   you said "yes"  --   is that "yes,  always,   in all scenarios no 
> matter what"
> 
> or "yes usually"?  Is it something we can bet the farm and farmer's 
> family on?
> 
> 
> 
> The kinds of scenarios where I am wondering if it's possible for 
> partition-key groups
> to get intermingled are :
> 
> 
> 
>   .   what if the node containing primary copy of a row is down
> and 
> cassandra fetches this row from a replica on a different node
>(e.g.  with CONSISTENCY ONE)
> 
>   .   what if there is a heavy stream of UPDATE activity from applications 
> which
>   connect to all nodes,   causing different nodes to have different 
> versions of replicas of same row?
> 
> 
> 
> Can you point me to some place in the cassandra source code where this 
> grouping is ensured?
> 
> 
> 
> Many thanks,
> 
> John Lumby  



Re: Cassandra input paging for Hadoop

2013-09-11 Thread Aaron Morton
>> 
>> I'm looking at the ConfigHelper.setRangeBatchSize() and
>> CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if
>> that's what I need and if yes, which one should I use for those purposes.
If you are using CQL 3 via Hadoop CqlConfigHelper.setInputCQLPageRowSize is the 
one you want. 

it maps to the LIMIT clause of the select statement the input reader will 
generate, the default is 1,000.

A
 
-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/09/2013, at 9:04 AM, Jiaan Zeng  wrote:

> Speaking of thrift client, i.e. ColumnFamilyInputFormat, yes,
> ConfigHelper.setRangeBatchSize() can reduce the number of rows sent to
> Cassandra.
> 
> Depend on how big your column is, you may also want to increase thrift
> message length through setThriftMaxMessageLengthInMb().
> 
> Hope that helps.
> 
> On Tue, Sep 10, 2013 at 8:18 PM, Renat Gilfanov  wrote:
>> Hi,
>> 
>> We have Hadoop jobs that read data from our Cassandra column families and
>> write some data back to another column families.
>> The input column families are pretty simple CQL3 tables without wide rows.
>> In Hadoop jobs we set up corresponding WHERE clause in
>> ConfigHelper.setInputWhereClauses(...), so we don't process the whole table
>> at once.
>> Never  the less, sometimes the amount of data returned by input query is big
>> enough to cause TimedOutExceptions.
>> 
>> To mitigate this, I'd like to configure Hadoop job in a such way that it
>> sequentially fetches input rows by smaller portions.
>> 
>> I'm looking at the ConfigHelper.setRangeBatchSize() and
>> CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if
>> that's what I need and if yes, which one should I use for those purposes.
>> 
>> Any help is appreciated.
>> 
>> Hadoop version is 1.1.2, Cassandra version is 1.2.8.
> 
> 
> 
> -- 
> Regards,
> Jiaan



Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Sankalp Kohli
The reason this is happening is that there are two instances of SStablereader 
object. A restart of Cassandra will fix the issue. 




On Sep 11, 2013, at 10:23, Robert Coli  wrote:

> On Wed, Sep 11, 2013 at 10:12 AM, Keith Freeman <8fo...@gmail.com> wrote:
>> I had seen that issue before, but it's marked Resolved/Fixed in v1.1.1, and 
>> I'm on v1.2.8.  Also it talks about not being able to re-create the 
>> keyspace, while my problem is that after re-creating, I eventually get 
>> FileNotFound exceptions.  This has happened to me several times in testing, 
>> this is the first time I've been able to follow-up and report it to the 
>> mailing list.
> 
> Sorry, pasted the wrong JIRA.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-4857
> and
> https://issues.apache.org/jira/browse/CASSANDRA-4221
> for background
> 
> =Rob 


Re: VMs versus Physical machines

2013-09-11 Thread Aaron Turner
On Wed, Sep 11, 2013 at 4:40 PM, Shahab Yunus wrote:

> Thanks Aaron for the reply. Yes, VMs or the nodes will be in cloud if we
> don't go the physical route.
>
> " Look how Cassandra scales and provides redundancy.  "
> But how does it differ for physical machines or VMs (in cloud.) Or after
> your first comment, are you saying that there is no difference whether we
> use physical or VMs (in cloud)?
>

They're different, but both can and do work... VM's just require more
virtual servers then going the physical route.

Sorry, but without you providing any actual information about your needs
all you're going to get is generalizations and hand-waving.


Re[2]: Cassandra input paging for Hadoop

2013-09-11 Thread Renat Gilfanov
 Hello,

So it means that job will process only first "cassandra.input.page.row.size" 
rows, and ignore the rest? Or CqlPagingRecordReader supports paging through the 
entire result set?


  Aaron Morton :
>>>
>>>I'm looking at the ConfigHelper.setRangeBatchSize() and
>>>CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if
>>>that's what I need and if yes, which one should I use for those purposes. If 
>>>you are using CQL 3 via Hadoop CqlConfigHelper.setInputCQLPageRowSize is the 
>>>one you want. 
>
>it maps to the LIMIT clause of the select statement the input reader will 
>generate, the default is 1,000.
>
>A
> 
>-
>Aaron Morton
>New Zealand
>@aaronmorton
>
>Co-Founder & Principal Consultant
>Apache Cassandra Consulting
>http://www.thelastpickle.com
>
>On 12/09/2013, at 9:04 AM, Jiaan Zeng < l.alle...@gmail.com > wrote:
>>Speaking of thrift client, i.e. ColumnFamilyInputFormat, yes,
>>ConfigHelper.setRangeBatchSize() can reduce the number of rows sent to
>>Cassandra.
>>
>>Depend on how big your column is, you may also want to increase thrift
>>message length through setThriftMaxMessageLengthInMb().
>>
>>Hope that helps.
>>
>>On Tue, Sep 10, 2013 at 8:18 PM, Renat Gilfanov < gren...@mail.ru > wrote:
>>>Hi,
>>>
>>>We have Hadoop jobs that read data from our Cassandra column families and
>>>write some data back to another column families.
>>>The input column families are pretty simple CQL3 tables without wide rows.
>>>In Hadoop jobs we set up corresponding WHERE clause in
>>>ConfigHelper.setInputWhereClauses(...), so we don't process the whole table
>>>at once.
>>>Never  the less, sometimes the amount of data returned by input query is big
>>>enough to cause TimedOutExceptions.
>>>
>>>To mitigate this, I'd like to configure Hadoop job in a such way that it
>>>sequentially fetches input rows by smaller portions.
>>>
>>>I'm looking at the ConfigHelper.setRangeBatchSize() and
>>>CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if
>>>that's what I need and if yes, which one should I use for those purposes.
>>>
>>>Any help is appreciated.
>>>
>>>Hadoop version is 1.1.2, Cassandra version is 1.2.8.
>>
>>
>>
>>-- 
>>Regards,
>>Jiaan
>