Re: Unable to drop secondary index

2013-04-19 Thread Michal Michalski

Hi Aaron,


Was the schema created with CQL or the CLI ?


It was created using Pycassa and - as far as I know - it was managed 
only by CLI.


> (It's not a good idea to manage one with the other)

Yes, I know - I only tried using CQL after I realized that CLI is not 
working, as I had to make it work (which didn't happen, though ;-) ) 
because my secondary index was returning wrong results and I wasn't able 
to rebuild it.
However, I can't tell for sure that no-one else has ever modified it 
using CQL before.



Can you provide the schema after the update and the update cf statement?


Update CF statement? I'm not updating it, I'm just trying to drop the 
index using DROP statement:


cli:   DROP INDEX ON Users.username;
cqlsh: DROP INDEX Users_username_idx;

No other updates have been made.
Here's the Users CF schema printed by CLI:

[default@production] describe Users;
ColumnFamily: Users
  Key Validation Class: org.apache.cassandra.db.marshal.LexicalUUIDType
  Default column value validator: 
org.apache.cassandra.db.marshal.UTF8Type

  Columns sorted by: org.apache.cassandra.db.marshal.AsciiType
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  DC Local Read repair chance: 0.0
  Replicate on write: true
  Caching: KEYS_ONLY
  Bloom Filter FP chance: default
  Built indexes: [Users.Users_active_idx, Users.Users_email_idx, 
Users.Users_username_idx]

  Column Metadata:
Column Name: date_created
  Validation Class: org.apache.cassandra.db.marshal.LongType
Column Name: active
  Validation Class: org.apache.cassandra.db.marshal.IntegerType
  Index Name: Users_active_idx
  Index Type: KEYS
Column Name: email
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Index Name: Users_email_idx
  Index Type: KEYS
Column Name: username
  Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Index Name: Users_username_idx
  Index Type: KEYS
Column Name: default_account_id
  Validation Class: org.apache.cassandra.db.marshal.LexicalUUIDType
  Compaction Strategy: 
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy

  Compression Options:
sstable_compression: 
org.apache.cassandra.io.compress.SnappyCompressor


CQL says (and shows a notification):

/usr/lib/pymodules/python2.7/cqlshlib/cql3handling.py:1519: 
UnexpectedTableStructure: Unexpected table structure; may not translate 
correctly to CQL. Compact storage CF Users has no column aliases, but 
comparator is not UTF8Type.


CREATE TABLE "Users" (
  key 'org.apache.cassandra.db.marshal.LexicalUUIDType' PRIMARY KEY,
  active varint,
  date_created bigint,
  default_account_id 'org.apache.cassandra.db.marshal.LexicalUUIDType',
  email text,
  username text
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.00 AND
  replicate_on_write='true' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

CREATE INDEX Users_active_idx ON "Users" (active);

CREATE INDEX Users_email_idx ON "Users" (email);

CREATE INDEX Users_username_idx ON "Users" (username);


M.


Re: differences between DataStax Community Edition and Cassandra package

2013-04-19 Thread Goktug YILDIRIM
I am sorry if this a very well know topic and I missed that. I wonder why
one must use CFS. What is unavailable in Cassandra without CFS?

Best,

-- Goktug


On Thu, Apr 18, 2013 at 9:33 PM, aaron morton wrote:

> sorry to ask in this thread, but from some time I am wondering how CFS can
> be installed on normal Cassandra?
>
> CFS is part of the Data Stax enterprise product.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/04/2013, at 1:35 AM, Nikolay Mihaylov  wrote:
>
> I thought so,
> sorry to ask in this thread, but from some time I am wondering how CFS can
> be installed on normal Cassandra?
>
>
> On Thu, Apr 18, 2013 at 3:23 PM, Michal Michalski wrote:
>
>> Probably Robert meant CFS: http://www.datastax.com/wp-**
>> content/uploads/2012/09/WP-**DataStax-HDFSvsCFS.pdf:-)
>>
>> W dniu 18.04.2013 14:10, Nikolay Mihaylov pisze:
>>
>>  whats CDFS ? I am sure you are not referring iso9660, e.g. CD-ROM
>>> filesystem? :)
>>>
>>>
>>> On Wed, Apr 17, 2013 at 10:42 PM, Robert Coli 
>>> wrote:
>>>
>>>  On Wed, Apr 17, 2013 at 11:19 AM, aaron morton >>> >**wrote:

  It's the same as the Apache version, but DSC comes with samples and the
> free version of Ops Centre.
>
>
>  DSE also comes with Solr special sauce and CDFS.

 =Rob



>>>
>>
>
>


Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-19 Thread Stuart Broad
Hi,

I am using Cassandra.Client prepare_cql3_query/execute_prepared_cql3_query
to create and run some prepared statements.  It is working well but I am
unclear as to how long the server side 'caches' the prepared statements.
 Should a prepared statement be prepared for every new Cassandra.Client?
 Based on my limited testing it seems like I can create some prepared
statements in one Cassandra.Client and use in another but I am not sure how
reliable/lasting this is i.e.  If I called the prepared statement again the
next day would it still exist?  What about if cassandra was re-started?

*Background:*
I am creating prepared statements for batch updates of pre-defined lengths
(e.g. 1, 1000, 500, 250, 50, 10, 1) and wanted to know if these could
just be set up once.  We felt that using the prepared statements was easier
than escaping values within a CQL statement and probably more performant.

Thanks in advance for your help.

Regards,

Stuart

p.s. I am relatively new to cassandra.


Re: Key-Token mapping in cassandra

2013-04-19 Thread Hiller, Dean
It should be Key4 --> 123

And you store it in the Image CF or the Documents CF or the MultiMedia CF….then 
it ends up all on the same machine as the CF breaks the row apart.

Dean

From: Alicia Leong mailto:lccali...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Thursday, April 18, 2013 8:24 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Key-Token mapping in cassandra

Hi Ravi

Key1 --> 123/IMAGE
Key2 --> 123/DOCUMENTS
Key3 --> 123/MULTIMEDIA


Which one is your ROW KEY ??   It is Key1,Key2,Key3?




On Thu, Apr 18, 2013 at 3:56 PM, aaron morton 
mailto:aa...@thelastpickle.com>> wrote:
All rows with the same key go on the same nodes. So if you use the same row key 
in different CF's they will be on the same nodes. i.e. have CF's called Image, 
Documents, Meta and store rows in all of them with the 123 key.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/04/2013, at 1:32 PM, Ravikumar Govindarajan 
mailto:ravikumar.govindara...@gmail.com>> 
wrote:

Thanks Aaron.
 We are looking at co-locating all keys for a given user in one Cassandra node.
Are there any other ways to achieve this

--
Ravi

On Thursday, April 18, 2013, aaron morton wrote:
CASSANDRA-1034
That ticket is about removing an assumption which was not correct.

I would like all keys with "123" as prefix to be mapped to a single token.
Why?
it's not possible nor desirable IMHO. Tokens are used to identify a single row 
internally.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/04/2013, at 11:25 PM, Ravikumar Govindarajan 
 wrote:

We would like to map multiple keys to a single token in cassandra. I believe 
this should be possible now with CASSANDRA-1034

Ex:

Key1 --> 123/IMAGE
Key2 --> 123/DOCUMENTS
Key3 --> 123/MULTIMEDIA

I would like all keys with "123" as prefix to be mapped to a single token.

Is this possible? What should be the Partitioner that I should most likely 
extend and write my own to achieve the desired result?

--
Ravi





Datastax Java Driver connection issue

2013-04-19 Thread Abhijit Chanda
Hi,

I have downloaded the CQL driver provided by Datastax using
   
com.datastax.cassandra
cassandra-driver-core
1.0.0-beta2


Then tried a sample program to connect to the cluster
Cluster cluster = Cluster.builder()
.addContactPoints(db1)
.withPort(9160)
.build();

But sadly its returning
c*om.datastax.driver.core.exceptions.NoHostAvailableException:
All host(s) tried for query failed   *
*
*
I am using cassandra 1.2.2

Can any one suggest me whats wrong with that.

And i am really sorry for posting  datastax java driver related question in
this forum, can't find a better place for the instant reaction


-Abhijit


Re: Datastax Java Driver connection issue

2013-04-19 Thread Gabriel Ciuloaica

Have you started the native transport on cassandra nodes?

Look into cassandra.yaml file, for native.transport. By default is disabled.

Br,
Gabi
On 4/19/13 4:16 PM, Abhijit Chanda wrote:

Hi,

I have downloaded the CQL driver provided by Datastax using
   
com.datastax.cassandra
cassandra-driver-core
1.0.0-beta2


Then tried a sample program to connect to the cluster
Cluster cluster = Cluster.builder()
.addContactPoints(db1)
.withPort(9160)
.build();

But sadly 
its returning c*om.datastax.driver.core.exceptions.NoHostAvailableException: 
All host(s) tried for query failed *

*
*
I am using cassandra 1.2.2

Can any one suggest me whats wrong with that.

And i am really sorry for posting  datastax java driver related 
question in this forum, can't find a better place for the instant 
reaction



-Abhijit




Re: Datastax Java Driver connection issue

2013-04-19 Thread Keith Wright
Did you enable the binary protocol in Cassandra.yaml?

Abhijit Chanda  wrote:



Hi,

I have downloaded the CQL driver provided by Datastax using
   
com.datastax.cassandra
cassandra-driver-core
1.0.0-beta2


Then tried a sample program to connect to the cluster
Cluster cluster = Cluster.builder()
.addContactPoints(db1)
.withPort(9160)
.build();

But sadly its returning 
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried 
for query failed

I am using cassandra 1.2.2

Can any one suggest me whats wrong with that.

And i am really sorry for posting  datastax java driver related question in 
this forum, can't find a better place for the instant reaction


-Abhijit


Re: Datastax Java Driver connection issue

2013-04-19 Thread Abhijit Chanda
@Gabriel, @Wright: thanks, such a silly of me.


On Fri, Apr 19, 2013 at 6:48 PM, Keith Wright  wrote:

>  Did you enable the binary protocol in Cassandra.yaml?
>
> Abhijit Chanda  wrote:
>
>
>  Hi,
>
>  I have downloaded the CQL driver provided by Datastax using
> 
> com.datastax.cassandra
> cassandra-driver-core
> 1.0.0-beta2
> 
>
>  Then tried a sample program to connect to the cluster
>  Cluster cluster = Cluster.builder()
> .addContactPoints(db1)
> .withPort(9160)
> .build();
>
>  But sadly its returning 
> c*om.datastax.driver.core.exceptions.NoHostAvailableException:
> All host(s) tried for query failed   *
> *
> *
> I am using cassandra 1.2.2
>
>  Can any one suggest me whats wrong with that.
>
>  And i am really sorry for posting  datastax java driver related question
> in this forum, can't find a better place for the instant reaction
>
>
>  -Abhijit
>



-- 
-Abhijit


Re: Unable to drop secondary index

2013-04-19 Thread Michal Michalski
It seems we can't update schemas at all. I tried to change 
read_repair_chance and it looks the same. However, in this case I'm 99% 
sure that some of the CFs I tried to update were NOT updated using CQL 
*ever* - only CLI. Not good...


But, as I mentioned before - we did the same on test cluster (probably 
with even more CLI & CQL mixing) and it works there.


M.

W dniu 19.04.2013 11:03, Michal Michalski pisze:

Hi Aaron,


Was the schema created with CQL or the CLI ?


It was created using Pycassa and - as far as I know - it was managed
only by CLI.

 > (It's not a good idea to manage one with the other)

Yes, I know - I only tried using CQL after I realized that CLI is not
working, as I had to make it work (which didn't happen, though ;-) )
because my secondary index was returning wrong results and I wasn't able
to rebuild it.
However, I can't tell for sure that no-one else has ever modified it
using CQL before.


Can you provide the schema after the update and the update cf statement?


Update CF statement? I'm not updating it, I'm just trying to drop the
index using DROP statement:

cli:   DROP INDEX ON Users.username;
cqlsh: DROP INDEX Users_username_idx;

No other updates have been made.
Here's the Users CF schema printed by CLI:

[default@production] describe Users;
 ColumnFamily: Users
   Key Validation Class:
org.apache.cassandra.db.marshal.LexicalUUIDType
   Default column value validator:
org.apache.cassandra.db.marshal.UTF8Type
   Columns sorted by: org.apache.cassandra.db.marshal.AsciiType
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   DC Local Read repair chance: 0.0
   Replicate on write: true
   Caching: KEYS_ONLY
   Bloom Filter FP chance: default
   Built indexes: [Users.Users_active_idx, Users.Users_email_idx,
Users.Users_username_idx]
   Column Metadata:
 Column Name: date_created
   Validation Class: org.apache.cassandra.db.marshal.LongType
 Column Name: active
   Validation Class: org.apache.cassandra.db.marshal.IntegerType
   Index Name: Users_active_idx
   Index Type: KEYS
 Column Name: email
   Validation Class: org.apache.cassandra.db.marshal.UTF8Type
   Index Name: Users_email_idx
   Index Type: KEYS
 Column Name: username
   Validation Class: org.apache.cassandra.db.marshal.UTF8Type
   Index Name: Users_username_idx
   Index Type: KEYS
 Column Name: default_account_id
   Validation Class:
org.apache.cassandra.db.marshal.LexicalUUIDType
   Compaction Strategy:
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
   Compression Options:
 sstable_compression:
org.apache.cassandra.io.compress.SnappyCompressor

CQL says (and shows a notification):

/usr/lib/pymodules/python2.7/cqlshlib/cql3handling.py:1519:
UnexpectedTableStructure: Unexpected table structure; may not translate
correctly to CQL. Compact storage CF Users has no column aliases, but
comparator is not UTF8Type.

CREATE TABLE "Users" (
   key 'org.apache.cassandra.db.marshal.LexicalUUIDType' PRIMARY KEY,
   active varint,
   date_created bigint,
   default_account_id 'org.apache.cassandra.db.marshal.LexicalUUIDType',
   email text,
   username text
) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=1.00 AND
   replicate_on_write='true' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

CREATE INDEX Users_active_idx ON "Users" (active);

CREATE INDEX Users_email_idx ON "Users" (email);

CREATE INDEX Users_username_idx ON "Users" (username);


M.




Advice on memory warning

2013-04-19 Thread Michael Theroux
Hello,

We've recently upgraded from m1.large to m1.xlarge instances on AWS to handle 
additional load, but to also relieve memory pressure.  It appears to have 
accomplished both, however, we are still getting a warning, 0-3 times a day, on 
our database nodes:

WARN [ScheduledTasks:1] 2013-04-19 14:17:46,532 GCInspector.java (line 145) 
Heap is 0.7529240824406468 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically

This is happening much less frequently than before the upgrade, but after 
essentially doubling the amount of available memory, I'm curious on what I can 
do to determine what is happening during this time.  

I am collecting all the JMX statistics.  Memtable space is elevated but not 
extraordinarily high.  No GC messages are being output to the log.   

These warnings do seem to be occurring doing compactions of column families 
using LCS with wide rows, but I'm not sure there is a direct correlation.

We are running Cassandra 1.1.9, with a maximum heap of 8G.  

Any advice?
Thanks,
-Mike

Re: Moving cluster

2013-04-19 Thread Kais Ahmed
Hello and thank you for your answers.

The first solution is much easier for me because I use the vnode.

What is the risk of the first solution

thank you,


2013/4/18 aaron morton 

> This is roughly the lift and shift process I use.
>
> Note that disabling thrift and gossip does not stop an existing repair
> session. So I often drain and then shutdown, and copy the live data dir
> rather than a snapshot dir.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/04/2013, at 4:10 AM, Michael Theroux  wrote:
>
> This should work.
>
> Another option is to follow a process similar to what we recently did.  We
> recently and successfully upgraded 12 instances from large to xlarge
> instances in AWS.  I chose not to replace nodes as restoring data from the
> ring would have taken significant time and put the cluster under some
> additional load.  I also wanted to eliminate the possibility that any
> issues on the new nodes could be blamed on new configuration/operating
> system differences.  Instead we followed the following procedure (removing
> some details that would likely be unique to our infrastructure).
>
> For a node being upgraded:
>
> 1) nodetool disable thrift
> 2) nodetool disable gossip
> 3) Snapshot the data (nodetool snapshot ...)
> 4) Backup the snapshot data to EBS (assuming you are on ephemeral)
> 5) Stop cassandra
> 6) Move the cassandra.yaml configuration file to cassandra.yaml.bak (to
> prevent any future restarts to cause cassandra to restart)
> 7) Shutdown the instance
> 8) Take an AMI of the instance
> 9) Start a new instance from the AMI with the desired hardware
> 10) If you assign the new instance a new IP Address, make sure any entries
> in /etc/hosts, or the broadcast_address in cassandra.yaml is updated
> 11) Attach the volume you backed up your snapshot data to to the new
> instance and mount it
> 12) Restore the snapshot data
> 13) Restore cassandra.yaml file
> 13) Restart cassandra
>
> - I recommend practicing this on a test cluster first
> - As you replace nodes with new IP Addresses, eventually all your seeds
> will need be updated.  This is not a big deal until all your seed nodes
> have been replaced.
> - Don't forget about NTP!  Make sure it is running on all your new nodes.
>  Myself, to be extra careful, I actually deleted the ntp drift file and let
> NTP recalculate it because its a new instance, and it took over an hour to
> restore our snapshot data... but that may have been overkill.
> - If you have the opportunity, depending on your situation, increase
> the max_hint_window_in_ms
> - Your details may vary
>
> Thanks,
> -Mike
>
> On Apr 18, 2013, at 11:07 AM, Alain RODRIGUEZ wrote:
>
> I would say add your 3 servers to the 3 tokens where you want them, let's
> say :
>
> {
> "0": {
> "0": 0,
> "1": 56713727820156410577229101238628035242,
> "2": 113427455640312821154458202477256070485
> }
> }
>
> or these token -1 or +1 if you already have these token used. And then
> just decommission x1Large nodes. You should be good to go.
>
>
>
> 2013/4/18 Kais Ahmed 
>
>> Hi,
>>
>> What is the best pratice to move from a cluster of 7 nodes (m1.xlarge) to
>> 3 nodes (hi1.4xlarge).
>>
>> Thanks,
>>
>
>
>
>


Re: differences between DataStax Community Edition and Cassandra package

2013-04-19 Thread Robert Coli
On Fri, Apr 19, 2013 at 4:18 AM, Goktug YILDIRIM
 wrote:
>
> I am sorry if this a very well know topic and I missed that. I wonder why one 
> must use CFS. What is unavailable in Cassandra without CFS?


The SOLR stuff (also only-in-DSE) uses CFS for Hadoop-like storage.
This is so you can use full SOLR support with DSE Cassandra without
needing Hadoop.

http://www.datastax.com/dev/blog/cassandra-file-system-design
"
The Cassandra File System (CFS) is an HDFS compatible filesystem built
to replace the traditional Hadoop NameNode, Secondary NameNode and
DataNode daemons. It is the foundation of our Hadoop support in
DataStax Enterprise.

The main design goals for the Cassandra File System were to first,
simplify the operational overhead of Hadoop by removing the single
points of failure in the Hadoop NameNode. Second, to offer easy Hadoop
integration for Cassandra users (one distributed system is enough)
"

=Rob


Re: Key-Token mapping in cassandra

2013-04-19 Thread Ravikumar Govindarajan
I think I have simplified my example a little too much.

Lets assume that there are groups and users.

Ideally a grpId becomes the key and it holds some meta-data.

Lets say GroupMetaCF

grpId --> key, entityId --> col-name, blobdata --> col-value

Now we have a UserTimeSeriesCF

grpId/userId --> key, UUID --> col-name, entityId --> col-value

[Each user will view a subset of the grp data, based on roles etc...]

There are many more such CFs all with prefixes of "grpId". By hashing grpId
to cassandra's token, I thought we can co-locate all the group's data into
one set of replicated nodes.

Is there a way to achieve this?

--
Ravi


On Thu, Apr 18, 2013 at 1:26 PM, aaron morton wrote:

> All rows with the same key go on the same nodes. So if you use the same
> row key in different CF's they will be on the same nodes. i.e. have CF's
> called Image, Documents, Meta and store rows in all of them with the 123
> key.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/04/2013, at 1:32 PM, Ravikumar Govindarajan <
> ravikumar.govindara...@gmail.com> wrote:
>
> Thanks Aaron.
>  We are looking at co-locating all keys for a given user in one Cassandra
> node.
> Are there any other ways to achieve this
>
> --
> Ravi
>
> On Thursday, April 18, 2013, aaron morton wrote:
>
>> CASSANDRA-1034
>>
>> That ticket is about removing an assumption which was not correct.
>>
>> I would like all keys with "123" as prefix to be mapped to a single token.
>>
>> Why?
>> it's not possible nor desirable IMHO. Tokens are used to identify a
>> single row internally.
>>
>> Cheers
>>
>>-
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 17/04/2013, at 11:25 PM, Ravikumar Govindarajan <
>> ravikumar.govindara...@gmail.com> wrote:
>>
>> We would like to map multiple keys to a single token in cassandra. I
>> believe this should be possible now with CASSANDRA-1034
>>
>> Ex:
>>
>> Key1 --> 123/IMAGE
>> Key2 --> 123/DOCUMENTS
>> Key3 --> 123/MULTIMEDIA
>>
>> I would like all keys with "123" as prefix to be mapped to a single token.
>>
>> Is this possible? What should be the Partitioner that I should most
>> likely extend and write my own to achieve the desired result?
>>
>> --
>> Ravi
>>
>>
>>
>


Batch get queries

2013-04-19 Thread Keith Wright
Hi all,

   I am using C* 1.2.4 and using CQL3 with Astyanax to consume large amount of 
user based data (around 50-100K / sec).  Requests come in based on user cookies 
which I then need to link to a user (as users can change their cookies).  This 
is done using a link table:

CREATE TABLE cookie_user_lookup (
cookie TEXT PRIMARY KEY,
user_id BIGINT,
creation_time TIMESTAMP
) with  
compression={'crc_check_chance':0.1,'sstable_compression':'LZ4Compressor'} and
compaction={'class':'LeveledCompactionStrategy'} and
gc_grace_seconds = 86400;

As I said, I am handling a large number of these per second and wanted to get 
your take on how best to do the lookup.  I find that there are 3 ways:

 *   Serially fetch 1 by 1.  The latency is very low at 0.1 ms but multiplying 
that by thousands per second becomes substantial.  This is too slow
 *   Serially fetch 1 by 1 but on separate threads.  This would require a very 
large number of concurrent connections (unless I change to datastax's binary 
protocol) as well as threads.  Seems heavy
 *   Batch fetch.  This is what I'm doing now where I build a very large select 
* from cookie_user_lookup where cookie in (a,b,c,.. Etc).  I am actually doing 
around 10K of these at a time and getting a response time in my cluster of 
around 100 ms.  This is very acceptable but wanted to get everyone's take as I 
have seen messages about this "starving" the request pool.  Note that I'm 
running in HSHA and am rarely seeing any reads waiting.

I appreciate your input!


index filter

2013-04-19 Thread Kanwar Sangha
Guys - Quick question. The index filter file created for a sstable contains all 
keys/index offset for a sstable ? I know that when we bring up the node, it 
reads a sample of the keys from this file. So this file contains all keys and a 
sample is read on startup ?

Thanks,
Kanwar





Re: index filter

2013-04-19 Thread Robert Coli
On Fri, Apr 19, 2013 at 10:38 AM, Kanwar Sangha  wrote:
> Guys – Quick question. The index filter file created for a sstable contains
> all keys/index offset for a sstable ? I know that when we bring up the node,
> it reads a sample of the keys from this file. So this file contains all keys
> and a sample is read on startup ?

The -Index file and the -Filter file are two different things. The
-Index file is what you describe. The -Filter is a bloom filter.

=Rob


Re: CQL

2013-04-19 Thread Sri Ramya
Please can any body help me to solve the above problem. I stuck with that
problem in my project. help me


Re: CQL

2013-04-19 Thread David McNelis
In order to do a query like that you'll need to have a timestamp/date as
the second portion of the primary key.

You'll only be able to do queries where you already know the key.  Unless
you're using an OrderPreservingPartitioner, there is no way to get a
continuous set of information back based on the key for the row (the first
key if you have a multi-column primary key).

I suggest you take a read of this DataStax post about time series data:
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra

Last, this is definitely a message better served by the user list and not
the dev list (which is primarily for communication about development
efforts of cassandra).

Good luck.


On Fri, Apr 19, 2013 at 1:06 PM, Sri Ramya  wrote:

> Please can any body help me to solve the above problem. I stuck with that
> problem in my project. help me
>


Re: CQL

2013-04-19 Thread Michael Theroux
A lot more details on your usecase and requirements would help.  You need to 
make specific considerations in cassandra when you have requirements around 
ordering.  Ordering can be achieved across columns.  Ordering across rows is a 
bit more tricky and may require the use of specific partitioners...

I did a real quick google search on "cassandra ordering" and found some good 
links (http://ayogo.com/blog/sorting-in-cassandra/).

To do queries with ordering across columns, using the "FIRST" keyword will 
return results based on the comparator you defined in your schema:

http://cassandra.apache.org/doc/cql/CQL.html#SELECT

Hope this helps,
-Mike

On Apr 19, 2013, at 7:32 AM, Sri Ramya wrote:

> hi,
> 
>  I am working with CQl. I want perform query base on timestamp. Can anyone 
> help me out of this how to get date great than or less than a given time 
> timestamp in cassandra.



Re: index filter

2013-04-19 Thread Kanwar Sangha
Let me rephrase. I am talking about the index file on disk created per sstable. 
Does that contain all key indexes?

Sent from Samsung mobile

Robert Coli  wrote:


On Fri, Apr 19, 2013 at 10:38 AM, Kanwar Sangha  wrote:
> Guys – Quick question. The index filter file created for a sstable contains
> all keys/index offset for a sstable ? I know that when we bring up the node,
> it reads a sample of the keys from this file. So this file contains all keys
> and a sample is read on startup ?

The -Index file and the -Filter file are two different things. The
-Index file is what you describe. The -Filter is a bloom filter.

=Rob


Re: index filter

2013-04-19 Thread Robert Coli
On Fri, Apr 19, 2013 at 11:36 AM, Kanwar Sangha  wrote:
> Let me rephrase. I am talking about the index file on disk created per 
> sstable. Does that contain all key indexes?

Yes. That's how sstablekeys can relatively efficiently produce a list
of keys, by parsing the index file.

=Rob


[RELEASE] Apache Cassandra 1.1.11

2013-04-19 Thread Eric Evans

The Cassandra team is pleased to announce the release of Apache Cassandra
version 1.1.11.

Downloads of source and binary distributions are listed on the website:

  http://cassandra.apache.org/download/

As usual this release includes a Debian package, but you can expect
some delay in seeing it published to the APT repository.  In the meantime
you can download it manually from:

  http://people.apache.org/~eevans

This is a maintenance/bug fix release[1] on the 1.1 series. As always,
please pay careful attention to the release notes[2] and Let us know[3]
right away if have any problems.

Enjoy,

[1]: http://goo.gl/QfZlg (CHANGES.txt)
[2]: http://goo.gl/O55QF (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA
[4]: http://goo.gl/KbiRm (CHEC)

-- 
Eric Evans
eev...@sym-link.com


Re: Datastax API which uses Binary protocol- Quick question

2013-04-19 Thread Tyler Hobbs
On Thu, Apr 18, 2013 at 9:02 PM, Techy Teck  wrote:

>
> When I was working with Cassandra CLI using the Netflix client(Astyanax
> client), then I created the column family like this-
>
> create column family profile
> with key_validation_class = 'UTF8Type'
> and comparator = 'UTF8Type'
> and default_validation_class = 'UTF8Type'
> and column_metadata = [
>   {column_name : crd, validation_class : 'DateType'}
>   {column_name : lmd, validation_class : 'DateType'}
>   {column_name : account, validation_class : 'UTF8Type'}
>   {column_name : advertising, validation_class : 'UTF8Type'}
>   {column_name : behavior, validation_class : 'UTF8Type'}
>   {column_name : info, validation_class : 'UTF8Type'}
>   ];
>
> Now I was trying to do the same thing using Datastax API. So to start
> working with Datastax API, do I need to create the column family in some
> different way as mentioned above? Or the above column familiy will work
> fine whenever I will try to insert data into Cassandra database using
> Datastax API.
>

If this column family already exists, the java-driver will be able to use
it.  It will resemble a column family created WITH COMPACT STORAGE through
cql3.


>
> If the above column family will not work then-
>
> First of all I have created the KEYSPACE like below-
>
> `CREATE KEYSPACE USERS WITH strategy_class = 'SimpleStrategy' AND
> strategy_options:replication_factor = '1';`
>
> Now I am confuse how to create the table? I am not sure which is the right
> way to do that?
>
> Should I create like this?
>
> `CREATE TABLE profile (
> id varchar,
> account varchar,
> advertising varchar,
> behavior varchar,
> info varchar,
> PRIMARY KEY (id)
> );`
>
> or should I create like this?
>
> `CREATE COLUMN FAMILY profile (
> id varchar,
> account varchar,
> advertising varchar,
> behavior varchar,
> info varchar,
> PRIMARY KEY (id)
> );`
>

You can use either "TABLE" or "COLUMN FAMILY".  They are equivalent.


>
> And also how to add-
>
> crd as DateType
> lmd as DateType
>
> in above table or column family while working with Datastax API?
>

In cql3, "timestamp" corresponds to DateType.

Use ALTER TABLE to add columns to a table:
http://www.datastax.com/docs/1.2/cql_cli/cql/ALTER_TABLE#cql-alter-columnfamily

-- 
Tyler Hobbs
DataStax 


Re: differences between DataStax Community Edition and Cassandra package

2013-04-19 Thread Nikolay Mihaylov
Is there alternative file systems running on top of cassandra?



On Fri, Apr 19, 2013 at 7:44 PM, Robert Coli  wrote:

> On Fri, Apr 19, 2013 at 4:18 AM, Goktug YILDIRIM
>  wrote:
> >
> > I am sorry if this a very well know topic and I missed that. I wonder
> why one must use CFS. What is unavailable in Cassandra without CFS?
>
>
> The SOLR stuff (also only-in-DSE) uses CFS for Hadoop-like storage.
> This is so you can use full SOLR support with DSE Cassandra without
> needing Hadoop.
>
> http://www.datastax.com/dev/blog/cassandra-file-system-design
> "
> The Cassandra File System (CFS) is an HDFS compatible filesystem built
> to replace the traditional Hadoop NameNode, Secondary NameNode and
> DataNode daemons. It is the foundation of our Hadoop support in
> DataStax Enterprise.
>
> The main design goals for the Cassandra File System were to first,
> simplify the operational overhead of Hadoop by removing the single
> points of failure in the Hadoop NameNode. Second, to offer easy Hadoop
> integration for Cassandra users (one distributed system is enough)
> "
>
> =Rob
>


Re: Prepared Statement - cache duration (CQL3 - Cassandra 1.2.4)

2013-04-19 Thread Sorin Manolache

On 2013-04-19 13:57, Stuart Broad wrote:

Hi,

I am using Cassandra.Client
prepare_cql3_query/execute_prepared_cql3_query to create and run some
prepared statements.  It is working well but I am unclear as to how long
the server side 'caches' the prepared statements.  Should a prepared
statement be prepared for every new Cassandra.Client?  Based on my
limited testing it seems like I can create some prepared statements in
one Cassandra.Client and use in another but I am not sure how
reliable/lasting this is i.e.  If I called the prepared statement again
the next day would it still exist?  What about if cassandra was re-started?


I don't know the answer and I have the same question. But have a look at 
this discussion, dating from September 2012.


https://issues.apache.org/jira/browse/CASSANDRA-4449

Apparently prepared statements are shared among threads (they were 
per-connection previously), they don't survive server restarts, 
apparently there's an LRU mechanism, and apparently you get a special 
exception if the prepared statement "disappeared" so you can prepare it 
again.


Regards,
Sorin



_Background:_
I am creating prepared statements for batch updates of pre-defined
lengths (e.g. 1, 1000, 500, 250, 50, 10, 1) and wanted to know if
these could just be set up once.  We felt that using the prepared
statements was easier than escaping values within a CQL statement and
probably more performant.

Thanks in advance for your help.

Regards,

Stuart

p.s. I am relatively new to cassandra.




Re: Cassandra bulk import confusion

2013-04-19 Thread Viksit Gaur
> On 30 Jul 2011, at 04:10, Jeff Schmidt wrote:
> Hello:
> I'm relatively new to Cassandra, but I've been searching around, 
> and it looks like Cassandra 0.8.x 
>has improved support for bulk importing of data.  I keep finding 
> references to the json2sstable 
>command, and I've read about that on the Datastax and Apache 
> documentation pages.
> 
> There's a lot of detail here if you want it, otherwise please skip 
> to the end. json2sstable seems to 
run successfully, but I cannot see the data in the new CF using the CLI.
> 

This is a reply after a long time, but the main way to resolve this is,

- Run nodetool refresh  
- Ensure that the data files are named correctly
 according the keyspace-cfname convention

- Viksit



Re: Datastax API which uses Binary protocol- Quick question

2013-04-19 Thread Techy Teck
Thanks a lot Tyler. That clears lot of my doubt. I have couple more
questions related to Datastax Java Driver-

1) Firstly, is there any way to figure out what version of CQL I am
running? Is it CQL 3 or something else? Is there any command that we can
use to check? Abd also by default CQLish behavior is false and I need to
enable that? Or it will come by default in all the Cassandra Version? By
the way, I am running Cassandra 1.2.3.

2) Secondly, I have created my column family in my keyspace like this-

 create column family profile
 with key_validation_class = 'UTF8Type'
 and comparator = 'UTF8Type'
 and default_validation_class = 'UTF8Type'
 and column_metadata = [
 {column_name : crd, validation_class : 'DateType'}
 {column_name : lmd, validation_class : 'DateType'}
 {column_name : account, validation_class : 'UTF8Type'}
 {column_name : advertising, validation_class : 'UTF8Type'}
 {column_name : behavior, validation_class : 'UTF8Type'}
 {column_name : info, validation_class : 'UTF8Type'}
 ];

Now I am trying to upsert data into above Column Family. I am not able to
understand how should I upsert the data into that as I am not able to find
lot of documentation that can explain simple example. Below is my upsert
method which will have two parameters-

userId and columnsNameAndValue

columnsNameAndValue is the map which will contain columns name as the key
and that corresponding value as the value.

/**
 * Performs an upsert of the specified attributes for the specified id.
 */
public void upsertAttributes(final String userId, final Map columnsNameAndValue) {

// I am not sure what I am supposed to do here to upsert the data?

}

Can you provide an example how to do that?


3)Thirdly, my last question- I am also trying to retrieve the data from
Cassandra using the same Datastax Java Driver given a row key-


/**
 * Retrieves and returns the specified attributes for
the specified id.
 * @return a Map of attribute name and its corresponding value
 */
public Map getAttributes(final String userId, final
Collection columnNames ) {

 // Now I am not sure what to do here as well to retrieve the data from
the Cassandra using Datastax Java Driver?
 // Here columnNames will be List of columns that I want to retrieve
from Cassandra on userId as the row key.

}

Any example on this will also be of great help.


I am totally new to Cassandra and Datastax Java driver so that is the
reason I am having problem.

Thanks for the help.




On Fri, Apr 19, 2013 at 2:17 PM, Tyler Hobbs  wrote:

>
> On Thu, Apr 18, 2013 at 9:02 PM, Techy Teck wrote:
>
>>
>> When I was working with Cassandra CLI using the Netflix client(Astyanax
>> client), then I created the column family like this-
>>
>> create column family profile
>> with key_validation_class = 'UTF8Type'
>> and comparator = 'UTF8Type'
>> and default_validation_class = 'UTF8Type'
>> and column_metadata = [
>>   {column_name : crd, validation_class : 'DateType'}
>>   {column_name : lmd, validation_class : 'DateType'}
>>   {column_name : account, validation_class : 'UTF8Type'}
>>   {column_name : advertising, validation_class : 'UTF8Type'}
>>   {column_name : behavior, validation_class : 'UTF8Type'}
>>   {column_name : info, validation_class : 'UTF8Type'}
>>   ];
>>
>> Now I was trying to do the same thing using Datastax API. So to start
>> working with Datastax API, do I need to create the column family in some
>> different way as mentioned above? Or the above column familiy will work
>> fine whenever I will try to insert data into Cassandra database using
>> Datastax API.
>>
>
> If this column family already exists, the java-driver will be able to use
> it.  It will resemble a column family created WITH COMPACT STORAGE through
> cql3.
>
>
>>
>> If the above column family will not work then-
>>
>> First of all I have created the KEYSPACE like below-
>>
>> `CREATE KEYSPACE USERS WITH strategy_class = 'SimpleStrategy' AND
>> strategy_options:replication_factor = '1';`
>>
>> Now I am confuse how to create the table? I am not sure which is the
>> right way to do that?
>>
>> Should I create like this?
>>
>> `CREATE TABLE profile (
>> id varchar,
>> account varchar,
>> advertising varchar,
>> behavior varchar,
>> info varchar,
>> PRIMARY KEY (id)
>> );`
>>
>> or should I create like this?
>>
>> `CREATE COLUMN FAMILY profile (
>> id varchar,
>> account varchar,
>> advertising varchar,
>> behavior varchar,
>> info varchar,
>> PRIMARY KEY (id)
>> );`
>>
>
> You can use either "TABLE" or "COLUMN FAMILY".  They are equivalent.
>
>
>>
>> And also how to add-
>>
>> crd as DateType
>> lmd as DateType
>>
>> in above table or column family while working with Datastax API?
>>
>
> In cql3, "timestamp" corresponds to DateType.
>
> Use ALTER TABLE to add columns to a table:
> http://www.da

Re: Datastax Java Driver connection issue

2013-04-19 Thread Techy Teck
I am also running into this problem. I have already enabled
*start_native_transport:
true*

And by this, I am trying to make a connection-

private CassandraDatastaxConnection() {

try{
cluster =
Cluster.builder().addContactPoint("localhost").build();
session = cluster.connect("my_keyspace");
} catch (NoHostAvailableException e) {
throw new RuntimeException(e);
}
}

And everytime it gives me the same exception-

*com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
tried for query failed (tried: [localhost/127.0.0.1])*

Any idea how to fix this problem?

Thanks for the help.
*
*




On Fri, Apr 19, 2013 at 6:41 AM, Abhijit Chanda
wrote:

> @Gabriel, @Wright: thanks, such a silly of me.
>
>
> On Fri, Apr 19, 2013 at 6:48 PM, Keith Wright wrote:
>
>>  Did you enable the binary protocol in Cassandra.yaml?
>>
>> Abhijit Chanda  wrote:
>>
>>
>>  Hi,
>>
>>  I have downloaded the CQL driver provided by Datastax using
>> 
>> com.datastax.cassandra
>> cassandra-driver-core
>> 1.0.0-beta2
>> 
>>
>>  Then tried a sample program to connect to the cluster
>>  Cluster cluster = Cluster.builder()
>> .addContactPoints(db1)
>> .withPort(9160)
>> .build();
>>
>>  But sadly its returning 
>> c*om.datastax.driver.core.exceptions.NoHostAvailableException:
>> All host(s) tried for query failed   *
>> *
>> *
>> I am using cassandra 1.2.2
>>
>>  Can any one suggest me whats wrong with that.
>>
>>  And i am really sorry for posting  datastax java driver related
>> question in this forum, can't find a better place for the instant reaction
>>
>>
>>  -Abhijit
>>
>
>
>
> --
> -Abhijit
>


Building SSTables using SSTableSimpleUnsortedWriter (v. 1.2.3)

2013-04-19 Thread David McNelis
Was trying to do a test of writing SSTs for a CQL3 table.  So I created the
following table:

CREATE TABLE test_sst_load (
  mykey1 ascii,
  mykey2 ascii,
  value1 ascii,
  PRIMARY KEY (mykey1, mykey2)
)

I then set up my writer like so: (moved to gist:
https://gist.github.com/dmcnelis/5424756 )

This created my SST files ok and they imported without throwing any sorts
of errors (had -v and --debug on) when using sstableloader.

When I went to query my data in cqlsh, I got an rpc error.  In my
system.log I saw an exception: java.lang.RuntimeException:
java.lang.IllegalArgumentException
 (also at the gist above).

I had a feeling that it wouldn't work.. but I can't see a way with the
SSTableSimpleUnsortedWriter (or in the AbstractSSTableWriter) to create an
sstable file that is going to work with the CQL3 tables.  I know its got to
be possible, I can import SSTs with the sstableloader from one cluster to
another, where the tables are CQL3.

What am I missing here?