about the consistency level

2011-01-16 Thread raoyixuan (Shandy)
How to set the consistency level in Cassandra 0.7? I mean what command? 华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo] Phone: 28358610 Mobile: 13425182943 Email: raoyix...@huawei.com 地址:深圳市龙岗区坂田华为基地 邮编:518129 Huawei Technologies Co., Ltd. Bantian, Longgang D

Re: balancing load

2011-01-16 Thread aaron morton
The nodes will not automatically delete stale data, to do that you need to run nodetool cleanup. See step 3 in the Range Changes > Bootstrap http://wiki.apache.org/cassandra/Operations#Range_changes If you are feeling paranoid before hand, you could run nodetool repair on each node in turn to

RE: balancing load

2011-01-16 Thread raoyixuan (Shandy)
You can issue the nodetool cleanup to clean up the data in old nodes. -Original Message- From: Karl Hiramoto [mailto:k...@hiramoto.org] Sent: Monday, January 17, 2011 3:34 PM To: user@cassandra.apache.org Subject: Re: balancing load Thanks for the help. I used "nodetool move", so now ea

Re: balancing load

2011-01-16 Thread Karl Hiramoto
Thanks for the help. I used "nodetool move", so now each node owns 20% of the space, but it seems that the data load is still mostly on 2 nodes. nodetool --host slave4 ring Address Status State LoadOwns Token

RE: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-16 Thread Viktor Jevdokimov
- Cassandra 0.7 beta1 on virtual Windows Server 2008 64bit machines (8 total). - In-house built C# client for .NET app connecting using Thrift, was worth it to built own client. - 150M transactions/day load and growing. Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.j

Re: is it possible to map an one from a a file and an one from cassandra?

2011-01-16 Thread Jun Young Kim
thanks for all. - Junyoung Kim (juneng...@gmail.com) On 01/17/2011 10:58 AM, Aaron Morton wrote: Thanks for the update. Aaron On 17 Jan, 2011,at 02:51 PM, Brandon Williams wrote: 2011/1/16 Jun Young Kim > Hi aron. I think that if the pig is able to

Re: is it possible to map an one from a a file and an one from cassandra?

2011-01-16 Thread Aaron Morton
Thanks for the update. AaronOn 17 Jan, 2011,at 02:51 PM, Brandon Williams wrote:2011/1/16 Jun Young Kim Hi aron. I think that if the pig is able to support to map it, the same job could be represented in java code itself. I believe that we can call a map function by loading

Re: is it possible to map an one from a a file and an one from cassandra?

2011-01-16 Thread Brandon Williams
2011/1/16 Jun Young Kim > Hi aron. > > I think that if the pig is able to support to map it, the same job could be > represented in java code itself. > > I believe that we can call a map function by loading a file and cassandra > at the same time. > > Ps) I dont need to join from them. I just wan

Re: RE: Cassandra and -XX:+UseLargePages

2011-01-16 Thread Aaron Morton
I pretty much had the same thoughts as you. I think setting xms to the same as xmx helps the JVM allocate all the memory up front. See the comments in conf/cassandra-env.sh. I was confident in defeating the imaginary bats, but was a little concerned that they should turn to dragons if I ignored the

RE: Cassandra and -XX:+UseLargePages

2011-01-16 Thread David Dabbs
-Original Message- From: Aaron Morton [mailto:aa...@thelastpickle.com] Sent: Sunday, January 16, 2011 7:07 PM To: user@cassandra.apache.org Subject: Re: Cassandra and -XX:+UseLargePages Chris, could you provide some more info on you experience? Were you using mmapped files? Using row or

Re: best way to do a count

2011-01-16 Thread Aaron Morton
Ah, was looking at something old. There is also a multiget_count() :) Sounds like your in business. Aaron On 17/01/2011, at 11:08 AM, Brandon Williams wrote: > On Sun, Jan 16, 2011 at 3:36 PM, Aaron Morton wrote: > Not that I know of. > > In 0.7 you have to pass a predicate to get_count (an

Re: Cassandra and -XX:+UseLargePages

2011-01-16 Thread Chris Goffinet
DiskAccessMode mmap mlockall() used No row caches Default keycache Default memtable thresholds -Chris On Jan 16, 2011, at 5:06 PM, Aaron Morton wrote: > Chris, could you provide some more info on you experience? Were you using > mmapped files? Using row or key caches? What were the memtabe thre

Re: Cassandra and -XX:+UseLargePages

2011-01-16 Thread Aaron Morton
Chris, could you provide some more info on you experience? Were you using mmapped files? Using row or key caches? What were the memtabe thresholds? Using mlockall() ? There are a couple of issues listed in the first paragraphs here that at first glance may cause issues http://www.oracle.com/te

Re: is it possible to map an one from a a file and an one from cassandra?

2011-01-16 Thread Aaron Morton
Yup, everything you can do in pig is doable in normal Hadoop. When you say you want to compare the keys, you're sort of doing an outer join. That's why I thought pig may make your life a bit easier, Good luck. Aaron On 17/01/2011, at 1:07 PM, Jun Young Kim wrote: > Hi aron. > > I think that

Re: Cassandra and -XX:+UseLargePages

2011-01-16 Thread Chris Goffinet
I've seen about a 13% improvement in practice. -Chris On Jan 16, 2011, at 4:01 PM, David Dabbs wrote: > Hello. > > Can anyone comment on the performance impact (positive or negative) > of running Cassandra configured to use large pages under Linux? > Yes, YMMV applies, but I thought I'd ask be

Re: is it possible to map an one from a a file and an one from cassandra?

2011-01-16 Thread Jun Young Kim
Hi aron. I think that if the pig is able to support to map it, the same job could be represented in java code itself. I believe that we can call a map function by loading a file and cassandra at the same time. Ps) I dont need to join from them. I just wanna compare each keys which are read from

Cassandra and -XX:+UseLargePages

2011-01-16 Thread David Dabbs
Hello. Can anyone comment on the performance impact (positive or negative) of running Cassandra configured to use large pages under Linux? Yes, YMMV applies, but I thought I'd ask before enlisting sysadmin Fu, etc. Thanks! David

Re: Bulk Loader for Cassandra 0.6.8

2011-01-16 Thread Jonathan Ellis
We're working on a new one for 0.7.1 (https://issues.apache.org/jira/browse/CASSANDRA-1278) but it won't be backported to 0.6. On Sun, Jan 16, 2011 at 11:36 AM, akshatbakli...@gmail.com wrote: > Hi All, > I am a newbie experimenting with Cassandra. While going through various > client options ava

Re: best way to do a count

2011-01-16 Thread Brandon Williams
On Sun, Jan 16, 2011 at 3:36 PM, Aaron Morton wrote: > Not that I know of. > In 0.7 you have to pass a predicate to get_count (and use a small hack to get the old behavior: https://github.com/driftx/Telephus/blob/0.7/telephus/client.py#L109 ) -Brandon

Re: balancing load

2011-01-16 Thread Peter Schuller
> So for full cluster balance required invoke nodetool move sequential over > all tokens? For a new cluster, the recommended method is to pre-calculate the tokens and bring nodes up with appropriate tokens. For existing clusters, it depends. E.g. if you're doubling the amount of nodes you can jus

Re: Cassandra in less than 1G of memory?

2011-01-16 Thread Peter Schuller
> bigger and bigger) that cassandra ram memory consumption is going through > the roof. mmap():ed memory will be counted as virtual address space. Disable mmap() and use standard I/O if you want to see how it behaves for real;' then if you want mmap() for performance you can re-enable it. -- /

Re: balancing load

2011-01-16 Thread ruslan usifov
2011/1/16 Edward Capriolo > On Sun, Jan 16, 2011 at 11:45 AM, Karl Hiramoto wrote: > > Hi, > > > > I have a keyspace with Replication Factor: 2 > > and it seems though that most of my data goes to one node. > > > > > > What am I missing to have Cassandra balance more evenly? > > > > ./nodetool

Re: Cassandra in less than 1G of memory?

2011-01-16 Thread Victor Kabdebon
If it's because of swapping made by Linux, wouldn't I only see the swap memory consumption rise ? Because the problem is (apart from swap becoming bigger and bigger) that cassandra ram memory consumption is going through the roof. However I want to give a try to the proposed method. Thank you ver

Re: best way to do a count

2011-01-16 Thread Aaron Morton
Not that I know of. Can you share some more information on you application, you may be able to design your way around it by denormalising. Aaron On 17/01/2011, at 5:22 AM, Michael Fortin wrote: > From what I can tell, get_count(), returns the total number of columns, is > there a way to get t

Re: Cassandra in less than 1G of memory?

2011-01-16 Thread Aaron Morton
The OS will make it's best guess as to how much memory if can give over to mmapped files. Unfortunately it will not always makes the best decision, see the information on adding JNA and mlockall() support in cassandra 0.6.5 http://www.datastax.com/blog/whats-new-cassandra-065As Jonathan says, try s

Re: cass0.7: Creating colum family & Sorting

2011-01-16 Thread Victor Kabdebon
Comparator comparates only the column inside a Key. Key sorting is done by your partitionner. Best regards, Victor Kabdebon 2011/1/16 kh jo > I am having some problems with creating column families and sorting them, > > I want to create a countries column family where I can get a sorted list o

cass0.7: Creating colum family & Sorting

2011-01-16 Thread kh jo
I am having some problems with creating column families and sorting them, I want to create a countries column family where I can get a sorted list of countries(by country's name) the following command fails: create column family Countries with comparator=LongType and column_metadata=[     {colu

Re: is it possible to map an one from a a file and an one from cassandra?

2011-01-16 Thread Aaron Morton
The  Pig readers are just the same as any other data source so you should be able to mix and match them as you pleaseTthe sample pig script in contrib/pig/example-script.pig specifies the to use the CassandraStorage source when loading data rows = LOAD 'cassandra://Keyspace1/Standard1' USING Cassan

Re: balancing load

2011-01-16 Thread Edward Capriolo
On Sun, Jan 16, 2011 at 11:45 AM, Karl Hiramoto wrote: > Hi, > > I have a keyspace with  Replication Factor: 2 > and it seems though that most of my data goes to one node. > > > What am I missing to have Cassandra balance more evenly? > > ./nodetool  -h host1 ring > Address         Status State  

Re: cassandra-cli 0.7 "name" as a column name!

2011-01-16 Thread Tyler Hobbs
I created https://issues.apache.org/jira/browse/CASSANDRA-1995 for this. Thanks for reporting the issue! - Tyler On Sun, Jan 16, 2011 at 1:51 PM, kh jo wrote: > example: > > create column family Countries > with comparator=UTF8Type > and column_metadata=[ > {column_name: name, validation_class:

Re: Cassandra-Maven-Plugin

2011-01-16 Thread Stephen Connolly
it will be an attachment to an as yet un raised jira. look out for it tomorrow/tuesday - Stephen --- Sent from my Android phone, so random spelling mistakes, random nonsense words and other nonsense are a direct result of using swype to type on the screen On 16 Jan 2011 17:52, "Hellmut Adolphs"

Re: cassandra-cli 0.7 "name" as a column name!

2011-01-16 Thread kh jo
example: create column family Countries with comparator=UTF8Type and column_metadata=[ {column_name: name, validation_class: UTF8Type} ]; this gives the following error: Command not found: `create column family Countries with comparator=UTF8Type and column_metadata=[ {column_name: name, valida

Re: cassandra-cli 0.7 "name" as a column name!

2011-01-16 Thread Tyler Hobbs
You can. Can you give us the line that you are trying to execute? - Tyler On Sun, Jan 16, 2011 at 1:22 PM, kh jo wrote: > Why can't I use "name" as a column name? > >

cassandra-cli 0.7 "name" as a column name!

2011-01-16 Thread kh jo
Why can't I use "name" as a column name?

Re: How can I correct this Cassandra load imbalance?

2011-01-16 Thread Peter Schuller
> Node 1: > strings/grep/wc: 979,123 > space used: 2,061,497,786 > > Node 2: > strings/grep/wc: 443,558 > space used: 854,213,778 > > Node 3: > strings/grep/wc: 2,103,294 > space used: 4,505,048,405 Was this figured out? Could it be so simple as a compaction discrepancy (did you try running compac

Bulk Loader for Cassandra 0.6.8

2011-01-16 Thread akshatbakli...@gmail.com
Hi All, I am a newbie experimenting with Cassandra. While going through various client options available for Cassandra. I was wondering if there exist some Bulk Loader other than BinaryMemtable. I found that BInaryMemtable is a java based Bulk Loader. I am writing my application in python and would

inputs to map from a file and cassandra at a time.

2011-01-16 Thread JunYoung Kim
Hi, I want to call map operations by using inputs which one is from a file and the other one is from a cassandra. I know the ways to get file inputs files from a directory or input datas from a cassandra. but, I am not sure a way to get each input from them is possible. here is more hints t

Re: balancing load

2011-01-16 Thread Mark Zitnik
Hi, if you are starting the cluster at once and not adding nodes to existed cluster try to calc the tokens. here is a python script to calc the tokens def tokens(nodes): - for x in xrange(nodes): - print 2 ** 127 / nodes * x also read the operation section in cassandra wiki http://wik

balancing load

2011-01-16 Thread Karl Hiramoto
Hi, I have a keyspace with Replication Factor: 2 and it seems though that most of my data goes to one node. What am I missing to have Cassandra balance more evenly? ./nodetool -h host1 ring Address Status State LoadOwns Token

Re: best way to do a count

2011-01-16 Thread Michael Fortin
From what I can tell, get_count(), returns the total number of columns, is there a way to get the count on a slice? The docs for Counters also doesn't make any references to slices either. On Jan 12, 2011, at 4:07 PM, Aaron Morton wrote: > There is a get_count() API function http://wiki.apache

Tombstone lifespan after multiple deletions

2011-01-16 Thread David Boxenhorn
If I delete a row, and later on delete it again, before GCGraceSeconds has elapsed, does the tombstone live longer? In other words, if I have the following scenario: GCGraceSeconds = 10 days On day 1 I delete a row On day 5 I delete the row again Will the tombstone be removed on day 10 or day 15

Accessing the 'time' name from a ColumnFamiliy

2011-01-16 Thread vicent roca daniel
Hi guys, I'm trying to get as a result a key->value from my ColumnFamily but so far nothing. The ColumnFamily is ordered by LongType (I'm storing Time object as a Column Name) A sample code could be: data = cassandradb.get(:Data, "00:15:6D:E6:D9:A4-networkl") I can access to the values, but then

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-16 Thread Jools
Ironically, we started with an Thrift based application stack which used Mysql as it's backend storage. At some point I was introduced to Cassandra, and after a very short time we implemented it as our backend storage mechanism. The first version of our application used the Cassandra thrift client

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-16 Thread Dave Gardner
We use PHP and Thrift directly (although this is wrapped in code that is basically our own bespoke client). Dave On Saturday, 15 January 2011, Dave Viner wrote: > Perl using the thrift interface directly. > On Sat, Jan 15, 2011 at 6:10 AM, Daniel Lundin wrote: > > python + pycassa > scala + Hec