date:20101006

Re: How is data propagated

2010-10-06 Thread Jonathan Ellis

the former, but also see http://issues.apache.org/jira/browse/CASSANDRA-1530 On Wed, Oct 6, 2010 at 9:22 PM, MK wrote: > Say I have a cluster of N nodes and I have started all the nodes with > a replication factor of N. So effectively all data is being mirrored > everywhere. > > Now, when I write

How is data propagated

2010-10-06 Thread MK

Say I have a cluster of N nodes and I have started all the nodes with a replication factor of N. So effectively all data is being mirrored everywhere. Now, when I write to a node, how does this data get propagated to the remaining N-1 nodes. 1) Does this one origin node do N-1 network operations

Re: Does the secondary index in 0.7 cost extra space like an extra ColumnFamily?

2010-10-06 Thread Matthew Dennis

Creating indexes takes extra space (does in MySQL, PGSQL, etc too). https://issues.apache.org/jira/browse/CASSANDRA-749 has quite a bit of detail about how the secondary indexes currently work. On Wed, Oct 6, 2010 at 7:17 PM, Alvin UW wrote: > Hello, > > Before 0.7, actually we can create an ex

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Matthew Dennis

Rob is correct. drain is really on there for when you need the commit log to be empty (some upgrades or a complete backup of a shutdown cluster). There really is no point to using to shutdown C* normally, just kill it... On Wed, Oct 6, 2010 at 4:18 PM, Rob Coli wrote: > On 10/6/10 1:13 PM, Aar

get_range_slices problem with super columns

2010-10-06 Thread Jianing Hu

I'm seeing cases where the count in slicerange predicate is not respected. This is only happening for super columns. I'm running Cassandra 0.6.4 in a single node. Steps to reproduce, using the Keyspace1.Super1 CF: * insert three super columns, bar1 bar 2, and bar3, under the same key * delete bar1

Does the secondary index in 0.7 cost extra space like an extra ColumnFamily?

2010-10-06 Thread Alvin UW

Hello, Before 0.7, actually we can create an extra ColumnFamily as an secondary index, if we need. I was wondering whether the secondary index mechanism in 0.7 just likes creating an extra ColumnFamily as an index. The difference is only that users don't take care of the maintainence of the secon

Re: Query on sstable2json - possible bug

2010-10-06 Thread Jonathan Ellis

can you tar.gz the filter/index/data files for this sstable and attach it to a ticket so we can debug? if you can't make the data public you can send it to me off list and I can have a look. On Wed, Oct 6, 2010 at 11:37 AM, Narendra Sharma wrote: > Has any one used sstable2json on 0.6.5 and noti

Re: atomic test-or-set

2010-10-06 Thread Simon Reavely

Ryan, Independent of this ambiguous requirement what were you thinking about. What I am trying to ask is can you be more specific/concrete about when you can Simon Reavely On Oct 5, 2010, at 11:30 AM, Ryan King wrote: > On Tue, Oct 5, 2010 at 8:23 AM, Ian Rogers > wrote: >> >> Does Cassan

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Rob Coli

On 10/6/10 1:13 PM, Aaron Morton wrote: To shutdown cleanly, say in a production system, use nodetool drain first. This will flush the memtables and put the node into a read only mode, AFAIK this also gives the other nodes a faster way of detecting the node is down via the drained node gossiping

Re: Read Latency

2010-10-06 Thread Aaron Morton

Thats a lot of questions, I'll try to answer some...Read/Write latency as reported for a CF is the time taken to perform a local read on that node. Read/Write latency reported on the o.a.c.service.StorageProxy are the time taken to process a complete request, including local and remote reads when C

Re: Re: Sorting in Cassandra

2010-10-06 Thread Matthew Dennis

The SCs are stored on disk in the order defined by the compareWith setting so if you want them back in a different order either someone is sorting them (C*, which doesn't sort them right now, or the client; which doesn't make much of a difference, it's just moving the load around) or you're denorma

Re: Retaining commit logs

2010-10-06 Thread Matthew Dennis

> > PS. Are other ppl interested in this functionality ? > I could file it to JIRA as well... > > Yes, please file it to Jira. It seems like it would be pretty useful for various things and fairly easy to change the code to move it to another directory whenever C* thinks it should be deleted...

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Aaron Morton

To shutdown cleanly, say in a production system, use nodetool drain first. This will flush the memtables and put the node into a read only mode, AFAIK this also gives the other nodes a faster way of detecting the node is down via the drained node gossiping it's new status. Then kill. AaronOn 07 Oct

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Matthew Dennis

Some relevant reading if you're interested: http://dslab.epfl.ch/pubs/crashonly/ http://web.archive.org/web/20060426230247/http://crash.stanford.edu/ On Wed, Oct 6, 2010 at 1:46 PM, Scott Mann wrote: > Yes. ctrl-C if running in the foreground. Use kill , if running > in the background (see the

Re: Tuning cassandra to use less memory

2010-10-06 Thread Aaron Morton

There is an explanation of how to lock the JVM into memory here http://www.riptano.com/blog/whats-new-cassandra-065However from the JVM Heap Size section here http://wiki.apache.org/cassandra/MemtableThresholdsFor a rough rule of thumb, Cassandra's internal datastructures will require about memtabl

Re: get keys based on values??

2010-10-06 Thread Matthew Dennis

As jbellis mentioned, the secondary indexes with > will work for this but in the mean time you can still index this manually in .6 (which will continue to work in .7 if need be). There are several ways to attack this now. If you don't have too many users you can have a row with "age" as the row k

Re: get keys based on values??

2010-10-06 Thread Brayton Thompson

Ok, Thank you all. More reading to do :) On Oct 6, 2010, at 3:21 PM, Jonathan Ellis wrote: > On Wed, Oct 6, 2010 at 1:49 PM, Brayton Thompson > wrote: >> Ok, let me tweak the scenario a tiny bit. What if I wanted something >> extremely arbitrary, for instance... simple comparisons like a WHERE

Re: get keys based on values??

2010-10-06 Thread Jonathan Ellis

On Wed, Oct 6, 2010 at 1:49 PM, Brayton Thompson wrote: > Ok, let me tweak the scenario a tiny bit. What if I wanted something > extremely arbitrary, for instance... simple comparisons like a WHERE clause > in SQL > get Users.someuser['uuid'] where Users.someuser['age'] > 33 > > From what i'

Re: get keys based on values??

2010-10-06 Thread Morten Wegelbye Nissen

So would my best bet be to simply get ALL of my users uuids and ages, then throw away all of those that do not meet the required test? And in fact this is also what a traditional database does when you need table scan. And this will happen if you have not prepared an index on that column. (

Re: Strange Behavior : Commitlog data is not flushed

2010-10-06 Thread Jonathan Ellis

Commitlog segments remain until all the data in them has been flushed. Reduce MemtableFlushAfterMinutes. If I had to guess without your error log why the node went down, I would guess you exceeded the open file handle allowance. You can increase that with the standard ulimit or /etc/security/lim

Strange Behavior : Commitlog data is not flushed

2010-10-06 Thread Rana Aich

Hello Experts, I see a queer behavior from on of the Cassandra nodes in my cluster where the data is not flushed off Commitlogs and the Commitlog file grows in number. I was inserting the data into the cluster and since yesterday this node had more than 900 commitlog files. -rw-r--r-- 1 dev dev

Re: get keys based on values??

2010-10-06 Thread Brayton Thompson

Ok, let me tweak the scenario a tiny bit. What if I wanted something extremely arbitrary, for instance... simple comparisons like a WHERE clause in SQL get Users.someuser['uuid'] where Users.someuser['age'] > 33 From what i've read this functionality defeats the point of Cassandra becau

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Scott Mann

Yes. ctrl-C if running in the foreground. Use kill , if running in the background (see the man page for kill if you are unfamiliar with it). Killing Cassandra is the only way to terminate it. On Wed, Oct 6, 2010 at 11:03 AM, Alberto Velandia wrote: > So, is ctrl + C how you stop cassandra? or I'm

Re: Tuning cassandra to use less memory

2010-10-06 Thread Rob Coli

On 10/6/10 9:05 AM, Utku Can Topçu wrote: The nodes are still swapping, even though the swappiness is set to zero right now. After swapping comes the OOM. https://issues.apache.org/jira/browse/CASSANDRA-1214 ? =Rob

Re: Null Pointer Exception / Secondary Indices

2010-10-06 Thread J T

Hi, On a first pass, that patch seems to have solved the problem. I'll be testing that functionality repeatedly in the next day or so I'll let you know how it fairs. Thanks Jason On Wed, Oct 6, 2010 at 4:06 PM, Stu Hood wrote: > Hey JT, > > I believe this issue should be fixed by CASSANDRA-15

Re: get keys based on values??

2010-10-06 Thread Matthew Dennis

As Norman said, secondary indexes are only in .7 but you can create standard indexes in both .6 and .7 Basically have a email_domain_idx CF where the row key is the domain and the column names have the row id of the user (the column value is unused in this scenario). This sounds basically like wh

Re: get keys based on values??

2010-10-06 Thread Norman Maurer

Only in 0.7 Bye, Norman 2010/10/6 Brayton Thompson : > Are secondary index's available in .6.5? or are they only in .7? > On Oct 6, 2010, at 1:15 PM, Tyler Hobbs wrote: > > If you're interested in only checking part of a column's value, you can > generally > just store that part of the value in a

Re: get keys based on values??

2010-10-06 Thread Brayton Thompson

Are secondary index's available in .6.5? or are they only in .7? On Oct 6, 2010, at 1:15 PM, Tyler Hobbs wrote: > If you're interested in only checking part of a column's value, you can > generally > just store that part of the value in a different column. So, have an > "email_addr" column > a

Re: API mismatch Cassandra and Pycassa versions

2010-10-06 Thread Tyler Hobbs

Hmm, I thought the Thrift API was moved to 18 before beta2 was released. I'll make a matching release for pycassa in just a moment. Thanks for the notice. By the way, there is a pycassa specific mailing list, pycassa-disc...@googlegroups.com - Tyler On Wed, Oct 6, 2010 at 12:13 PM, Dipti Mathu

Re: get keys based on values??

2010-10-06 Thread Tyler Hobbs

If you're interested in only checking part of a column's value, you can generally just store that part of the value in a different column. So, have an "email_addr" column and a "email_domain" column, which stores "aol.com", for example. Then you can just use a secondary index on the "email_domain

API mismatch Cassandra and Pycassa versions

2010-10-06 Thread Dipti Mathur

Hi All, I was trying to connect to cassandra using the pycassa module. Looks like there is a API cersion mismatch. Any ideas where I can get the right version of the APIs? I am using: INFO 22:11:50,860 Cassandra version: 0.7.0-beta2 INFO 22:11:50,861 Thrift API version: 17.1.0 Error message on p

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Alberto Velandia

So, is ctrl + C how you stop cassandra? or I'm i better doing it another way? Thanks On Oct 6, 2010, at 11:59 AM, Norman Maurer wrote: > CTRL + Z does not stop a programm it just suspend it. You will need to > resume it with "fg" and then hit CTRL + C to stop it. > > For some basic background:

Re: Newbie Question about restarting Cassandra

2010-10-06 Thread Norman Maurer

CTRL + Z does not stop a programm it just suspend it. You will need to resume it with "fg" and then hit CTRL + C to stop it. For some basic background: http://linuxreviews.org/beginner/jobs/ Bye, Norman 2010/10/6 Alberto Velandia : > Hi I've stopped cassandra hitting Ctrl + Z and tried to rest

Newbie Question about restarting Cassandra

2010-10-06 Thread Alberto Velandia

Hi I've stopped cassandra hitting Ctrl + Z and tried to restart it and got this message: INFO 11:46:16,039 JNA not found. Native methods will be disabled. INFO 11:46:16,159 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap ERROR 11:46:16,449 Fatal exception during initializa

Re: Retaining commit logs

2010-10-06 Thread Narendra Sharma

Thanks Oleg! Could you please share the patch. I have build Cassandra before from source. I can definitely give it try. -Naren On Wed, Oct 6, 2010 at 3:55 AM, Oleg Anastasyev wrote: > > Is it possible to retain the commit logs? > > In off-the-shelf cassandra 0.6.5 this is not possible, AFAIK.

Re: Query on sstable2json - possible bug

2010-10-06 Thread Narendra Sharma

Has any one used sstable2json on 0.6.5 and noticed the issue I described in my email below? This doesn't look like data corruption issue as sstablekeys shows the keys. Thanks, Naren On Tue, Oct 5, 2010 at 8:09 PM, Narendra Sharma wrote: > 0.6.5 > > -Naren > > > On Tue, Oct 5, 2010 at 6:56 PM, J

Re: Tuning cassandra to use less memory

2010-10-06 Thread Utku Can Topçu

Hi Oleg, I've been also looking into these after some research. I've been tacking with: 1. Setting the default max and min heap from 1G to 1500M. 2. I'm not using row caches, and the key caches are set to 1000, before they were 200K as default 3. I've lowered the memtable throughput to 32MB 4. We

Re: Retaining commit logs

2010-10-06 Thread Peter Schuller

> PS. Are other ppl interested in this functionality ? > I could file it to JIRA as well... I was about to post that such a thing was useful for point-in-time recovery before reading your post, so yes :) -- / Peter Schuller

get keys based on values??

2010-10-06 Thread Brayton Thompson

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ok, I am VERY new to Cassandra and trying to get my head around its core ideas. So lets say I have a CF of Users that contains all the info I would ever want to know about them. One day I decide(for some reason) that I want to send a mass email to

RE: Null Pointer Exception / Secondary Indices

2010-10-06 Thread Stu Hood

Hey JT, I believe this issue should be fixed by CASSANDRA-1571... if you're able to test that patch, it would be very helpful. Thanks, Stu -Original Message- From: "J T" Sent: Tuesday, October 5, 2010 9:50pm To: cassandra-u...@incubator.apache.org Subject: Null Pointer Exception / Seco

RE: Column TTL

2010-10-06 Thread Dan Hendry

Ah, great, thanks. I was looking under trunk/src/java/... instead of trunk/interface/... Dan From: Michal Augustýn [mailto:augustyn.mic...@gmail.com] Sent: October-06-10 10:38 To: user@cassandra.apache.org Subject: Re: Column TTL Hi, I checked Cassandra.thrift file and found: @

Re: Column TTL

2010-10-06 Thread Michal Augustýn

Hi, I checked Cassandra.thrift file and found: @param ttl. An optional, positive delay (in seconds) after which the column will be automatically deleted. Augi 2010/10/6 Dan Hendry > Hi, > > > > I have a quick and quite frankly ridiculous question regarding the column > TTL value; what are t

Column TTL

2010-10-06 Thread Dan Hendry

Hi, I have a quick and quite frankly ridiculous question regarding the column TTL value; what are the time units? Milliseconds/seconds or something else? I initially thought milliseconds given that it is Java and that is what timestamps are in but the data type used in the setTll() Java thrif

Read Latency

2010-10-06 Thread Wayne

I have been seeing some strange trends in read latency that I wanted to throw out there to find some explanations. We are running .6.5 in a 10 node cluster rf=3. We find that the read latency reported by the cfstats is always about 1/4 of the actual time it takes to get the data back to the python

Re: Retaining commit logs

2010-10-06 Thread Oleg Anastasyev

> Is it possible to retain the commit logs? In off-the-shelf cassandra 0.6.5 this is not possible, AFAIK. I developed a patch we use internally in our company for commit log archivation and replay. I can share a patch with you, if you dare patching cassandra sources by yourself ;-) PS. Are o

Re: Tuning cassandra to use less memory

2010-10-06 Thread Oleg Anastasyev

> > Hi All,We're currently starting to get OOM exceptions in our cluster. I'm trying to push the limiations of our machines. Currently we have 1.7 G memory (ec2-medium)I'm wondering if by tweaking some of cassandra's configuration settings, is it possible to make it live in peace and less memory.

Re: Cassandra + Pig + PHP

2010-10-06 Thread Jeremy Hanna

Yes - the HadoopSupport should be updated for the functionality that is added to 0.7. It's still a little in flux. There is an output format and output streaming support on trunk/0.7 beta2. The output format has a java example in the contrib/word_count example code. The output streaming, whi

Re: Cassandra + Pig + PHP

2010-10-06 Thread Aaron Morton

AFAIK you can submit a pig job to the Hadoop job server via the pig command line interface. If you have not done so already have a read of the Hadoop Book it discusses pig as well http://bit.ly/9gGRyH Not sure how you go about monitoring the hadoop job though. There is support for hadoop to o

Re: Cassandra + Pig + PHP

2010-10-06 Thread Jeremy Hanna

> PHP: I basicaly need to start pig program from a php script (via thrift or > something..?) Can't you just execute a Pig script with PHP by calling Pig with a PHP exec function call? I'm not sure what you're trying to do with it, but that's one way you could do it. > PIG: there is a LoadFunc

Re: Retaining commit logs

2010-10-06 Thread Aaron Morton

If you turn the log level up to DEBUG that will include information about each request. Would that help? You could restrict it by setting a logging configuration for the specific classes that output the message you are interested in. Not sure about retaining the commit logs. Aaron On 6 Oct 20

Re: R: Re: Sorting in Cassandra

2010-10-06 Thread Aaron Morton

Your sort of right for point two. The comparators you define in the keyspace def are for the names of the columns (or super columns) not their values. So it's not possible to sort by the value of your name column, you'll need to do it client side. The indexing features in 0.7 can sort the value

Retaining commit logs

2010-10-06 Thread Narendra Sharma

Cassandra Version: 0.6.5 I am running a long duration test and I need to keep the commit log to see the sequence of operations to debug few application issues. Is it possible to retain the commit logs? Apart from increasing the value of CommitLogRotationThresholdInMB what is the other way to achie

Re: Cassandra + Pig + PHP

2010-10-06 Thread Jeff Zhang

Pig do not have thrift interface, But I believe you can create it. And another way I think is create a web service for your pig service, and call the web service in your php. On Wed, Oct 6, 2010 at 4:17 PM, Petr Odut wrote: > Hi, > PHP: I basicaly need to start pig program from a php script (vi

Null Pointer Exception / Secondary Indices

2010-10-06 Thread J T

Hi, I've been battling against some errors that only seem to crop up when I'm messing around with secondary indices in 0.7-beta2. Namely I seem to get errors like this start to happen, after I 'delete' a row in a CF that has a couple of secondary indices on it and then at some point later try to

Re: Cassandra + Pig + PHP

2010-10-06 Thread Petr Odut

Hi, PHP: I basicaly need to start pig program from a php script (via thrift or something..?) PIG: there is a LoadFunc that loads data from Cassandra, is there also a StoreFunc? On Tue, Oct 5, 2010 at 9:22 PM, Aaron Morton wrote: > There is an example for pig in contrib/pig and a hadoop example i

R: Re: Sorting in Cassandra

2010-10-06 Thread cbert...@libero.it

Aaron, first of all thanks for your time. 1. You cannot return just the super columns, you have to get their sub columns as well. The returned data is ordered, please provide and example of where it is not. I don't know what I did before but now I checked and data are sorted as I expected th

56 matches

Mail list logo