date:20110615

last record rowId

2011-06-15 Thread karim abbouh

in my java application,when we try to insert we should all the time know the last rowId in order the insert the new record in rowId+1,so for that we should save this rowId in a file is there other way to know the last record rowId? thanks B.R

Re: Where is my data?

2011-06-15 Thread Sylvain Lebresne

You can use the thrift call describe_ring(). It will returns a map that associate to each range of the ring who is a replica. Once any range has all it's endpoint unavailable, that range of the data is unavailable. -- Sylvain On Tue, Jun 14, 2011 at 11:33 PM, AJ wrote: > Is there an official det

Re: possible 'coming back to life' bug with counters

2011-06-15 Thread Sylvain Lebresne

Let me point out that the current thread is about counter removal, not about counter TTL. Counter expiration have other problems, so that even if you do not care about incrementing a counter again after it expires, it will still not work for you (please look at the discussion on https://issues.apac

Cassandra DC Upcoming Meetup

2011-06-15 Thread Chris Burroughs

Cassandra DC's first meetup of the pizza and talks variety will be on July 6th. There will be an introductory sort of presentation and a totally cool one on Pig integration. If you are in the DC area it would be great to see you there. http://www.meetup.com/Cassandra-DC-Meetup/events/22145481/

Re: New web client & future API

2011-06-15 Thread AJ

Nice interface... and someone has good taste in music. BTW, I'm new to web programming, what did you use for the web components? JSF, JavaScript, something else? On 6/14/2011 7:42 AM, Markus Wiesenbacher | Codefreun.de wrote: Hi, what is the future API for Cassandra? Thrift, Avro, CQL? I

Re: New web client & future API

2011-06-15 Thread Holger Hoffstaette

On Wed, 15 Jun 2011 10:04:53 +1200, aaron morton wrote: > Avro is dead. Just so that this is not misunderstood: "for Cassandra". Avro itself (and -ipc) is far from dead. -h

Re: last record rowId

2011-06-15 Thread Utku Can Topçu

As far as I can tell, this functionality doesn't exist. However you can use such a method to insert the rowId into another column within a seperate row, and request the latest column. I think this would work for you. However every insert would need a get request, which I think would be performance

Re: New web client & future API

2011-06-15 Thread Jeremy Hanna

Yes - avro is alive and well. Avro as an RPC alternative for Cassandra is dead. See reasoning here: http://goo.gl/urENc On Jun 15, 2011, at 8:28 AM, Holger Hoffstaette wrote: > On Wed, 15 Jun 2011 10:04:53 +1200, aaron morton wrote: > >> Avro is dead. > > Just so that this is not misundersto

Re: Where is my data?

2011-06-15 Thread AJ

Thanks On 6/15/2011 3:20 AM, Sylvain Lebresne wrote: You can use the thrift call describe_ring(). It will returns a map that associate to each range of the ring who is a replica. Once any range has all it's endpoint unavailable, that range of the data is unavailable. -- Sylvain

Re: New web client & future API

2011-06-15 Thread Eric Evans

On Tue, 2011-06-14 at 09:49 -0400, Victor Kabdebon wrote: > Actually from what I understood (please correct me if I am wrong) CQL > is based on Thrift / Avro. In this project, we tend to use the word "Thrift" as a sort of shorthand for "Cassandra's RPC interface", and not, "The serialization and R

Re: New web client & future API

2011-06-15 Thread Markus Wiesenbacher | Codefreun.de

I am using a Javascript framework, Sencha ExtJS. The format between UI and servlets is JSON. Thanks for your response and that you agree to my music taste ;) Am 15.06.2011 um 15:48 schrieb AJ : > Nice interface... and someone has good taste in music. > > BTW, I'm new to web programming, what

Re: New web client & future API

2011-06-15 Thread Victor Kabdebon

Ok thanks for the update. I thought the query string was translated to Thrift, then send to a server. Victor Kabdebon 2011/6/15 Eric Evans > On Tue, 2011-06-14 at 09:49 -0400, Victor Kabdebon wrote: > > Actually from what I understood (please correct me if I am wrong) CQL > > is based on Thrift

Atomicity of batch updates

2011-06-15 Thread Artem Orobets

Hi, Wiki says that write operation is atomic within ColumnFamily (http://wiki.apache.org/cassandra/ArchitectureOverview chapter "write properties"). If I use batch update for single CF, and get an exception in last mutation operation, is it means that all previous operation will be reverted. If

Re: New web client & future API

2011-06-15 Thread Jeffrey Kesselman

Correct me if I'm wrong, but AFAIK Hector is the only higher level APi I would consider "complete' right now, with support for things like fail-over. I notice in the latest Hector build he is starting to add CQL support, so thats what I'm sticking with. When he has CQL support done I'll decide i

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Shotaro Kamio

We've encountered the situation that compacted sstable files aren't deleted after node repair. Even when gc is triggered via jmx, it sometimes leaves compacted files. In a case, a lot of files are left. Some files stay more than 10 hours already. There is no guarantee that gc will cleanup all compa

Re: New web client & future API

2011-06-15 Thread Nate McCall

CQL support in Hector was available as of 0.8.0 release. See details here: https://github.com/rantav/hector/wiki/Using-CQL On Wed, Jun 15, 2011 at 9:46 AM, Jeffrey Kesselman wrote: > Correct me if I'm wrong, but AFAIK Hector is the only higher level > APi I would consider "complete' right now,

sstable2json2sstable bug with json data stored

2011-06-15 Thread Timo Nentwig

Hi! Couldn't google anybody having yet experienced this, so I do (0.8): { "foo":{ "foo":{ "foo":"bar", "foo":"bar", "foo":"bar", "foo":"", "foo":"bar", "foo":"bar", "id":123456 } }, "foo":null } (json can likely be boiled down even more...)

Re: cascading failures due to memory

2011-06-15 Thread AJ

Sasha, Did you ever nail down the cause of this problem? On 5/31/2011 4:01 AM, Sasha Dolgy wrote: hi everyone, the current nodes i have deployed (4) have all been working fine, with not a lot of data ... more reads than writes at the moment. as i had monitoring disabled, when one node's OS ki

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen

Even if the gc call cleaned all files, it is not really acceptable on a decent sized cluster due to the impact full gc has on performance. Especially non-needed ones. The delay in file deletion can also at times make it hard to see how much spare disk you actually have. We easily see 100% increas

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen

On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen < tmarthinus...@gmail.com> wrote: > Even if the gc call cleaned all files, it is not really acceptable on a > decent sized cluster due to the impact full gc has on performance. > Especially non-needed ones. > > Not acceptable as running GC on ev

What triggers hint delivery?

2011-06-15 Thread Terje Marthinussen

Hi, I was looking quickly at source code tonight. As far as I could see from a quick code scan, hint delivery is only triggered as a state change from a node is down to when it enters up state? If this is indeed the case, it would potentially explain why we sometimes have hints on machines which

Re: cascading failures due to memory

2011-06-15 Thread Sasha Dolgy

No. Upgraded to 0.8 and monitor the systems more. we schedule a repair every 24hrs via cron and so far no problems.. On Jun 15, 2011 5:44 PM, "AJ" wrote: > Sasha, > > Did you ever nail down the cause of this problem? > > On 5/31/2011 4:01 AM, Sasha Dolgy wrote: >> hi everyone, >> >> the current

Re: Forcing Cassandra to free up some space

2011-06-15 Thread AJ

In regards to cleaning-up old sstable files, I posed this question before as I noticed after taking a snapshot, the older files (pre-compaction) shared no links with the snapshots. Therefore, (if the Cass snapshot functionality is working correctly) those older files can be manually deleted.

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King

There's a ticket open to address this: https://issues.apache.org/jira/browse/CASSANDRA-1974 -ryan On Wed, Jun 15, 2011 at 8:49 AM, Terje Marthinussen wrote: > > > On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen > wrote: >> >> Even if the gc call cleaned all files, it is not really accepta

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Peter Schuller

> Even if the gc call cleaned all files, it is not really acceptable on a > decent sized cluster due to the impact full gc has on performance. > Especially non-needed ones. You can run with -XX:+ExplicitGCInvokesConcurrent to "safely" trigger CMS cycles. However that also means System.gc() semanti

Re: last record rowId

2011-06-15 Thread Jonathan Ellis

You're better served using UUIDs than numeric row IDs for surrogate keys. (Of course natural keys work fine too.) On Wed, Jun 15, 2011 at 9:16 AM, Utku Can Topçu wrote: > As far as I can tell, this functionality doesn't exist. > > However you can use such a method to insert the rowId into anothe

Re: What triggers hint delivery?

2011-06-15 Thread Jonathan Ellis

On Wed, Jun 15, 2011 at 10:53 AM, Terje Marthinussen wrote: > I was looking quickly at source code tonight. > As far as I could see from a quick code scan, hint delivery is only > triggered as a state change from a node is down to when it enters up state? Right. > If this is indeed the case, it

useful little way to run locally with (pig|hive) && cassandra

2011-06-15 Thread Jeremy Hanna

We started doing this recently and thought it might be useful to others. Pig (and Hive) have a sample function that allows you to sample data from your data store. In pig it looks something like this: mysample = SAMPLE myrelation 0.01; One possible use for this, with pig and cassandra is to sol

Force a node to form part of quorum

2011-06-15 Thread A J

Is there a way to favor a node to always participate (or never participate) towards fulfillment of read consistency as well as write consistency ? Thanks AJ

Re: Docs: Token Selection

2011-06-15 Thread Vijay

The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1 (Which will cause uneven distribution of data the node) It is easier to think of the DCs as ring and split equally and interleave them together DC1 Node 1 : token 0 DC1 Node 2 : t

Re: Docs: Token Selection

2011-06-15 Thread Vijay

Correction "The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1" should be "The problem in the above approach is you have 1 node between 0-4 (25%) and and one node covering the rest which is 4-16, 0-0 (75%)" Regards, On Wed, Jun

prep for cassandra storage from pig

2011-06-15 Thread William Oberman

I think I'm stuck on typing issues trying to store data in cassandra. To verify, cassandra wants (key, {tuples}) My pig script is fairly brief: raw = LOAD 'cassandra://test_in/test_cf' USING CassandraStorage() AS (key:chararray, columns:bag {column:tuple (name, value)}); --colums == timeUUID -> J

Re: Atomicity of batch updates

2011-06-15 Thread chovatia jaydeep

Cassandra write operation is atomic for all the columns/super columns for a given row key in Column Family. So in your case not all previous operations (assuming each operation was on separate key) will be reverted. Thank you, Jaydeep From: Artem Orobets To:

upgrading from cassandra 0.7.3 to 0.8.0

2011-06-15 Thread Anurag Gujral

Hi All, I had a cassandra node which was running on cassandra 0.7.3. Without changing the data directories I installed cassandra 0.8.0 but when I query data I get timeouts. Can somehow please guide me how to go about upgrade from cassandra 0.7.3 to cassandra 0.8.0. Thanks Anurag

Re: useful little way to run locally with (pig|hive) && cassandra

2011-06-15 Thread Jeremy Hanna

Cool - thanks Dmitriy! On Jun 15, 2011, at 12:54 PM, Dmitriy Ryaboy wrote: > Another tip: > If you parametrize your load statements, it becomes easy to switch > between loading from something like Cassandra, and reading from HDFS > or local fs directly. > > Also: > Try using Pig's "illustrate" c

Re: When does it make sense to use TimeUUID?

2011-06-15 Thread chovatia jaydeep

Hi Sameer, One example is, store all the tweets for a given user in a Column Family, where row key is user name/user id and column name is of TimeUUID type that represents tweet arrival time. User would generally like to see the tweets sorted based on its arrival time. So TimeUUID will help h

Re: upgrading from cassandra 0.7.3 to 0.8.0

2011-06-15 Thread Jonathan Ellis

Are there exceptions in the Cassandra log? On Wed, Jun 15, 2011 at 1:54 PM, Anurag Gujral wrote: > Hi All, > I had a cassandra node which was running on cassandra 0.7.3. > Without changing the data directories I installed cassandra 0.8.0 but when I > query data I get timeouts. > Can som

Re: prep for cassandra storage from pig

2011-06-15 Thread Jeremy Hanna

Hi Will, That's partly why I like to use FromCassandraBag and ToCassandraBag from pygmalion - it does the work for you to get it back into a form that cassandra understands. Others may know better how to massage the data into that form using just pig, but if all else fails, you could write a u

Re: prep for cassandra storage from pig

2011-06-15 Thread William Oberman

My problem is the column names are dynamic (a date), and pygmalion seems to want the column names to be fixed at "compile time" (the script). On Wed, Jun 15, 2011 at 3:04 PM, Jeremy Hanna wrote: > Hi Will, > > That's partly why I like to use FromCassandraBag and ToCassandraBag from > pygmalion -

Re: prep for cassandra storage from pig

2011-06-15 Thread William Oberman

I'll do a reply all, to keep this more consistent (sorry!). Rather than staying stuck, I wrote a custom function: TupleToBagOfTuple. I'm curious if I could have avoided it with proper pig scripting though. On Wed, Jun 15, 2011 at 3:08 PM, William Oberman wrote: > My problem is the column names a

Re: prep for cassandra storage from pig

2011-06-15 Thread Jeremy Hanna

Yeah - for completely dynamic column names, then yeah - From/To Cassandra Bag doesn't handle that. It does handle prefixed names though - like link* will get a bag of all the columns that start with link. But sounds like you are doing what I would have to do if I got into a nested data conundr

Re: Docs: Token Selection

2011-06-15 Thread AJ

On 6/15/2011 12:14 PM, Vijay wrote: Correction "The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1" should be "The problem in the above approach is you have 1 node between 0-4 (25%) and and one node covering the rest which is 4

Re: Multi data center configuration - A question on read correction

2011-06-15 Thread Selva Kumar

Thanks Jonathan. Can we turn off RR by READ_REPAIR_CHANCE.= 0. Please advice. Selva From: Jonathan Ellis To: user@cassandra.apache.org Sent: Tue, June 14, 2011 8:59:41 PM Subject: Re: Multi data center configuration - A question on read correction That's just

Re: Docs: Token Selection

2011-06-15 Thread Vijay

All you heard is right... You are not overriding Cassandra's token assignment by saying here is your token... Logic is: Calculate a token for the given key... find the node in each region independently (If you use NTS and if you set the strategy options which says you want to replicate to the othe

Re: Docs: Token Selection

2011-06-15 Thread AJ

Vijay, thank you for your thoughtful reply. Will Cass complain if I don't setup my tokens like in the examples? On 6/15/2011 2:41 PM, Vijay wrote: All you heard is right... You are not overriding Cassandra's token assignment by saying here is your token... Logic is: Calculate a token for th

Slowdowns during repair

2011-06-15 Thread Aurynn Shaw

Hey all; So, we have Cassandra running on a 5-server ring, with a RF of 3, and we're regularly seeing major slowdowns in read & write performance while running nodetool repair, as well as the occasional Cassandra crash during the repair window - slowdowns past 10 seconds to perform a single w

Easy way to overload a single node on purpose?

2011-06-15 Thread Suan Aik Yeo

Here's a weird one... what's the best way to get a Cassandra node into a "half-crashed" state? We have a 3-node cluster running 0.7.5. A few days ago this happened organically to node1 - the partition the commitlog was on was 100% full and there was a "No space left on device" error, and after a w

Re: Is there a way from a running Cassandra node to determine whether or not itself is "up"?

2011-06-15 Thread Suan Aik Yeo

Thanks, Aaron, but we determined that adding Java into the equation just brings in too much complexity for something that's called out of an Nginx Perl module. Right now I'm having trouble even replicating the above scenario and posted a question here: http://cassandra-user-incubator-apache-org.306

Re: Docs: Token Selection

2011-06-15 Thread Vijay

No it wont it will assume you are doing the right thing... Regards, On Wed, Jun 15, 2011 at 2:34 PM, AJ wrote: > Vijay, thank you for your thoughtful reply. Will Cass complain if I don't > setup my tokens like in the examples? > > > On 6/15/2011 2:41 PM, Vijay wrote: > > All you heard

Re: What triggers hint delivery?

2011-06-15 Thread Terje Marthinussen

I suspect a few possibilities: 1. I have not checked, but what happens (in terms of hint delivery) if a node tries to write something but the write times out even if the node is marked as up? 2. I would assume there can be ever so slight variations in how different nodes in the cluster think the re

Re: What triggers hint delivery?

2011-06-15 Thread Jonathan Ellis

You're right, those could all cause what you are seeing. We used to have a "re-check hourly" scheduled task, but took it out because it was very very performance intensive -- at the time, hints were not stored by machine so asking "does machine X have any hints" required scanning all hints. Shoul

downgrading from cassandra 0.8 to 0.7.3

2011-06-15 Thread Anurag Gujral

Hi All, I moved to cassandra 0.8.0 from cassandra-0.7.3 when I try to move back I get the following error: java.lang.RuntimeException: Can't open sstables from the future! Current version f, found file: /data/cassandra/data/system/Schema-g-9. Please suggest. Thanks Anurag

Re: downgrading from cassandra 0.8 to 0.7.3

2011-06-15 Thread Terje Marthinussen

Can't help you with that. You may have to go the json2sstable route and re-import into 0.7.3 But... why would you want to go back to 0.7.3? Terje On Thu, Jun 16, 2011 at 10:30 AM, Anurag Gujral wrote: > Hi All, > I moved to cassandra 0.8.0 from cassandra-0.7.3 when I try to > move ba

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Terje Marthinussen

Watching this on a node here right now and it sort of shows how bad this can get. This node still has 109GB free disk by the way... INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space INFO [CompactionExecutor:5] 2011-06-16 09:12:23,

Re: Docs: Token Selection

2011-06-15 Thread AJ

Ok. I understand the reasoning you laid out. But, I think it should be documented more thoroughly. I was trying to get an idea as to how flexible Cass lets you be with the various combinations of strategies, snitches, token ranges, etc.. It would be instructional to see what a graphical rep

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Jeffrey Kesselman

The GC cleanup approach, if depending on specific objects being GCd, is fundamentally flawed. I brought this up earlier, won't restart that thread. It should be in the archives. On Wed, Jun 15, 2011 at 10:17 PM, Terje Marthinussen wrote: > Watching this on a node here right now and it sort of

Re: Is there a way from a running Cassandra node to determine whether or not itself is "up"?

2011-06-15 Thread Jake Luciani

No force a node "down" you can use nodetool disablegossip On Wed, Jun 15, 2011 at 6:42 PM, Suan Aik Yeo wrote: > Thanks, Aaron, but we determined that adding Java into the equation just > brings in too much complexity for something that's called out of an Nginx > Perl module. Right now I'm havin

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King

There's a ticket open for this: https://issues.apache.org/jira/browse/CASSANDRA-2521. Vote on it if you think its important. -ryan On Wed, Jun 15, 2011 at 7:34 PM, Jeffrey Kesselman wrote: > The GC cleanup approach, if depending on specific objects being GCd, > is fundamentally flawed. > > I bro

Re: Docs: Token Selection

2011-06-15 Thread Vijay

+1 for more documentation (I guess contributions are always welcomed) I will try to write it down sometime when we have a bit more time... 0.8 nodetool ring command adds the DC and RAC information http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers http://www

Re: What's the best approach to search in Cassandra

2011-06-15 Thread Mark Kerzner

Jake, *You need to maintain a huge number of distinct indexes.* * * *Are we talking about secondary indexes? If yes, this sounds like exactly my problem. There is so little documentation! - but I think that if I read all there is on GitHub, I can probably start using it. * Thank you, Mark On Fri

Re: What's the best approach to search in Cassandra

2011-06-15 Thread Sasha Dolgy

Datastax has pretty sufficient documentation on their site for secondary indexes. On Jun 16, 2011 6:57 AM, "Mark Kerzner" wrote: > Jake, > > *You need to maintain a huge number of distinct indexes.* > * > * > *Are we talking about secondary indexes? If yes, this sounds like exactly my > problem. T

Important Variables for Scaling

2011-06-15 Thread Schuilenga, Jan Taeke

Which variables (for instance: throughput, CPU, I/O, connections) are leading in deciding to add a node to a Cassandra setup which is put under strain. We are trying to proove scalibility, but when is the time there to add a node and have the optimum scalibilty result.

62 matches

Mail list logo