in my java application,when we try to insert we should all the time know the
last rowId
in order the insert the new record in rowId+1,so for that we should save this
rowId in a file
is there other way to know the last record rowId?
thanks
B.R
You can use the thrift call describe_ring(). It will returns a map
that associate to each range of the
ring who is a replica. Once any range has all it's endpoint
unavailable, that range of the data is unavailable.
--
Sylvain
On Tue, Jun 14, 2011 at 11:33 PM, AJ wrote:
> Is there an official det
Let me point out that the current thread is about counter removal, not about
counter TTL. Counter expiration have other problems, so that even if you do not
care about incrementing a counter again after it expires, it will
still not work for you
(please look at the discussion on
https://issues.apac
Cassandra DC's first meetup of the pizza and talks variety will be on
July 6th. There will be an introductory sort of presentation and a
totally cool one on Pig integration.
If you are in the DC area it would be great to see you there.
http://www.meetup.com/Cassandra-DC-Meetup/events/22145481/
Nice interface... and someone has good taste in music.
BTW, I'm new to web programming, what did you use for the web
components? JSF, JavaScript, something else?
On 6/14/2011 7:42 AM, Markus Wiesenbacher | Codefreun.de wrote:
Hi,
what is the future API for Cassandra? Thrift, Avro, CQL?
I
On Wed, 15 Jun 2011 10:04:53 +1200, aaron morton wrote:
> Avro is dead.
Just so that this is not misunderstood: "for Cassandra".
Avro itself (and -ipc) is far from dead.
-h
As far as I can tell, this functionality doesn't exist.
However you can use such a method to insert the rowId into another column
within a seperate row, and request the latest column.
I think this would work for you. However every insert would need a get
request, which I think would be performance
Yes - avro is alive and well. Avro as an RPC alternative for Cassandra is
dead. See reasoning here: http://goo.gl/urENc
On Jun 15, 2011, at 8:28 AM, Holger Hoffstaette wrote:
> On Wed, 15 Jun 2011 10:04:53 +1200, aaron morton wrote:
>
>> Avro is dead.
>
> Just so that this is not misundersto
Thanks
On 6/15/2011 3:20 AM, Sylvain Lebresne wrote:
You can use the thrift call describe_ring(). It will returns a map
that associate to each range of the
ring who is a replica. Once any range has all it's endpoint
unavailable, that range of the data is unavailable.
--
Sylvain
On Tue, 2011-06-14 at 09:49 -0400, Victor Kabdebon wrote:
> Actually from what I understood (please correct me if I am wrong) CQL
> is based on Thrift / Avro.
In this project, we tend to use the word "Thrift" as a sort of shorthand
for "Cassandra's RPC interface", and not, "The serialization and R
I am using a Javascript framework, Sencha ExtJS. The format between UI and
servlets is JSON.
Thanks for your response and that you agree to my music taste ;)
Am 15.06.2011 um 15:48 schrieb AJ :
> Nice interface... and someone has good taste in music.
>
> BTW, I'm new to web programming, what
Ok thanks for the update. I thought the query string was translated to
Thrift, then send to a server.
Victor Kabdebon
2011/6/15 Eric Evans
> On Tue, 2011-06-14 at 09:49 -0400, Victor Kabdebon wrote:
> > Actually from what I understood (please correct me if I am wrong) CQL
> > is based on Thrift
Hi,
Wiki says that write operation is atomic within ColumnFamily
(http://wiki.apache.org/cassandra/ArchitectureOverview chapter "write
properties").
If I use batch update for single CF, and get an exception in last mutation
operation, is it means that all previous operation will be reverted.
If
Correct me if I'm wrong, but AFAIK Hector is the only higher level
APi I would consider "complete' right now, with support for things
like fail-over.
I notice in the latest Hector build he is starting to add CQL support,
so thats what I'm sticking with. When he has CQL support done I'll
decide i
We've encountered the situation that compacted sstable files aren't
deleted after node repair. Even when gc is triggered via jmx, it
sometimes leaves compacted files. In a case, a lot of files are left.
Some files stay more than 10 hours already. There is no guarantee that
gc will cleanup all compa
CQL support in Hector was available as of 0.8.0 release. See details here:
https://github.com/rantav/hector/wiki/Using-CQL
On Wed, Jun 15, 2011 at 9:46 AM, Jeffrey Kesselman wrote:
> Correct me if I'm wrong, but AFAIK Hector is the only higher level
> APi I would consider "complete' right now,
Hi!
Couldn't google anybody having yet experienced this, so I do (0.8):
{
"foo":{
"foo":{
"foo":"bar",
"foo":"bar",
"foo":"bar",
"foo":"",
"foo":"bar",
"foo":"bar",
"id":123456
} },
"foo":null
}
(json can likely be boiled down even more...)
Sasha,
Did you ever nail down the cause of this problem?
On 5/31/2011 4:01 AM, Sasha Dolgy wrote:
hi everyone,
the current nodes i have deployed (4) have all been working fine, with
not a lot of data ... more reads than writes at the moment. as i had
monitoring disabled, when one node's OS ki
Even if the gc call cleaned all files, it is not really acceptable on a
decent sized cluster due to the impact full gc has on performance.
Especially non-needed ones.
The delay in file deletion can also at times make it hard to see how much
spare disk you actually have.
We easily see 100% increas
On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen <
tmarthinus...@gmail.com> wrote:
> Even if the gc call cleaned all files, it is not really acceptable on a
> decent sized cluster due to the impact full gc has on performance.
> Especially non-needed ones.
>
>
Not acceptable as running GC on ev
Hi,
I was looking quickly at source code tonight.
As far as I could see from a quick code scan, hint delivery is only
triggered as a state change from a node is down to when it enters up state?
If this is indeed the case, it would potentially explain why we sometimes
have hints on machines which
No. Upgraded to 0.8 and monitor the systems more. we schedule a repair
every 24hrs via cron and so far no problems..
On Jun 15, 2011 5:44 PM, "AJ" wrote:
> Sasha,
>
> Did you ever nail down the cause of this problem?
>
> On 5/31/2011 4:01 AM, Sasha Dolgy wrote:
>> hi everyone,
>>
>> the current
In regards to cleaning-up old sstable files, I posed this question
before as I noticed after taking a snapshot, the older files
(pre-compaction) shared no links with the snapshots. Therefore, (if the
Cass snapshot functionality is working correctly) those older files can
be manually deleted.
There's a ticket open to address this:
https://issues.apache.org/jira/browse/CASSANDRA-1974
-ryan
On Wed, Jun 15, 2011 at 8:49 AM, Terje Marthinussen
wrote:
>
>
> On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen
> wrote:
>>
>> Even if the gc call cleaned all files, it is not really accepta
> Even if the gc call cleaned all files, it is not really acceptable on a
> decent sized cluster due to the impact full gc has on performance.
> Especially non-needed ones.
You can run with -XX:+ExplicitGCInvokesConcurrent to "safely" trigger
CMS cycles. However that also means System.gc() semanti
You're better served using UUIDs than numeric row IDs for surrogate
keys. (Of course natural keys work fine too.)
On Wed, Jun 15, 2011 at 9:16 AM, Utku Can Topçu wrote:
> As far as I can tell, this functionality doesn't exist.
>
> However you can use such a method to insert the rowId into anothe
On Wed, Jun 15, 2011 at 10:53 AM, Terje Marthinussen
wrote:
> I was looking quickly at source code tonight.
> As far as I could see from a quick code scan, hint delivery is only
> triggered as a state change from a node is down to when it enters up state?
Right.
> If this is indeed the case, it
We started doing this recently and thought it might be useful to others.
Pig (and Hive) have a sample function that allows you to sample data from your
data store.
In pig it looks something like this:
mysample = SAMPLE myrelation 0.01;
One possible use for this, with pig and cassandra is to sol
Is there a way to favor a node to always participate (or never
participate) towards fulfillment of read consistency as well as write
consistency ?
Thanks
AJ
The problem in the above approach is you have 2 nodes between 12 to 4 in DC1
but from 4 to 12 you just have 1 (Which will cause uneven distribution
of data the node)
It is easier to think of the DCs as ring and split equally and interleave
them together
DC1 Node 1 : token 0
DC1 Node 2 : t
Correction
"The problem in the above approach is you have 2 nodes between 12 to 4 in
DC1 but from 4 to 12 you just have 1"
should be
"The problem in the above approach is you have 1 node between 0-4 (25%) and
and one node covering the rest which is 4-16, 0-0 (75%)"
Regards,
On Wed, Jun
I think I'm stuck on typing issues trying to store data in cassandra. To
verify, cassandra wants (key, {tuples})
My pig script is fairly brief:
raw = LOAD 'cassandra://test_in/test_cf' USING CassandraStorage() AS
(key:chararray, columns:bag {column:tuple (name, value)});
--colums == timeUUID -> J
Cassandra write operation is atomic for all the columns/super columns for a
given row key in Column Family. So in your case not all previous operations
(assuming each operation was on separate key) will be reverted.
Thank you,
Jaydeep
From: Artem Orobets
To:
Hi All,
I had a cassandra node which was running on cassandra 0.7.3.
Without changing the data directories I installed cassandra 0.8.0 but when I
query data I get timeouts.
Can somehow please guide me how to go about upgrade from cassandra 0.7.3 to
cassandra 0.8.0.
Thanks
Anurag
Cool - thanks Dmitriy!
On Jun 15, 2011, at 12:54 PM, Dmitriy Ryaboy wrote:
> Another tip:
> If you parametrize your load statements, it becomes easy to switch
> between loading from something like Cassandra, and reading from HDFS
> or local fs directly.
>
> Also:
> Try using Pig's "illustrate" c
Hi Sameer,
One example is, store all the tweets for a given user in a Column
Family, where row key is user name/user id and column name is of
TimeUUID type that represents tweet arrival time. User would generally
like to see the tweets sorted based on its arrival time. So TimeUUID
will help h
Are there exceptions in the Cassandra log?
On Wed, Jun 15, 2011 at 1:54 PM, Anurag Gujral wrote:
> Hi All,
> I had a cassandra node which was running on cassandra 0.7.3.
> Without changing the data directories I installed cassandra 0.8.0 but when I
> query data I get timeouts.
> Can som
Hi Will,
That's partly why I like to use FromCassandraBag and ToCassandraBag from
pygmalion - it does the work for you to get it back into a form that cassandra
understands.
Others may know better how to massage the data into that form using just pig,
but if all else fails, you could write a u
My problem is the column names are dynamic (a date), and pygmalion seems to
want the column names to be fixed at "compile time" (the script).
On Wed, Jun 15, 2011 at 3:04 PM, Jeremy Hanna wrote:
> Hi Will,
>
> That's partly why I like to use FromCassandraBag and ToCassandraBag from
> pygmalion -
I'll do a reply all, to keep this more consistent (sorry!).
Rather than staying stuck, I wrote a custom function: TupleToBagOfTuple. I'm
curious if I could have avoided it with proper pig scripting though.
On Wed, Jun 15, 2011 at 3:08 PM, William Oberman
wrote:
> My problem is the column names a
Yeah - for completely dynamic column names, then yeah - From/To Cassandra Bag
doesn't handle that. It does handle prefixed names though - like link* will
get a bag of all the columns that start with link. But sounds like you are
doing what I would have to do if I got into a nested data conundr
On 6/15/2011 12:14 PM, Vijay wrote:
Correction
"The problem in the above approach is you have 2 nodes between 12 to 4
in DC1 but from 4 to 12 you just have 1"
should be
"The problem in the above approach is you have 1 node between 0-4
(25%) and and one node covering the rest which is 4
Thanks Jonathan. Can we turn off RR by READ_REPAIR_CHANCE.= 0. Please advice.
Selva
From: Jonathan Ellis
To: user@cassandra.apache.org
Sent: Tue, June 14, 2011 8:59:41 PM
Subject: Re: Multi data center configuration - A question on read correction
That's just
All you heard is right...
You are not overriding Cassandra's token assignment by saying here is your
token...
Logic is:
Calculate a token for the given key...
find the node in each region independently (If you use NTS and if you set
the strategy options which says you want to replicate to the othe
Vijay, thank you for your thoughtful reply. Will Cass complain if I
don't setup my tokens like in the examples?
On 6/15/2011 2:41 PM, Vijay wrote:
All you heard is right...
You are not overriding Cassandra's token assignment by saying here is
your token...
Logic is:
Calculate a token for th
Hey all;
So, we have Cassandra running on a 5-server ring, with a RF of 3, and
we're regularly seeing major slowdowns in read & write performance while
running nodetool repair, as well as the occasional Cassandra crash
during the repair window - slowdowns past 10 seconds to perform a single
w
Here's a weird one... what's the best way to get a Cassandra node into a
"half-crashed" state?
We have a 3-node cluster running 0.7.5. A few days ago this happened
organically to node1 - the partition the commitlog was on was 100% full and
there was a "No space left on device" error, and after a w
Thanks, Aaron, but we determined that adding Java into the equation just
brings in too much complexity for something that's called out of an Nginx
Perl module. Right now I'm having trouble even replicating the above
scenario and posted a question here:
http://cassandra-user-incubator-apache-org.306
No it wont it will assume you are doing the right thing...
Regards,
On Wed, Jun 15, 2011 at 2:34 PM, AJ wrote:
> Vijay, thank you for your thoughtful reply. Will Cass complain if I don't
> setup my tokens like in the examples?
>
>
> On 6/15/2011 2:41 PM, Vijay wrote:
>
> All you heard
I suspect a few possibilities:
1. I have not checked, but what happens (in terms of hint delivery) if a
node tries to write something but the write times out even if the node is
marked as up?
2. I would assume there can be ever so slight variations in how different
nodes in the cluster think the re
You're right, those could all cause what you are seeing.
We used to have a "re-check hourly" scheduled task, but took it out
because it was very very performance intensive -- at the time, hints
were not stored by machine so asking "does machine X have any hints"
required scanning all hints. Shoul
Hi All,
I moved to cassandra 0.8.0 from cassandra-0.7.3 when I try to
move back I get the following error:
java.lang.RuntimeException: Can't open sstables from the future! Current
version f, found file: /data/cassandra/data/system/Schema-g-9.
Please suggest.
Thanks
Anurag
Can't help you with that.
You may have to go the json2sstable route and re-import into 0.7.3
But... why would you want to go back to 0.7.3?
Terje
On Thu, Jun 16, 2011 at 10:30 AM, Anurag Gujral wrote:
> Hi All,
> I moved to cassandra 0.8.0 from cassandra-0.7.3 when I try to
> move ba
Watching this on a node here right now and it sort of shows how bad this can
get.
This node still has 109GB free disk by the way...
INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java
(line 2071) requesting GC to free disk space
INFO [CompactionExecutor:5] 2011-06-16 09:12:23,
Ok. I understand the reasoning you laid out. But, I think it should be
documented more thoroughly. I was trying to get an idea as to how
flexible Cass lets you be with the various combinations of strategies,
snitches, token ranges, etc..
It would be instructional to see what a graphical rep
The GC cleanup approach, if depending on specific objects being GCd,
is fundamentally flawed.
I brought this up earlier, won't restart that thread. It should be in
the archives.
On Wed, Jun 15, 2011 at 10:17 PM, Terje Marthinussen
wrote:
> Watching this on a node here right now and it sort of
No force a node "down" you can use nodetool disablegossip
On Wed, Jun 15, 2011 at 6:42 PM, Suan Aik Yeo wrote:
> Thanks, Aaron, but we determined that adding Java into the equation just
> brings in too much complexity for something that's called out of an Nginx
> Perl module. Right now I'm havin
There's a ticket open for this:
https://issues.apache.org/jira/browse/CASSANDRA-2521. Vote on it if
you think its important.
-ryan
On Wed, Jun 15, 2011 at 7:34 PM, Jeffrey Kesselman wrote:
> The GC cleanup approach, if depending on specific objects being GCd,
> is fundamentally flawed.
>
> I bro
+1 for more documentation (I guess contributions are always welcomed) I
will try to write it down sometime when we have a bit more time...
0.8 nodetool ring command adds the DC and RAC information
http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
http://www
Jake,
*You need to maintain a huge number of distinct indexes.*
*
*
*Are we talking about secondary indexes? If yes, this sounds like exactly my
problem. There is so little documentation! - but I think that if I read all
there is on GitHub, I can probably start using it.
*
Thank you,
Mark
On Fri
Datastax has pretty sufficient documentation on their site for secondary
indexes.
On Jun 16, 2011 6:57 AM, "Mark Kerzner" wrote:
> Jake,
>
> *You need to maintain a huge number of distinct indexes.*
> *
> *
> *Are we talking about secondary indexes? If yes, this sounds like exactly
my
> problem. T
Which variables (for instance: throughput, CPU, I/O, connections) are
leading in deciding to add a node to a Cassandra setup which is put
under strain. We are trying to proove scalibility, but when is the time
there to add a node and have the optimum scalibilty result.
62 matches
Mail list logo