AW: Cassandra at Amazon AWS

2013-01-21 Thread Roland Gude
relatively small cluster of 4 nodes where the switch to leveledcompaction increased backup cost by 800 Euro a month. Greetings Roland Von: Roland Gude [mailto:roland.g...@ez.no] Gesendet: Freitag, 18. Januar 2013 09:23 An: user@cassandra.apache.org Betreff: AW: Cassandra at Amazon AWS Priam is good for

AW: Cassandra at Amazon AWS

2013-01-18 Thread Roland Gude
Priam is good for backups but it is another complex (but very good) part to a software stack. A simple solution is to do regular snapshots (via cron) Compress them and put them into s3 On the s3 you can simply choose how many days the files are kept. This can be done with a couple of lines of she

AW: Replication Factor and Consistency Level Confusion

2012-12-19 Thread Roland Gude
Hi RF 2 means that 2 nodes are responsible for any given row (no matter how many nodes are in the cluster) For your cluster with three nodes let's just assume the following responsibilities NodeA B C Primary keys0-5 6-1011-15 R

AW: TTL on SecondaryIndex Columns. A bug?

2012-12-19 Thread Roland Gude
I think this might be https://issues.apache.org/jira/browse/CASSANDRA-4670 Unfortunately apart from me no one was yet able to reproduce. Check if data is available before/after compaction If you have leveled compaction it is hard to test because you cannot trigger compaction manually. -Urspr

AW: secondery indexes TTL - strange issues

2012-09-17 Thread Roland Gude
DEBUG level logs that would be helpful. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/09/2012, at 10:08 PM, Roland Gude mailto:roland.g...@ez.no>> wrote: I am not sure it is compacting an old file: the same thing happens

AW: secondery indexes TTL - strange issues

2012-09-14 Thread Roland Gude
action ? Are you able to replicate the problem with a fresh testing CF and some test Data? If it's only a problem with imported data can you provide a sample of the failing query ? Any maybe the CF definition ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://w

secondery indexes TTL - strange issues

2012-09-13 Thread Roland Gude
Hi, we have been running a system on Cassandra 0.7 heavily relying on secondary indexes for columns with TTL. This has been working like a charm, but we are trying hard to move forward with Cassandra and are struggling at that point: When we put our data into a new cluster (any 1.1.x version -

AW: AW: How to control location of data?

2012-01-10 Thread Roland Gude
Each node in the cluster is assigned a token (can be done automatically - but usually should not) The token of a node is the start token of the partition it is responsible for (and the token of the next node is the end token of the current tokens partition) Assume you have the following nodes/

AW: How to control location of data?

2012-01-10 Thread Roland Gude
Hi, i think everything is called a replica so if data is on 3 nodes you have 3 replicas. There is no such thing as an original. A partitioner decides into which partition a piece of data belongs A replica placement strategy decides which partition goes on which node You cannot suppress the part

AW: Garbage collection freezes cassandra node

2011-12-19 Thread Roland Gude
Tuning garbage colletion is really hard. Espescially if you do not know why garbage collection stalls. In general I must say I have never seen a software which shipped with such a good garbage collection configuration as Cassandra. The thing that looks suspiscious is that the major collections a

AW: Pending ReadStage is exploding on only one node

2011-11-23 Thread Roland Gude
Are you using indexslicequeries? I described a similar problem a couple of months ago (and mechanisms to reproduce the behavior) but unfortunately failed to create an issue for it (shame on me). The mail thread is in the archives http://www.mail-archive.com/user@cassandra.apache.org/msg16157.htm

AW: flushwriter all time blocked

2011-08-29 Thread Roland Gude
public int getTotalBlockedTasks(); /** * Get the number of tasks currently blocked, waiting to be accepted by * the executor (because all threads are busy and the backing queue is full). */ public int getCurrentlyBlockedTasks(); On Mon, Aug 29, 2011 at 3:39 AM, Roland Gude w

flushwriter all time blocked

2011-08-29 Thread Roland Gude
Hi all, On a 0.7.8 cluster In tpstats i can see flushwriter stage having several tasks in state all-time-blocked (immendiatly after node restart its 8 but grows over time to around 300). What does it mean (or how can I find out) and what can I do about it? -- YOOCHOOSE GmbH Roland Gude

AW: IndexSliceQuery issue - ReadStage piling up (looks like deadlock/infinite loop or similar)

2011-08-11 Thread Roland Gude
p://www.thelastpickle.com On 10 Aug 2011, at 03:33, Roland Gude wrote: Hi, I experience issues when doing a indexslicequery with multiple expressions if one of the expressions is about a non index column I did the equivalent of this example (but with my data) from http://www.datastax.com/dev/bl

IndexSliceQuery issue - ReadStage piling up (looks like deadlock/infinite loop or similar)

2011-08-09 Thread Roland Gude
xception at org.apache.cassandra.db.ColumnFamily.addAll(ColumnFamily.java:131) at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1615) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42) ... 4 more Can anybody rep

AW: results of index slice query

2011-07-29 Thread Roland Gude
? -Ursprüngliche Nachricht- Von: Roland Gude [mailto:roland.g...@yoochoose.com] Gesendet: Donnerstag, 28. Juli 2011 11:22 An: user@cassandra.apache.org Betreff: AW: results of index slice query Created https://issues.apache.org/jira/browse/CASSANDRA-2964 -Ursprüngliche Nachricht- Von: Jonathan

AW: results of index slice query

2011-07-28 Thread Roland Gude
Wed, Jul 27, 2011 at 6:44 AM, Roland Gude wrote: > Hi, > > I was just experiencing that when i do an IndexSliceQuery with the index > column not in the slicerange the index column will be returned anyways. Is > this behavior intended or is it a bug (if so - is it a Cassandra bug or

results of index slice query

2011-07-27 Thread Roland Gude
, roland -- YOOCHOOSE GmbH Roland Gude Software Engineer Im Mediapark 8, 50670 Köln +49 221 4544151 (Tel) +49 221 4544159 (Fax) +49 171 7894057 (Mobil) Email: roland.g...@yoochoose.com WWW: www.yoochoose.com<http://www.yoochoose.com/> YOOCHOOSE GmbH Geschäftsführer: Dr. Uwe Alkemper, M

AW: Multi-type column values in single CF

2011-07-03 Thread Roland Gude
You could do the serialization for all your supported datatypes yourself (many libraries for serialization are available and a pretty thorough benchmarking for them can be found here: https://github.com/eishay/jvm-serializers/wiki) and prepend the serialized bytes with an identifier for your dat

AW: Column value type

2011-06-22 Thread Roland Gude
There is a comparator type (fort he name) and a validation type (for the value) If you have set the validation to be UTF8 you can only store data that is valid UTF8 there. The default validation is BytesType so it should accept everything unless otherwise specified. I cannot tell anything regard

Re: "range query" vs "slice range query"

2011-05-25 Thread Roland Gude
That is correct. Random partitioner orders rows according to the MD5 sum. Am 25.05.2011 um 16:11 schrieb "Robert Jackson" mailto:robe...@promedicalinc.com>>: Also, it is my understanding that if you are not using OrderPreservingPartitioner a get_range_slices may not return what you would expec

Re: "range query" vs "slice range query"

2011-05-25 Thread Roland Gude
I cannot Display the Book page you are referring to, but your General understanding is correct. A Range Refers to several rows, a slice Refers to several columns. A RangeSlice is a combination of Both. From all rows in a Range get a specific slice of columns. Am 25.05.2011 um 10:43 schrieb "dav

AW: Does anyone have Cassandra running on OpenSolaris?

2011-05-09 Thread Roland Gude
Use bash as a shell #bash bin/cassandra -f -Ursprüngliche Nachricht- Von: Jeffrey Kesselman [mailto:jef...@gmail.com] Gesendet: Montag, 9. Mai 2011 17:12 An: user@cassandra.apache.org Betreff: Does anyone have Cassandra running on OpenSolaris? I get this error: bin/cassandra: syntax

Re: low performance inserting

2011-05-03 Thread Roland Gude
Hi, Not sure this is the case for your Bad Performance, but you are Meassuring Data creation and Insertion together. Your Data creation involves Lots of class casts which are probably quite Slow. Try Timing only the b.send Part and See how Long that Takes. Roland Am 03.05.2011 um 12:30 schrieb

AW: AW: Two versions of schema

2011-04-19 Thread Roland Gude
Yeah it happens from time to time even if everything seems to be fine that schema changes don't work correctly. But it's always repairable with the described procedure. Therefore the operator being available is a must have I think. Drain is a nodetool command. The node flushes data and stops ac

AW: Two versions of schema

2011-04-18 Thread Roland Gude
Schema updates in cassandra tickle through the cluster over time very much like normal writes do. But they keep some state indicating the "parent" schema and they will only be applied to some node if the parent schema is correct thus asserting the correct order of schema changes. This process is

Re: Atomicity Strategies

2011-04-10 Thread Roland Gude
A Strategy that should Cover at least some use Cases is roughly like this: Given cf A and B should Be in Sync In write 'a' to cf A Add another Column 'Synchronisation_token' and Write a tuuid 'T' (or a timestamp or some Otter Value that Allows (Time based) ordering) As its value. On the related

Re: Site Not Surviving a Single Cassandra Node Crash

2011-04-10 Thread Roland Gude
Not sure about that Hector Version, but there was a Hector Bug that Hector did Not stop using a Dead Node As Proxy and that it did not do proper Load balancing in the requests. If you enable trace Logs for Hector you can See which nodes it uses for requests. If there is a newer 0.6 Hector you sh

Re: Secondary Index keeping track of column names

2011-04-07 Thread Roland Gude
You could simulate it thoug. Just Add some Meta Column with a boolean Value indicating if the referred Column is in the Row or Not. Then Add an Index in that Meta Column and query for it. I. E. Row a: (c=1234),(has_c=Yes) Quert : List cf where has_c=Yes Am 06.04.2011 um 18:52 schrieb "Jonatha

AW: Strange nodetool repair behaviour

2011-04-04 Thread Roland Gude
I am experiencing the same behavior but had it on previous versions of 0.7 as well. -Ursprüngliche Nachricht- Von: Jonas Borgström [mailto:jonas.borgst...@trioptima.com] Gesendet: Montag, 4. April 2011 12:26 An: user@cassandra.apache.org Betreff: Strange nodetool repair behaviour Hi,

AW: too many open files - maybe a fd leak in indexslicequeries

2011-04-02 Thread Roland Gude
Nachricht- Von: Jonathan Ellis [mailto:jbel...@gmail.com] Gesendet: Freitag, 1. April 2011 06:07 An: user@cassandra.apache.org Cc: Roland Gude; Juergen Link; Johannes Hoerle Betreff: Re: too many open files - maybe a fd leak in indexslicequeries Index queries (ColumnFamilyStore.scan) don'

too many open files - maybe a fd leak in indexslicequeries

2011-03-31 Thread Roland Gude
anybody know about it? Where could I look? Greetings, roland -- YOOCHOOSE GmbH Roland Gude Software Engineer Im Mediapark 8, 50670 Köln +49 221 4544151 (Tel) +49 221 4544159 (Fax) +49 171 7894057 (Mobil) Email: roland.g...@yoochoose.com WWW: www.yoochoose.com<http://www.yoochoose.

AW: problems while TimeUUIDType-index-querying with two expressions

2011-03-15 Thread Roland Gude
] Gesendet: Dienstag, 15. März 2011 07:54 An: user@cassandra.apache.org Cc: Juergen Link; Roland Gude; her...@datastax.com Betreff: Re: problems while TimeUUIDType-index-querying with two expressions Perfectly reasonable, created https://issues.apache.org/jira/browse/CASSANDRA-2328 Aaron On 15 Mar 2011

AW: cant seem to figure out secondary index definition

2011-03-04 Thread Roland Gude
find out, which indexes exist for which columns in a given cluster? Any help would be greatly appreciated -Ursprüngliche Nachricht- Von: Roland Gude [mailto:roland.g...@yoochoose.com] Gesendet: Montag, 21. Februar 2011 18:36 An: user@cassandra.apache.org Betreff: Re: cant seem to figur

Re: cant seem to figure out secondary index definition

2011-02-21 Thread Roland Gude
ave an equals > clause with that UUID as the column name? > > On Thu, Feb 17, 2011 at 11:32 AM, Roland Gude > wrote: >> Hi again, >> >> >> >> i am still having trouble with this. >> >> If I define the index using cli with these commands: >>

AW: cant seem to figure out secondary index definition

2011-02-17 Thread Roland Gude
ail.com] Gesendet: Dienstag, 15. Februar 2011 16:22 An: user@cassandra.apache.org Betreff: Re: cant seem to figure out secondary index definition Ah, ok. I checked that in source and the problem is that you wrote "validation_class" but you should "validator_class". Augi 2011

AW: rename index

2011-02-17 Thread Roland Gude
names causing a problem? Aaron On 17/02/2011, at 6:15 AM, Roland Gude mailto:roland.g...@yoochoose.com>> wrote: Hi, unfortiunately i made a copy paste error and created two indexes called “myindex” on different columnfamilies. What can I do to fix this? Below the output from descri

rename index

2011-02-16 Thread Roland Gude
tion Class: org.apache.cassandra.db.marshal.UTF8Type Index Name: MyIndex Index Type: KEYS -- YOOCHOOSE GmbH Roland Gude Software Engineer Im Mediapark 8, 50670 Köln +49 221 4544151 (Tel) +49 221 4544159 (Fax) +49 171 7894057 (Mobil) Email: roland.g...@yoochoose.com WWW: www.yoochoose.com<http://www.

AW: cant seem to figure out secondary index definition

2011-02-15 Thread Roland Gude
"validation_class" but you should "validator_class". Augi 2011/2/15 Roland Gude mailto:roland.g...@yoochoose.com>> Yeah i know about that, but the definition i have is for a cluster that is started/stopped from a unit test with hector embeddedServerHelper, which takes definit

AW: cant seem to figure out secondary index definition

2011-02-15 Thread Roland Gude
yspace definition is for demonstration purposes only. Cassandra will not load these definitions during startup. See http://wiki.apache.org/cassandra/FAQ#no_keyspaces for an explanation." So you should make all schema-related operation via Thrift/AVRO API, or you can use Cassandra CLI. Au

cant seem to figure out secondary index definition

2011-02-15 Thread Roland Gude
there? Greetings, roland -- YOOCHOOSE GmbH Roland Gude Software Engineer Im Mediapark 8, 50670 Köln +49 221 4544151 (Tel) +49 221 4544159 (Fax) +49 171 7894057 (Mobil) Email: roland.g...@yoochoose.com WWW: www.yoochoose.com<http://www.yoochoose.com/> YOOCHOOSE GmbH Geschäftsführer: Dr.

AW: cassandra solaris x64 support

2011-02-11 Thread Roland Gude
This is a problem with the start scripts, not with Cassandra itself (or any of its configuration) The shell you are using cannot start the cassandra shell script. Try #bash bin/cassandra -f As far as I know, it should work fine. Actually it should work with sh as well... -Ursprüngliche N

AW: Data ends up in wrong Columnfamily

2011-02-11 Thread Roland Gude
Yes this could very well be the issue. As I see its already fixed for 0.7.1. Hopefully it will pass a vote soon. Thanks, Roland -Ursprüngliche Nachricht- Von: sc...@scode.org [mailto:sc...@scode.org] Im Auftrag von Peter Schuller Gesendet: Freitag, 11. Februar 2011 09:11 An: user@cassand

AW: Why is it when I removed a row the RowKey is still there?

2011-02-11 Thread Roland Gude
It has something to do with the way data is deleted in Cassandra. You are not doing anything wrong. See here http://wiki.apache.org/cassandra/FAQ#range_ghosts Or here: http://wiki.apache.org/cassandra/DistributedDeletes For some more detail -Ursprüngliche Nachricht- Von: Joshua Partogi [

AW: Data ends up in wrong Columnfamily

2011-02-11 Thread Roland Gude
chine A even know the other CF name? Can you log the batch mutations you are sending? When it appears in the other CF is the data complete? There is also a Hector list, perhaps they can help. Aaron On 10/02/2011, at 11:58 PM, Roland Gude mailto:roland.g...@yoochoose.com>> wrote: Hi, i am

Data ends up in wrong Columnfamily

2011-02-10 Thread Roland Gude
data that was written (According to my application logs) from Machine A to CF_A ends up in CF_A and in one of the other columnfamilies. Any ideas why this could be happening? I am using Cassandra 0.7.0 and hector 0.7.0-23 Greetings, Roland -- YOOCHOOSE GmbH Roland Gude Software Engineer Im

strange issue with timeUUID columns

2010-12-22 Thread Roland Gude
seem to be sufficient) I can query the data again. The Cassandra I use is a single node 0.7.0-rc2 I am querying with hector. Has anyone else experienced such issues? Can someone think of an explanation for this? Kind regards, roland -- YOOCHOOSE GmbH Roland Gude Software Engineer Im Mediapark

Streaming Row Ranges

2010-12-16 Thread Roland Gude
them (and being limited by the count of rows). Of course client side libraries could hide the paging stuff, but that would not improve latency. Is something like this possible? Is it perhaps already implemented? Greetings, roland -- YOOCHOOSE GmbH Roland Gude Software Engineer Im Mediapark 8,