Re: Read efficiency question

2016-12-30 Thread Janne Jalkanen
In practice, the performance you’re getting is likely to be impacted by your reading patterns.  If you do a lot of sequential reads where key1 and key2 stay the same, and only key3 varies, then you may be getting better peformance out of the second option due to hitting the row and disk caches more

Re: Revisit Cassandra EOL Policy

2016-01-07 Thread Janne Jalkanen
If you wish to have a specific EOL policy, you need to basically buy it. It's unusual for open source projects to give any sort of an EOL policy; that's something that people with very specific requirements are willing to cough up a lot of money on. And getting money by giving support on older

Re: [RELEASE] Apache Cassandra 3.1 released

2015-12-13 Thread Janne Jalkanen
> There's not going to be a 3.3.x series, there will be one 3.3 release (unless > there is a critical bug, as mentioned above). > > There are two separate release lines going on: > > 3.0.1 -> 3.0.2 -> 3.0.3 -> 3.0.4 -> ... (every release is a bugfix) > > 3.1 -> 3.2 -> 3.3 -> 3.4 -> ... (odd num

Re: [RELEASE] Apache Cassandra 3.1 released

2015-12-11 Thread Janne Jalkanen
Thanks for this clarification, however... > So, for the 3.x line: > If you absolutely must have the most stable version of C* and don't care at > all about the new features introduced in even versions of 3.x, you want the > 3.0.N release. So there is no reason why you would ever want to run 3.1

Re: [RELEASE] Apache Cassandra 3.1 released

2015-12-09 Thread Janne Jalkanen
I’m sorry, I don’t understand the new release scheme at all. Both of these are bug fixes on 3.0? What’s the actual difference? If I just want to run the most stable 3.0, should I run 3.0.1 or 3.1? Will 3.0 gain new features which will not go into 3.1, because that’s a bug fix release on 3.0?

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-05 Thread Janne Jalkanen
path. Although the > table is defined to use Snappy Compression. Is this compression library or > some other transitive dependency pulled in by Astyanax enabling compression > of the payload i.e. sent over the wire and account for the difference in tp99? > Regards > Sachin > > On

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-03 Thread Janne Jalkanen
n Sat, Aug 1, 2015 at 11:50 PM, Janne Jalkanen <mailto:janne.jalka...@ecyrd.com>> wrote: > No, this just tells that your client (S3 using Datastax driver) cannot > communicate to the Cassandra cluster using a compressed protocol, since the > necessary libraries are mi

Re: Cassandra Data Stax java driver & Snappy Compression library

2015-08-01 Thread Janne Jalkanen
No, this just tells that your client (S3 using Datastax driver) cannot communicate to the Cassandra cluster using a compressed protocol, since the necessary libraries are missing on the client side. Servers will still compress the data they receive when they write it to disk. In other words C

Re: User click count

2014-12-30 Thread Janne Jalkanen
ur interface will have very > unstable read time. > > Pick the best solution (or combination) for your use case. Those > disadvantages lists are not exhaustive, just things that came to my mind > right now. > > C*heers > > Alain > > 2014-12-29 13:33 GMT+01:0

Re: User click count

2014-12-29 Thread Janne Jalkanen
Hi! It’s really a tradeoff between accurate and fast and your read access patterns; if you need it to be fairly fast, use counters by all means, but accept the fact that they will (especially in older versions of cassandra or adverse network conditions) drift off from the true click count. If

Re: Practical use of counters in the industry

2014-12-23 Thread Janne Jalkanen
On 20 Dec 2014, at 09:46, Robert Coli wrote: > On Thu, Dec 18, 2014 at 7:19 PM, Rajath Subramanyam > wrote: > Thanks Ken. Any other use cases where counters are used apart from Rainbird ? > > Disqus use(d? s?) them behind an in-memory accumulator which batches and > periodically flushes. Th

Re: are repairs in 2.0 more expensive than in 1.2

2014-10-24 Thread Janne Jalkanen
m. > > Sean > > [1] https://issues.apache.org/jira/browse/CASSANDRA-8177 > > On Thu, Oct 23, 2014 at 2:04 PM, Janne Jalkanen > wrote: > > On 23 Oct 2014, at 21:29 , Robert Coli wrote: > >> On Thu, Oct 23, 2014 at 9:33 AM, Sean Bridges wrote: >> The chang

Re: are repairs in 2.0 more expensive than in 1.2

2014-10-23 Thread Janne Jalkanen
On 23 Oct 2014, at 21:29 , Robert Coli wrote: > On Thu, Oct 23, 2014 at 9:33 AM, Sean Bridges wrote: > The change from parallel to sequential is very dramatic. For a small cluster > with 3 nodes, using cassandra 2.0.10, a parallel repair takes 2 hours, and > io throughput peaks at 6 mb/s.

Re: Moving Cassandra from EC2 Classic into VPC

2014-09-09 Thread Janne Jalkanen
Alain Rodriguez outlined this procedure that he was going to try, but failed to mention whether this actually worked :-) https://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201406.mbox/%3cca+vsrlopop7th8nx20aoz3as75g2jrjm3ryx119deklynhq...@mail.gmail.com%3E /Janne On 8 Sep 2014,

Re: What % of cassandra developers are employed by Datastax?

2014-05-16 Thread Janne Jalkanen
Don’t know, but as a potential customer of DataStax I’m also concerned at the fact that there does not seem to be a competitor offering Cassandra support and services. All innovation seems to be occurring only in the OSS version or DSE(*). I’d welcome a competitor for DSE - it does not even ha

Weird row cache behaviour

2014-04-06 Thread Janne Jalkanen
Heya! I’ve been observing some strange and worrying behaviour all this week with row cache hits taking hundreds of milliseconds. Cassandra 1.2.15, Datastax CQL driver 1.0.4. EC2 m1.xlarge instances RF=3, N=4 vnodes in use key cache: 200M row cache: 200M row_cache_provider: SerializingCacheProvid

Re: Row cache vs. OS buffer cache

2014-01-23 Thread Janne Jalkanen
Our experience is that you want to have all your very hot data fit in the row cache (assuming you don’t have very large rows), and leave the rest for the OS. Unfortunately, it completely depends on your access patterns and data what is the right size for the cache - zero makes sense for a lot

Re: Setting up Cassandra to store on a specific node and not replicate

2013-12-19 Thread Janne Jalkanen
Probably yes, if you also disabled any sort of failovers from the token-aware client… (Talking about this makes you realize how many failsafes Cassandra has. And still you can lose data… :-P) /Janne On 18 Dec 2013, at 20:31, Robert Coli wrote: > On Wed, Dec 18, 2013 at 2:44 AM, Sylvain Lebr

Re: Setting up Cassandra to store on a specific node and not replicate

2013-12-18 Thread Janne Jalkanen
This may be hard because the coordinator could store hinted handoff (HH) data on disk. You could turn HH off and have RF=1 to keep data on a single instance, but you would be likely to lose data if you had any problems with your instances… Also you would need to tweak the memtable flushing so t

Re: user / password authentication advice

2013-12-11 Thread Janne Jalkanen
Hi! You're right, this isn't really Cassandra-specific. Most languages/web frameworks have their own way of doing user authentication, and then you just typically write a plugin that just stores whatever data the system needs in Cassandra. For example, if you're using Java (or Scala or Groovy

Re: Data loss when swapping out cluster

2013-11-27 Thread Janne Jalkanen
A-yup. Got burned this too some time ago myself. If you do accidentally try to bootstrap a seed node, the solution is to run repair after adding the new node but before removing the old one. However, during this time the node will advertise itself as owning a range, but when queried, it'll retu

Re: Data loss when swapping out cluster

2013-11-26 Thread Janne Jalkanen
That sounds bad! Did you run repair at any stage? Which CL are you reading with? /Janne On 25 Nov 2013, at 19:00, Christopher J. Bottaro wrote: > Hello, > > We recently experienced (pretty severe) data loss after moving our 4 node > Cassandra cluster from one EC2 availability zone to an

Re: Efficient IP address location lookup

2013-11-16 Thread Janne Jalkanen
Idea: Put only range end points in the table with primary key (part, remainder) insert into location (part, remainder, city) values (100,10,Sydney) // 100.0.0.1-100.0.0.10 is Sydney insert into location (part, remainder, city) values (100,50,Melbourne) // 100.0.0.11-100.0.0.5 is Melb then look

Re: [RELEASE] Apache Cassandra 1.2.11 released

2013-10-23 Thread Janne Jalkanen
Question - is https://issues.apache.org/jira/browse/CASSANDRA-6102 in 1.2.11 or not? CHANGES.txt says it's not, JIRA says it is. /Janne (temporarily unable to check out the git repo) On Oct 22, 2013, at 13:48 , Sylvain Lebresne wrote: > The Cassandra team is pleased to announce the release of

Re: Disappearing index data.

2013-10-07 Thread Janne Jalkanen
https://issues.apache.org/jira/browse/CASSANDRA-5732 There is now a reproducible test case. /Janne On Oct 7, 2013, at 16:29 , Michał Michalski wrote: > I had similar issue (reported many times here, there's also a JIRA issue, but > people reporting this problem were unable to reproduce it).

Re: Mystery PIG issue with 1.2.10

2013-09-26 Thread Janne Jalkanen
Sorry, got sidetracked :) https://issues.apache.org/jira/browse/CASSANDRA-6102 /Janne On Sep 26, 2013, at 20:04 , Robert Coli wrote: > On Thu, Sep 26, 2013 at 1:00 AM, Janne Jalkanen > wrote: > > Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear > a

Re: Mystery PIG issue with 1.2.10

2013-09-26 Thread Janne Jalkanen
13, at 3:57 AM, Chad Johnston wrote: > >> As an FYI, creating the table without the "WITH COMPACT STORAGE" and using >> CqlStorage works just fine in 1.2.10. >> >> I know that CqlStorage and AbstractCassandraStorage got changed for 1.2.10 - >> maybe th

Mystery PIG issue with 1.2.10

2013-09-25 Thread Janne Jalkanen
Heya! I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle integer values. Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1. Single node for testing this. First a table: > CREATE TABLE testc ( key text PRIMARY KEY, ivalue int, svalue text, value big

Re: [Pig] ERROR 2118: Could not get input splits

2013-09-20 Thread Janne Jalkanen
I just started moving our scripts to Pig 0.11.1 from 0.9.2 and I see the same issue - about 75-80% time it fails. So I'm not moving :-/. I am using OSX + Oracle Java7 and CassandraStorage, but I did not see any difference between CassandraStorage and CqlStorage. Cassandra 1.2.9, though 1.1.10

Re: Failed decommission

2013-08-25 Thread Janne Jalkanen
ache.org/jira/browse/CASSANDRA-5857?focusedCommentId=13748998&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13748998 > > > Cheers, > > Mike > > > On Sun, Aug 25, 2013 at 4:06 AM, Janne Jalkanen > wrote: > This on cass 1.2.

Re: Failed decommission

2013-08-25 Thread Janne Jalkanen
Aug 25, 2013 at 3:06 AM, Janne Jalkanen > wrote: > This on cass 1.2.8 > > Ring state before decommission > > -- Address Load Owns Host ID > TokenRack > UN 10.0.0.1 38.82 GB 3

Failed decommission

2013-08-25 Thread Janne Jalkanen
This on cass 1.2.8 Ring state before decommission -- Address Load Owns Host ID TokenRack UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1 1a UN 10.0.

Re: Cassandra JVM heap sizes on EC2

2013-08-24 Thread Janne Jalkanen
We've been trying to keep the heap as small as possible; the disk access penalty on EC2 is big enough - even on instance store - that you want to give as much memory to disk caches as you can. Of course, then you will need to keep extra vigilant on your garbage collection and tune various thin

Re: understanding memory footprint

2013-08-15 Thread Janne Jalkanen
Also, if you are using leveled compaction, remember that each SSTable will take a couple of MB of heap space. You can tune this by choosing a good sstable_size_in_mb value for those CFs which are on LCS and contain lots of data. Default is 5 MB, which is for many cases inadequate, so most peo

Re: Which of these VPS configurations would perform better for Cassandra ?

2013-08-06 Thread Janne Jalkanen
Well, Amazon is expensive. Hetzner will sell you dedicated SSD RAID-1 servers with 32GB RAM and 4 cores with HT for €59/mth. However, if pricing is an issue, you could start with: 1 server : read at ONE, write at ONE, RF=1. You will have consistency, but not high availability. This is the sam

Re: sstable size change

2013-07-22 Thread Janne Jalkanen
I don't think upgradesstables is enough, since it's more of a "change this file to a new format but don't try to merge sstables and compact" -thing. Deleting the .json -file is probably the only way, but someone more familiar with cassandra LCS might be able to tell whether manually editing the

Re: Cassandra Out of Memory on startup while reading cache

2013-07-22 Thread Janne Jalkanen
Sounds like this: https://issues.apache.org/jira/browse/CASSANDRA-5706, which is fixed in 1.2.7. /Janne On 22 Jul 2013, at 20:40, Jason Tyler wrote: > Hello, > > Since upgrading from 1.1.9 to 1.2.6 over the last week, we've had two > instances where cassandra was unable, but kept trying to

Re: Why does cassandra PoolingSegmentedFile recycle the RandomAccessReader?

2013-07-15 Thread Janne Jalkanen
I had exactly the same problem, so I increased the sstable size (from 5 to 50 MB - the default 5MB is most certainly too low for serious usecases). Now the number of SSTableReader objects is manageable, and my heap is happier. Note that for immediate effect I stopped the node, removed the *.js

Re: Billions of counters

2013-06-13 Thread Janne Jalkanen
Hi! We have a similar situation of millions of events on millions of items - turns out that this isn't really a problem, because there tends to be a very strong power -distribution: very few of the items get a lot of hits, some get some, and the majority gets no hits (though most of them do ge

Re: best practices on EC2 question

2013-05-16 Thread Janne Jalkanen
On May 16, 2013, at 17:05 , Brian Tarbox wrote: > An alternative that we had explored for a while was to do a two stage backup: > 1) copy a C* snapshot from the ephemeral drive to an EBS drive > 2) do an EBS snapshot to S3. > > The idea being that EBS is quite reliable, S3 is still the emergency

Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-16 Thread Janne Jalkanen
Might you be experiencing this? https://issues.apache.org/jira/browse/CASSANDRA-4417 /Janne On May 16, 2013, at 14:49 , Alain RODRIGUEZ wrote: > @Rob: Thanks about the feedback. > > Yet I have a weird behavior still unexplained about repairing. Are counters > supposed to be "repaired" too ?

Re: normal thread counts?

2013-05-01 Thread Janne Jalkanen
This sounds very much like https://issues.apache.org/jira/browse/CASSANDRA-5175, which was fixed in 1.1.10. /Janne On Apr 30, 2013, at 23:34 , aaron morton wrote: >> Many many many of the threads are trying to talk to IPs that aren't in the >> cluster (I assume they are the IP's of dead hos

Re: secondary index problem

2013-03-15 Thread Janne Jalkanen
This could be either of the following bugs (which might be the same thing). I get it too every time I recycle a node on 1.1.10. https://issues.apache.org/jira/browse/CASSANDRA-4973 or https://issues.apache.org/jira/browse/CASSANDRA-4785 /Janne On Mar 15, 2013, at 23:24 , Brett Tinling wrote:

Re: HintedHandoff IOError?

2013-03-15 Thread Janne Jalkanen
mpaction and flushing sstables. > > Cheers > > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 11/03/2013, at 11:19 PM, Janne Jalkanen wrote: > >> >> Oop

Re: HintedHandoff IOError?

2013-03-11 Thread Janne Jalkanen
t; > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 11/03/2013, at 2:13 PM, Robert Coli wrote: > >> On Mon, Mar 11, 2013 at 7:05 AM, Janne Jalkanen >> wrote: >>

HintedHandoff IOError?

2013-03-11 Thread Janne Jalkanen
I keep seeing these in my log. Three-node cluster, one node is working fine, but two other nodes have increased latencies and these in the error logs (might of course be unrelated). No obvious GC pressure, no disk errors that I can see. Ubuntu 12.04 on EC2, Java 7. Repair is run regularly. My

Re: LCS and counters

2013-02-25 Thread Janne Jalkanen
At least for our use case (reading slices from varyingly sized rows from 10-100k composite columns with counters and hundreds of writes/second) LCS has a nice ~75% lower read latency than Size Tiered. And compactions don't stop the world anymore. Repairs do easily trigger a few hundred compact

Re: Wide rows in CQL 3

2013-01-09 Thread Janne Jalkanen
On 10 Jan 2013, at 01:30, Edward Capriolo wrote: > Column families that mix static and dynamic columns are pretty common. In > fact it is pretty much the default case, you have a default validator then > some columns have specific validators. In the old days people used to say > "You only nee

Re: Cassandra nodes failing with OOM

2012-11-19 Thread Janne Jalkanen
Something that bit us recently was the size of bloom filters: we have a column family which is mostly written to, and only read sequentially, so we were able to free a lot of memory and decrease GC pressure by increasing bloom_filter_fp_chance for that particular CF. This on 1.0.12. /Janne O

Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-12 Thread Janne Jalkanen
On 12 Sep 2012, at 00:50, Omid Aladini wrote: > On Tue, Sep 11, 2012 at 8:33 PM, Janne Jalkanen > wrote: >> >> Does this mean that LCS on 1.0.x should be considered unsafe to >> use? I'm using them for semi-wide frequently-updated CounterColumns >> and th

Re: Assertions running Cleanup on a 3-node cluster with Cassandra 1.1.4 and LCS

2012-09-11 Thread Janne Jalkanen
> A bug in Cassandra 1.1.2 and earlier could cause out-of-order sstables > and inter-level overlaps in CFs with Leveled Compaction. Your sstables > generated with 1.1.3 and later should not have this issue [1] [2]. Does this mean that LCS on 1.0.x should be considered unsafe to use? I'm using th

Re: Data aggregation - averages, sums, etc.

2012-05-19 Thread Janne Jalkanen
> 2. I know I have counter columns. I can do sums. But can I do averages ? One counter column for the sum, one counter column for the count. Divide for average :-) /Janne

Re: Column Family per User

2012-04-18 Thread Janne Jalkanen
Each CF takes a fair chunk of memory regardless of how much data it has, so this is probably not a good idea, if you have lots of users. Also using a single CF means that compression is likely to work better (more redundant data). However, Cassandra distributes the load across different nodes b

Re: cql shell error

2012-04-15 Thread Janne Jalkanen
; > > > ta...@tok-media.com > Tel: +972 2 6409736 > Mob: +972 54 8356490 > Fax: +972 2 5612956 > > > > > > On Sun, Apr 15, 2012 at 7:46 PM, Janne Jalkanen > wrote: > > You might have hit this bug: > https://issues.apache.org/jira/br

Re: cql shell error

2012-04-15 Thread Janne Jalkanen
You might have hit this bug: https://issues.apache.org/jira/browse/CASSANDRA-4003 /Janne On Apr 15, 2012, at 17:21 , Tamar Fraenkel wrote: > Hi! > I have an error when I try to read column value using cql but I can read it > when I use cli. > > When I read in cli I get: > get cf['a52efb7a-b

Re: issue with composite row key on CassandraStorage pig?

2012-04-10 Thread Janne Jalkanen
r composite columns support on > CassandraStorage.java. > Do you have any pointers for implementing composite row key feature? > > Thanks. > > On Mon, Apr 9, 2012 at 11:32 AM, Janne Jalkanen > wrote: > > I don't think the Pig code supports Composite *keys* yet. The 1.0.9 code >

Re: issue with composite row key on CassandraStorage pig?

2012-04-09 Thread Janne Jalkanen
I don't think the Pig code supports Composite *keys* yet. The 1.0.9 code supports Composite Column Names tho'... /Janne On Apr 8, 2012, at 06:02 , Janwar Dinata wrote: > Hi, > > I have a column family that uses DynamicCompositeType for its > keys_validation_class. > When I try to dump the ro

Re: cassandra 1.0.9 is out!

2012-04-06 Thread Janne Jalkanen
...or if you're a Pig user, you get support for both counter columns and composite columns. /Janne On Apr 7, 2012, at 07:46 , Watanabe Maki wrote: > 1.0.9 is a maintenance release, so it's basically bug fixes with some minor > improvements. > If you plan to use LeveledCompaction, you should b

Re: multi region EC2

2012-03-31 Thread Janne Jalkanen
I've switched from SS to NTS on 1.0.x on a single-az cluster with RF3 (which obviously created a single-dc, single-rack NTS cluster). Worked without a hitch. Also switched from SimpleSnitch to Ec2Snitch on-the-fly. I had about 12GB of data per node. Of course, your mileage may vary, so while I

Re: Lots of 0 Bytes tmp Data/Index files remain in data folder

2012-02-09 Thread Janne Jalkanen
Yup, that's exactly it. You can get rid of those either by restarting the node or upgrading to 1.0.7. /Janne On Feb 10, 2012, at 02:49 , Roshan wrote: > I have deployed 2 node Cassandra 1.0.6 cluster in production and it running > almost t weeks without any issue. But I can see lots of (more t

Re: how stable is 1.0 these days?

2012-01-26 Thread Janne Jalkanen
1.0.5 and 1.0.6 we had some longer-term stability problems with (fd leaks, etc), but so far 1.0.7 is running like a train for us. /Janne On Jan 26, 2012, at 08:43 , Radim Kolar wrote: > Dne 26.1.2012 2:32, David Carlton napsal(a): >> How stable is 1.0 these days? > good. but hector 1.0 is unst

Re: cassandra hit a wall: Too many open files (98567!)

2012-01-18 Thread Janne Jalkanen
1.0.6 has a file leak problem, fixed in 1.0.7. Perhaps this is the reason? https://issues.apache.org/jira/browse/CASSANDRA-3616 /Janne On Jan 18, 2012, at 03:52 , dir dir wrote: > Very Interesting Why you open so many file? Actually what kind of > system that is built by you until open so

Re: Counters and Top 10

2011-12-24 Thread Janne Jalkanen
In our case we didn't need an exact daily top-10 list of pages, just a good guess of it. So the way we did it was to insert a column with a short TTL (e.g. 12 hours) with the page id as the column name. Then, when constructing the top-10 list, we'd just slice through the entire list of unexpi

Re: 1.0.3 CLI oddities

2011-12-14 Thread Janne Jalkanen
Correct. 1.0.6 fixes this for me. /Janne On 12 Dec 2011, at 02:57, Chris Burroughs wrote: > Sounds like https://issues.apache.org/jira/browse/CASSANDRA-3558 and the > other tickets reference there. > > On 11/28/2011 05:05 AM, Janne Jalkanen wrote: >> Hi! >> >&g

Re: Upgrade from 0.6 to 1.0

2011-12-07 Thread Janne Jalkanen
I did this just last week, 0.6.13 -> 1.0.5. Basically, I grabbed the 0.7 distribution and ran the configuration conversion tool there first, but since the config it produced wasn't compatible with 1.0, in the end I just opened two editor windows, one with my 0.6 config and one with the 1.0 cas

Re: [RELEASE] Apache Cassandra 1.0.5 released

2011-12-02 Thread Janne Jalkanen
Would be glad to be of any help; it's kind of annoying. * Nothing unusual on any nodes that I can see * Cannot reproduce on a single-node cluster; I see it only on our prod cluster which was running 0.6.13 until this point (cluster conf is attached to the JIRA issue mentioned below). Let me kn

1.0.3 CLI oddities

2011-11-28 Thread Janne Jalkanen
Hi! (Asked this on IRC too, but didn't get anyone to respond, so here goes...) Is it just me, or are these real bugs? On 1.0.3, from CLI: "update column family XXX with gc_grace = 36000;" just says "null" with nothing logged. Previous value is the default. Also, on 1.0.3, "update column fami

Munin plugins stupid question

2011-06-22 Thread Janne Jalkanen
Heya! I know I should probably be able to figure this out on my own, but... The Cassandra Munin plugins (all of them) define in their storageproxy_latency.conf the following (this is from a 0.6 config): read_latency.jmxObjectName org.apache.cassandra.db:type=StorageProxy read_latency.jmxAttribu

Re: Flush / Snapshot Triggering Full GCs, Leaving Ring

2011-04-10 Thread Janne Jalkanen
On Apr 7, 2011, at 23:43 , Jonathan Ellis wrote: > The history is that, way back in the early days, we used to max it out > the other way (MTT=128) but observed behavior is that objects that > survive 1 new gen collection are very likely to survive "forever." Just a quick note: my own tests seem

Re: Explaining the Replication Factor, N and W and R

2011-02-13 Thread Janne Jalkanen
> Excellent! How about adding Hinted Handoff enabled/disabled option? Sure, once I understand it ;-) /Janne

Explaining the Replication Factor, N and W and R

2011-02-13 Thread Janne Jalkanen
Folks, as it seems that wrapping the brain around the R+W>N concept is a big hurdle for a lot of users, I made a simple web page that allows you to try out the different parameters and see how they affect the system. http://www.ecyrd.com/cassandracalculator/ Let me know if you have any suggest

Re: unsubscribe

2011-02-02 Thread Janne Jalkanen
How about adding an autosignature with unsubscription info? /Janne On Feb 2, 2011, at 19:42 , Norman Maurer wrote: > To make it short.. No it can't. > > Bye, > Norman > > (ASF Infrastructure Team) > > 2011/2/2 F. Hugo Zwaal : >> Can't the mailinglist server be changed to treat messages with

Re: cassandra as session store

2011-02-01 Thread Janne Jalkanen
If your sessions are fairly long-lived (more like hours instead of minutes) and you crank up a suitable row cache and make sure your db is consistent (via quorum read/writes or write:all, read:1) - sure, why not? Especially if you're already familiar with Cassandra; possibly even have a deploy

Re: Best strategy for adding new nodes to the cluster

2010-09-28 Thread Janne Jalkanen
On 28 Sep 2010, at 08:37, Michael Dürgner wrote: >> What do you mean by "running live"? I am also planning to use cassandra on >> EC2 using small nodes. Small nodes have 1/4 cpu of the large ones, 1/4 cost, >> but I/O is more than 1/4 (amazon does not give explicit I/O numbers...), so >> I thi

Re: Minor question on index design

2010-09-15 Thread Janne Jalkanen
as the Object Name or Timestamp (if it has one) so you can slice against it, e.g. to support paging operations. Make the column value the key for the object. Aaron On 15 Sep, 2010,at 02:41 AM, Janne Jalkanen wrote: Hi all! I'm pondering between a couple of alternatives here: I'

Re: OrderPreservingPartitioner for get_range_slices

2010-09-15 Thread Janne Jalkanen
Correct. You can use get_range_slices with RandomPartitioner too, BUT the iteration order is non-predictable, that is, you will not know in which order you get the rows (RandomPartitioner would probably better be called ObscurePartitioner - it ain't random, but it's as good as if it were

Minor question on index design

2010-09-14 Thread Janne Jalkanen
Hi all! I'm pondering between a couple of alternatives here: I've got two CFs, one which contains Objects, and one which contains Users. Now, each Object has an owner associated to it, so obviously I need some sort of an index to point from Users to Objects. This would be of course the perfect

Re: 4k keyspaces... Maybe we're doing it wrong?

2010-09-07 Thread Janne Jalkanen
7:53 PM, Benjamin Black wrote: On Mon, Sep 6, 2010 at 12:41 AM, Janne Jalkanen wrote: > > So if I read this right, using lots of CF's is also a Bad Idea(tm)? > Yes, lots of CFs is bad means lots of CFs is also bad. -- Virtually, Ned Wolpert "Settle thy studies, Faustus, and begin..." --Marlowe

Re: 4k keyspaces... Maybe we're doing it wrong?

2010-09-06 Thread Janne Jalkanen
#2. Performance: Will Cassandra work better with a single keyspace + lots of keys, or thousands of keyspaces? Thousands is a non-starter. There is an active memtable for every CF defined and caches (row and key) are per CF. Assuming even 2 CFs per keyspace, with 4000 keyspaces you will hav

Re: JConsole/SSH tunneling tip

2010-09-01 Thread Janne Jalkanen
On Sep 1, 2010, at 16:03 , Matthew Conway wrote: If you need to tunnel jconsole to a remote cassandra instance, the SSH socks proxy (ssh -D)is the easiest, least intrusive way. More details: http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html Totally awesome. I've lost sev

Re: column family names

2010-08-30 Thread Janne Jalkanen
I've been doing it for years with no technical problems. However, using "%" as the escape char tends to, in some cases, confuse a certain operating system whose name may or may not begin with "W", so using something else makes sense. However, it does require an extra cognitive step for th