Re: RFC: Cassandra Virtual Nodes

2012-03-23 Thread Peter Schuller
istributed tokens (hashed keys), all sstables are likely to have almost the entire possible token range in them. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-22 Thread Peter Schuller
de terminology, be stored separately in the file system.) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-21 Thread Peter Schuller
ion to responsible node. I.e., it probably means the vnode information must be kept as state. It is probably difficult to reconcile with balancing solutions like consistent hashing/crush/etc. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-20 Thread Peter Schuller
is the ring delay stuff which makes it un-workable to do at high granularity, but that should apply to the active range solution too. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-19 Thread Peter Schuller
pack is limited to a handful of instances. In order for vnodes to be useful with random placement, we'd need much more than a handful of vnodes per node (cassandra instances in a "pack" in that model). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-19 Thread Peter Schuller
t; > I will have to re-read your orignal post. I seem to have missed something :) I did, and I may or may not understand what you mean. Are you comparing vnodes + hashing, with CRUSH + pre-partitioning by hash + identity hash as you traverse down the topology tree? --

Re: RFC: Cassandra Virtual Nodes

2012-03-19 Thread Peter Schuller
in unconvinced thus far. Further, even looking at just the math, the claim cannot possibly hold as N grows sufficiently large. At some point you will bottleneck on the network and no longer benefit form a higher RDF, but the probability of data loss doesn't drop off until you reach DF=number of partitions (because at that point an increased cluster size doesn't increase the number of nodes with data sharing with another node). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Peter Schuller
Point of clarification: My use of the term "bucket" is completely unrelated to the term "bucket" used in the CRUSH paper. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Peter Schuller
ssion and details to fill in. I apologize, but again, I really want to post something now that this is being brought up. BEGIN un-polished text ("we" = "I"):= = CRUSHing Cassandra Author: Peter Schuller This is a proposal for a significant re-design of some fundamentals of C

Re: [VOTE] Release Apache Cassandra 1.0.8

2012-02-22 Thread Peter Schuller
+1 (but FYI changelog has a typo "ahndling"). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: [VOTE] Release Apache Cassandra 1.1.0-beta1

2012-02-16 Thread Peter Schuller
+1 -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Rules of thumbs for committers

2012-02-14 Thread Peter Schuller
commits on every pull+push iteration. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Welcome committer Peter Schuller

2012-02-13 Thread Peter Schuller
> The Apache Cassandra PMC has voted to add Peter as a committer.  Thank > you Peter, and we look forward to continuing to work with you! Thank *you*, as do I :) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: What's the point of the deviation of java code style in Cassandra?

2012-01-27 Thread Peter Schuller
> that's a disambiguation wiki page. what exactly are you talking about? http://en.wiktionary.org/wiki/when_in_Rome,_do_as_the_Romans_do Can we *please* stop this thread? -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra has moved to Git

2012-01-09 Thread Peter Schuller
into a JIRA ticket after the fact to figure out what reasoning was). * You're not rebasing published branches. The downside I suppose is that the branch count increases. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Using quorum to read data

2012-01-09 Thread Peter Schuller
(I don't remember off hand how to tell hector to use auto-discovery.) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra has moved to Git

2012-01-04 Thread Peter Schuller
(And btw, major +1 on the transition to git!) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra has moved to Git

2012-01-04 Thread Peter Schuller
out the specific issue of "git pull" vs "git pull --rebase" in the simple hacking-away-at-a-single-branch case. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: cassandra node is not starting

2012-01-02 Thread Peter Schuller
> Could this just be commit log reply of the truncate? Nevermind :) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: cassandra node is not starting

2012-01-02 Thread Peter Schuller
Could this just be commit log reply of the truncate? -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: major version release schedule

2011-12-20 Thread Peter Schuller
on unless developers ship things early to get it into the release. But also keep in mind: If we reach a point where major users of Cassandra need to run on significantly divergent versions of Cassandra because the release is just too old, the "normal" mainstream release will en

Re: major version release schedule

2011-12-20 Thread Peter Schuller
are about being stable, and working, and the version you're upgrading too should be stable. (2) Critical fixes need still be maintained for the version you're running (else you are in fact kind of forced to upgrade). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: [VOTE] Release Apache Cassandra 1.0.0-rc2 (Release Candidate 2)

2011-09-30 Thread Peter Schuller
> [1]: http://goo.gl/YtJLq (CHANGES.txt) Contains merge markers. >>>>>>> .merge-right.r1176712 0.8.6 -- / Peter Schuller (@scode on twitter)

Re: Releasing 0.8.6 due to CASSANDRA-3166?

2011-09-17 Thread Peter Schuller
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201109.mbox/%3CCAKkz8Q307TaOfw=7tpkaooal_a+ry_gewnyo-vwnugoenv3...@mail.gmail.com%3E Oops, I'm sorry. I did actually search my mailbox first, but obviously I failed. -- / Peter Schuller (@scode on twitter)

Releasing 0.8.6 due to CASSANDRA-3166?

2011-09-17 Thread Peter Schuller
As came up in a thread on user@, I would suggest that CASSANDRA-3166[1] is enough reason to release 0.8.6. Asking people to build from source and patch to perform a rolling upgrade isn't good. [1] https://issues.apache.org/jira/browse/CASSANDRA-3166 -- / Peter Schuller (@scode on twitter)

Re: [VOTE] Release Apache Cassandra 0.8.1 (take #4)

2011-06-27 Thread Peter Schuller
probably be updated to reflect that it is slated for 0.8.2? -- / Peter Schuller

Re: [VOTE] Release Apache Cassandra 0.8.1

2011-06-20 Thread Peter Schuller
ster was built with non-released code > (sporting a different message version). I believe it is expected in this case due to https://issues.apache.org/jira/browse/CASSANDRA-2280 -- / Peter Schuller

Re: [ANN] Branched; freeze in effect

2011-04-11 Thread Peter Schuller
rg/jira/browse/CASSANDRA-2420 -- / Peter Schuller

Cassandra documentation (and in this case the datastax anti-entropy docs)

2011-03-31 Thread Peter Schuller
being sensitive, gossip delays, bootstrapping multiple nodes at once, etc). I'm not sure how to get there. It's not like I'm *so* motivated and have *so* much time that if people agree I'll sit down and write 500 pages of Cassandra handbook. So the question is how to achieve something incrementally that is yet more organized than the wiki. Thoughts? -- / Peter Schuller

Re: Please unsubscribe me from the email list.

2011-02-14 Thread Peter Schuller
> Please unsubscribe gary.mo...@xerox.com from this email list. http://wiki.apache.org/cassandra/FAQ#unsubscribe -- / Peter Schuller

Re: Maintenance releases

2011-02-11 Thread Peter Schuller
e mailing lists and JIRA, adjusting the release engineering a bit seems like a high-priority change towards that goal. -- / Peter Schuller

Clarification on intended bootstrapping semantics

2011-01-20 Thread Peter Schuller
uction cluster, except: (7a) New nodes being brought in as seeds (7b) During the very first initial cluster setup with no data (7) The above is intended and on purpose, and it would be correct to operate under these assumptions when updating/improving documentation. -- / Peter Schuller

Re: API pages on the wiki

2011-01-11 Thread Peter Schuller
entation is worse than the user having to read two versions to get a sense of differences, it seems to make sense. -- / Peter Schuller

API pages on the wiki

2011-01-11 Thread Peter Schuller
e for each version? -- / Peter Schuller

Re: [SOLVED] Very high memory utilization (not caused by mmap on sstables)

2010-12-19 Thread Peter Schuller
s/0.6/operations/tuning - but without more information it's difficult to know what specifically it is that you're hitting. Are you seriously saying you're running for 15-20 days with only 2 mb of live data? -- / Peter Schuller

Re: [SOLVED] Very high memory utilization (not caused by mmap on sstables)

2010-12-16 Thread Peter Schuller
> Sorry for spam again. :-) No, thanks a lot for tracking that down and reporting details! Presumably a significant amount of users are on that version of Ubuntu running with openjdk. -- / Peter Schuller

Re: Any chance of getting cassandra releases published to repo1.maven.org?

2010-12-14 Thread Peter Schuller
ally. I may be interested in helping out trying to maintain one, but I'm not sure I have sufficient maven fu yet to be effective (but I'm getting there).) Regardless, the Riptano maven repository is greatly appreciated as it appears already. -- / Peter Schuller

mmap:ed i/o and buffer sizes

2010-12-10 Thread Peter Schuller
e queue) if there is no read-ahead until the first successive access. I have not checked what actually does happen, nor have I benchmarked for comparison. But I'd be interested in hearing if people have already addressed this in the past. -- / Peter Schuller

Re: Reducing confusion around client libraries

2010-12-04 Thread Peter Schuller
d trying to find places on the wiki that links to the thrift API page and re-consider whether (or at least how) to link, etc. -- / Peter Schuller

Re: Atomically adding a column to columns_

2010-09-29 Thread Peter Schuller
> It would be good to document this, or, since the > correct-even-for-remove logic is not much more complicated, switch to > that. Submitted: https://issues.apache.org/jira/browse/CASSANDRA-1559 -- / Peter Schuller

Re: Atomically adding a column to columns_

2010-09-27 Thread Peter Schuller
rds, I do not believe the remove() code path should not ever be taken concurrently with insertions (by design, and not by accident). Anyone care to confirm/deny? -- / Peter Schuller

Re: improving read performance

2010-09-20 Thread Peter Schuller
uld make checking the bloom > filters unnecessary in most cases for me, but I'm not sure it's worth the > effort. Write-through row caching seems like a more direct approach to me personally, off hand. Also to the extent that you're worried about false positive rates, larger bloom filters may still be an option (not currently configurable; would require source changes). -- / Peter Schuller

Re: improving read performance

2010-09-20 Thread Peter Schuller
t write-through or not though. -- / Peter Schuller

Re: GC Storm

2010-07-18 Thread Peter Schuller
suggestion; might be 1 gig). (e) log_m(n) will never be large enough for it to be a scaling problem that you have one thread per "level" Thoughts? -- / Peter Schuller

Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Peter Schuller
es, but I may have missed them. The *.Data.db files are indeed sstables. -- / Peter Schuller

Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Peter Schuller
eams are closed. Are the deleted files indeed sstable, or was that a bad assumption on my part? -- / Peter Schuller

Re: Minimizing the impact of compaction on latency and throughput

2010-07-13 Thread Peter Schuller
rge amounts of data is not what you want (there are any number of practical situations where this has been an issue for me, if nothing else). But if I'm overlooking something that would mean that this optimization, trying to avoid eviction, is useless with Cassandra please do explain it to me :)

Re: Minimizing the impact of compaction on latency and throughput

2010-07-13 Thread Peter Schuller
ds though. We'll see. I'll try to make time for trying this out. -- / Peter Schuller

Re: Minimizing the impact of compaction on latency and throughput

2010-07-08 Thread Peter Schuller
simple rate limiter might help significantly - albeit be something that has to be tweaked very specifically for the situation/hardware rather than being auto-tuned. If I have the time I may look into posix_fadvise() to begin with (but I'm not promising anything). Thanks for the input! -- / Peter Schuller

Re: Minimizing the impact of compaction on latency and throughput

2010-07-07 Thread Peter Schuller
pect it to potentially work pretty well without separation, if you do have such a setup). -- / Peter Schuller

Minimizing the impact of compaction on latency and throughput

2010-07-07 Thread Peter Schuller
h the goal? -- / Peter Schuller

Re: Cassandra performance and read/write latency

2010-07-06 Thread Peter Schuller
cking writes to the commit log for example (are you running with periodic fsync or batch wise fsync?). -- / Peter Schuller

Re: Hello

2010-07-02 Thread Peter Schuller
:s such that generated javadocs are easier to navigate in terms of the overall structure and the roles of packages. -- / Peter Schuller

Re: Packaging Cassandra for Debian [was: Packaging Cassandra for Ubuntu]

2010-06-04 Thread Peter Schuller
ines as an easer-to-accomplish goal for the Cassandra developers, yet providing high payoff to users. -- / Peter Schuller

Re: Cassandra on top of B-Tree

2010-03-30 Thread Peter Schuller
ases is not an expected use case - beyond some hints in the documentation that would indicate it's meant for smaller databases.) -- / Peter Schuller