nodes contain data for (prevTokenInRing, nodesOwnToken] (i.e. exclusive from
previous token to inclusive of the nodes token). So .179 will contain
things that hash in the range (152896308109140433971537345591636551711,0]
and .12 will contain things that hash in range
(0,152896308109140433971537345
Is your ReplicationFactor (RF) really set to 0? Don't do that, it needs to
be at least 1 and probably needs to be 3 in production if you care about
your data. It must be greater than 0 and less than the number of nodes in
your ring. It represents the number of nodes to copy/replicate data to.
An
On Mon, Feb 14, 2011 at 6:28 PM, Aaron Morton wrote:
> Will take a closer look at the code tonight, perhaps we should return an
> error if you try to using Network Topology it cannot detect any DC's .
>
>
+1
On Mon, Feb 14, 2011 at 2:54 PM, Robert Coli wrote:
> Regarding very large memtables, it is important to recognize that
> throughput refers only to the size of the COLUMN VALUES, and not, for
> example, their names.
>
That would be a bug in it's own right. There are lots of use cases that
only
> > Write Latency: NaN ms.
> > Pending Tasks: 0
> > Key cache capacity: 20
> > Key cache size: 0
> > Key cache hit rate: NaN
> > Row cache: disabled
> > Compacted row minimum size: 0
> > Compacted row maximum size: 0
> > Compacted row mean size: 0
&
On Mon, Feb 14, 2011 at 6:58 PM, Dan Hendry wrote:
> > 1) If I insert a key and want to verify which node it went to then how do
> I
> > do that?
>
> I don't think you can and there should be no reason to care. Cassandra
> abstracts where data is being stored, think in terms of consistency levels
no, it's actually worse to do that.
1) you're introducing single points of failure (your array).
2) you're introducing complexity and expense
3) you're introducing latency
4) you're introducing bottle necks
5) some other reasons...
You do want your commit log on a separate disk though. The o
regardless of increasing RF or not, RR happens based on the
read_repair_chance setting. RR happens after the request has been replied
to though, so it's possible that if you increase the RF and then read that
the read might get stale/missing data. RR would then put the correct value
on all the co
0.7.1 is what I would go with right now. It's likely you'll eventually have
to upgrade that as well, but moving to other 0.7.x releases should be fairly
painless. Most development is happening on the 0.7 releases, which already
have lots of fixes over the 0.6 series (not to mention performance
im
But you can not depend on such behavior. If you do a write and you get an
unavailable exception, the only thing you know is at that time it was not
able to be placed on all the nodes required to meet your CL. It may
eventually end up on all those nodes, it may not be on any of the nodes or
at the
1. Yes, the coordinator node propagates requests to the correct nodes.
2. most (all?) higher level clients (pycassa, hector, etc) load balance for
you. In general your client and/or the caller of the client needs to catch
exceptions and retry. If you're using RRDNS and some of the nodes are
temp
tor node could have been avoided somehow.
> Does the write on the coordinator node (incase it is not part of the N
> replica nodes for that key) get deleted before response of the write is
> returned back to the client ?
>
>
> On Tue, Feb 15, 2011 at 4:40 PM, Matthew Dennis wro
You have a single HAProxy node in front of the cluster or you have a HAProxy
node on each machine that is a client of Cassandra that points at all the
nodes in the cluster?
The former has a SPOF and bottleneck (the HAProxy instance), the latter does
not (and is somewhat common, especially for thin
Assuming you aren't changing the RC, the normal bootstrap process takes care
of all the problems like that, making sure things work correctly.
Most importantly, if something fails (either the new node or any of the
existing nodes) you can recover from it.
Just don't connect clients directly to th
+1 on avoiding OPP
On Wed, Feb 16, 2011 at 3:27 PM, Tyler Hobbs wrote:
>
> Thanks for you input, but we have a set key that consists of name:timestamp
>> that we are using.. and we need to also retrieve the oldest data as well..
>>
>
> Then you'll need to denormalize and store every row three wa
Data is in Memtables from writes before they get flushed (based on first
threshold of ops/size/time exceeded; all are configurable) to SSTables on
disk.
There is a keycache and a rowcache. The keycache caches offsets into
SSTables for the rows. the rowcache caches the entire row. There is also
The map returned by multiget_slice (what I suspect is the underlying thrift
call for getColumnsFromRows) is not a order preserving map, it's a HashMap
so the order of the returned results cannot be depended on. Even if it was
a order preserving map, not all languages would be able to make use of t
ow <http://www.sparrowmailapp.com>
>
> On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote:
>
> The map returned by multiget_slice (what I suspect is the underlying thrift
> call for getColumnsFromRows) is not a order preserving map, it's a HashMap
> s
or d...@riptano.com
On Wed, Sep 29, 2010 at 11:43 AM, Jonathan Ellis wrote:
> We'll get those fixed.
>
> Here or tho...@riptano.com directly is fine.
>
> Thanks!
>
As Norman said, secondary indexes are only in .7 but you can create standard
indexes in both .6 and .7
Basically have a email_domain_idx CF where the row key is the domain and the
column names have the row id of the user (the column value is unused in this
scenario). This sounds basically like wh
uld my best bet be to simply get ALL of my users uuids and ages, then
> throw away all of those that do not meet the required test?
>
> Thank you.
>
> On Oct 6, 2010, at 2:09 PM, Matthew Dennis wrote:
>
> As Norman said, secondary indexes are only in .7 but you can create
> s
Some relevant reading if you're interested:
http://dslab.epfl.ch/pubs/crashonly/
http://web.archive.org/web/20060426230247/http://crash.stanford.edu/
On Wed, Oct 6, 2010 at 1:46 PM, Scott Mann wrote:
> Yes. ctrl-C if running in the foreground. Use kill , if running
> in the background (see the
>
> PS. Are other ppl interested in this functionality ?
> I could file it to JIRA as well...
>
>
Yes, please file it to Jira. It seems like it would be pretty useful for
various things and fairly easy to change the code to move it to another
directory whenever C* thinks it should be deleted...
The SCs are stored on disk in the order defined by the compareWith setting
so if you want them back in a different order either someone is sorting them
(C*, which doesn't sort them right now, or the client; which doesn't make
much of a difference, it's just moving the load around) or you're
denorma
Rob is correct.
drain is really on there for when you need the commit log to be empty (some
upgrades or a complete backup of a shutdown cluster).
There really is no point to using to shutdown C* normally, just kill it...
On Wed, Oct 6, 2010 at 4:18 PM, Rob Coli wrote:
> On 10/6/10 1:13 PM, Aar
Creating indexes takes extra space (does in MySQL, PGSQL, etc too).
https://issues.apache.org/jira/browse/CASSANDRA-749 has quite a bit of
detail about how the secondary indexes currently work.
On Wed, Oct 6, 2010 at 7:17 PM, Alvin UW wrote:
> Hello,
>
> Before 0.7, actually we can create an ex
If I remember correctly the only operator supported for secondary indexes
right now is EQ, not LTE (or the others).
On Thu, Oct 7, 2010 at 6:13 AM, Christian Decker wrote:
> I'm currently trying to get started on secondary indices in Cassandra
> 0.7.0svn, but without any luck so far. I have the
Keep in mind that .7 and on will have per-CF settings for most things so
there will be even more control over the the tuning...
On Oct 7, 2010 3:10 PM, "Peter Schuller"
wrote:
>> What if there is more than one keyspace in the system ? Assuming each
>> keyspace has the same number of column familie
+1 on disabling swap
On Oct 7, 2010 3:27 PM, "Peter Schuller"
wrote:
>> The nodes are still swapping, even though the swappiness is set to zero
>> right now. After swapping comes the OOM.
>
> In addition to what's already been said, consider just flat out
> disabling swap completely, unless you ha
n CL.ONE.
On Thu, Oct 7, 2010 at 7:11 PM, David McIntosh wrote:
> Are there any data loss concerns if you have the commit log sync set to
> periodic and are writing with CL One or Any?
>
>
>
> *From:* Matthew Dennis [mailto:mden...@riptano.com]
> *Sent:* We
Allan,
I'm confused on why removetoken doesn't do anything and would be interested
in finding out why, but to answer your question:
You can shutdown down your last node, nuke the system directory (make a
backup just in case), restart the node, load the schema (export it first if
need be) and be o
Also, in general, you probably want to set Xms = Xmx (regardless of the
value you eventually decide on for that).
If you set them equal, the JVM will just go ahead and allocate that amount
on startup. If they're different, then when you grow above Xms it has to
allocate more and move a bunch of s
2 GiB is pretty small for a C* node. You can also try reducing all the
caching to zero with so little memory. If you have lots of CFs you probably
want to reduce the memtable throughput too.
On Wed, Oct 27, 2010 at 12:43 PM, Koert Kuipers <
koert.kuip...@diamondnotch.com> wrote:
> While bootst
You need to specify your initial tokens. LoadBalance really doesn't do a
good job of balancing the load. Take a look at "Load Balancing" in
http://wiki.apache.org/cassandra/Operations There is a little python script
in there to help you pick tokens for a given cluster size.
If you don't want to
34 matches
Mail list logo