Even though the client did not get a success message, it is possible
that write may have succeeded on one of the replicas. Let us say that
client did a retry and the write succeeded.
Let us also assume that I was trying to withdraw $100. Initially $100
was withdrawn as per one of the replica
This is where things starts getting subtle.
If Cassandra's failure detector knows ahead of time that not enough
writes are available, that is the only time we truly fail a write, and
nothing will be written anywhere. But if a write starts during the
window where a node is failed but we don't know
http://goo.gl/3sjE5
On Fri, 2011-02-25 at 10:33 +0800, Ardi Chen wrote:
> 2011/2/25 Jun Young Kim
>
> >
> > --
> > Junyoung Kim (juneng...@gmail.com)
--
Eric Evans
eev...@rackspace.com
2011/2/25 Jun Young Kim
>
> --
> Junyoung Kim (juneng...@gmail.com)
>
>
--
Junyoung Kim (juneng...@gmail.com)
1. Why 24GB of heap? Do you need this high heap? Bigger heap can lead to
longer GC cycles but 15min look too long.
2. Do you have ROW cache enabled?
3. How many column families do you have?
4. Enable GC logs and monitor what GC is doing to get idea of why it is
taking so long. You can add following
Right, so I'm interpreting silence as a confirmation on all points. I
opened:
https://issues.apache.org/jira/browse/CASSANDRA-2245
https://issues.apache.org/jira/browse/CASSANDRA-2246
to work on these.
On Wed, Feb 23, 2011 at 5:31 PM, Matt Kennedy wrote:
> Let me start out by saying that I thin
I failed to mention: this is just doing repeated data retrievals using the
index.
> ...
>
> Sample run: Secondary index.
>
> DEBUG Retrieved THS / 7293 rows, in 2012 ms
> DEBUG Retrieved THS / 7293 rows, in 1956 ms
> DEBUG Retrieved THS / 7293 rows, in 1843 ms
...
FWIW, for me the advantage of homebrew indexes is that they can be a lot more
sophisticated than the standard -- I can hash combinations of column values
to whatever I want. I also put counters on column values in the index, so
there is lots of functionality. Of course, I can do it because my data
I am doing some experimenting with indexing. My data CF has about 25000 rows
around 1KB each. I set up a special column of boolean value to use as the
secondary index. I also created my own index in a separate CF where each index
is one row and the column names are the data keys.
The implem
Retrieving data using row key is the primary way how to get data from
Cassandra, so it's highly optimized.
Firstly, node responsible for the row is computed using partitioner. You can
use RandomPartitioner (distributes md5 of keys) or
OrderPreservingPartitioner (key must be UTF8 string).
Then the r
On Thu, Feb 24, 2011 at 3:56 PM, A J wrote:
> While we are at it, there's more to consider than just CAP in distributed :)
> http://voltdb.com/blog/clarifications-cap-theorem-and-data-related-errors
>
> On Thu, Feb 24, 2011 at 3:31 PM, Edward Capriolo
> wrote:
>> On Thu, Feb 24, 2011 at 3:03 PM,
Hey all,
Our setup is 5 machines running Cassandra 0.7.0 with 24GB of heap and 1.5TB
disk each collocated in a DC. We're doing bulk imports from each of the nodes
with RF = 2 and write consistency ANY (write perf is very important). The
behavior we're seeing is this:
- Nodes often se
On Thu, Feb 24, 2011 at 3:07 PM, mcasandra wrote:
>
> Thanks! I just started reading about Bloom Filter. Is this something that
> is
> inbuilt by default or is it something that need to be explicitly
> configured?
>
It's built in, no configuration needed.
--
Tyler Hobbs
Software Engineer, Data
I should mention that it took me a while to figure this out too. Might be a
candidate for an improvement in the cli?
On Thu, Feb 24, 2011 at 4:01 PM, buddhasystem wrote:
>
> Thanks! You are right. I see exception but have no idea what went wrong.
>
>
> ERROR [ReadStage:14] 2011-02-24 21:51:29,3
Thanks! I just started reading about Bloom Filter. Is this something that is
inbuilt by default or is it something that need to be explicitly configured?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Understanding-Indexes-tp6058238p6062010.html
On Thu, Feb 24, 2011 at 3:55 PM, mcasandra wrote:
>
> Either I am not explaning properly or I don't understand the data model just
> yet. Please check again:
>
> In below example this is what I understand:
>
> 1) UserProfile is a CF
> 2) is a row key
> 3) username is a column. Each row (eg 11
Thanks! You are right. I see exception but have no idea what went wrong.
ERROR [ReadStage:14] 2011-02-24 21:51:29,374 AbstractCassandraDaemon.java
(line 113) Fatal exception in thread Thread[ReadStage:14,5,main]
java.io.IOError: java.io.EOFException
at
org.apache.cassandra.db.columnitera
While we are at it, there's more to consider than just CAP in distributed :)
http://voltdb.com/blog/clarifications-cap-theorem-and-data-related-errors
On Thu, Feb 24, 2011 at 3:31 PM, Edward Capriolo wrote:
> On Thu, Feb 24, 2011 at 3:03 PM, A J wrote:
>> yes, that is difficult to digest and one
Either I am not explaning properly or I don't understand the data model just
yet. Please check again:
In below example this is what I understand:
1) UserProfile is a CF
2) is a row key
3) username is a column. Each row (eg ) has username column
My understanding is that secondary indexe
When I've gotten "null" as a result in cassandra-cli, it turned out to mean
that there were exceptions being thrown on the server side. Have you checked
your Cassandra logs?
On Thu, Feb 24, 2011 at 3:44 PM, buddhasystem wrote:
>
> Thanks Tyler,
>
>ColumnFamily: index1
> Columns sorted b
On Thu, Feb 24, 2011 at 3:34 PM, mcasandra wrote:
>
> I wasn't aware that there is an index on primary key (that is row keys). So
> from what I understand there is by default an index on for eg: , in
> below example? Where can I read more about it?
>
> UserProfile = { // this is a ColumnFa
Thanks Tyler,
ColumnFamily: index1
Columns sorted by: org.apache.cassandra.db.marshal.AsciiType
Row cache size / save period: 0.0/0
Key cache size / save period: 1.0/3600
Memtable thresholds: 0.8765625/50/60
GC grace seconds: 864000
Compaction min/max t
On Thu, Feb 24, 2011 at 2:36 PM, Anthony John wrote:
>
> Does "ALL" succeed even if there is a single surviving replica for the
> given piece of data ?
> Again, tolerates node failure. Does it really mean - from ALL surviving
> nodes ?
>
All replicas (RF) for that row must respond before an ope
All:
So "ANY" CL seems to mean that Write (and read) on any node, even if it is a
hinted handoff, and return success. Correct ?
Guessing this accommodates node failure - right ?
Does "ALL" succeed even if there is a single surviving replica for the
given piece of data ?
Again, tolerates node fa
I wasn't aware that there is an index on primary key (that is row keys). So
from what I understand there is by default an index on for eg: , in
below example? Where can I read more about it?
UserProfile = { // this is a ColumnFamily
{ // this is the key to this Row inside the C
On Thu, Feb 24, 2011 at 3:03 PM, A J wrote:
> yes, that is difficult to digest and one has to be sure if the use
> case can afford it.
>
> Some other NOSQL databases deals with it differently (though I don't
> think any of them use atomic 2-phase commit). MongoDB for example will
> ask you to read
On Thu, Feb 24, 2011 at 2:27 PM, buddhasystem wrote:
>
> I'm doing insertion with a pycassa client. It seems to work in most cases,
> but sometimes, when I go to Cassandra-cli, and query with key and column
> that I inserted, I get "null" whereas I shouldn't. What could be causes for
> that?
>
C
You're welcomed!
On Thu, Feb 24, 2011 at 5:30 PM, mcasandra wrote:
>
> Thanks. This helps a lot!
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Understand-eventually-consistent-tp6038330p6061838.html
> Sent from the cassandra-u...@incubato
Thanks. This helps a lot!
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Understand-eventually-consistent-tp6038330p6061838.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
I'm doing insertion with a pycassa client. It seems to work in most cases,
but sometimes, when I go to Cassandra-cli, and query with key and column
that I inserted, I get "null" whereas I shouldn't. What could be causes for
that?
--
View this message in context:
http://cassandra-user-incubator-a
No, since you are intentionally asking that at least a QUORUM of the RFs are
written. So in your scenario, only 1 node is up of 3, and QUORUM value is 2.
So that operation will fail, no HH is made.
A read won't succedd either, since you are asking that the data to be
returned must be validated at
It all depends on what you're trying to do. What you're proposing doing, by
defintion, is creating a secondary index. The primary index is your row
key. Depending on the partitioner, it might or might not be a conveniently
iterable index or sorted index. If you need your keys sorted in a differ
Javier Canillas wrote:
>
> HH is some kind of write repair, so it has nothing to do with CL that is a
> requirement of the operation; and it won't be used over reads.
>
> In your example QUORUM is the same as ALL, since you only have 1 RF (only
> the data holder - coordinator). If that node fai
yes, that is difficult to digest and one has to be sure if the use
case can afford it.
Some other NOSQL databases deals with it differently (though I don't
think any of them use atomic 2-phase commit). MongoDB for example will
ask you to read from the node you wrote first (primary node) unless
you
Hi everyone
I am new to JAVA and Cassandra.
I just get started to install Cassandra.
My Machine is Debian 5.0.6.
I installed jdk1.6.0_24 to /usr/local
java -version is as following.
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) Server VM (build 19.1-
The leap of faith here is that an error does not mean a clean backing out to
prior state - as we are used to with databases. It means that the operation
in error could have gone through partially
Again, this is not an absolutely unfamiliar territory and can be dealt with.
-JA
On Thu, Feb 24, 201
HH is some kind of write repair, so it has nothing to do with CL that is a
requirement of the operation; and it won't be used over reads.
In your example QUORUM is the same as ALL, since you only have 1 RF (only
the data holder - coordinator). If that node fails, all read / writes will
fail.
Now,
I don't say you shouldn't. In case you feel like there is a problem, you may
think of splitting column families into N. But I think you won't get that
problem. You can read about RowCacheSize and KeyCache support on 0.7.X of
Cassandra, if you rows are small, you may cache a lot of them and avoid a
On Thu, Feb 24, 2011 at 1:26 PM, mcasandra wrote:
>
> Does HH count towards QUORUM? Say RF=1 and CL of W=QUORUM and one node
> that
> owns the key dies. Would subsequent write operations for that key be
> successful? I am guessing it will not succeed.
>
No, it would not succeed. It would only s
Does HH count towards QUORUM? Say RF=1 and CL of W=QUORUM and one node that
owns the key dies. Would subsequent write operations for that key be
successful? I am guessing it will not succeed.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Under
Thanks! I am thinking more in terms where you have millions of keys (rows).
For eg: UUID as a row key. or there could millions of users.
So are we saying that we should NOT create column families with these many
keys? What are the other options in such cases?
UserProfile = { // this is a Column
>>but could be broken in case of a failed write<<
You can think of a scenario where R + W >N still leads to
inconsistency even for successful writes. Say you keep W=1 and R=N .
Lets say the one node where a write happened with success goes down
before it made to the other N-1 nodes. Lets say it goe
thanks Narendra. I read again the wiki quote you pasted below and now it
does make sense. Cassandra's design behavior is to propagate the failed
write if it was ever written successfully to atleast one server. I was
having hard time trying to work around this but I guess I am starting to
think the
You are missing the point. The coordinator node that is handling the request
won't wait for all the nodes to return their copy/digest of data. It just
wait for Q (RF/2+1) nodes to return. This is the reason I explained two
possible scenarios.
Further, on what basis Cassandra will know that the dat
Well, it will need all nodes that are required on the operation to be up,
and to response in a timely fashion, even a time-out rpc of 1 replica will
get you a fail response.
CL is calculated based on the RF configured for the ColumnFamily.
"The ConsistencyLevel is an enum that controls both read
Thanks all for good detail and clarification. I just wanted to get things
clear and understand correctly what is the expected behavior when working
with Cassandra against various failure conditions so that application can be
designed accordingly and provide proper locking/synchronization if require
I really don't see the point.. Again, suppose a cluster with 3 nodes, where
there is a ColumnFamily that will hold data which key is basically consisted
on a word of 2 letters (pretty simple). That's make a total of 729 posible
keys.
RandomPartitioner then will tokenize each key and assign them to
Javier Canillas wrote:
>
> Instead, when you execute the same OP using CL QUORUM, then it means
> RF /2+1, it will try to write on the coordinator node and replica.
> Considering only 1 replica is down, the OP will success too.
>
I am assuming even read will succeed when CL QUORUM and RF=3 and
I see the point - apologies for putting everyone through this!
It was just militating against my mental model.
In summary, here is my take away - simple stuff but - IMO - important to
conclude this thread (I hope):-
1. I was splitting hair over a failed ( partial ) Q Write. Such an event
should b
Not sure if there is a particular reason for you using different regions,
but Amazon states that each zone is a different physical location completely
separate from others, e.g. us-east-1a and us-east-1b. Using the Amazon
internal IPs (10.x. etc) reduces latency greatly by not going outbound
throu
What I am trying to ask is that what if there are billions of row keys (eg:
abc, def, xyz in below eg.) and then client does a lookup/query on a row say
xyz (get all cols for row xyz). Now since there are billions of rows look up
using Hash mechanism, is it going to be slow? What algorithm will be
Gotcha I had forgotten about the gossip piece, that makes sense.
-Original Message-
From: Edward Capriolo [mailto:edlinuxg...@gmail.com]
Sent: Wednesday, February 23, 2011 5:00 PM
To: Truelove, Jeremy: IT (NYK)
Cc: user@cassandra.apache.org
Subject: Re: Multiple Seeds
On Wed, Feb 23, 201
If you mean does it make sense to have a CF where each row contains a set of
keys to other rows in another CF, then yes, that's a common design pattern,
although usually it's because you're creating collections of those rows
(i.e. a Groups CF where each row consists of a set of keys to rows in the
Another possibility is this:
why not setup 2 nodes in 1 region in 1 az, and get that to work.
Then, open a third node in the same region, but different AZ, and get that
to work.
Then, once you have that working, open a fourth node in a different region
and get that to work.
Seems like taking a pi
On Thu, Feb 24, 2011 at 6:33 PM, Anthony John wrote:
> Completely understand!
>
> All that I am quibbling over is whether a CL of quorum guarantees
> consistency or not. That is what the documentation says - right. IF for a CL
> of Q read - it depends on which node returns read first to determine
Generally no. But yes if retrieving the key through index is faster than
going through the hash buckets.
Currently I am thinking there could be 100s of million or billion of rows
and in that case if we have to retrieve a row which one will be fast going
through hash bucket or index? I am thinkin
Completely understand!
All that I am quibbling over is whether a CL of quorum guarantees
consistency or not. That is what the documentation says - right. IF for a CL
of Q read - it depends on which node returns read first to determine the
actual returned result or other more convoluted conditions
On Thu, Feb 24, 2011 at 6:01 PM, Anthony John wrote:
> If you are correct and you are probably closer to the code - then CL of
> Quorum does not guarantee a consistency.
If the operation succeed, it does (for some definition of consistency which
is, following reads at Quorum will be guaranteed
If you are correct and you are probably closer to the code - then CL of
Quorum does not guarantee a consistency.
On Thu, Feb 24, 2011 at 10:54 AM, Sylvain Lebresne wrote:
> On Thu, Feb 24, 2011 at 5:34 PM, Anthony John wrote:
>
>> >>Time stamps are not used for conflict resolution - unless is is
>Time stamps are not used for conflict resolution - unless is is part of the
application logic!!!
This is false. In fact, the main reason Cassandra keeps timestamps is to do
conflict resolution. If there is a conflict between two replicas, when doing
a read or a repair, then the highest timestamp
On Thu, Feb 24, 2011 at 5:34 PM, Anthony John wrote:
> >>Time stamps are not used for conflict resolution - unless is is part of
>> the application logic!!!
>>
>
> >>What is you definition of conflict resolution ? Because if you update
> twice the same column (which
> >>I'll call a conflict), the
>
> >>Time stamps are not used for conflict resolution - unless is is part of
> the application logic!!!
>
>>What is you definition of conflict resolution ? Because if you update
twice the same column (which
>>I'll call a conflict), then the timestamps are used to decide which update
wins (which I
Hello,
Have there been Cassandra implementations in non-latin languages. In
particular: Mandarin (China) ,Devanagari (India), Korean (Korea)
I am interested in finding if there are storage, sorting or other
types of issues one should be aware of in these languages.
Thanks.
Do not copy the entire thread, only hit reply!
It seems as the thread grows in responses, the spam word count somehow kicks
in.
Thx,
-JA
On Thu, Feb 24, 2011 at 9:44 AM, Sasha Dolgy wrote:
> have you tried replying without copying in the entire conversation
> thread to the message?
>
> On Thu
have you tried replying without copying in the entire conversation
thread to the message?
On Thu, Feb 24, 2011 at 1:40 PM, Anthony John wrote:
> To the list owners - the error text that gmail comes back with is below
> Now I understand that much of what I write is spam quality, so the mail
> filt
On Thu, Feb 24, 2011 at 4:08 AM, Thibaut Britz
wrote:
> Hi,
>
> How would you use rsync instead of repair in case of a node failure?
>
> Rsync all files from the data directories from the adjacant nodes
> (which are part of the quorum group) and then run a compactation which
> will? remove all the
Hi
i'm using a 3 node cluster of cassandra 0.6.1 together with hector as api to
java client.
every few days I get a situation where I cannot connect to cassandra, other
than that the data dir is filling up the whole disk space and the
synchronization stops at these times, the exceptions I get are
Himanshi,
my bad, try this for iptables:
# SNAT outgoing connections
iptables -t nat -A POSTROUTING -p tcp --dport 7000 -d 175.41.143.192 -j SNAT
--to-source INTERNALIP
As for tcpdump the argument for the -i option is the interface name (eth0,
cassth0, etc...), and not the IP. So, it should be
t
On Thu, Feb 24, 2011 at 3:22 AM, Anthony John wrote:
> Apologies : For some reason my response on the original mail keeps bouncing
> back, thus this new one!
> > From the other hand, the same article says:
> > "For conditional writes to work, the condition must be evaluated at all
> update
> > si
>>c. Read with CL = QUORUM. If read hits node1 and node2/node3, new data
that was written to node1 will be returned.
>>In this case - N1 will be identified as a discrepancy and the change will
be discarded via read repair
>>[Naren] How will Cassandra know this is a discrepancy?
Because at Q - on
To the list owners - the error text that gmail comes back with is below
Now I understand that much of what I write is spam quality, so the mail
filter might actually be smart ;).
New posts works, as this one hopefully will. If is on reply that I have a
problem. Any pointers to avoid this situatio
First of all, in your example W=CL?
If it so, then the success of any read / write operarion will be
determine by if the CL required can be satisfied in that moment.
If you write with CL ONE over a CF with RF 3 when 1 node of the
replicas is down, then the operarion will success and HitedHandOff
Thanks Daniel.
But SNAT command is not working and when i try tcpdump it gives
[root@ip-10-136-75-201 ~]# tcpdump -i 50.18.60.117 -n port 7000
tcpdump: Invalid adapter index
Not able to figure out wats this ??
Thanks,
Himanshi
From:
Daniel van Ham Colchete
To:
user@cassandra.apache.org
Da
My 2 cents ..
1. Focus should be on the core problem Cassandra is solving i.e.
Availability, Partitioning and a form of consistency that works (in spite of
all the questions) . All this with high performance is a huge step forward -
architecturally!
2. Any enhancement should shore up the core valu
I dont think i got the point in your question. But if you are thinking
about key indexes (like PKs), take in mind that cassandra will manage
keys using the partition strategy. By doing so, it will be able to
determine on which node the row with such key should be hold.
So, in another words, inside
Himanshi,
you could try adding your public IP address to an internal interface and
DNAT the packets to it. This shouldn't give you any problems with your
normal traffic. Tell Cassandra on listen on the public IPs and it should
work.
Linux commands would be:
# Create an internal interface using b
Hi,
How would you use rsync instead of repair in case of a node failure?
Rsync all files from the data directories from the adjacant nodes
(which are part of the quorum group) and then run a compactation which
will? remove all the unneeded keys?
Thanks,
Thibaut
On Thu, Feb 24, 2011 at 4:22 AM,
If it cannot protect against lost updates, isn't that an issue? How is client
support to protect against concurrency? I see lot of users mentioning the
use of cages (i.e. use ZooKeeper) but involving locks on every writes at the
application level is certainly not acceptable. And again, the applica
79 matches
Mail list logo