I forgot to mention that the cluster cannot be taken down, it needs to
continue serving... is there another way?
On May 21, 2010 3:03 AM, "Jonathan Ellis" wrote:
One possibility:
rsync the data to the next node in the ring that is in the same DC.
(specifically, rsync once, then flush on the sou
Hi Fellows,
I just joined this mailing list but I've been on the IRC for a while. Pardon if
this post is a repeat but I would like to share with you some of my experiences
with Cassandra Thrift Interface that comes with the nightly built and probably
0.7. I came across an issue last night that
What inner mechanism does Cassandra adopt to get this kind of fault
tolerance?
2010/5/20 Simon Smith
> On Thu, May 20, 2010 at 8:08 AM, 史英杰 wrote:
> > Hi, All,
> > I am now learning the mechanism Cassandra adopts to get high
> > availability and fault tolerance. As I know, we should connec
Excellent leads, thanks. cassandra.in.sh has a heap of 6GB, but I didn't
realize that I was trying to float so many memtables. I'll poke tomorrow
and report if it gets fixed.
Ian
On Thu, May 20, 2010 at 10:40 AM, Jonathan Ellis wrote:
> Some possibilities:
>
> You didn't adjust Cassandra heap
Were you bootstrapping or otherwise moving nodes around?
I don't think anyone's tracked this bug down farther than "if you
restart the entire cluster, it goes away."
On Wed, May 19, 2010 at 10:05 PM, Keith Thornhill wrote:
> in a 5 node cluster, i noticed in our client error log that one of the
One possibility:
rsync the data to the next node in the ring that is in the same DC.
(specifically, rsync once, then flush on the source node and rsync
again.) Then stop the entire cluster, and restart everyone but those
two nodes. Then run nodetool repair on each machine.
If your client is not
Look in /contrib it's already there.
On May 20, 2010, at 6:23 PM, Mark Robson wrote:
On 20 May 2010 23:16, Ryan Daum wrote:
I personally would love to see Cassandra add the concept of a read-
only 'proxy' node which acts like the embedded ready only mode (Java
'fat client') but sits as a
On 20 May 2010 23:16, Ryan Daum wrote:
> I personally would love to see Cassandra add the concept of a read-only
> 'proxy' node which acts like the embedded ready only mode (Java 'fat
> client') but sits as a stand alone server. It would know the the entire ring
> and watch Gossip and thus be abl
I personally would love to see Cassandra add the concept of a read-only
'proxy' node which acts like the embedded ready only mode (Java 'fat
client') but sits as a stand alone server. It would know the the entire ring
and watch Gossip and thus be able to direct requests to the most appropriate
node
On 20 May 2010 20:17, David Wellman wrote:
> I have a 5 node cassandra cluster and I am wondering if there is any
> advantage of setting up a connection pool that is balanced across all 5
> nodes (IE: 50 connections = 10 per node) over one pool all to one server (50
> connection => one node)
De
Or possibly in a language for which there are R<->$language IDL bindings. I
would not be surprised if you could do this using SWIG.
http://www.swig.org/Doc1.3/R.html
On Thu, May 20, 2010 at 2:01 PM, Jonathan Ellis wrote:
> You'd probably need to build a proxy in a language that Thrift supports
No, CM is not exposed to nodetool yet. (You should really be putting
metrics into a real monitoring system rather than relying on nodetool.
Some example munin plugins are at
http://github.com/jbellis/cassandra-munin-plugins, for instance.)
CM also has BytesCompacted/BytesTotalInProgress.
Backup
"disseminating load info" is not related to your problem.
certainly you should be using connection pooling rather than opening a
ton of sockets.
You didn't say what CL you are using, but you should not use CL.ZERO
for load tests.
On Thu, May 20, 2010 at 11:14 AM, xavier manach wrote:
> Hi.
>
>
yes, the extra io + cpu caused by decomission will affect reads (and
writes, to a lesser degree)
On Thu, May 20, 2010 at 10:18 AM, Maxim Kramarenko
wrote:
> It reports, that node 3 transfer data to node 1. As I remember, node 2
> doesn't send or receive data.
>
> BTW, after a few hrs (probably, a
HBase has the same problem.
Your choices are basically (a) figure out a way to not do all writes
sequentially or (b) figure out a way to model w/o OPP.
Most Cassandra users go with option (b).
On Thu, May 20, 2010 at 8:21 AM, Sonny Heer wrote:
> Yes, I'm using OOP, because of the way we modeled
You'd probably need to build a proxy in a language that Thrift supports.
On Thu, May 20, 2010 at 2:49 AM, Kyusik Chung wrote:
> Does anyone have suggestions on how to access cassandra from R
> (http://www.r-project.org/)?
>
> Thanks!
>
> Kyusik Chung
--
Jonathan Ellis
Project Chair, Apache C
Hello,
It's unclear if you're looking for data that can be stored in Cassandra or
> an example of someone using Cassandra to store a network; I'm assuming the
> former.
>
You're assuming incorrectly. I'm looking an example of someone using
Cassandra to store a graph.
> You will have a hard time
Hi,
In the 0.5.x series there was a COMPACTION-POOL which kept track of
in process, pending and completed compactions. With 0.6.x this seems
to have vanished and instead we only have the CompactionManager PendingTasks
statistic. Is there also a completed tasks somewhere? Is there any
way to d
I have a 5 node cassandra cluster and I am wondering if there is any advantage
of setting up a connection pool that is balanced across all 5 nodes (IE: 50
connections = 10 per node) over one pool all to one server (50 connection =>
one node)
Hi.
I am neebie in Cassandra. I study and compare performance of databases for
choose my future architecture.
I try to load a lot of datas in cassandra.
I use python with protocol thrift (very simple, whithout threading)
I do sequentials requests : client.batch_mutate and client.get_slice.
(a
It's unclear if you're looking for data that can be stored in Cassandra or an
example of someone using Cassandra to store a network; I'm assuming the former.
You will have a hard time finding a social network dataset with relationships
already well-defined for free. I have seen crawls of Twitte
On Wed, May 19, 2010 at 7:15 PM, Torsten Curdt wrote:
> We are currently working on a prototype that is using Cassandra for
> realtime-ish statistics system. This seems to be quite a common use
> case. If people are interested - maybe it be worth collaborating on
> this beyond design discussions
It reports, that node 3 transfer data to node 1. As I remember, node 2
doesn't send or receive data.
BTW, after a few hrs (probably, after decomission finished), node 2
become work again, without restart. Can decommission of node 3 affect
reading ?
On 20.05.2010 18:51, Jonathan Ellis wrote:
MIne is under developement. Sorry I can't help you at the moment :(
Regards,
Michael
On Thu, May 20, 2010 at 12:09 PM, Valerio Schiavoni <
valerio.schiav...@gmail.com> wrote:
> Not strictly Facebook.
> Any online social network is ok to me, as long as it has a reasonable
> number of users and
Not strictly Facebook.
Any online social network is ok to me, as long as it has a reasonable number
of users and that it's built on top of a schema-less storage system.
Are you looking for Facebook stuff? Good luck on getting a data set from any
> real world model.
>
>
> Hello everyone,
>> i'm a
Are you looking for Facebook stuff? Good luck on getting a data set from any
real world model.
Regards,
Michael
On Thu, May 20, 2010 at 11:53 AM, Valerio Schiavoni <
valerio.schiav...@gmail.com> wrote:
> Hello everyone,
> i'm a phd student looking for some real-world dataset of any social
> ne
Hello everyone,
i'm a phd student looking for some real-world dataset of any social networks
built on top of some schema-less storage system.
The dataset should at least provide a mean to reconstruct the graph of
users.
Due to possible sensible informations in the dataset, the dataset can be
very p
meant to say OPP :)
On Thu, May 20, 2010 at 8:21 AM, Sonny Heer wrote:
> Yes, I'm using OOP, because of the way we modeled our data. Does
> Cassandra not handle OOP intensive write operations? Is HBase a
> better approach if one must use OOP?
>
>
> On Thu, May 20, 2010 at 7:41 AM, Jonathan Elli
Yes, I'm using OOP, because of the way we modeled our data. Does
Cassandra not handle OOP intensive write operations? Is HBase a
better approach if one must use OOP?
On Thu, May 20, 2010 at 7:41 AM, Jonathan Ellis wrote:
> Are you using OOP? That will tend to create hot spots like this,
> whi
On Wed, May 19, 2010 at 1:37 AM, Peng Guo wrote:
> Thanks for you information.
>
> I look at some source code of the implement. There still some question:
>
> 1 How did I know that the binary write message send to endpoint success?
It doesn't. It's fire-and-forget. If you look at the example it
No.
On Wed, May 19, 2010 at 4:12 PM, Beier Cai wrote:
> Thanks Jonathan, using mysql as an id sequence generator definitely is a
> good options. One thing though, does using sequential ids defeat the purpose
> of random partitioner?
>
> On Tue, May 18, 2010 at 11:25 PM, Jonathan Ellis wrote:
>>
What does JMX report as described in
http://wiki.apache.org/cassandra/Streaming ?
2010/5/19 Maxim Kramarenko :
> Hello!
>
> I have 3 node cluster: node1, node2, node3. Replication factor = 2.
> I run decommission on node3 and it's in progress, moving data to node1
>
> Ring on all nodes show all 3
Are you using OOP? That will tend to create hot spots like this,
which is why most people deploy on RP.
If you are using RP you may simply need to add C* capacity, or take
TimeoutException as a signal to throttle your activity.
On Tue, May 18, 2010 at 4:37 PM, Sonny Heer wrote:
> Yeah there are
Some possibilities:
You didn't adjust Cassandra heap size in cassandra.in.sh (1GB is too small)
You're inserting at CL.ZERO (ROW-MUTATION-STAGE in tpstats will show
large pending ops -- large = 100s)
You're creating large rows a bit at a time and Cassandra OOMs when it
tries to compact (the oom sh
Here are the basics I discuss in Riptano's training classes:
http://github.com/jbellis/cassandra-munin-plugins
On Mon, May 17, 2010 at 3:02 PM, Maxim Kramarenko
wrote:
> Hi!
>
> Which JMX metrics do you use for Cassandra monitoring ? Which values can be
> used for alerts ?
>
--
Jonathan Ellis
Yes, if you add nodes when the existing one doesn't have enough data
to guess a good token from the keys it has, it uses a random token.
Created https://issues.apache.org/jira/browse/CASSANDRA-1112 to use
midpoint instead.
On Mon, May 17, 2010 at 4:06 PM, Chris Shorrock wrote:
> I have a feeling
You can also easily use VM (using NAT instead of bridged to avoid DHCP
when and if your computer connects to different network) to launch several
nodes with different IPs on your single host machine (I suppose you want to
do this for training)
On Thu, 20 May 2010 07:03:57 -0700, Jonathan Ellis
w
You can easily run on 127.0.0.1, 127.0.0.2, etc.
On Thu, May 20, 2010 at 7:00 AM, Yan Virin wrote:
> It seems to be impossible to run several cassandra instances on a
> localmachine, due to the fact that the seeds are described as ip addresses
> and not couples of ip address and port.
> Is this c
On Thu, May 20, 2010 at 8:08 AM, 史英杰 wrote:
> Hi, All,
> I am now learning the mechanism Cassandra adopts to get high
> availability and fault tolerance. As I know, we should connect to one
> server of Cassandra first, then we can read or write data through it, so if
> the server which we co
Hi, All,
I am now learning the mechanism Cassandra adopts to get high
availability and fault tolerance. As I know, we should connect to one
server of Cassandra first, then we can read or write data through it, so if
the server which we connect to get down, what will happen? Should we have to
40 matches
Mail list logo