@michael - benjamin answered your question.
Thing is if you use mysql just for indices you are not at all using the
benefits of the whole relational database engine(which is fine) but then are
inheriting all its disadvantages.
You can use mysql for storing indices and then write your own sharding
Hi,
We have an application that uses Cassandra to store data. The application is
deployed on multiple nodes that are part of an application cluster. We are
at present using single Cassandra node. We have noticed few errors in
application and our analysis revealed that the root cause was that the c
On Tue, Jul 13, 2010 at 04:35, Mubarak Seyed wrote:
> Where can i find the documentation for BinaryMemTable (btm_example in contrib)
> to use CassandraBulkLoader? What is the input to be supplied to
> CassandraBulkLoader?
> How to form the input data and what is the format of an input data?
The
They are not complicated, its more that they are not in the package that
they should be in.
I assume the client package exposes the functionality of the server and it
does not have the ability to manage the tables in the database that to me
seems to be extremely limiting.
When I did not see that co
On Mon, Jul 12, 2010 at 11:44 PM, Benjamin Black wrote:
> We use Cassandra (multidimensional metrics) *and* redis (counters and
> alerts) *and* MySQL (supporting Rails). Right tool for each job. The
> idea that it is a good thing to cram everything into a single database
> (and data model), beat
For read, the bottleneck is usually the disk.
Use iostat to check the utility of your disks.
On Tue, Jul 13, 2010 at 2:07 PM, Peter Schuller wrote:
> > Has anyone experimented with different settings for concurrent reads? I
> > have set our servers to 4 ( 2 per processor core ). I have notic
I'm not entirely sure but I think you can only use get_range_slices
with start_key/end_key on a cluster using OrderPreservingPartitioner.
Dont know if that is intentional or buggy like Jonathan suggest but I
saw the same "duplicates" behaviour when trying to iterate all rows
using RP and start_key/
Thanks for the links.
Actually it is pretty easy to catch those tombstoned keys on the
client side. However, in certain applications it can generate some
additional overhead on the network.
I think it would be nice to have a forced garbage collection in the
API. This would IMHO ease to write Unit
>I'm not entirely sure but I think you can only use get_range_slices
>with start_key/end_key on a cluster using OrderPreservingPartitioner.
>Dont know if that is intentional or buggy like Jonathan suggest but I
>saw the same "duplicates" behaviour when trying to iterate all rows
>using RP and start
On Tue, Jul 13, 2010 at 7:38 AM, Thomas Heller wrote:
> I'm not entirely sure but I think you can only use get_range_slices
> with start_key/end_key on a cluster using OrderPreservingPartitioner.
> Dont know if that is intentional or buggy like Jonathan suggest but I
> saw the same "duplicates" be
You should use ntp in daemon mode, not as a one-time fix.
http://linux.die.net/man/1/ntpd
On Tue, Jul 13, 2010 at 2:45 AM, Narendra Sharma
wrote:
> Hi,
>
> We have an application that uses Cassandra to store data. The application is
> deployed on multiple nodes that are part of an application clu
The only issue I see (please correct me if I am wrong) is that you loose, is
that you have single points of failure in the system now i.e. redis etc.
On Tue, Jul 13, 2010 at 3:33 AM, Sandeep Kalidindi at PaGaLGuY.com <
sandeep.kalidi...@pagalguy.com> wrote:
> @michael - benjamin answered your que
The iostat numbers are rather low as is cpu utilization. We have a couple
of nightly jobs which do a lot of reads in a short amount of time. That is
when the pending reads was climbing. I'm going to bump up the number and
see how things run.
Lee Parker
On Tue, Jul 13, 2010 at 6:18 AM, Schubert
@Ahmed -
we are trying to use Redis + gizzard - with gizzard responsible for sharding
and maintaining replicas . Need to test it well before plunging into
production though.
Cheers,
Deepu.
On Tue, Jul 13, 2010 at 7:46 PM, S Ahmed wrote:
> The only issue I see (please correct me if I am wrong)
I recently ran across a blog posting with a comment from a Cassandra committer
that indicated a performance penalty when having a large number of columns per
row/key. Unfortunately I didn't bookmark the blog posting and now I can't find
it. Regardless, since our current plan and design is to h
Hi,
I have set up a ring with a couple of servers and wanted to run some
stress tests.
Unfortunately, there is some kind of bottleneck at the client side.
I'm using Hector and Cassandra 0.6.1.
The subsequent profile results are based on a small Java program that
inserts sequentially records, wit
Currently there is a limitation that each row must fit in memory (with some
not insignificant overhead), thus having lots of columns per row can trigger
out-of-memory errors. This limitation should be removed in a future
release.
Please see:
- http://wiki.apache.org/cassandra/CassandraLimitation
Yes
-Original Message-
From: Jonathan Ellis [jbel...@gmail.com]
Received: 7/12/10 9:15 PM
To: user@cassandra.apache.org [u...@cassandra.apache.org]
Subject: Re: GCGraceSeconds per ColumnFamily/Keyspace
Probably. Can you open a ticket?
On Mon, Jul 12, 2010 at 10:41 PM, Todd Burruss wr
On Tue, Jul 13, 2010 at 2:43 AM, Paul Prescod wrote:
> On Mon, Jul 12, 2010 at 11:44 PM, Benjamin Black wrote:
>> We use Cassandra (multidimensional metrics) *and* redis (counters and
>> alerts) *and* MySQL (supporting Rails). Right tool for each job. The
>> idea that it is a good thing to cram
On Tue, Jul 13, 2010 at 5:47 AM, Samuru Jackson
wrote:
> Thanks for the links.
>
> Actually it is pretty easy to catch those tombstoned keys on the
> client side. However, in certain applications it can generate some
> additional overhead on the network.
>
> I think it would be nice to have a forc
I updated the Ruby client to 0.7, but I am not a Cassandra committer
(and not much of a Java guy), so haven't touched the Java client. Is
there more to it than regenerating Thrift bindings?
On Tue, Jul 13, 2010 at 1:42 AM, GH wrote:
> They are not complicated, its more that they are not in the p
We are planning a rollout of our online product ~September 1. Cassandra is a
major part of our online system.
We need some Cassandra consulting + general online consulting for
determining our server configuration so it will support Cassandra under all
possible scenarios.
Does anybody have any ide
http://riptano.com
On Tue, Jul 13, 2010 at 9:14 AM, David Boxenhorn wrote:
> We are planning a rollout of our online product ~September 1. Cassandra is a
> major part of our online system.
>
> We need some Cassandra consulting + general online consulting for
> determining our server configuration
On Tue, Jul 13, 2010 at 12:45 AM, Narendra Sharma
wrote:
> How are other Cassandra users handling the clock sync in production
> environment?
>
By structuring access in the app such that there are never conflicts
in the first place, for example by using UUIDs for row and column
names. At the p
Are there any plans or talks of adding SSL/encryption support between
Cassandra nodes? This would make setting up secure cross-country Cassandra
clusters much easier, without having to setup a secure overlay network.
MySQL supports this in it's replication.
-Ben
On Mon, Jul 12, 2010 at 11:23 P
Since you're using hector hector-users@ is a good place to be, so
u...@cassandra to bcc
operateWithFailover is one stop before sending the request over the network
and waiting, so it makes lots of sense that a significant part of the
application is spent in it.
On Tue, Jul 13, 2010 at 6:22 PM, Sa
Thanks Torsten.
Jonathan's blog on Fact Vs Fiction says that
Fact: It has always been straightforward to send the output of Hadoop jobs
to Cassandra, and Facebook, Digg, and others have been using Hadoop like
this as a Cassandra bulk-loader for over a year.
Does anyone from Facebook or Digg shar
Just want some clarifications on thrift.
1. thrift creates a layer between Cassandra and the client, specific to
whatever language you want.
2. thrift generates an interface to Cassandra's service endpoints
*3. when Cassandra's endpoints have been modified, thrift needs to be
re-generated (along
https://issues.apache.org/jira/browse/CASSANDRA-1276
On Tue, 2010-07-13 at 09:05 -0700, Todd Burruss wrote:
> From: Jonathan Ellis [jbel...@gmail.com]
> Received: 7/12/10 9:15 PM
> To: user@cassandra.apache.org [u...@cassandra.apache.org]
> Subject: Re: GCGraceSeconds per ColumnFamily/Keyspace
>
> Just want some clarifications on thrift.
> 1. thrift creates a layer between Cassandra and the client, specific to
> whatever language you want.
Well, thrift allows cassandra to expose an RPC interface in a language
neutral fashion.
> 2. thrift generates an interface to Cassandra's service endp
So it would appear that 0.7 will have solved the requirement that a single row
must be able to fit in memory. That issue aside, how would one expect the
read/write performance to be in the scenarios listed below?
From: Mason Hale [mailto:ma...@onespot.com]
Sent:
look at contrib/bmt_example, with the caveat that it's usually
premature optimization
On Tue, Jul 13, 2010 at 12:31 PM, Mubarak Seyed wrote:
> Thanks Torsten.
> Jonathan's blog on Fact Vs Fiction says that
> Fact: It has always been straightforward to send the output of Hadoop jobs
> to Cassandra
We would like to do one in Europe in October.
On Fri, Jul 9, 2010 at 11:02 AM, Dave Gardner wrote:
>
> Do you have a rough estimate as to when there might be a training day in
> London (UK). I'm currently weighing up whether I should be making a journey
> across the pond for one of the US-based e
On Fri, Jul 9, 2010 at 9:36 AM, Jeremy Dunck wrote:
> On Fri, Jul 2, 2010 at 1:08 PM, Jonathan Ellis wrote:
>> Riptano's one day Cassandra training is coming to NYC in August, our
>> first public session on the East coast:
>> http://www.eventbrite.com/event/749518831
>
> Is there a calendar where
did you look at compaction activity?
On Mon, Jul 12, 2010 at 9:31 AM, Olivier Rosello wrote:
>> > But in Cassandra output log :
>> > r...@cassandra-2:~# tail -f /var/log/cassandra/output.log
>> > INFO 15:32:05,390 GC for ConcurrentMarkSweep: 1359 ms, 4295787600
>> reclaimed leaving 1684169392 u
It's been suggested, but it's not very useful w/o having encryption
for Thrift as well (in case a client has to fail over to the
cross-country Cassandra nodes). So using a secure VPN makes the most
sense to me.
On Tue, Jul 13, 2010 at 12:02 PM, Ben Standefer wrote:
> Are there any plans or talks
Many apps would find it realistic or feasible to failover database
connections across the country (going from <1ms latency to ~90ms latency).
The scheme of failing over client database connections across the country
is probably the minority case. SSL between Cassandra nodes, even without
encrypti
Err, find it *unrealistic*
-Ben
On Tue, Jul 13, 2010 at 2:22 PM, Ben Standefer wrote:
> Many apps would find it realistic or feasible to failover database
> connections across the country (going from <1ms latency to ~90ms latency).
> The scheme of failing over client database connections acro
> look at contrib/bmt_example, with the caveat that it's usually
> premature optimization
I wish that was true for us :)
>> Fact: It has always been straightforward to send the output of Hadoop jobs
>> to Cassandra, and Facebook, Digg, and others have been using Hadoop like
>> this as a Cassandra
If you do not need range scans (and assuming Random Partitioner), I would probably go with B. I tend to feel better when things are spread out. I'm not sure on any overhead on asking the coordinator to send requests to a lot of nodes. But I feel that it will make better use of new nodes added to th
Benjamin,
Yes i have seen this when adding a new node into the cluster. the new node
doesnt see the complete ring through nodetool, but the strange part is that
looking at the ring through jconsole shows the complete ring. it as if
there is a big in nodetool publishing the actual ring. has anyo
Are you interested in contributing this?
On Tue, Jul 13, 2010 at 4:22 PM, Ben Standefer wrote:
> Many apps would find it realistic or feasible to failover database
> connections across the country (going from <1ms latency to ~90ms latency).
> The scheme of failing over client database connection
Hi, has anyone been able to load balance a Cassandra cluster with an AWS
Elastic Load Balancer? I've setup an ELB with the obvious settings (namely,
--listener "lb-port=9160,instance-port=9160,protocol=TCP") but client's
simply hang trying to load records from the ELB hostname:9160.
Thanks,
--Bria
I know Cassandra 0.7 isn't released yet, but I was wondering if anyone
has used Pelops with the latest builds of Cassandra? I'm having some
issues, but I wanted to make sure that somebody else isn't working on
a branch of Pelops to support Cassandra 7. I have downloaded and built
the latest code fr
I just build today's trunk successfully and am getting the following exception
on startup which to me it seams bogus as the method exists but I don't know why:
ERROR 15:27:00,957 Exception encountered during startup.
java.lang.NoSuchMethodError: org.apache.cassandra.db.ColumnFamily.id()I
Hector doesn't have 0.7 support yet
On Jul 14, 2010 1:34 AM, "Peter Harrison" wrote:
I know Cassandra 0.7 isn't released yet, but I was wondering if anyone
has used Pelops with the latest builds of Cassandra? I'm having some
issues, but I wanted to make sure that somebody else isn't working on
a
ant clean
On Tue, Jul 13, 2010 at 5:33 PM, Arya Goudarzi wrote:
> I just build today's trunk successfully and am getting the following
> exception on startup which to me it seams bogus as the method exists but I
> don't know why:
>
> ERROR 15:27:00,957 Exception encountered during startup.
> ja
I haven't used ELB, but I've setup HAProxy to do it... appears to work well
so far.
Dave Viner
On Tue, Jul 13, 2010 at 3:30 PM, Brian Helfrich wrote:
> Hi, has anyone been able to load balance a Cassandra cluster with an AWS
> Elastic Load Balancer? I've setup an ELB with the obvious settings (
Hi Gary,
Thanks for the reply. I tried this again today. Streams gets stuck, pls read my
comment:
https://issues.apache.org/jira/browse/CASSANDRA-1221
-arya
- Original Message -
From: "Gary Dusbabek"
To: user@cassandra.apache.org
Sent: Wednesday, June 23, 2010 5:40:02 AM
Subject: Re:
To be honest I do not know how to regenerate the binidings, I will look into
that.
ollowing your email, I went on and took the unit test code and created a
client. Given that this code works I am guessing that the thrift bindings
are in place and it is more that the client code does not support the
http://github.com/danwashusen/pelops/tree/cassandra-0.7.0
p.s. Pelops doesn't have any test coverage and my implicit tests (my app
integration tests) don't touch anywhere near all of the Pelops API.
p.s.s. I've made API breaking changes to support the new 0.7.0 API and
Dominic (the original Pelop
On Wed, Jul 14, 2010 at 2:43 PM, Dan Washusen wrote:
> http://github.com/danwashusen/pelops/tree/cassandra-0.7.0
Doh - I've just finished making most of the changes for the new API.
> p.s. Pelops doesn't have any test coverage and my implicit tests (my app
> integration tests) don't touch anywhe
Check out step 4 of this page:
https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
./compiler/cpp/thrift -gen php
../PATH-TO-CASSANDRA/interface/cassandra.thrift
That is how to compile the thrift client from the cassandra bindings. Just
replace the "php" with the language of your ch
Very cool stuff, thanks for the info Dave, I will give this a shot...
On Wed, Jul 14, 2010 at 1:03 PM, Dave Viner wrote:
> Check out step 4 of this page:
> https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
>
> ./compiler/cpp/thrift -gen php
> ../PATH-TO-CASSANDRA/interface/cassan
Yes, possibly. We haven't written it yet, and I was putting some feelers
out there to see if there's any interest or buy-in from committers if we did
contribute it.
-Ben
On Tue, Jul 13, 2010 at 3:23 PM, Jonathan Ellis wrote:
> Are you interested in contributing this?
>
> On Tue, Jul 13, 2010
Yep, as Ben said, we're not asking for anyone to write this for us.
We've been playing with some ideas around encryption between EC2
data-centers/regions (intra-region is already secure enough for us -- it's
all switches / dedicate lines) and the easiest solution seems to be to wrap
the inter-Cass
56 matches
Mail list logo