Re: Install Cassandra on EC2

2011-08-03 Thread Dave Viner
Hi Eldad, Check out http://wiki.apache.org/cassandra/CloudConfig There are a few ways listed there including a step-by-step guide. Dave Viner On Wed, Aug 3, 2011 at 7:49 AM, Eldad Yamin wrote: > Thanks! > But I prefer to learn how to Install first - if you have any good > refe

[OT] shout out for riptano training

2010-12-09 Thread Dave Viner
Just wanted to give a shout-out to Jonathan Ellis & the Riptano team for the awesome training they provided yesterday in Santa Monica. It was awesome, and I'd highly recommend it for anyone who is using or seriously considering using Cassandra. Just. freakin awesome. Dave Viner

Re: Quorum and Datacenter loss

2010-12-12 Thread Dave Viner
S1 and DC1-S2 are still responding, your read query will succeed. Of course, you are on the knife-edge. If you lost any machine in DC1 while DC2 is out, then you would not be able to satisfy QUORUM reads or writes. Note that in 0.7, the new LOCAL_QUORUM on reads would solve the issue, I believe.

Re: Cassandra Monitoring

2010-12-19 Thread Dave Viner
How does mx4j compare with the earlier jmx-to-rest bridge listed in the operations page: "JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge"; Thanks Dave Viner On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory wrote: > FYI, I just added an mx4j s

Re: Cassandra Monitoring

2010-12-19 Thread Dave Viner
dra jmx-to-rest runs in a separate jvm. > > It also has a nice useful HTML interface that you can look into any > > running host. > > > > On Sunday, December 19, 2010, Dave Viner wrote: > >> How does mx4j compare with the earlier jmx-to-rest bridge listed in the >

Re: Virtual IP / hardware load balancing for cassandra nodes

2010-12-20 Thread Dave Viner
You can put a Cassandra cluster behind a load balancer. One thing to be cautious of is the health check. Just because the node is listening on port 9160 doesn't mean that it's healthy to serve requests. It is required, but not sufficient. The real test is the JMX values. Dave Vine

Re: WELCOME to user@cassandra.apache.org

2010-12-29 Thread Dave Viner
I think you can probably find all the information on the cassandra wiki at http://wiki.apache.org/cassandra/ for free. Dave Viner On Wed, Dec 29, 2010 at 8:51 PM, asil klin wrote: > Can't I get it for free from anywhere? I am a student researching on > Cassandra datastore

Re: Does Cassandra run better on Amazon EC2 or Rackspace cloud servers?

2011-01-03 Thread Dave Viner
Since it's all pay-for-use, you could build your system on both, then do whatever stress testing you want. The cassandra part of your app should be unchanged between different cloud providers. Personally, I'm using EC2 and don't have any complaints. Dave Viner On Mon, Jan 3,

anyone using Cassandra as an analytics/data warehouse?

2011-01-04 Thread Dave Viner
1/2010 from the US"? - "tell me how many page views occurred between 12/01/2010 and 12/31/2010 from the US in the 9th hour of the day (in gmt)"? Time slicing and dimension slicing seems like it might be very challenging (especially since the windows of time would not be known in advance). Thanks Dave Viner

Re: anyone using Cassandra as an analytics/data warehouse?

2011-01-04 Thread Dave Viner
n the query you will be > doing. Don't get too hung up on traditional data structures and queries as > they have little relationship to a Cassandra approach. > > > > On Wed, Jan 5, 2011 at 2:34 PM, Dave Viner wrote: > >> Does anyone use Cassandra to power an analytics

upgrading to 0.7 from 0.6.x

2011-01-11 Thread Dave Viner
Is the idea to install 0.7 alongside the 0.6 instance or to replace the entire machine/node with a new node running 0.7? If so, is there a step 1b, which is "install 0.7 code & executables" ? Thanks Dave Viner

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-15 Thread Dave Viner
Perl using the thrift interface directly. On Sat, Jan 15, 2011 at 6:10 AM, Daniel Lundin wrote: > python + pycassa > scala + Hector > > On Fri, Jan 14, 2011 at 6:24 PM, Ertio Lew wrote: > > Hey, > > > > If you have a site in production environment or considering so, what > > is the client that

Re: Super CF or two CFs?

2011-01-17 Thread Dave Viner
can you give an example of the data and how you'd access it? what would your expected columns (and/or supercolumns) be? Dave Viner On Mon, Jan 17, 2011 at 11:05 AM, Steven Mac wrote: > How can I best map an object containing two maps, one of which is updated > very frequently an

Re: Cassandra automatic startup script on ubuntu

2011-01-20 Thread Dave Viner
You can also use the apt-get repository version, which installs the startup script. On http://wiki.apache.org/cassandra/CloudConfig, see the Cassandra Basic Setup section. It applies to any debian based machine, not just cloud instances. HTH Dave Viner On Thu, Jan 20, 2011 at 9:11 AM, Donal

Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Dave Viner
" would be fantastic. Dave Viner On Fri, Jan 21, 2011 at 1:01 PM, Aaron Morton wrote: > Yup, you can use diff ports and you can give them different cluster names > and different seed lists. > > After you upgrade the second cluster partition the data should repair > across, eith

quick shout-out to the riptano/datastax folks!

2011-02-02 Thread Dave Viner
Just a quick shout-out to the riptano folks and becoming part of/forming DataStax! Congrats!

Re: Cassandra nodes on EC2 in two different regions not communicating

2011-02-22 Thread Dave Viner
ity-groups"; Assuming that both nodes are in the same security group, ensure that the SG is configured to allow other members of the SG to communicate on port 7000 to each other. HTH, Dave Viner On Tue, Feb 22, 2011 at 8:59 PM, Himanshi Sharma wrote: > > Hi, > > I am ne

Re: Cassandra nodes on EC2 in two different regions not communicating

2011-02-23 Thread Dave Viner
Try using the IP address, not the dns name in the cassandra.yaml. If you can telnet from one to the other on port 7000, and both nodes have the other node in their config, it should work. Dave Viner On Wed, Feb 23, 2011 at 1:43 AM, Himanshi Sharma wrote: > > Ya they do. Have specified

Re: Cassandra nodes on EC2 in two different regions not communicating

2011-02-23 Thread Dave Viner
the other node, which IP address did you use - the 10.x address or the public ip address? And what is the seed/non-seed configuration in both cassandra.yaml files? Dave Viner On Wed, Feb 23, 2011 at 8:12 AM, Frank LoVecchio wrote: > The internal Amazon IP address is what you will want to use

cassandra as user-profile data store

2011-02-23 Thread Dave Viner
window, user x must be interested in buying a car). I don't have specifics as yet... just some general thoughts. But this feels like a Cassandra type problem. (User profile can have lots of columns per user, but the exact columns might differ from user to user... very scalable, etc) Thanks Dave Viner

Re: Cassandra nodes on EC2 in two different regions not communicating

2011-02-23 Thread Dave Viner
hat is routable from the other node. I would first try with the actual public IP address (not the Elastic IP). Once you get that to work, then shutdown the cluster, change the listen_address to the EIP, boot up and try again. Dave Viner On Wed, Feb 23, 2011 at 8:54 PM, Himanshi Sharma wrote: >

Re: Cassandra nodes on EC2 in two different regions not communicating

2011-02-23 Thread Dave Viner
ddress > field, Cassandra gives the same exception but if leave it blank then > Cassandra runs but again in the nodetool command with ring option it does'nt > show the node in another region. > > Thanks, > Himanshi > > > -Dave Viner wrote: ----- > > To: user

Re: Cassandra nodes on EC2 in two different regions not communicating

2011-02-24 Thread Dave Viner
piece-meal approach would be beneficial here. Dave Viner On Thu, Feb 24, 2011 at 6:11 AM, Daniel van Ham Colchete < daniel.colch...@gmail.com> wrote: > Himanshi, > > my bad, try this for iptables: > > # SNAT outgoing connections > iptables -t nat -A POSTROUTIN

Re: cassandra as user-profile data store

2011-03-01 Thread Dave Viner
e timelines for each user to build a "Profile"? Are you on 0.7 or 0.6.x? Dave Viner On Tue, Mar 1, 2011 at 1:31 AM, Dave Gardner wrote: > Dave > > Tyler's answer already covers CFs etc.. > > We are using Cassandra to store user profile data for exactly the sort o

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Dave Viner
similarly with EBS. In EBS you pay for the disk size you allocate. There's a tiny additional charge for IO (currently $0.10 per 1M io requests). HTH, Dave Viner On Wed, Mar 9, 2011 at 8:48 AM, Sasha Dolgy wrote: > Hi Will, > > http://wiki.apache.org/cassandra/Operations#B

Re: EC2 - 2 regions

2011-03-18 Thread Dave Viner
ode that's in a separate region. Taking it step-by-step will ensure that any issues are specific to the region-to-region communication, rather than intra-zone connectivity or cassandra cluster configuration. Dave Viner On Fri, Mar 18, 2011 at 8:34 AM, A J wrote: > Hello, > >

Re: EC2 - 2 regions

2011-03-18 Thread Dave Viner
>From the us-west instance, are you able to connect to the us-east instance using telnet on port 7000 and 9160? If not, then you need to open those ports for communication (via your Security Group) Dave Viner On Fri, Mar 18, 2011 at 10:20 AM, A J wrote: > Thats exactly what I am doing.

Re: EC2 - 2 regions

2011-03-21 Thread Dave Viner
Hi Milind, Great work here. Can you provide the patch against the 2 files? Perhaps there's some way to incorporate it into the trunk of cassandra so that this is feasible (in a future release) without patching the source code. Dave Viner On Mon, Mar 21, 2011 at 9:41 AM, A J wrote: >

Re: what kind of bug?

2011-03-23 Thread Dave Viner
I saw this once when my servers ran out of file descriptors. This caused totally weird problems. Make sure all nodes in the cluster are listening on the gossip port (7000 by default). Also check out http://www.datastax.com/docs/0.7/troubleshooting/index#view-of-ring-differs-between-some-nodesor

Re: LB scenario

2011-04-05 Thread Dave Viner
overloaded. In practice, I've found it better for the client to have a pool of connections, and then retry as needed to distinct nodes rather than use a load balancer. HTH Dave Viner On Tue, Apr 5, 2011 at 9:51 AM, A J wrote: > Can someone comment on this ? Or is the question t

Re: Property file snitch for Cassandra?

2010-07-07 Thread Dave Viner
RAC1. I've not yet proven to myself that this is accurate, but it definitely stops the error messages and , from looking at the code, seems like it should work. Is this correct? Thanks Dave Viner On Wed, Jul 7, 2010 at 6:22 PM, Eric Evans wrote: > > Let's move this to the user@

Backing up the data stored in cassandra

2010-07-07 Thread Dave Viner
ing those files? (That is, bring up a new node, copy the backed up files from the crashed node onto the new node, then have the new node join the cluster?) Thanks Dave Viner

Re: How to add a new Keyspace?

2010-07-07 Thread Dave Viner
y_cf_config http://www.mail-archive.com/user@cassandra.apache.org/msg02498.html HTH Dave Viner On Wed, Jul 7, 2010 at 11:39 PM, Peter Schuller wrote: > > If I want to add a new Keyspace, does it mean I have to distribute my > > storage-conf.xml to whole nodes? and restart whole

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread Dave Viner
same region allows for higher thruput numbers. Dave Viner On Fri, Jul 9, 2010 at 10:36 AM, maneela a wrote: > Are there any known performance issues if cassandra cluster launched with > RackAwareStrategy because I see huge performance difference between > RackAwareStrategy vs RackUnAwareS

Re: Elastic Load Balancing Cassandra

2010-07-13 Thread Dave Viner
I haven't used ELB, but I've setup HAProxy to do it... appears to work well so far. Dave Viner On Tue, Jul 13, 2010 at 3:30 PM, Brian Helfrich wrote: > Hi, has anyone been able to load balance a Cassandra cluster with an AWS > Elastic Load Balancer? I've setup an ELB with

Re: Is anyone using version 0.7 schema update API

2010-07-13 Thread Dave Viner
ge of your choosing. According to http://wiki.apache.org/thrift/, "Thrift has generators for C++, C#, Erlang, Haskell, Java, Objective C/Cocoa, OCaml, Perl, PHP, Python, Ruby, and Squeak" HTH Dave Viner On Tue, Jul 13, 2010 at 6:05 PM, GH wrote: > > To be honest I do not know how t

Re: A very short summary on Cassandra for a book

2010-07-15 Thread Dave Viner
s at any time from any node. "Also sorting happens on insert time" Yes, I believe this is true. Dave Viner On Thu, Jul 15, 2010 at 4:26 PM, Karoly Negyesi wrote: > Hi, > > I am writing a scalability chapter in a book and I need to mention > Apache Cassandra although it&

Re: Suggestion for the storage.conf

2010-07-19 Thread Dave Viner
Added: http://wiki.apache.org/cassandra/StorageConfiguration On Mon, Jul 19, 2010 at 2:55 AM, Dimitry Lvovsky wrote: > I think it would be a good idea to add a bit more explanation > storage-conf.xml/wiki regarding the replication factor. It caused some > confusion until we dug around the mail

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread Dave Viner
Cloud, but perhaps they have something similar? This feels like the kind of problem that might be easier for someone else to setup and quickly test. (The beauty of the virtual server - quick setup and quick tear down) Dave Viner On Mon, Jul 19, 2010 at 10:24 AM, Peter Schuller < peter.schul...

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread Dave Viner
release of Cassandra... Dave Viner On Mon, Jul 19, 2010 at 2:35 PM, Peter Schuller wrote: > > CPU was approximately equal across the cluster; it was around 50%. > > > > stress.py generates keys randomly or using a gaussian distribution, both > methods showed the same resul

Re: setting up a cluster

2010-07-21 Thread Dave Viner
Zs in the same region, you can use internal IPs in the storage-conf.xml file. If you use multiple regions, you must use external IPs (and open the security groups accordingly). Dave Viner On Wed, Jul 21, 2010 at 2:33 PM, Aaron Morton wrote: > This page may help > http://wiki.apache.or

Re: Cassandra Chef recipe and EC2 snitch

2010-07-22 Thread Dave Viner
. Pure EC2 api calls to setup your cluster. You can also use rackaware-ness in EC2. Just add in the PropertyFile endpoint and put your rack file in /etc/cassandra/rack.properties. Dave Viner On Thu, Jul 22, 2010 at 10:08 AM, Allan Carroll wrote: > Hi all, > > I'm setting up a new clu

Re: SV: How to stop cassandra server, installed from debian/ubuntupackage

2010-07-26 Thread Dave Viner
Yes... if you're using debian cassandra you can do: /etc/init.d/cassandra stop On Mon, Jul 26, 2010 at 8:05 AM, Lee Parker wrote: > Which debian/ubuntu packages are you using? I am using the ones that are > maintained by Eric Evans and the init.d script stops the server correctly. > > Lee Par

Re: Design questions/Schema help

2010-07-26 Thread Dave Viner
and aggregates from that directly in a hadoop post-processing... But perhaps others will have better ideas. If you haven't read http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model, go read it now. It won't answer your question directly, but will describe the process of modeling a bl

Re: Design questions/Schema help

2010-07-26 Thread Dave Viner
AFAIK, atomic increments are not available. There recently has been quite a bit of discussion about them. So, you might search the archives. Dave Viner On Mon, Jul 26, 2010 at 7:02 PM, Mark wrote: > On 7/26/10 6:06 PM, Dave Viner wrote: > > I'd love to hear other's o

Re: Quick Poll: Server names

2010-07-27 Thread Dave Viner
I've seen & used several... names of children of employees of the company names of streets near office names of diseases (lead to very hard to spell names after a while, but was quite educational for most developers) names of characters from famous books (e.g., lord of the rings, asimov novels, et

Re: non blocking Cassandra with Tornado

2010-07-27 Thread Dave Viner
t check out the new async Thrift client in Java for inspiration: http://blog.rapleaf.com/dev/2010/06/23/fully-async-thrift-client-in-java/ Or, even better, port the Thrift async client to work for python and other languages. Dave Viner On Tue, Jul 27, 2010 at 8:44 AM, Peter Schuller wrote:

iterating over all rows keys gets duplicate key returns

2010-07-28 Thread Dave Viner
forming the iteration over the keys. Any suggestions on how I can properly iterate? Thanks Dave Viner

Re: iterating over all rows keys gets duplicate key returns

2010-07-28 Thread Dave Viner
, adding the keys to a hash table. When an iteration returns no new keys, assume that all keys have been seen and exit. -> this also fails, since a particular result set can be full of duplicates, but the iteration has not traversed the entire row-key spectrum. Dave Viner On Wed, Jul 28, 2010 at 3:4

Re: Please need help with Munin: Cassandra Munin plugin problem

2010-07-29 Thread Dave Viner
Is your code posted somewhere such that others could try it? On Thu, Jul 29, 2010 at 5:57 AM, Miriam Allalouf wrote: > Hi, > Please, can someone help us with Munin?? > Thanks, > Miriam > > > On Mon, Jul 26, 2010 at 1:58 PM, osishkin osishkin > wrote: > > Hi, > > > > I'm trying to use Munin to m

Re: error using get_range_slice with random partitioner

2010-08-06 Thread Dave Viner
} } } $have_more = 1; } # end results loop if ($keyrange->{'start_key'} eq $previous_start_key) { $have_more = 0; } } # end while() loop $t

Re: Thrift + PHP: help!

2010-08-19 Thread Dave Viner
I am a user of the perl api - so I'd like to lurk in case there are things that can benefit both perl & php. Dave Viner On Wed, Aug 18, 2010 at 1:35 PM, Gabriel Sosa wrote: > I would like to help with this too! > > > On Wed, Aug 18, 2010 at 5:15 PM, Bas Kok wrote: > &

Re: need help on cassandra client

2010-08-30 Thread Dave Viner
e now in gen-perl/Cassandra* You can copy those to your perl directory of choice. Once you have the libraries installed, you should be able to run the sample code from http://wiki.apache.org/cassandra/ThriftExamples#Perl to connect. HTH, Dave Viner On Sat, Aug 28, 2010 at 8:22 AM, Eric Evans

Re: Cassandra & HAProxy

2010-08-30 Thread Dave Viner
t to drop those nodes from the rotation. I'd be happy to help with this, as I know how it works with haproxy and standard web servers or other tcp servers. But, I'm not sure how to make it work with Cassandra, since, as Ben points out, it can return valid tcp responses (that say "error-

Re: Cassandra & HAProxy

2010-08-30 Thread Dave Viner
yond the simple "machine not reachable" issue and covers more common scenarios that temporarily impact service time, but aren't so drastic as to cause machine outage. Dave Viner On Mon, Aug 30, 2010 at 9:52 AM, Edward Capriolo wrote: > On Mon, Aug 30, 2010 at 12:40 PM, Dave Vi

Re: Monitoring with Cacti

2010-09-12 Thread Dave Viner
l and column family stats - https://support.cloudkick.com/Cassandra_Checks. Dave Viner On Fri, Sep 10, 2010 at 8:31 PM, Edward Capriolo wrote: > On Fri, Sep 10, 2010 at 7:29 PM, aaron morton > wrote: > > Am going through the rather painful process of trying to monitor > cassandra using

Re: Buildding a Ubuntu / Debian package for Cassandra

2010-09-16 Thread Dave Viner
Hi Francois, Any reason http://wiki.apache.org/cassandra/DebianPackaging isn't working for you? Dave Viner On Thu, Sep 16, 2010 at 10:30 AM, Francois Richard wrote: > Guys, > > > > I am trying to build a debian package in order to deploy Cassandra 0.6.5 on > Ubuntu.

Re: Dazed and confused with Cassandra on EC2 ...

2010-09-17 Thread Dave Viner
ng to put words in your mouth - but I want to make sure that I understand what you're asking about (because I have similar ec2-related thoughts). Let me know if this is an accurate summary. Dave Viner On Fri, Sep 17, 2010 at 7:41 AM, Jedd Rashbrooke < jedd.rashbro...@imagini.net> wr

Re: Advice on settings

2010-10-07 Thread Dave Viner
ou do not pay inbound-outbound fees for the data xfer. HTH, Dave Viner On Thu, Oct 7, 2010 at 10:26 AM, B. Todd Burruss wrote: > if you are updating columns quite rapidly, you will scatter the columns > over many sstables as you update them over time. this means that a read of > a s

Re: Cold boot performance problems

2010-10-08 Thread Dave Viner
Has anyone found solid step-by-step docs on how to raid0 the ephemeral disks in ec2 for use by Cassandra? On Fri, Oct 8, 2010 at 12:11 PM, Jason Horman wrote: > We are currently using EBS with 4 volumes striped with LVM. Wow, we > didn't realize you could raid the ephemeral disks. I thought the

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Dave Viner
extremely nice) guy and I'm sure he could make HBase bend to his will at any point. Dave Viner On Sun, Nov 21, 2010 at 4:16 PM, Todd Lipcon wrote: > On Sun, Nov 21, 2010 at 2:06 PM, Edward Ribeiro > wrote: > >> >> Also I believe saying HBASE is consistent is not tru