Re: Storing big objects into columns

2011-01-14 Thread Peter Schuller
> In a project I would like to store "big" objects in columns, serialized. For > example entire images (several Ko to several Mo), flash animations (several > Mo) etc... > Does someone use Cassandra with those relatively big columns and if yes does > it work well ? Is there any drawbacks using this

Re: about the data directory

2011-01-14 Thread Peter Schuller
> as a administrator, I want to know why I can read the data from any node, > because the data just be kept the replica. Can you tell me? Thanks in advance. It's part of the point of Cassandra. You talk to the cluster, period. It's Cassandra's job to keep track of where data lives, and client app

Re: Usage Pattern : "unique" value of a key.

2011-01-14 Thread Oleg Anastasyev
> > You're right when you say it's unlikely that 2 threads have the same > timestamp, but it can. So it could work for user creation, but maybe > not on a more write intensive problem. Um, sorry I thought you re solving exact case of duplicate user creation. If youre trying to solve the concurre

RE: about the data directory

2011-01-14 Thread raoyixuan (Shandy)
Thanks very much -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: Friday, January 14, 2011 4:40 PM To: user@cassandra.apache.org Subject: Re: about the data directory > as a administrator, I want to know why I can read the data from any n

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Roshan Dawrani
It's possible that I am misunderstanding the question in some way. The row keys can be Time UUIDs and with those row keys as column names, u can use comparator TIMEUUIDTYPE to have them sorted by time automatically. On Fri, Jan 14, 2011 at 9:18 AM, Aaron Morton wrote: > You could make the time a

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-14 Thread Jairam Chandar
The cassandra logs strangely show no errors at the time of failure. Changing the RPCTimeoutInMillis seemed to help. Though it slowed down the job considerably, it seems to be finishing by changing the timeout value to 1 min. Unfortunately, I cannot be sure if it will continue to work if the data in

Different comparator types for column and supercolumn don't work

2011-01-14 Thread Karin Kirsch
Hello, I'm new to cassandra. I'm using cassandra release 0.7.0 (local, single node). I can't perform write operations in case the column and supercolumn families have different comparator types. For example if I use the code given in Issue: https://issues.apache.org/jira/browse/CASSANDRA-1712 by

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
@Roshan Yes, I thought about that, but then I wouldn't be able to use the Random Partitioner. @Aaron Do you mean like this: 'timeUUID+ row_key' as the supercolumn names? then when retriving the row_key from this column name, will I be required to parse the name ? How do I do that exactly ? >So

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Roshan Dawrani
On Fri, Jan 14, 2011 at 7:15 PM, Aklin_81 wrote: > @Roshan > Yes, I thought about that, but then I wouldn't be able to use the > Random Partitioner. > > Can you please expand a bit on this? What is this restriction? Can you point me to some relevant documentation on this? Thanks.

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Rajkumar Gupta
I am not sure but I guess because all the rows of certain time range will go to just one node & will not be evenly distributed because the timeUUID will not be random but sequential according to time... I am not sure anyways... On Fri, Jan 14, 2011 at 7:18 PM, Roshan Dawrani wrote: > On Fri, Jan

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
I too believed so! but not totally sure. On 1/14/11, Rajkumar Gupta wrote: > I am not sure but I guess because all the rows of certain time range will go > to just one node & will not be evenly distributed because the timeUUID will > not be random but sequential according to time... I am not sur

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Roshan Dawrani
I am not clear what you guys are trying to do and say :-) So, let's take some specifics... Say you want to create rows in some column family (say CF_A), and as you create them, you want to store their row key in column names in some other column family (say CF_B) - possibly for filtering keys bas

Problem starting Cassandra on Ubuntu

2011-01-14 Thread kh jo
Hi, just installed Cassandra on Ubuntu using package manager but I can not start it I get the following error in the logs:  INFO [main] 2011-01-14 15:37:49,758 AbstractCassandraDaemon.java (line 74) Heap size: 1051525120/1051525120  WARN [main] 2011-01-14 15:37:49,826 CLibrary.java (line 73) O

Re: Different comparator types for column and supercolumn don't work

2011-01-14 Thread Roshan Dawrani
On Fri, Jan 14, 2011 at 6:04 PM, Karin Kirsch wrote: > Hello, > > I'm new to cassandra. I'm using cassandra release 0.7.0 (local, single > node). I can't perform write operations in case the column and supercolumn > families have > different comparator types. For example if I use the code given in

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
I just read that cassandra internally creates a md5 hash that is used for distributing the load by sending it to a node reponsible for the range within which that md5 hash falls, so even when we create sequential keys, their MD5 hash is not the same & hence they are not sent to same node. This was

Re: limiting columns in a row

2011-01-14 Thread Sylvain Lebresne
Hi, > does this seem like a generally useful feature? I do think this could be a useful feature. If only because I don't think there is any satisfactory/efficient way to do this client side. > if so, would it be hard to implement (maybe it could be done at compaction > time like the TTL feature)

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Roshan Dawrani
On Fri, Jan 14, 2011 at 8:51 PM, Aklin_81 wrote: > I just read that cassandra internally creates a md5 hash that is used > for distributing the load by sending it to a node reponsible for the > range within which that md5 hash falls, so even when we create > sequential keys, their MD5 hash is not

live data migration from mysql to cassandra

2011-01-14 Thread ruslan usifov
Hello Dear community please share your experience, home you make live(without stop) migration from mysql or other RDBM to cassandra

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
No, you do not need to shut up, please! :) you may be clearing up my further misconceptions on the topic! Anyways, the link b/w 1st and 2nd para was that since the rows distribution among nodes is not affected by key(as you rightly said) but by md5 hash of the key thus I can use just any key incl

Re: live data migration from mysql to cassandra

2011-01-14 Thread Edward Capriolo
On Fri, Jan 14, 2011 at 10:40 AM, ruslan usifov wrote: > Hello > > Dear community please share your experience, home you make live(without > stop) migration from mysql or other RDBM to cassandra > There is no built in way to do this. I remember hearing at hadoop world this year that the hbase guy

Re: live data migration from mysql to cassandra

2011-01-14 Thread Victor Kabdebon
I personnally did it the other way around : from Cassandra to PostGreSQL, I needed an hybrid system : Cassandra solidly holds all data while PostGreSQL holds fewer data but request are simple and efficient ( with SELECT WHERE). This is pretty easy once you master key browsing and iterating. I thin

Re: live data migration from mysql to cassandra

2011-01-14 Thread Victor Kabdebon
gosh, sorry for the mistakes I am tired ! Victor K. 2011/1/14 Victor Kabdebon > I personnally did it the other way around : from Cassandra to PostGreSQL, I > needed an hybrid system : Cassandra solidly holds all data while PostGreSQL > holds fewer data but request are simple and efficient ( wit

Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ertio Lew
Hey, If you have a site in production environment or considering so, what is the client that you use to interact with Cassandra. I know that there are several clients available out there according to the language you use but I would love to know what clients are being used widely in production env

Re: cassandra row cache

2011-01-14 Thread Mike Malone
Digest reads could be being dropped..? On Thu, Jan 13, 2011 at 4:11 PM, Jonathan Ellis wrote: > On Thu, Jan 13, 2011 at 2:00 PM, Edward Capriolo > wrote: > > Is it possible that your are reading at READ.ONE and that READ.ONE > > only warms cache on 1 of your three nodes= 20. 2nd read warms anot

Re: Cassandra freezes under load when using libc6 2.11.1-0ubuntu7.5

2011-01-14 Thread Mike Malone
That's interesting. For us, the 7.5 version of libc was causing problems. Either way, I'm looking forward to hearing about anything you find. Mike On Thu, Jan 13, 2011 at 11:47 PM, Erik Onnen wrote: > Too similar to be a coincidence I'd say: > > Good node (old AZ): 2.11.1-0ubuntu7.5 > Bad node

Re: cassandra row cache

2011-01-14 Thread Jonathan Ellis
That's possible, yes. He'd want to make sure there aren't any of those WARN messages in the logs. On Fri, Jan 14, 2011 at 11:46 AM, Mike Malone wrote: > Digest reads could be being dropped..? > > On Thu, Jan 13, 2011 at 4:11 PM, Jonathan Ellis wrote: >> >> On Thu, Jan 13, 2011 at 2:00 PM, Edwar

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ran Tavory
I use Hector, if that counts. .. On Jan 14, 2011 7:25 PM, "Ertio Lew" wrote: > Hey, > > If you have a site in production environment or considering so, what > is the client that you use to interact with Cassandra. I know that > there are several clients available out there according to the > lang

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ertio Lew
what is the technology stack do you use? On 1/14/11, Ran Tavory wrote: > I use Hector, if that counts. .. > On Jan 14, 2011 7:25 PM, "Ertio Lew" wrote: >> Hey, >> >> If you have a site in production environment or considering so, what >> is the client that you use to interact with Cassandra. I

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ran Tavory
Java On Jan 14, 2011 8:25 PM, "Ertio Lew" wrote: > what is the technology stack do you use? > > On 1/14/11, Ran Tavory wrote: >> I use Hector, if that counts. .. >> On Jan 14, 2011 7:25 PM, "Ertio Lew" wrote: >>> Hey, >>> >>> If you have a site in production environment or considering so, what >

phpcassa never return(infinite loop)?!!!

2011-01-14 Thread kh jo
I am trying to use phpcasse I use the following example  CassandraConn::add_node('localhost', 9160); $users = new CassandraCF('rhg', 'Users'); // ColumnFamily $users->insert('1', array('email' => 't...@example.com', 'password' => 'test'));  when I run it, it never returns,,, and apache p

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Victor Kabdebon
Same here Hector + java Best Regards, Victor K 2011/1/14 Ran Tavory > Java > On Jan 14, 2011 8:25 PM, "Ertio Lew" wrote: > > what is the technology stack do you use? > > > > On 1/14/11, Ran Tavory wrote: > >> I use Hector, if that counts. .. > >> On Jan 14, 2011 7:25 PM, "Ertio Lew" wrote: >

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Mike Wynholds
We have one in production with Ruby / fauna Cassandra gem and Cassandra 0.6.x. The project is live but is stuck in a sort of private beta, so it hasn't really been run through any load scenarios. ..mike.. -- Michael Wynholds | Carbon Five | 310.821.7125 x13 | m...@carbonfive.com On Fri, Jan 1

Re: phpcassa never return(infinite loop)?!!!

2011-01-14 Thread Tyler Hobbs
Answered in the phpcassa ML here: http://groups.google.com/group/phpcassa/browse_thread/thread/2771112a323860f7 - Tyler On Fri, Jan 14, 2011 at 12:36 PM, kh jo wrote: > I am trying to use phpcasse > > I use the following example > > > CassandraConn::add_node('localhost', 9160); > $users = new

Cassandra in less than 1G of memory?

2011-01-14 Thread Rajat Chopra
Hello. According to JVM heap size topic at http://wiki.apache.org/cassandra/MemtableThresholds , Cassandra would need atleast 1G of memory to run. Is it possible to have a running Cassandra cluster with machines that have less than that memory... say 512M? I can live with slow transactions, no

Re: Newbie Replication/Cluster Question

2011-01-14 Thread Mark Moseley
On Thu, Jan 13, 2011 at 2:32 PM, Mark Moseley wrote: > On Thu, Jan 13, 2011 at 1:08 PM, Gary Dusbabek wrote: >> It is impossible to properly bootstrap a new node into a system where >> there are not enough nodes to satisfy the replication factor.  The >> cluster as it stands doesn't contain all t

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Dan Kuebrich
We've done hundreds of gigs in and out of cassandra 0.6.8 with pycassa 0.3. Working on upgrading to 0.7 and pycassa 1.03. I don't know if we're using it wrong, but the "connection object is tied to a particular keyspace" constraint isn't that awesome--we have a number of keyspaces used simultaneo

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Victor Kabdebon
Dear rajat, Yes it is possible, I have the same constraints. However I must warn you, from what I see Cassandra memory consumption is not bounded in 0.6.X on debian 64 Bit Here is an example of an instance launch in a node : root 19093 0.1 28.3 1210696 *570052* ? Sl Jan11 9:08 /usr

Re: Newbie Replication/Cluster Question

2011-01-14 Thread Mark Moseley
> Perhaps the better question would be, if I have a two node cluster and > I want to be able to lose one box completely and replace it (without > losing the cluster), what settings would I need? Or is that an > impossible scenario? In production, I'd imagine a 3 node cluster being > the minimum but

Cassandra-Maven-Plugin

2011-01-14 Thread Stephen Connolly
OK, I nearly have the Cassandra-Maven-Plugin ready. It has the following goals: run: launches Cassandra in the foreground and blocks until you press ^C at which point Maven terminates. Use-case: Running integration tests from your IDE. Live development from your IDE. start: launches Cassandr

Re: Newbie Replication/Cluster Question

2011-01-14 Thread Aaron Morton
Here's some slides I did last year that have a simple explanation of RF http://www.slideshare.net/mobile/aaronmorton/well-railedcassandra24112010-5901169 Short version is, generally no single node contains all the data in the db. Normally the RF is going to be less than the number of nodes, and

Re: Newbie Replication/Cluster Question

2011-01-14 Thread Mark Moseley
On Fri, Jan 14, 2011 at 4:29 PM, Aaron Morton wrote: > Here's some slides I did last year that have a simple explanation of RF > http://www.slideshare.net/mobile/aaronmorton/well-railedcassandra24112010-5901169 > > Short version is, generally no single node contains all the data in the db. > Norm

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Edward Capriolo
On Fri, Jan 14, 2011 at 2:13 PM, Victor Kabdebon wrote: > Dear rajat, > > Yes it is possible, I have the same constraints. However I must warn you, > from what I see Cassandra memory consumption is not bounded in 0.6.X on > debian 64 Bit > > Here is an example of an instance launch in a node : > >

is it possible to map an one from a a file and an one from cassandra?

2011-01-14 Thread 김준영
hi, cassandra supports hadoop to map & reduce from cassandra. now I am digging to find out a way to map from a file and cassandra together. I mean if both of them are files in my disk, it is possible by using splits. but, in this kind of a situtation, which way is posssible? for example. in

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Jonathan Ellis
mmapping only consumes memory that the OS can afford to feed it. On Fri, Jan 14, 2011 at 7:29 PM, Edward Capriolo wrote: > On Fri, Jan 14, 2011 at 2:13 PM, Victor Kabdebon > wrote: >> Dear rajat, >> >> Yes it is possible, I have the same constraints. However I must warn you, >> from what I see C

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Victor Kabdebon
Hi Jonathan, hi Edward, Jonathan : but it looks like mmaping wants to consume the entire memory of my server. It goes up to 1.7 Gb for a ridiculously small amount of data. Am I doing something wrong or is there something I should change to prevent this never ending increase of memory consumption ?