Can Cassandra make real use of several DataFileDirectories?
I have a configuration like this: /storage01/cassandra/data /storage02/cassandra/data /storage03/cassandra/data After loading a big chunk of data into cassandra, I end up wich some 70GB in the first directory, and only about 10GB in the second and third one. All rows are quite small, so it's not just some big rows that contain the majority of data. Does Cassandra have the ability to 'see' the maximum available space in these directory? I'm asking myself this question since my limit is 100GB, and the first directory is approaching this limit... And, wouldn't it be better if Cassandra tried to 'load-balance' the files inside the directories because this will result in better (read) performance if the directories are on different disks (which is the case for me)? Any help is appreciated. Roland
Re: how to store file in the cassandra?
Hi Jonathan, Cassandra seems has not a Blob data type. To handle binary large object data, we have to use array of byte. I have a question to you. Suppose I have a MPEG video files 15 MB. To save this video file into Cassandra database I will store this file into array of byte. One day, I feel this video is not necessary again, therefore I delete it from the database. My question is, after I delete this video from Cassandra database, should I perform defragmentation operation into Cassandra's file database?? Thank you. On Mon, Apr 26, 2010 at 8:28 AM, Jonathan Ellis wrote: > Cassandra stores byte arrays. You can certainly store file data in > it, although if it is larger than a few MB you should chunk it into > multiple columns. > > On Sun, Apr 25, 2010 at 8:21 PM, Shuge Lee wrote: > > Yes. > > > > Cassandra does save raw string data only, not a file, and shouldn't save > a > > file. > > > > 2010/4/26 刘兵兵 > >> > >> sorry i'm not very familiar with python, are you meaning that the files > >> are stored in the file system of the os? > >> > >> then , the cassandra just stores the path to access the files? > >> > >> > >> On Mon, Apr 26, 2010 at 8:57 AM, Shuge Lee wrote: > >>> > >>> In Python: > >>> > >>> keyspace.columnfamily[key][column] = value > >>> > >>> files.video[uuid.uuid4()]['name'] = 'foo.flv' > >>> files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv' > >>> > >>> create a mapping > >>> files.video = { > >>> uuid.uuid4() : { > >>> 'name' : 'foo.flv', > >>> 'path' : '/var/files/foo.flv', > >>> } > >>> } > >>> > >>> if most of sizes >= 0.5MB, use sys-fs/reiser4progs, else use ext4. > >>> > >>> > >>> 2010/4/26 Bingbing Liu > > any suggestion? > > 2010-04-26 > > Bingbing Liu > >>> > >>> > >>> -- > >>> Shuge Lee | Lee Li | 李蠡 > >> > >> > >> > >> -- > >> Bingbing Liu > >> > >> Web and Mobile Data Management lab > >> > >> Renmin University of China > > > > > > > > -- > > Shuge Lee | Lee Li | 李蠡 > > >
Re: when i use the OrderPreservingPartition, the load is very imbalance
1) you can re-balance a node with bin/nodetool -h token [] specify a new token manually or let the system guess one. 2) take a look into your system.log to find out why your nodes are dying. 2010/4/26 刘兵兵 > i do some INSERT ,because i will do some scan operations, i use the > OrderPreservingPartition method. > > the state of the cluster is showed below. > > as i predicated the load is very imbalance, and some of the nodes down (in > some nodes,the Cassandra processes died and in others the processes are > > alive but they still down), > > so i have two questions: > > 1)how to do balance after the insert ends? > > 2)why the nodes died? how to make them up again (when the situation is that > the process is alive but the node'state is down) > > thx > > 10.37.17.241 Up 47.65 GB > 0p6ovvUXMJ4cdd1L |<--| > 10.37.17.234 Up 67.41 GB > 5OxiS2DKBZLeISPg | ^ > 10.37.17.235 Up 67.54 GB > 7UDcS0SToePuQACe v | > 10.37.17.246 Up 555 bytes > OCvC3nqKLeKA5n0I | ^ > 10.37.17.233 Up 830 bytes > SJp6cQRNox52av2Y v | > 10.37.17.249 Up 830 bytes > SxVmCVcruOpoS48B | ^ > 10.37.17.247 Up 555 bytes > TGctCMvfNuRo7RjS v | > 10.37.17.245 Up 555 bytes > j2smY0OOtQ0SeeHY | ^ > 10.37.17.250 Up 830 bytes > jNwBPchW58i5tGxp v | > 10.37.17.248 Up 830 bytes > jYWaJC93OyMdWDaN | ^ > 10.37.17.237 Up 830 bytes > mPwhLOsKlbPart6j v | > 10.37.17.236 Up 830 bytes > noh0t8HJgw4hmz7I | ^ > 10.37.17.244 Up 555 bytes > q8c8SPYEkWEzmFcR v | > 10.37.17.238 Up 555 bytes > rIuuq3AR4DVK989X | ^ > 10.37.17.242 Up 555 bytes > smebTmIvQBMG56Zf v | > 10.37.17.243 Up 555 bytes > tWTYyiqAKQVw7197 | ^ > 10.37.17.232 Up 830 bytes > uVdBQkR9Dszm5deK v | > 10.37.17.239 Up 555 bytes > xXQkDQn1vvg8e1xS | ^ > 10.37.17.240 Up 555 bytes > yQRrq9RG2dUsHUyR |-->| > > > -- > Bingbing Liu > > Web and Mobile Data Management lab > > Renmin University of China >
Re: how to store file in the cassandra?
On 26 April 2010 00:57, Shuge Lee wrote: > In Python: > > keyspace.columnfamily[key][column] = value > > files.video[uuid.uuid4()]['name'] = 'foo.flv' > files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv' > Hi. Storing the filename in the database will not solve the file storage problem. Cassandra is a distributed database, and a file stored locally will not be available on other client nodes. If you're using Cassandra at all, that probably implies that you have lots of client nodes. A non-redundant NFS server (for example) would not offer high availability, so would be inadequate for the OP's situation. Storing files *IN* Cassandra is very useful because you can then retrieve them from anywhere with high availability. However, as others have discussed, they should be split across multiple columns, or if very big, multiple rows. I prefer to split by row because this scales better to very large files. During compaction, as is well noted, Cassandra needs the entire row in memory, which will cause a FAIL once you have files more than a few gigs. Mark
Re: when i use the OrderPreservingPartition, the load is very imbalance
sorry, if specifying the token manually, use: bin/nodetool -h move 2010/4/26 Roland Hänel > 1) you can re-balance a node with > > bin/nodetool -h token [] > > specify a new token manually or let the system guess one. > > 2) take a look into your system.log to find out why your nodes are dying. > > > 2010/4/26 刘兵兵 > > i do some INSERT ,because i will do some scan operations, i use the >> OrderPreservingPartition method. >> >> the state of the cluster is showed below. >> >> as i predicated the load is very imbalance, and some of the nodes down (in >> some nodes,the Cassandra processes died and in others the processes are >> >> alive but they still down), >> >> so i have two questions: >> >> 1)how to do balance after the insert ends? >> >> 2)why the nodes died? how to make them up again (when the situation is >> that the process is alive but the node'state is down) >> >> thx >> >> 10.37.17.241 Up 47.65 GB >> 0p6ovvUXMJ4cdd1L |<--| >> 10.37.17.234 Up 67.41 GB >> 5OxiS2DKBZLeISPg | ^ >> 10.37.17.235 Up 67.54 GB >> 7UDcS0SToePuQACe v | >> 10.37.17.246 Up 555 bytes >> OCvC3nqKLeKA5n0I | ^ >> 10.37.17.233 Up 830 bytes >> SJp6cQRNox52av2Y v | >> 10.37.17.249 Up 830 bytes >> SxVmCVcruOpoS48B | ^ >> 10.37.17.247 Up 555 bytes >> TGctCMvfNuRo7RjS v | >> 10.37.17.245 Up 555 bytes >> j2smY0OOtQ0SeeHY | ^ >> 10.37.17.250 Up 830 bytes >> jNwBPchW58i5tGxp v | >> 10.37.17.248 Up 830 bytes >> jYWaJC93OyMdWDaN | ^ >> 10.37.17.237 Up 830 bytes >> mPwhLOsKlbPart6j v | >> 10.37.17.236 Up 830 bytes >> noh0t8HJgw4hmz7I | ^ >> 10.37.17.244 Up 555 bytes >> q8c8SPYEkWEzmFcR v | >> 10.37.17.238 Up 555 bytes >> rIuuq3AR4DVK989X | ^ >> 10.37.17.242 Up 555 bytes >> smebTmIvQBMG56Zf v | >> 10.37.17.243 Up 555 bytes >> tWTYyiqAKQVw7197 | ^ >> 10.37.17.232 Up 830 bytes >> uVdBQkR9Dszm5deK v | >> 10.37.17.239 Up 555 bytes >> xXQkDQn1vvg8e1xS | ^ >> 10.37.17.240 Up 555 bytes >> yQRrq9RG2dUsHUyR |-->| >> >> >> -- >> Bingbing Liu >> >> Web and Mobile Data Management lab >> >> Renmin University of China >> > >
Re: when i use the OrderPreservingPartition, the load is very imbalance
On 26 April 2010 01:18, 刘兵兵 wrote: > i do some INSERT ,because i will do some scan operations, i use the > OrderPreservingPartition method. > > the state of the cluster is showed below. > > as i predicated the load is very imbalance I think the solution to this would be to choose your nodes' tokens wisely before you start inserting data, and if possible, modify the keys to split them better between the nodes. For example, if your key has two parts, one of which you want to range scan, another which you don't. Say you have customer_id and a timestamp. The customer ID does not need to be range scanned, so you can hash it into a hex value (say), then append the timestamp (in a lexically sortable way of course). So you'd end up with keys like -0012345-0001234567890 Where is a hash of the customer ID, 0012345 is the customer ID, and the rest is a timestamp. You'd be able to do a time range scan by using the known prefixes, and distributing your nodes equally from to would result in fairly even data (provided you don't have a very small number of very large customers). Mark
Re: when i use the OrderPreservingPartition, the load is very imbalance
When starting your cassandra cluster, please configure the InitialToken for each node, which make the key range balance. On Mon, Apr 26, 2010 at 6:17 PM, Mark Robson wrote: > On 26 April 2010 01:18, 刘兵兵 wrote: > >> i do some INSERT ,because i will do some scan operations, i use the >> OrderPreservingPartition method. >> >> the state of the cluster is showed below. >> >> as i predicated the load is very imbalance > > > > I think the solution to this would be to choose your nodes' tokens wisely > before you start inserting data, and if possible, modify the keys to split > them better between the nodes. > > For example, if your key has two parts, one of which you want to range > scan, another which you don't. Say you have customer_id and a timestamp. The > customer ID does not need to be range scanned, so you can hash it into a hex > value (say), then append the timestamp (in a lexically sortable way of > course). So you'd end up with keys like > > -0012345-0001234567890 > > Where is a hash of the customer ID, 0012345 is the customer ID, and > the rest is a timestamp. > > You'd be able to do a time range scan by using the known prefixes, and > distributing your nodes equally from to would result in fairly > even data (provided you don't have a very small number of very large > customers). > > Mark >
Re: Re: when i use the OrderPreservingPartition, the load is veryimbalance
thank you so much for your help! 2010-04-26 Bingbing Liu 发件人: Mark Robson 发送时间: 2010-04-26 18:17:53 收件人: user 抄送: 主题: Re: when i use the OrderPreservingPartition, the load is veryimbalance On 26 April 2010 01:18, 刘兵兵 wrote: i do some INSERT ,because i will do some scan operations, i use the OrderPreservingPartition method. the state of the cluster is showed below. as i predicated the load is very imbalance I think the solution to this would be to choose your nodes' tokens wisely before you start inserting data, and if possible, modify the keys to split them better between the nodes. For example, if your key has two parts, one of which you want to range scan, another which you don't. Say you have customer_id and a timestamp. The customer ID does not need to be range scanned, so you can hash it into a hex value (say), then append the timestamp (in a lexically sortable way of course). So you'd end up with keys like -0012345-0001234567890 Where is a hash of the customer ID, 0012345 is the customer ID, and the rest is a timestamp. You'd be able to do a time range scan by using the known prefixes, and distributing your nodes equally from to would result in fairly even data (provided you don't have a very small number of very large customers). Mark
Re: Can Cassandra make real use of several DataFileDirectories?
Please refer the code: org.apache.cassandra.db.ColumnFamilyStore public String getFlushPath() { long guessedSize = 2 * DatabaseDescriptor.getMemtableThroughput() * 1024*1024; // 2* adds room for keys, column indexes String location = DatabaseDescriptor.getDataFileLocationForTable(table_, guessedSize); if (location == null) throw new RuntimeException("Insufficient disk space to flush"); return new File(location, getTempSSTableFileName()).getAbsolutePath(); } and we can go through org.apache.cassandra.config.DatabaseDescriptor: public static String getDataFileLocationForTable(String table, long expectedCompactedFileSize) { long maxFreeDisk = 0; int maxDiskIndex = 0; String dataFileDirectory = null; String[] dataDirectoryForTable = getAllDataFileLocationsForTable(table); for ( int i = 0 ; i < dataDirectoryForTable.length ; i++ ) { File f = new File(dataDirectoryForTable[i]); if( maxFreeDisk < f.getUsableSpace()) { maxFreeDisk = f.getUsableSpace(); maxDiskIndex = i; } } // Load factor of 0.9 we do not want to use the entire disk that is too risky. maxFreeDisk = (long)(0.9 * maxFreeDisk); if( expectedCompactedFileSize < maxFreeDisk ) { dataFileDirectory = dataDirectoryForTable[maxDiskIndex]; currentIndex = (maxDiskIndex + 1 )%dataDirectoryForTable.length ; } else { currentIndex = maxDiskIndex; } return dataFileDirectory; } So, DataFileDirectories means multiple disks or disk-partitions. I think your storage01, storage02 and storage03 are in same disk or disk partition. 2010/4/26 Roland Hänel > I have a configuration like this: > > > /storage01/cassandra/data > /storage02/cassandra/data > /storage03/cassandra/data > > > After loading a big chunk of data into cassandra, I end up wich some 70GB > in the first directory, and only about 10GB in the second and third one. All > rows are quite small, so it's not just some big rows that contain the > majority of data. > > Does Cassandra have the ability to 'see' the maximum available space in > these directory? I'm asking myself this question since my limit is 100GB, > and the first directory is approaching this limit... > > And, wouldn't it be better if Cassandra tried to 'load-balance' the files > inside the directories because this will result in better (read) performance > if the directories are on different disks (which is the case for me)? > > Any help is appreciated. > > Roland > >
Re: Can Cassandra make real use of several DataFileDirectories?
Thanks very much. Precisely answers my questions. :-) 2010/4/26 Schubert Zhang > Please refer the code: > > org.apache.cassandra.db.ColumnFamilyStore > > public String getFlushPath() > { > long guessedSize = 2 * DatabaseDescriptor.getMemtableThroughput() * > 1024*1024; // 2* adds room for keys, column indexes > String location = > DatabaseDescriptor.getDataFileLocationForTable(table_, guessedSize); > if (location == null) > throw new RuntimeException("Insufficient disk space to flush"); > return new File(location, > getTempSSTableFileName()).getAbsolutePath(); > } > > and we can go through org.apache.cassandra.config.DatabaseDescriptor: > > public static String getDataFileLocationForTable(String table, long > expectedCompactedFileSize) > { > long maxFreeDisk = 0; > int maxDiskIndex = 0; > String dataFileDirectory = null; > String[] dataDirectoryForTable = > getAllDataFileLocationsForTable(table); > > for ( int i = 0 ; i < dataDirectoryForTable.length ; i++ ) > { > File f = new File(dataDirectoryForTable[i]); > if( maxFreeDisk < f.getUsableSpace()) > { > maxFreeDisk = f.getUsableSpace(); > maxDiskIndex = i; > } > } > // Load factor of 0.9 we do not want to use the entire disk that is > too risky. > maxFreeDisk = (long)(0.9 * maxFreeDisk); > if( expectedCompactedFileSize < maxFreeDisk ) > { > dataFileDirectory = dataDirectoryForTable[maxDiskIndex]; > currentIndex = (maxDiskIndex + 1 )%dataDirectoryForTable.length ; > } > else > { > currentIndex = maxDiskIndex; > } > return dataFileDirectory; > } > > So, DataFileDirectories means multiple disks or disk-partitions. > I think your storage01, storage02 and storage03 are in same disk or disk > partition. > > > 2010/4/26 Roland Hänel > > I have a configuration like this: >> >> >> /storage01/cassandra/data >> /storage02/cassandra/data >> /storage03/cassandra/data >> >> >> After loading a big chunk of data into cassandra, I end up wich some 70GB >> in the first directory, and only about 10GB in the second and third one. All >> rows are quite small, so it's not just some big rows that contain the >> majority of data. >> >> Does Cassandra have the ability to 'see' the maximum available space in >> these directory? I'm asking myself this question since my limit is 100GB, >> and the first directory is approaching this limit... >> >> And, wouldn't it be better if Cassandra tried to 'load-balance' the files >> inside the directories because this will result in better (read) performance >> if the directories are on different disks (which is the case for me)? >> >> Any help is appreciated. >> >> Roland >> >> >
Re: when i use the OrderPreservingPartition, the load is very imbalance
Hello Mark, El 26/04/2010, a las 07:17, Mark Robson escribió: > I think the solution to this would be to choose your nodes' tokens wisely > before you start inserting data, and if possible, modify the keys to split > them better between the nodes. > > For example, if your key has two parts, one of which you want to range scan, > another which you don't. Say you have customer_id and a timestamp. The > customer ID does not need to be range scanned, so you can hash it into a hex > value (say), then append the timestamp (in a lexically sortable way of > course). So you'd end up with keys like > > -0012345-0001234567890 > > Where is a hash of the customer ID, 0012345 is the customer ID, and the > rest is a timestamp. > > You'd be able to do a time range scan by using the known prefixes, and > distributing your nodes equally from to would result in fairly even > data (provided you don't have a very small number of very large customers). How do you ask cassandra to do a range scan with a prefix? As far as I can tell, you can't do something like: db.get_range('SomeCF', :start => '-0012345-*') ...do you? Regards -- Lucas Di Pentima - Santa Fe, Argentina Jabber: lu...@di-pentima.com.ar MSN: ldipent...@hotmail.com
Re: value size, is there a suggested limit?
I think that is not what cassandra good at. On Mon, Apr 26, 2010 at 4:22 AM, Mark Greene wrote: > http://wiki.apache.org/cassandra/CassandraLimitations > > > On Sun, Apr 25, 2010 at 4:19 PM, S Ahmed wrote: > >> Is there a suggested sized maximum that you can set the value of a given >> key? >> >> e.g. could I convert a document to bytes and store it as a value to a key? >> if yes, which I presume so, what if the file is 10mb? or 100mb? >> > >
Cassandra use cases: as a datagrid ? as a distributed cache ?
Hi, Cassandra comes closer and closer to a data grid like Oracle Coherence: Cassandra includes distributed "hash maps", partitioning, high availability, map/reduce processing, (some) request capability, etc. So, I am wondering about the 2 following (and possible ?) Cassandra's use cases : (1) has anyone already used Cassandra as an in-memory data grid ? If no, does anyone know how far such a database is from, let's say, Oracle Coherence ? Does Cassandra provide, for example, a (synchronized) cache on the client side ? (2) has anyone already used Cassandra as a distributed cache ? Are there some testimonials somewhere about this use case ? Thanks for your help. Regards, Dominique
Re: MapReduce, Timeouts and Range Batch Size
OPP will be marginally faster. Maybe 10%? I don't think anyone has benchmarked it. On Fri, Apr 23, 2010 at 10:30 AM, Joost Ouwerkerk wrote: > In that case I should probably wait for 0.7. Is there any fundamental > performance difference in get_range_slices between Random and > Order-Preserving partitioners. If so, by what factor? > joost. > > On Fri, Apr 23, 2010 at 10:47 AM, Jonathan Ellis wrote: >> >> You could look into it, but it's not going to be an easy backport >> since SSTableReader and SSTableScanner got split into two classes in >> trunk. >> >> On Fri, Apr 23, 2010 at 9:39 AM, Joost Ouwerkerk >> wrote: >> > Awesome. In the meantime, I hacked something similar myself. The >> > performance difference does not appear to be material. I think the real >> > killer is the get_range_slices call. Relative to that, the cost of >> > getting >> > the connection appears to be more or less trivial. What can I do to >> > alleviate that cost? CASSANDRA-821 looks interesting -- can I apply >> > that to >> > 0.6.1 ? >> > joost. >> > On Fri, Apr 23, 2010 at 9:39 AM, Jonathan Ellis >> > wrote: >> >> >> >> Great! Created https://issues.apache.org/jira/browse/CASSANDRA-1017 >> >> to track this. >> >> >> >> On Fri, Apr 23, 2010 at 4:12 AM, Johan Oskarsson >> >> wrote: >> >> > I have written some code to avoid thrift reconnection, it just keeps >> >> > the >> >> > connection open between get_range_slices calls. >> >> > I can extract that and put it up but not until early next week. >> >> > >> >> > /Johan >> >> > >> >> > On 23 apr 2010, at 05.09, Jonathan Ellis wrote: >> >> > >> >> >> That would be an easy win, sure. >> >> >> >> >> >> On Thu, Apr 22, 2010 at 9:27 PM, Joost Ouwerkerk >> >> >> >> >> >> wrote: >> >> >>> I was getting client timeouts in >> >> >>> ColumnFamilyRecordReader.maybeInit() >> >> >>> when >> >> >>> MapReducing. So I've reduced the Range Batch Size to 256 (from >> >> >>> 4096) >> >> >>> and >> >> >>> this seems to have fixed my problem, although it has slowed things >> >> >>> down a >> >> >>> bit -- presumably because there are 16x more calls to >> >> >>> get_range_slices. >> >> >>> While I was in that code I noticed that a new client was being >> >> >>> created >> >> >>> for >> >> >>> each batch get. By decreasing the batch size, I've increased this >> >> >>> overhead. I'm thinking of re-writing ColumnFamilyRecordReader to >> >> >>> do >> >> >>> some >> >> >>> connection pooling. Anyone have any thoughts on that? >> >> >>> joost. >> >> >>> >> >> > >> >> > >> > >> > > >
Re: newbie question on how columns names are indexed/lucene limitations?
The column index in a row is a sorted-blocked index (like b-tree), just like bigtable. On Mon, Apr 26, 2010 at 2:43 AM, Stu Hood wrote: > The indexes within rows are _not_ implemented with Lucene: there is a > custom index structure that allows for random access within a row. But, you > should probably read http://wiki.apache.org/cassandra/CassandraLimitationsto > understand the current limitations of the file format, some of which are > scheduled to be fixed soon. > > -Original Message- > From: "TuX RaceR" > Sent: Sunday, April 25, 2010 11:54am > To: user@cassandra.apache.org > Subject: newbie question on how columns names are indexed/lucene > limitations? > > Hello Cassandra Users, > > When use the RandomPartinionner and a simple ColumnFamily/Columns (i.e. > no SuperColumns) my understanding is that one signle Row can store > millions of columns. > > If I look at the http://wiki.apache.org/cassandra/API, I understand that > I can get a subset of the millions of columns defined above using: > SlicePredicate->ColumnNames or SlicePredicate->SliceRange > > My question is about the implementation of this columns 'selection'. > I vaguely remember reading somewhere (but I cannot find the link again) > that this was implemented using a Lucene index over the column names for > each row. > Is that true? Is there a small lucene index per row? > > Also we know from that lucene have some limitations > http://lucene.apache.org/java/3_0_1/fileformats.html#Limitations : you > cannot index more than 2.1 billions documents as a document ID is mapped > to a 32 bits int. > > As I plan to store in column names the ID of my cassandra documents (the > global number of documents can go well beyond 2.1 billions), will I be > hit by the lucene limitations? I.e can I store cassandra documents ID > (i.e keys) in column names, if in each individual row there are no more > than few millions of those IDs? I guess the answer is "yes I can", > because lucandra uses a similar schema but it is not clear for me why. > Is that because the lucene index is made on each row and what really > matters in the number of columns in one single row and not the number of > distinct column names (globally over all the rows)? > > > Thanks in advance > TuX > > >
RE: Does anybody work about transaction on cassandra ?
Orthogonal in this case means "at cross purposes" Transactions can't really be done with eventual consistency because all nodes don't have all the info at the time the transaction is done. I think they recommend zookeeper for this kind of stuff, but I don't know why you want to use Cassandra vs a RDBMS if you really want transactions. From: dir dir [mailto:sikerasa...@gmail.com] Sent: Saturday, April 24, 2010 12:08 PM To: user@cassandra.apache.org Subject: Re: Does anybody work about transaction on cassandra ? >Transactions are orthogonal to the design of Cassandra Sorry, Would you want to tell me what is an orthogonal mean in this context?? honestly I do not understand what is it. Thank you. On Thu, Apr 22, 2010 at 9:14 PM, Miguel Verde mailto:miguelitov...@gmail.com>> wrote: No, as far as I know no one is working on transaction support in Cassandra. Transactions are orthogonal to the design of Cassandra[1][2], although a system could be designed incorporating Cassandra and other elements a la Google's MegaStore[3] to support transactions. Google uses Paxos, one might be able to use Zookeeper[4] to design such a system, but it would be a daunting task. [1] http://www.julianbrowne.com/article/viewer/brewers-cap-theorem [2] http://www.allthingsdistributed.com/2008/12/eventually_consistent.html [3] http://perspectives.mvdirona.com/2008/07/10/GoogleMegastore.aspx [4] http://hadoop.apache.org/zookeeper/ On Thu, Apr 22, 2010 at 2:56 AM, Jeff Zhang mailto:zjf...@gmail.com>> wrote: Hi all, I need transaction support on cassandra, so wondering is anybody work on it ? -- Best Regards Jeff Zhang
Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token
Hi Jonathan Ellis and Stu Hood, I think, finally, we should provide a user customizable key abstract class. User can define what types of key and its class, which define how to compare keys. Schubert On Sat, Apr 24, 2010 at 1:16 PM, Stu Hood wrote: > Your keys cannot be an encoded as binary for OPP, since Cassandra will > attempt to decode them as UTF-8, meaning that they may not come back in the > same format. > > 0.7 supports byte keys using the ByteOrderedPartitioner, and tokens are > specified using hex. > > -Original Message- > From: "Mark Jones" > Sent: Friday, April 23, 2010 10:55am > To: "user@cassandra.apache.org" > Subject: RE: org.apache.cassandra.dht.OrderPreservingPartitioner Initial > Token > > So if my keys are binary, is there any way to escape the keysequence in? > > I have 20 bytes (any value 0x0-0xff is possible) as the key. > > Are they compared as an array of bytes? So that I can use truncation? > > 4 nodes, broken up by 0x00, 0x40, 0x80, 0xC0? > > > -Original Message- > From: Jonathan Ellis [mailto:jbel...@gmail.com] > Sent: Friday, April 23, 2010 10:22 AM > To: user@cassandra.apache.org > Subject: Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial > Token > > a normal String from the same universe as your keys. > > On Fri, Apr 23, 2010 at 7:23 AM, Mark Jones wrote: > > How is this specified? > > > > Is it a large hex #? > > > > A string of bytes in hex? > > > > > > > > http://wiki.apache.org/cassandra/StorageConfiguration doesn't say. > > >
Re: ORM in Cassandra?
I think you should forget these RDBMS tech. On Sat, Apr 24, 2010 at 11:00 AM, aXqd wrote: > On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert > wrote: > > There is nothing wrong with what you are asking. Some work has been done > to > > get an ORM layer ontop of cassandra, for example, with a RubyOnRails > > project. I'm trying to simplify cassandra integration with grails with > the > > plugin I'm writing. > > The problem is ORM solutions to date are wrapping a relational database. > > (The 'R' in ORM) Cassandra isn't a relational database so it does not map > > cleanly. > > Thanks. I noticed this problem before. I just want to know, in the > first place, what exactly is the right way to model relations in > Cassandra(a no-relational database). > So far, I still have those entities, and, without foreign keys, I use > relational entities, which contains the IDs of both sides of > relations. > In some other cases, I just duplicate data, and maintain the relations > manually by updating all the data in the same time. > > Is this the right way to go? Or what I am doing is still trying to > convert Cassandra to a RDBMS? > > > > > On Fri, Apr 23, 2010 at 1:29 AM, aXqd wrote: > >> > >> On Fri, Apr 23, 2010 at 3:03 PM, Benoit Perroud > >> wrote: > >> > I understand the question more like : Is there already a lib which > >> > help to get rid of writing hardcoded and hard to maintain lines like : > >> > > >> > MyClass data; > >> > String[] myFields = {"name", "label", ...} > >> > List columns; > >> > for (String field : myFields) { > >> >if (field == "name") { > >> > columns.add(new Column(field, data.getName())) > >> >} else if (field == "label") { > >> > columns.add(new Column(field, data.getLabel())) > >> >} else ... > >> > } > >> > (same for loading (instanciating) automagically the object). > >> > >> Yes, I am talking about this question. > >> > >> > > >> > Kind regards, > >> > > >> > Benoit. > >> > > >> > 2010/4/23 dir dir : > >> >>>So maybe it's weird to combine ORM and Cassandra, right? Is there > >> >>>anything we can take from ORM? > >> >> > >> >> Honestly I do not understand what is your question. It is clear that > >> >> you can not combine ORM such as Hibernate or iBATIS with Cassandra. > >> >> Cassandra it self is not a RDBMS, so you will not map the table into > >> >> the object. > >> >> > >> >> Dir. > >> > >> Sorry, English is not my mother tongue. > >> > >> I do understand I cannot combine ORM with Cassandra, because they are > >> totally different ways for building our data model. But I think there > >> are still something can be learnt from ORM to make Cassandra easier to > >> use, just as what ORM did to RDBMS before. > >> > >> IMHO, domain model is still intact when we design our software, hence > >> we need another way to map them to Cassandra's entity model. Relation > >> does not just go away in this case, hence we need another way to > >> express those relations and have a tool to set up Keyspace / > >> ColumnFamily automatically as what django's SYNCDB does. > >> > >> According to my limited experience with Cassandra, now, we do more > >> when we write, and less when we read/query. Hence I think the problem > >> lies exactly in how we duplicate our data to do queries. > >> > >> Please correct me if I got these all wrong. > >> > >> >> > >> >> On Fri, Apr 23, 2010 at 12:12 PM, aXqd wrote: > >> >>> > >> >>> Hi, all: > >> >>> > >> >>> I know many people regard O/R Mapping as rubbish. However it is > >> >>> undeniable that ORM is quite easy to use in most simple cases, > >> >>> Meanwhile Cassandra is well known as No-SQL solution, a.k.a. > >> >>> No-Relational solution. > >> >>> So maybe it's weird to combine ORM and Cassandra, right? Is there > >> >>> anything we can take from ORM? > >> >>> I just hate to write CRUD functions/Data layer for each object in > even > >> >>> a disposable prototype program. > >> >>> > >> >>> Regards. > >> >>> -Tian > >> >> > >> >> > >> > > > > > > > > > -- > > Virtually, Ned Wolpert > > > > "Settle thy studies, Faustus, and begin..." --Marlowe > > >
Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token
this is what IPartitioner does On Mon, Apr 26, 2010 at 10:16 AM, Schubert Zhang wrote: > Hi Jonathan Ellis and Stu Hood, > > I think, finally, we should provide a user customizable key abstract class. > User can define what types of key and its class, which define how to compare > keys. > > Schubert > > On Sat, Apr 24, 2010 at 1:16 PM, Stu Hood wrote: >> >> Your keys cannot be an encoded as binary for OPP, since Cassandra will >> attempt to decode them as UTF-8, meaning that they may not come back in the >> same format. >> >> 0.7 supports byte keys using the ByteOrderedPartitioner, and tokens are >> specified using hex. >> >> -Original Message- >> From: "Mark Jones" >> Sent: Friday, April 23, 2010 10:55am >> To: "user@cassandra.apache.org" >> Subject: RE: org.apache.cassandra.dht.OrderPreservingPartitioner Initial >> Token >> >> So if my keys are binary, is there any way to escape the keysequence in? >> >> I have 20 bytes (any value 0x0-0xff is possible) as the key. >> >> Are they compared as an array of bytes? So that I can use truncation? >> >> 4 nodes, broken up by 0x00, 0x40, 0x80, 0xC0? >> >> >> -Original Message- >> From: Jonathan Ellis [mailto:jbel...@gmail.com] >> Sent: Friday, April 23, 2010 10:22 AM >> To: user@cassandra.apache.org >> Subject: Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial >> Token >> >> a normal String from the same universe as your keys. >> >> On Fri, Apr 23, 2010 at 7:23 AM, Mark Jones wrote: >> > How is this specified? >> > >> > Is it a large hex #? >> > >> > A string of bytes in hex? >> > >> > >> > >> > http://wiki.apache.org/cassandra/StorageConfiguration doesn't say. >> >> > >
Re: running cassandra as a service on windows
Hi all, Had you tried with Tanuki's Java Wrapper? It's so easy to deploy in Windows... -aah 2010/4/23, Miguel Verde : > https://issues.apache.org/jira/browse/CASSANDRA-292 points to > http://commons.apache.org/daemon/procrun.html which is used by other Apache > software to implement Windows services in Java. CassandraDaemon conforms to > the Commons Daemon spec. > On Fri, Apr 23, 2010 at 2:20 PM, Jonathan Ellis wrote: > >> you could do it with standard techniques to run java apps as windows >> services. i understand it's a bit painful. >> >> On Fri, Apr 23, 2010 at 2:05 PM, S Ahmed wrote: >> > Is it possible to have Cassandra run in the background on a windows >> server? >> > i.e. as a service so if the server reboots, cassandra will automatically >> > run? >> > I really hate how windows handles services >> > -- Enviado desde mi dispositivo móvil
Re: ORM in Cassandra?
I am going to agree with axQd. Having something that does for Cassandra what say, Hibernate does for RDBMS seems an effort well worth pursuing. I have some complex object graphs written in Java. If I could annotate them and get persistence with a well laid out schema. It would be good. On Mon, Apr 26, 2010 at 8:21 AM, Schubert Zhang wrote: > I think you should forget these RDBMS tech. > > > > On Sat, Apr 24, 2010 at 11:00 AM, aXqd wrote: > >> On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert >> wrote: >> > There is nothing wrong with what you are asking. Some work has been done >> to >> > get an ORM layer ontop of cassandra, for example, with a RubyOnRails >> > project. I'm trying to simplify cassandra integration with grails with >> the >> > plugin I'm writing. >> > The problem is ORM solutions to date are wrapping a relational database. >> > (The 'R' in ORM) Cassandra isn't a relational database so it does not >> map >> > cleanly. >> >> Thanks. I noticed this problem before. I just want to know, in the >> first place, what exactly is the right way to model relations in >> Cassandra(a no-relational database). >> So far, I still have those entities, and, without foreign keys, I use >> relational entities, which contains the IDs of both sides of >> relations. >> In some other cases, I just duplicate data, and maintain the relations >> manually by updating all the data in the same time. >> >> Is this the right way to go? Or what I am doing is still trying to >> convert Cassandra to a RDBMS? >> >> > >> > On Fri, Apr 23, 2010 at 1:29 AM, aXqd wrote: >> >> >> >> On Fri, Apr 23, 2010 at 3:03 PM, Benoit Perroud >> >> wrote: >> >> > I understand the question more like : Is there already a lib which >> >> > help to get rid of writing hardcoded and hard to maintain lines like >> : >> >> > >> >> > MyClass data; >> >> > String[] myFields = {"name", "label", ...} >> >> > List columns; >> >> > for (String field : myFields) { >> >> >if (field == "name") { >> >> > columns.add(new Column(field, data.getName())) >> >> >} else if (field == "label") { >> >> > columns.add(new Column(field, data.getLabel())) >> >> >} else ... >> >> > } >> >> > (same for loading (instanciating) automagically the object). >> >> >> >> Yes, I am talking about this question. >> >> >> >> > >> >> > Kind regards, >> >> > >> >> > Benoit. >> >> > >> >> > 2010/4/23 dir dir : >> >> >>>So maybe it's weird to combine ORM and Cassandra, right? Is there >> >> >>>anything we can take from ORM? >> >> >> >> >> >> Honestly I do not understand what is your question. It is clear that >> >> >> you can not combine ORM such as Hibernate or iBATIS with Cassandra. >> >> >> Cassandra it self is not a RDBMS, so you will not map the table into >> >> >> the object. >> >> >> >> >> >> Dir. >> >> >> >> Sorry, English is not my mother tongue. >> >> >> >> I do understand I cannot combine ORM with Cassandra, because they are >> >> totally different ways for building our data model. But I think there >> >> are still something can be learnt from ORM to make Cassandra easier to >> >> use, just as what ORM did to RDBMS before. >> >> >> >> IMHO, domain model is still intact when we design our software, hence >> >> we need another way to map them to Cassandra's entity model. Relation >> >> does not just go away in this case, hence we need another way to >> >> express those relations and have a tool to set up Keyspace / >> >> ColumnFamily automatically as what django's SYNCDB does. >> >> >> >> According to my limited experience with Cassandra, now, we do more >> >> when we write, and less when we read/query. Hence I think the problem >> >> lies exactly in how we duplicate our data to do queries. >> >> >> >> Please correct me if I got these all wrong. >> >> >> >> >> >> >> >> On Fri, Apr 23, 2010 at 12:12 PM, aXqd wrote: >> >> >>> >> >> >>> Hi, all: >> >> >>> >> >> >>> I know many people regard O/R Mapping as rubbish. However it is >> >> >>> undeniable that ORM is quite easy to use in most simple cases, >> >> >>> Meanwhile Cassandra is well known as No-SQL solution, a.k.a. >> >> >>> No-Relational solution. >> >> >>> So maybe it's weird to combine ORM and Cassandra, right? Is there >> >> >>> anything we can take from ORM? >> >> >>> I just hate to write CRUD functions/Data layer for each object in >> even >> >> >>> a disposable prototype program. >> >> >>> >> >> >>> Regards. >> >> >>> -Tian >> >> >> >> >> >> >> >> > >> > >> > >> > >> > -- >> > Virtually, Ned Wolpert >> > >> > "Settle thy studies, Faustus, and begin..." --Marlowe >> > >> > >
Re: Trying To Understand get_range_slices Results When Using RandomPartitioner
RandomPartioner is for row-keys. #1 no #2 yes #3 yes On Sat, Apr 24, 2010 at 4:33 AM, Larry Root wrote: > I trying to better understand how using the RandomPartitioner will affect > my ability to select ranges of keys. Consider my simple example where we > have many online games across different game genres (GameType). These games > need to store data for each one of their users. With that in mind consider > the following data model: > > enum GameType {'RPG', 'FPS', 'ARCADE'} > > { > "GameData": { // Super Column Family > > *GameType+"1234"*: {// Row (concat gametype with a > game id for example) > *"user-data:5678"*:{// Super column (user data) > *"user_prop_name"*: "value",// Subcolumn (arbitrary user > properties and values) > *"another_prop_name"*: "value", > ... > }, > *"user-data:9012"*:{ > *"**user_prop_name**"*: "value", > ... > } > }, > > * GameType+"3456"*: {...}, > *GameType+"7890"*: {...}, > ... > } > } > > Assume we have a multi node cluster running Cassandra 0.6.1. In that > scenario could some one help me understand what the result would be in the > following cases: > >1. We use a range slice to grab keys for all 'RPG' games (range slice >at the ROW level). Would we be able to get all games back in a single query >or would that not be guaranteed? > >2. For a given game we use a range slice to grab all user-data keys in >which the ID starts with '5' (range slice at the COLUMN level). Again, > would >we be able to get all keys in one call (assuming number of keys in the >result was not an issue)? > >3. Finally for a given game and a given user we do a range slice to >grab all user properties that start with 'a' (range slice at the SUBCOLUMN >level of a SUPERCOLUMN). Is that possible in one call? > > I'm trying to understand at what level the RandomPartioner affects my > example data model. Is it at a fixed level like just ROWS (the sub data is > fixed to the same node) or is all data at every level *randomized* across > all nodes. > > Are there any tricks to doing these sort of range slices using RP? For > example if I set my consistency level to 'ALL' when doing a range slice > would that effectively compile a complete result set for me? > > Thanks for the help! > > larry
Re: ORM in Cassandra?
I don't think you are trying to convert Cassandra to a RDBMS with what you want. The issue is that finding a way to map these objects to Cassandra in a meaningful way is hard. Its not as easy as saying 'do what hibernate does' simply because its not an RDBMS... but it is a reasonable and useful goal. I'm trying to accomplish this myself with the grails Cassandra plugin. On Fri, Apr 23, 2010 at 8:00 PM, aXqd wrote: > On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert > wrote: > > There is nothing wrong with what you are asking. Some work has been done > to > > get an ORM layer ontop of cassandra, for example, with a RubyOnRails > > project. I'm trying to simplify cassandra integration with grails with > the > > plugin I'm writing. > > The problem is ORM solutions to date are wrapping a relational database. > > (The 'R' in ORM) Cassandra isn't a relational database so it does not map > > cleanly. > > Thanks. I noticed this problem before. I just want to know, in the > first place, what exactly is the right way to model relations in > Cassandra(a no-relational database). > So far, I still have those entities, and, without foreign keys, I use > relational entities, which contains the IDs of both sides of > relations. > In some other cases, I just duplicate data, and maintain the relations > manually by updating all the data in the same time. > > Is this the right way to go? Or what I am doing is still trying to > convert Cassandra to a RDBMS? > > > > > On Fri, Apr 23, 2010 at 1:29 AM, aXqd wrote: > >> > >> On Fri, Apr 23, 2010 at 3:03 PM, Benoit Perroud > >> wrote: > >> > I understand the question more like : Is there already a lib which > >> > help to get rid of writing hardcoded and hard to maintain lines like : > >> > > >> > MyClass data; > >> > String[] myFields = {"name", "label", ...} > >> > List columns; > >> > for (String field : myFields) { > >> >if (field == "name") { > >> > columns.add(new Column(field, data.getName())) > >> >} else if (field == "label") { > >> > columns.add(new Column(field, data.getLabel())) > >> >} else ... > >> > } > >> > (same for loading (instanciating) automagically the object). > >> > >> Yes, I am talking about this question. > >> > >> > > >> > Kind regards, > >> > > >> > Benoit. > >> > > >> > 2010/4/23 dir dir : > >> >>>So maybe it's weird to combine ORM and Cassandra, right? Is there > >> >>>anything we can take from ORM? > >> >> > >> >> Honestly I do not understand what is your question. It is clear that > >> >> you can not combine ORM such as Hibernate or iBATIS with Cassandra. > >> >> Cassandra it self is not a RDBMS, so you will not map the table into > >> >> the object. > >> >> > >> >> Dir. > >> > >> Sorry, English is not my mother tongue. > >> > >> I do understand I cannot combine ORM with Cassandra, because they are > >> totally different ways for building our data model. But I think there > >> are still something can be learnt from ORM to make Cassandra easier to > >> use, just as what ORM did to RDBMS before. > >> > >> IMHO, domain model is still intact when we design our software, hence > >> we need another way to map them to Cassandra's entity model. Relation > >> does not just go away in this case, hence we need another way to > >> express those relations and have a tool to set up Keyspace / > >> ColumnFamily automatically as what django's SYNCDB does. > >> > >> According to my limited experience with Cassandra, now, we do more > >> when we write, and less when we read/query. Hence I think the problem > >> lies exactly in how we duplicate our data to do queries. > >> > >> Please correct me if I got these all wrong. > >> > >> >> > >> >> On Fri, Apr 23, 2010 at 12:12 PM, aXqd wrote: > >> >>> > >> >>> Hi, all: > >> >>> > >> >>> I know many people regard O/R Mapping as rubbish. However it is > >> >>> undeniable that ORM is quite easy to use in most simple cases, > >> >>> Meanwhile Cassandra is well known as No-SQL solution, a.k.a. > >> >>> No-Relational solution. > >> >>> So maybe it's weird to combine ORM and Cassandra, right? Is there > >> >>> anything we can take from ORM? > >> >>> I just hate to write CRUD functions/Data layer for each object in > even > >> >>> a disposable prototype program. > >> >>> > >> >>> Regards. > >> >>> -Tian > >> >> > >> >> > >> > > > > > > > > > -- > > Virtually, Ned Wolpert > > > > "Settle thy studies, Faustus, and begin..." --Marlowe > > > -- Virtually, Ned Wolpert "Settle thy studies, Faustus, and begin..." --Marlowe
Is SuperColumn necessary?
I don't think the SuperColumn is so necessary. I think this level of logic can be leaved to application. Do you think so? If SuperColumn is needed, as https://issues.apache.org/jira/browse/CASSANDRA-598, we should build index in SuperColumns level and SubColumns level. Thus, the levels of index is too many.
Re: Does anybody work about transaction on cassandra ?
Better fault tolerance? Scalability to large data volumes? A combination of ZooKeeper based transactions and Cassandra may have better characteristics than RDBMS on these criteria. There's no question that trade-offs are involved, but as far as these issues are concerned, you'd be starting from a better vantage point than a SPOF relational database. On Apr 26, 2010, at 10:24 AM, Mark Jones wrote: > Orthogonal in this case means “at cross purposes” Transactions can’t really > be done with eventual consistency because all nodes don’t have all the info > at the time the transaction is done. I think they recommend zookeeper for > this kind of stuff, but I don’t know why you want to use Cassandra vs a RDBMS > if you really want transactions. > > From: dir dir [mailto:sikerasa...@gmail.com] > Sent: Saturday, April 24, 2010 12:08 PM > To: user@cassandra.apache.org > Subject: Re: Does anybody work about transaction on cassandra ? > > >Transactions are orthogonal to the design of Cassandra > > Sorry, Would you want to tell me what is an orthogonal mean in this context?? > honestly I do not understand what is it. > > Thank you. > > > On Thu, Apr 22, 2010 at 9:14 PM, Miguel Verde wrote: > No, as far as I know no one is working on transaction support in Cassandra. > Transactions are orthogonal to the design of Cassandra[1][2], although a > system could be designed incorporating Cassandra and other elements a la > Google's MegaStore[3] to support transactions. Google uses Paxos, one might > be able to use Zookeeper[4] to design such a system, but it would be a > daunting task. > > [1] http://www.julianbrowne.com/article/viewer/brewers-cap-theorem > [2] http://www.allthingsdistributed.com/2008/12/eventually_consistent.html > [3] http://perspectives.mvdirona.com/2008/07/10/GoogleMegastore.aspx > [4] http://hadoop.apache.org/zookeeper/ > > On Thu, Apr 22, 2010 at 2:56 AM, Jeff Zhang wrote: > Hi all, > > I need transaction support on cassandra, so wondering is anybody work on it ? > > > -- > Best Regards > > Jeff Zhang > >
Re: Is SuperColumn necessary?
I think that once we have built-in indexing (CASSANDRA-749) you can make a good case for dropping supercolumns (at least, dropping them from the public API and reserving them for internal use). On Mon, Apr 26, 2010 at 11:05 AM, Schubert Zhang wrote: > I don't think the SuperColumn is so necessary. > I think this level of logic can be leaved to application. > > Do you think so? > > If SuperColumn is needed, as > https://issues.apache.org/jira/browse/CASSANDRA-598, we should build index > in SuperColumns level and SubColumns level. > Thus, the levels of index is too many. > >
Re: ORM in Cassandra?
Clearly Cassandra is not an RDBMS. The intent of my Hibernate reference was to be more lyrical. Sorry if that didn't come through. Nonetheless, the need remains to relieve ourselves from excessive boilerplate coding. On Mon, Apr 26, 2010 at 9:00 AM, Ned Wolpert wrote: > I don't think you are trying to convert Cassandra to a RDBMS with what you > want. The issue is that finding a way to map these objects to Cassandra in a > meaningful way is hard. Its not as easy as saying 'do what hibernate does' > simply because its not an RDBMS...but it is a reasonable and useful goal. > I'm trying to accomplish this myself with the grails Cassandra plugin. > > > On Fri, Apr 23, 2010 at 8:00 PM, aXqd wrote: > >> On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert >> wrote: >> > There is nothing wrong with what you are asking. Some work has been done >> to >> > get an ORM layer ontop of cassandra, for example, with a RubyOnRails >> > project. I'm trying to simplify cassandra integration with grails with >> the >> > plugin I'm writing. >> > The problem is ORM solutions to date are wrapping a relational database. >> > (The 'R' in ORM) Cassandra isn't a relational database so it does not >> map >> > cleanly. >> >> Thanks. I noticed this problem before. I just want to know, in the >> first place, what exactly is the right way to model relations in >> Cassandra(a no-relational database). >> So far, I still have those entities, and, without foreign keys, I use >> relational entities, which contains the IDs of both sides of >> relations. >> In some other cases, I just duplicate data, and maintain the relations >> manually by updating all the data in the same time. >> >> Is this the right way to go? Or what I am doing is still trying to >> convert Cassandra to a RDBMS? >> >> > >> > On Fri, Apr 23, 2010 at 1:29 AM, aXqd wrote: >> >> >> >> On Fri, Apr 23, 2010 at 3:03 PM, Benoit Perroud >> >> wrote: >> >> > I understand the question more like : Is there already a lib which >> >> > help to get rid of writing hardcoded and hard to maintain lines like >> : >> >> > >> >> > MyClass data; >> >> > String[] myFields = {"name", "label", ...} >> >> > List columns; >> >> > for (String field : myFields) { >> >> >if (field == "name") { >> >> > columns.add(new Column(field, data.getName())) >> >> >} else if (field == "label") { >> >> > columns.add(new Column(field, data.getLabel())) >> >> >} else ... >> >> > } >> >> > (same for loading (instanciating) automagically the object). >> >> >> >> Yes, I am talking about this question. >> >> >> >> > >> >> > Kind regards, >> >> > >> >> > Benoit. >> >> > >> >> > 2010/4/23 dir dir : >> >> >>>So maybe it's weird to combine ORM and Cassandra, right? Is there >> >> >>>anything we can take from ORM? >> >> >> >> >> >> Honestly I do not understand what is your question. It is clear that >> >> >> you can not combine ORM such as Hibernate or iBATIS with Cassandra. >> >> >> Cassandra it self is not a RDBMS, so you will not map the table into >> >> >> the object. >> >> >> >> >> >> Dir. >> >> >> >> Sorry, English is not my mother tongue. >> >> >> >> I do understand I cannot combine ORM with Cassandra, because they are >> >> totally different ways for building our data model. But I think there >> >> are still something can be learnt from ORM to make Cassandra easier to >> >> use, just as what ORM did to RDBMS before. >> >> >> >> IMHO, domain model is still intact when we design our software, hence >> >> we need another way to map them to Cassandra's entity model. Relation >> >> does not just go away in this case, hence we need another way to >> >> express those relations and have a tool to set up Keyspace / >> >> ColumnFamily automatically as what django's SYNCDB does. >> >> >> >> According to my limited experience with Cassandra, now, we do more >> >> when we write, and less when we read/query. Hence I think the problem >> >> lies exactly in how we duplicate our data to do queries. >> >> >> >> Please correct me if I got these all wrong. >> >> >> >> >> >> >> >> On Fri, Apr 23, 2010 at 12:12 PM, aXqd wrote: >> >> >>> >> >> >>> Hi, all: >> >> >>> >> >> >>> I know many people regard O/R Mapping as rubbish. However it is >> >> >>> undeniable that ORM is quite easy to use in most simple cases, >> >> >>> Meanwhile Cassandra is well known as No-SQL solution, a.k.a. >> >> >>> No-Relational solution. >> >> >>> So maybe it's weird to combine ORM and Cassandra, right? Is there >> >> >>> anything we can take from ORM? >> >> >>> I just hate to write CRUD functions/Data layer for each object in >> even >> >> >>> a disposable prototype program. >> >> >>> >> >> >>> Regards. >> >> >>> -Tian >> >> >> >> >> >> >> >> > >> > >> > >> > >> > -- >> > Virtually, Ned Wolpert >> > >> > "Settle thy studies, Faustus, and begin..." --Marlowe >> > >> > > > > -- > Virtually, Ned Wolpert > > "Settle thy studies, Faustus, and begin..." --Marlowe >
Re: Can Cassandra make real use of several DataFileDirectories?
I would recommend using RAID-0 rather that multiple data directories. -ryan 2010/4/26 Roland Hänel : > I have a configuration like this: > > > /storage01/cassandra/data > /storage02/cassandra/data > /storage03/cassandra/data > > > After loading a big chunk of data into cassandra, I end up wich some 70GB in > the first directory, and only about 10GB in the second and third one. All > rows are quite small, so it's not just some big rows that contain the > majority of data. > > Does Cassandra have the ability to 'see' the maximum available space in > these directory? I'm asking myself this question since my limit is 100GB, > and the first directory is approaching this limit... > > And, wouldn't it be better if Cassandra tried to 'load-balance' the files > inside the directories because this will result in better (read) performance > if the directories are on different disks (which is the case for me)? > > Any help is appreciated. > > Roland > >
Re: How do you construct an index and use it, especially in Ruby
On Sun, Apr 25, 2010 at 11:14 AM, Bob Hutchison wrote: > > Hi, > > I'm new to Cassandra and trying to work out how to do something that I've > implemented any number of times (e.g. TokyoCabinet, Perst, even the > filesystem using grep :-) I've managed to get some of this working in > Cassandra but not all. > > So here's the core of the situation. > > I have this opaque chunk of data that I want to store in Cassandra and then > find it again. > > I can generate a key when the data is created very easily, and I've stored it > in a straight forward manner: in a column with a key whose value is the data. > And I can retrieve it when I know the key. No difficulties here at all, works > fine. > > Now I want to index this data taking what I imagine to be a pretty typical > approach. > > Lets say there's two many-to-one indexes: 'colour', and 'size'. Each colour > value will have more than one chunk of data, same for size. > > What I thought I'd do is make a super column and index the chunk of data kind > of like: { 'colour' => { 'blue' => 1 }, 'size' => { 'large' => 1}} with the > key equal to the key of the chunk of data. And Cassandra stores it without > error like that. So using the Ruby gem, it'd be something along the lines of: > > cassandra.insert(:Indexes, key-of-the-chunk-of-data, { 'colour' => { 'blue' > => 1 }, 'size' => { 'large' => 1 } }) > > Q1: is this a reasonable approach? It *seems* to be what I've read is > supposed to be done. The 1 is meaningless. Anyway, it executes without error > in Ruby. No. In order to index your data, you need to invert it. Since you're working in ruby I'd recommend CassandraObject: http://github.com/nzKoz/cassandra_object. It has indexing built in. -ryan > Q2: what is the syntax of the (Ruby) query to find the keys of all 'blue' > chunks of data? I'm assuming get_range is the correct method, but what are > the parameters? The docs say: get_range(column_family, options={}) but that > seems to be missing a bit of detail, in particular the super column name. > > Q2a: So I know there's a :start and :finish key supported in the options > hash, inclusive, exclusive respectively. How do you define a range for equals > with a UTF8 key? Surely not 'blue'.succ?? or by some kind of suffix?? > > Q2b: How do you specify the super column name 'colour'? Looking at the (Ruby) > source of the get_range method and I'm unconvinced that this is implemented > (seems to be a constant '' used where the super column name makes sense to > be.) > > Anyway I ended up hacking at the Ruby gem's source to use the column name > where the '' was in the original, and didn't really get anywhere useful (I > can find nothing, or everything, nothing in between). > > Q3: If I am correct about what is supposed to be done, does the Ruby gem > support it? > > Q4: Does anyone know of some Ruby code that does and indexed lookup that they > could point me at. (lots of code that indexes but nothing that searches by > the index) > > I'll try to take a look at some of the other Cassandra client implementations > and see if I can get this model to work. Maybe just a Ruby problem?? With any > luck, it'll be me messing up. > > If it'd help I can post the source of what I have, but it'll need some > cleanup. Let me know. > > Thanks for taking the time to read this far :-) > > Bob > > > Bob Hutchison > Recursive Design Inc. > http://www.recursive.ca/ > weblog: http://xampl.com/so > > > > Bob Hutchison > Recursive Design Inc. > http://www.recursive.ca/ > weblog: http://xampl.com/so > > > > >
cassandra 0.5.1 java.lang.OutOfMemoryError: Java heap space issue
Hello. I have a six node cassandra cluster running on modest hardware with 1G of heap assigned to cassandra. After inserting about 245 million rows of data, cassandra failed with a java.lang.OutOfMemoryError: Java heap space error. I rasied the java heap to 2G, but still get the same error when trying to restart cassandra. I am using Cassandra 0.5.1 with Sun jre1.6.0_18. Any thoughts on how to resolve this issue are greatly appreciated. Here are log excerpts from two of the nodes: DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dcf9f19e [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dd04bf9c [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dd08981a [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dd7f7ac9 [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 SliceQueryFilter.java (line 116) collecting SuperColumn(dde1d4cf [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(de32aec3 [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(de378105 [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(deb5d591 [0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(ded75dee [0a011d0c,0a011d0d,]) DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 SliceQueryFilter.java (line 116) collecting SuperColumn(defe3445 [0a011d0c,0a011d0d,]) INFO [FLUSH-TIMER] 2010-04-23 16:20:00,071 ColumnFamilyStore.java (line 393) IpTag has reached its threshold; switching in a fresh Memtable INFO [FLUSH-TIMER] 2010-04-23 16:20:00,072 ColumnFamilyStore.java (line 1035) Enqueuing flush of Memtable(IpTag)@7816 INFO [FLUSH-SORTER-POOL:1] 2010-04-23 16:20:00,072 Memtable.java (line 183) Sorting Memtable(IpTag)@7816 INFO [FLUSH-WRITER-POOL:1] 2010-04-23 16:20:00,107 Memtable.java (line 192) Writing Memtable(IpTag)@7816 DEBUG [Timer-0] 2010-04-23 16:20:00,130 LoadDisseminator.java (line 39) Disseminating load info ... ERROR [ROW-MUTATION-STAGE:41] 2010-04-23 16:20:00,348 CassandraDaemon.java (line 71) Fatal exception in thread Thread[ROW-MUTATION-STAGE:41,5,main] java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Unknown Source) at java.lang.String.(Unknown Source) at java.lang.StringBuilder.toString(Unknown Source) at org.apache.cassandra.db.marshal.AbstractType.getColumnsString(AbstractType.java:87) at org.apache.cassandra.db.ColumnFamily.toString(ColumnFamily.java:344) at org.apache.commons.lang.ObjectUtils.toString(ObjectUtils.java:241) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3073) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3133) at org.apache.cassandra.db.RowMutation.toString(RowMutation.java:263) at java.lang.String.valueOf(Unknown Source) at java.lang.StringBuilder.append(Unknown Source) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:46) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:38) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) --- DEBUG [main] 2010-04-23 17:15:45,501 CommitLog.java (line 312) Reading mutation at 57527476 DEBUG [main] 2010-04-23 17:16:11,375 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5c0,])} DEBUG [main] 2010-04-23 17:16:45,293 CommitLog.java (line 312) Reading mutation at 57527686 DEBUG [main] 2010-04-23 17:16:45,294 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} DEBUG [main] 2010-04-23 17:16:54,311 CommitLog.java (line 312) Reading mutation at 57527919 DEBUG [main] 2010-04-23 17:17:46,344 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} DEBUG [main] 2010-04-23 17:17:55,530 CommitLog.java (line 312) Reading mutation at 57528129 DEBUG [main] 2010-04-23 17:18:20,266 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c607,])} DEBUG [main] 2010-04-23 17:18:38,273 CommitLog.java (line 312) Reading mutation at 57528362 DEBUG [main] 2010-04-23 17:21:53,966 CommitLog.java (line 340) replaying mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c607,])} DEBUG [mai
Cassandra Job in Pasadena
Hi, OpenX is looking for someone to work fulltime on Cassandra, we're located in Pasadena, CA. Here's a link to the job description http://www.openx.org/jobs/position/software-engineer-infrastructure We've been running cassandra in production since 0.3.0, and currently have 3 cassandra clusters. Feel free to email me offlist any questions you might have, and if you are interested please send your resume. Thanks, -Anthony -- Anthony Molinaro
Re: ORM in Cassandra?
On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: > Clearly Cassandra is not an RDBMS. The intent of my Hibernate > reference was to be more lyrical. Sorry if that didn't come through. > Nonetheless, the need remains to relieve ourselves from excessive > boilerplate coding. I agree with eliminating boilerplate code. Chris Shorrock wrote a simple object mapper in Scala for his Cascal Cassandra client. You may want to check out the wiki on GitHub (http://wiki.github.com/shorrockin/cascal/). In my opinion, a mapping solution for Cassandra should be more like a Template. Something that helps map (back and forth) rows to objects, columns to properties, etc. Since the data model can vary so much depending on data access patters, any overly structured approach that prescribes a particular schema will be of limited use. If you're from the Java world, think of iBATIS vs. Hibernate. > > On Mon, Apr 26, 2010 at 9:00 AM, Ned Wolpert = wrote: > I don't think you are trying to convert Cassandra to a RDBMS with what = you want. The issue is that finding a way to map these objects to = Cassandra in a meaningful way is hard. Its not as easy as saying 'do = what hibernate does' simply because its not an RDBMS...but it is a = reasonable and useful goal. I'm trying to accomplish this myself with = the grails Cassandra plugin. >=20 >=20 > On Fri, Apr 23, 2010 at 8:00 PM, aXqd wrote: > On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert = wrote: > > There is nothing wrong with what you are asking. Some work has been = done to > > get an ORM layer ontop of cassandra, for example, with a RubyOnRails > > project. I'm trying to simplify cassandra integration with grails = with the > > plugin I'm writing. > > The problem is ORM solutions to date are wrapping a relational = database. > > (The 'R' in ORM) Cassandra isn't a relational database so it does = not map > > cleanly. >=20 > Thanks. I noticed this problem before. I just want to know, in the > first place, what exactly is the right way to model relations in > Cassandra(a no-relational database). > So far, I still have those entities, and, without foreign keys, I use > relational entities, which contains the IDs of both sides of > relations. > In some other cases, I just duplicate data, and maintain the relations > manually by updating all the data in the same time. >=20 > Is this the right way to go? Or what I am doing is still trying to > convert Cassandra to a RDBMS? >=20 > > > > On Fri, Apr 23, 2010 at 1:29 AM, aXqd wrote: > >> > >> On Fri, Apr 23, 2010 at 3:03 PM, Benoit Perroud = > >> wrote: > >> > I understand the question more like : Is there already a lib = which > >> > help to get rid of writing hardcoded and hard to maintain lines = like : > >> > > >> > MyClass data; > >> > String[] myFields =3D {"name", "label", ...} > >> > List columns; > >> > for (String field : myFields) { > >> >if (field =3D=3D "name") { > >> > columns.add(new Column(field, data.getName())) > >> >} else if (field =3D=3D "label") { > >> > columns.add(new Column(field, data.getLabel())) > >> >} else ... > >> > } > >> > (same for loading (instanciating) automagically the object). > >> > >> Yes, I am talking about this question. > >> > >> > > >> > Kind regards, > >> > > >> > Benoit. > >> > > >> > 2010/4/23 dir dir : > >> >>>So maybe it's weird to combine ORM and Cassandra, right? Is = there > >> >>>anything we can take from ORM? > >> >> > >> >> Honestly I do not understand what is your question. It is clear = that > >> >> you can not combine ORM such as Hibernate or iBATIS with = Cassandra. > >> >> Cassandra it self is not a RDBMS, so you will not map the table = into > >> >> the object. > >> >> > >> >> Dir. > >> > >> Sorry, English is not my mother tongue. > >> > >> I do understand I cannot combine ORM with Cassandra, because they = are > >> totally different ways for building our data model. But I think = there > >> are still something can be learnt from ORM to make Cassandra easier = to > >> use, just as what ORM did to RDBMS before. > >> > >> IMHO, domain model is still intact when we design our software, = hence > >> we need another way to map them to Cassandra's entity model. = Relation > >> does not just go away in this case, hence we need another way to > >> express those relations and have a tool to set up Keyspace / > >> ColumnFamily automatically as what django's SYNCDB does. > >> > >> According to my limited experience with Cassandra, now, we do more > >> when we write, and less when we read/query. Hence I think the = problem > >> lies exactly in how we duplicate our data to do queries. > >> > >> Please correct me if I got these all wrong. > >> > >> >> > >> >> On Fri, Apr 23, 2010 at 12:12 PM, aXqd = wrote: > >> >>> > >> >>> Hi, all: > >> >>> > >> >>> I know many people regard O/R Mapping as rubbish. However it is > >> >>> undeniable that ORM is quite easy to use in most simple cases, > >> >>> Meanwhile Cassandra is well known as No-SQL solution, a.k.a. >
Re: ORM in Cassandra?
There is, of course, also cassandra_object on the ruby side. I assume this thread has the implicit requirement of Java, though. -- Jeff On Mon, Apr 26, 2010 at 10:26 AM, Isaac Arias wrote: > On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: > >> Clearly Cassandra is not an RDBMS. The intent of my Hibernate >> reference was to be more lyrical. Sorry if that didn't come through. > >> Nonetheless, the need remains to relieve ourselves from excessive >> boilerplate coding. > > I agree with eliminating boilerplate code. Chris Shorrock wrote a > simple object mapper in Scala for his Cascal Cassandra client. You may > want to check out the wiki on GitHub > (http://wiki.github.com/shorrockin/cascal/). > > In my opinion, a mapping solution for Cassandra should be more like a > Template. Something that helps map (back and forth) rows to objects, > columns to properties, etc. Since the data model can vary so much > depending on data access patters, any overly structured approach that > prescribes a particular schema will be of limited use. > > If you're from the Java world, think of iBATIS vs. Hibernate. > > >> >> On Mon, Apr 26, 2010 at 9:00 AM, Ned Wolpert = > wrote: >> I don't think you are trying to convert Cassandra to a RDBMS with what = > you want. The issue is that finding a way to map these objects to = > Cassandra in a meaningful way is hard. Its not as easy as saying 'do = > what hibernate does' simply because its not an RDBMS...but it is a = > reasonable and useful goal. I'm trying to accomplish this myself with = > the grails Cassandra plugin. >>=20 >>=20 >> On Fri, Apr 23, 2010 at 8:00 PM, aXqd wrote: >> On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert = > wrote: >> > There is nothing wrong with what you are asking. Some work has been = > done to >> > get an ORM layer ontop of cassandra, for example, with a RubyOnRails >> > project. I'm trying to simplify cassandra integration with grails = > with the >> > plugin I'm writing. >> > The problem is ORM solutions to date are wrapping a relational = > database. >> > (The 'R' in ORM) Cassandra isn't a relational database so it does = > not map >> > cleanly. >>=20 >> Thanks. I noticed this problem before. I just want to know, in the >> first place, what exactly is the right way to model relations in >> Cassandra(a no-relational database). >> So far, I still have those entities, and, without foreign keys, I use >> relational entities, which contains the IDs of both sides of >> relations. >> In some other cases, I just duplicate data, and maintain the relations >> manually by updating all the data in the same time. >>=20 >> Is this the right way to go? Or what I am doing is still trying to >> convert Cassandra to a RDBMS? >>=20 >> > >> > On Fri, Apr 23, 2010 at 1:29 AM, aXqd wrote: >> >> >> >> On Fri, Apr 23, 2010 at 3:03 PM, Benoit Perroud = > >> >> wrote: >> >> > I understand the question more like : Is there already a lib = > which >> >> > help to get rid of writing hardcoded and hard to maintain lines = > like : >> >> > >> >> > MyClass data; >> >> > String[] myFields =3D {"name", "label", ...} >> >> > List columns; >> >> > for (String field : myFields) { >> >> > if (field =3D=3D "name") { >> >> > columns.add(new Column(field, data.getName())) >> >> > } else if (field =3D=3D "label") { >> >> > columns.add(new Column(field, data.getLabel())) >> >> > } else ... >> >> > } >> >> > (same for loading (instanciating) automagically the object). >> >> >> >> Yes, I am talking about this question. >> >> >> >> > >> >> > Kind regards, >> >> > >> >> > Benoit. >> >> > >> >> > 2010/4/23 dir dir : >> >> >>>So maybe it's weird to combine ORM and Cassandra, right? Is = > there >> >> >>>anything we can take from ORM? >> >> >> >> >> >> Honestly I do not understand what is your question. It is clear = > that >> >> >> you can not combine ORM such as Hibernate or iBATIS with = > Cassandra. >> >> >> Cassandra it self is not a RDBMS, so you will not map the table = > into >> >> >> the object. >> >> >> >> >> >> Dir. >> >> >> >> Sorry, English is not my mother tongue. >> >> >> >> I do understand I cannot combine ORM with Cassandra, because they = > are >> >> totally different ways for building our data model. But I think = > there >> >> are still something can be learnt from ORM to make Cassandra easier = > to >> >> use, just as what ORM did to RDBMS before. >> >> >> >> IMHO, domain model is still intact when we design our software, = > hence >> >> we need another way to map them to Cassandra's entity model. = > Relation >> >> does not just go away in this case, hence we need another way to >> >> express those relations and have a tool to set up Keyspace / >> >> ColumnFamily automatically as what django's SYNCDB does. >> >> >> >> According to my limited experience with Cassandra, now, we do more >> >> when we write, and less when we read/query. Hence I think the = > problem >> >> lies exactly in how we duplicate our data to do queries. >> >> >> >> Please corr
Re: ORM in Cassandra?
On 04/26/2010 01:26 PM, Isaac Arias wrote: On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: Clearly Cassandra is not an RDBMS. The intent of my Hibernate reference was to be more lyrical. Sorry if that didn't come through. Nonetheless, the need remains to relieve ourselves from excessive boilerplate coding. I agree with eliminating boilerplate code. Chris Shorrock wrote a simple object mapper in Scala for his Cascal Cassandra client. You may want to check out the wiki on GitHub (http://wiki.github.com/shorrockin/cascal/). In my opinion, a mapping solution for Cassandra should be more like a Template. Something that helps map (back and forth) rows to objects, columns to properties, etc. Since the data model can vary so much depending on data access patters, any overly structured approach that prescribes a particular schema will be of limited use. For what it's worth, this is exactly my opinion after looking at the problem for a bit, and I'm actively developing such a solution in Ruby. I spent some time playing with the CassandraObject project, but felt that despite all the good work that went in there, it didn't feel to me like it fit the problem space in an idiomatic manner. No criticism intended there; it seems to lean a little more towards a very structured schema, with less flexibility for things like collection attributes the members of which all have a key that matches a pattern (which is a use case we have). So, for my approach, there's one project that gives metaprogramming semantics for building the mapping behavior you describe: build classes that are oriented towards mapping between simple JSON-like structures and full-blown business objects. And a separate project that layers Cassandra specifics on top of that underlying mapper tool. The rub being: it's for a client, and we're collectively sorting out the details for releasing the code in some useful, public manner. But hopefully I'll get something useful out there for potential Ruby enthusiasts before too long. Hopefully a week or two. Thanks. - Ethan -- Ethan Rowe End Point Corporation et...@endpoint.com
Re: ORM in Cassandra?
The real tragedy is that we have not created a new acronym for this yet... OKVM... it makes more sense... On Mon, Apr 26, 2010 at 10:35 AM, Ethan Rowe wrote: > On 04/26/2010 01:26 PM, Isaac Arias wrote: > >> On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: >> >> >> >>> Clearly Cassandra is not an RDBMS. The intent of my Hibernate >>> reference was to be more lyrical. Sorry if that didn't come through. >>> >>> >> >> >>> Nonetheless, the need remains to relieve ourselves from excessive >>> boilerplate coding. >>> >>> >> I agree with eliminating boilerplate code. Chris Shorrock wrote a >> simple object mapper in Scala for his Cascal Cassandra client. You may >> want to check out the wiki on GitHub >> (http://wiki.github.com/shorrockin/cascal/). >> >> In my opinion, a mapping solution for Cassandra should be more like a >> Template. Something that helps map (back and forth) rows to objects, >> columns to properties, etc. Since the data model can vary so much >> depending on data access patters, any overly structured approach that >> prescribes a particular schema will be of limited use. >> >> > > For what it's worth, this is exactly my opinion after looking at the > problem for a bit, and I'm actively developing such a solution in Ruby. I > spent some time playing with the CassandraObject project, but felt that > despite all the good work that went in there, it didn't feel to me like it > fit the problem space in an idiomatic manner. No criticism intended there; > it seems to lean a little more towards a very structured schema, with less > flexibility for things like collection attributes the members of which all > have a key that matches a pattern (which is a use case we have). > > So, for my approach, there's one project that gives metaprogramming > semantics for building the mapping behavior you describe: build classes that > are oriented towards mapping between simple JSON-like structures and > full-blown business objects. And a separate project that layers Cassandra > specifics on top of that underlying mapper tool. > > The rub being: it's for a client, and we're collectively sorting out the > details for releasing the code in some useful, public manner. But hopefully > I'll get something useful out there for potential Ruby enthusiasts before > too long. Hopefully a week or two. > > Thanks. > - Ethan > > -- > Ethan Rowe > End Point Corporation > et...@endpoint.com > >
Re: Can Cassandra make real use of several DataFileDirectories?
Ryan - You (or maybe someone else) mentioned using RAID-0 instead of multiple data directories at the Cassandra hackathon as well. Could you explain the motivation behind that? Thanks, Edmond On Mon, Apr 26, 2010 at 9:53 AM, Ryan King wrote: > I would recommend using RAID-0 rather that multiple data directories. > > -ryan > > 2010/4/26 Roland Hänel : >> I have a configuration like this: >> >> >> /storage01/cassandra/data >> /storage02/cassandra/data >> /storage03/cassandra/data >> >> >> After loading a big chunk of data into cassandra, I end up wich some 70GB in >> the first directory, and only about 10GB in the second and third one. All >> rows are quite small, so it's not just some big rows that contain the >> majority of data. >> >> Does Cassandra have the ability to 'see' the maximum available space in >> these directory? I'm asking myself this question since my limit is 100GB, >> and the first directory is approaching this limit... >> >> And, wouldn't it be better if Cassandra tried to 'load-balance' the files >> inside the directories because this will result in better (read) performance >> if the directories are on different disks (which is the case for me)? >> >> Any help is appreciated. >> >> Roland >> >> >
Re: Can Cassandra make real use of several DataFileDirectories?
http://wiki.apache.org/cassandra/CassandraHardware On Mon, Apr 26, 2010 at 1:06 PM, Edmond Lau wrote: > Ryan - > > You (or maybe someone else) mentioned using RAID-0 instead of multiple > data directories at the Cassandra hackathon as well. Could you > explain the motivation behind that? > > Thanks, > Edmond > > On Mon, Apr 26, 2010 at 9:53 AM, Ryan King wrote: >> I would recommend using RAID-0 rather that multiple data directories. >> >> -ryan >> >> 2010/4/26 Roland Hänel : >>> I have a configuration like this: >>> >>> >>> /storage01/cassandra/data >>> /storage02/cassandra/data >>> /storage03/cassandra/data >>> >>> >>> After loading a big chunk of data into cassandra, I end up wich some 70GB in >>> the first directory, and only about 10GB in the second and third one. All >>> rows are quite small, so it's not just some big rows that contain the >>> majority of data. >>> >>> Does Cassandra have the ability to 'see' the maximum available space in >>> these directory? I'm asking myself this question since my limit is 100GB, >>> and the first directory is approaching this limit... >>> >>> And, wouldn't it be better if Cassandra tried to 'load-balance' the files >>> inside the directories because this will result in better (read) performance >>> if the directories are on different disks (which is the case for me)? >>> >>> Any help is appreciated. >>> >>> Roland >>> >>> >> >
Re: range get over subcolumns on supercolumn family
Just found the way... keyRange start and end key will be the same and instead of specifying the count and start on KeyRange it has to be specified on SliceRange and then keySlices will come with a single key and a list of columns... 2010/4/25 Rafael Ribeiro > Hi all! > > I am trying to do a paginated query on the subcolumns of a superfamily > column but sincerely I am a little bit confused. > I have already been able to do a range query but only over the keys of a > regular column family. > For the keys case I've been able to do so using the code below: > > KeyRange keyRange = new KeyRange(count); > keyRange.setStart_key(startKey); > keyRange.setEnd_key(""); > > SliceRange range = new SliceRange(); > range.setStart(new byte[] {}); > range.setFinish(new byte[] {}); > > SlicePredicate predicate = new SlicePredicate(); > predicate.setSlice_range(range); > > ColumnParent cp = new ColumnParent("ColumnFamily"); > > List keySlices = client.get_range_slices("Keyspace", > cp, predicate, keyRange, ConsistencyLevel.ALL); > > Is there any way I can do a similar approach to do the range query on the > subcolumns? Would I need to do some trick over ColumnParent? I tried setting > the supercolumn attribute but with no success (sincerely I knew it wont work > but it was worth trying). Only to clarify a little bit... I am still > exercising what is possible to do with Cassandra and I was willing to store > a key over a supercolumnfamily with uuid keys under it so I could scan it > using an ordering scheme but without loading the whole data under the top > level key. > > best regards, > Rafael Ribeiro > >
Re: Can Cassandra make real use of several DataFileDirectories?
Hm... I understand that RAID0 would help to create a bigger pool for compactions. However, it might impact read performance: if I have several CF's (with their SSTables), random read requests for the CF files that are on separate disks will behave nicely - however if it's RAID0 then a random read on any file will create a random read on all of the hard disks. Correct? -Roland 2010/4/26 Jonathan Ellis > http://wiki.apache.org/cassandra/CassandraHardware > > On Mon, Apr 26, 2010 at 1:06 PM, Edmond Lau wrote: > > Ryan - > > > > You (or maybe someone else) mentioned using RAID-0 instead of multiple > > data directories at the Cassandra hackathon as well. Could you > > explain the motivation behind that? > > > > Thanks, > > Edmond > > > > On Mon, Apr 26, 2010 at 9:53 AM, Ryan King wrote: > >> I would recommend using RAID-0 rather that multiple data directories. > >> > >> -ryan > >> > >> 2010/4/26 Roland Hänel : > >>> I have a configuration like this: > >>> > >>> > >>> /storage01/cassandra/data > >>> /storage02/cassandra/data > >>> /storage03/cassandra/data > >>> > >>> > >>> After loading a big chunk of data into cassandra, I end up wich some > 70GB in > >>> the first directory, and only about 10GB in the second and third one. > All > >>> rows are quite small, so it's not just some big rows that contain the > >>> majority of data. > >>> > >>> Does Cassandra have the ability to 'see' the maximum available space in > >>> these directory? I'm asking myself this question since my limit is > 100GB, > >>> and the first directory is approaching this limit... > >>> > >>> And, wouldn't it be better if Cassandra tried to 'load-balance' the > files > >>> inside the directories because this will result in better (read) > performance > >>> if the directories are on different disks (which is the case for me)? > >>> > >>> Any help is appreciated. > >>> > >>> Roland > >>> > >>> > >> > > >
Re: Can Cassandra make real use of several DataFileDirectories?
2010/4/26 Roland Hänel : > Hm... I understand that RAID0 would help to create a bigger pool for > compactions. However, it might impact read performance: if I have several > CF's (with their SSTables), random read requests for the CF files that are > on separate disks will behave nicely - however if it's RAID0 then a random > read on any file will create a random read on all of the hard disks. > Correct? Without RAID0 you will end up with host spots (a compaction could end up putting a large SSTable on one disk, while the others have smaller SSTables). If you have many CFs this might average out, but it might not and there are no guarantees here. I'd reccomend RAID0 unless you have reason to do something else. -ryan
Re: ORM in Cassandra?
On Mon, Apr 26, 2010 at 10:35 AM, Ethan Rowe wrote: > On 04/26/2010 01:26 PM, Isaac Arias wrote: >> >> On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: >> ... >> In my opinion, a mapping solution for Cassandra should be more like a >> Template. Something that helps map (back and forth) rows to objects, >> columns to properties, etc. Since the data model can vary so much >> depending on data access patters, any overly structured approach that >> prescribes a particular schema will be of limited use. >> > For what it's worth, this is exactly my opinion after looking at the problem > for a bit, and I'm actively developing such a solution in Ruby. I spent ... > So, for my approach, there's one project that gives metaprogramming > semantics for building the mapping behavior you describe: build classes that > are oriented towards mapping between simple JSON-like structures and > full-blown business objects. And a separate project that layers Cassandra > specifics on top of that underlying mapper tool. +1 I think proper layering is the way to go: it makes problem (of simple construction of services that use Cassandra as the storage system) much easier to solve, divide and conquer. There are pretty decent OJM/OXM solutions that are mostly orthogonal wrt distributed storage part. I understand that there are some trade-offs (some things are easiest to optimize when Cassandra core handles them), but flexibility and best-tool-for-the-job have their benefits too. -+ Tatu +-
Cassandra cluster runs into OOM when bulk loading data
I have a cluster of 5 machines building a Cassandra datastore, and I load bulk data into this using the Java Thrift API. The first ~250GB runs fine, then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts are done with consistency level ALL. I hope with this I have avoided all the 'usual dummy errors' that lead to OOM's. I have begun to troubleshoot the issue with JMX, however, it's difficult to catch the JVM in the right moment because it runs well for several hours before this thing happens. One thing gets to my mind, maybe one of the experts could confirm or reject this idea for me: is it possible that when one machine slows down a little bit (for example because a big compaction is going on), the memtables don't get flushed to disk as fast as they are building up under the continuing bulk import? That would result in a downward spiral, the system gets slower and slower on disk I/O, but since more and more data arrives over Thrift, finally OOM. I'm using the "periodic" commit log sync, maybe also this could create a situation where the commit log writer is too slow to catch up with the data intake, resulting in ever growing memory usage? Maybe these thoughts are just bullshit. Let me now if so... ;-)
Re: Cassandra cluster runs into OOM when bulk loading data
Which version of Cassandra? Which version of Java JVM are you using? What do your I/O stats look like when bulk importing? When you run `nodeprobe -host tpstats` is any thread pool backing up during the import? -Chris 2010/4/26 Roland Hänel > I have a cluster of 5 machines building a Cassandra datastore, and I load > bulk data into this using the Java Thrift API. The first ~250GB runs fine, > then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using > and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM > allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts > are done with consistency level ALL. > > I hope with this I have avoided all the 'usual dummy errors' that lead to > OOM's. I have begun to troubleshoot the issue with JMX, however, it's > difficult to catch the JVM in the right moment because it runs well for > several hours before this thing happens. > > One thing gets to my mind, maybe one of the experts could confirm or reject > this idea for me: is it possible that when one machine slows down a little > bit (for example because a big compaction is going on), the memtables don't > get flushed to disk as fast as they are building up under the continuing > bulk import? That would result in a downward spiral, the system gets slower > and slower on disk I/O, but since more and more data arrives over Thrift, > finally OOM. > > I'm using the "periodic" commit log sync, maybe also this could create a > situation where the commit log writer is too slow to catch up with the data > intake, resulting in ever growing memory usage? > > Maybe these thoughts are just bullshit. Let me now if so... ;-) > > >
Re: Can Cassandra make real use of several DataFileDirectories?
Ryan, I agree with you on the hot spots, however for the physical disk performance, even the worst case hot spot is not worse than RAID0: in a hot spot scenario, it might be that 90% of your reads go to one hard drive. But with RAID0, 100% of your reads will go to *all* hard drives. But you're right, individual disks might waste up to 50% of your total disk space... I came to consider this idea because Hadoop DFS explicitely recommends different disks. But the design is not exactly the same, they don't have to deal with very big files on the native FS layer. -Roland 2010/4/26 Ryan King > 2010/4/26 Roland Hänel : > > Hm... I understand that RAID0 would help to create a bigger pool for > > compactions. However, it might impact read performance: if I have several > > CF's (with their SSTables), random read requests for the CF files that > are > > on separate disks will behave nicely - however if it's RAID0 then a > random > > read on any file will create a random read on all of the hard disks. > > Correct? > > Without RAID0 you will end up with host spots (a compaction could end > up putting a large SSTable on one disk, while the others have smaller > SSTables). If you have many CFs this might average out, but it might > not and there are no guarantees here. I'd reccomend RAID0 unless you > have reason to do something else. > > -ryan >
Re: Question about TimeUUIDType
On Sun, Apr 25, 2010 at 5:43 PM, Jonathan Ellis wrote: > On Sun, Apr 25, 2010 at 5:40 PM, Tatu Saloranta wrote: >>> Now with TimeUUIDType, if two UUID have the same timestamps, they are >>> ordered >>> by bytes order. >> >> Naively for the whole UUID? That would not be good, given that >> timestamp within UUID is not stored in expected lexical order, but >> with sort of little-endian mess (first bytes are least-significant >> bytes of timestamp). > > I think the code here is clearer than explaining in English. :) > > comparing timeuuids o1 and o2: > > long t1 = LexicalUUIDType.getUUID(o1).timestamp(); > long t2 = LexicalUUIDType.getUUID(o2).timestamp(); > return t1 < t2 ? -1 : (t1 > t2 ? 1 : > FBUtilities.compareByteArrays(o1, o2)); :-) Yes, this makes sense, so it is a two-part sort, not just latter part. -+ Tatu +- ps. Not sure if this matters, but I am finally working on Java Uuid Generator v3, which might help with time-location based UUIDs. Will announce it on the list when it's ready (in couple of weeks)
Re: ORM in Cassandra?
On 04/26/2010 03:11 PM, Tatu Saloranta wrote: On Mon, Apr 26, 2010 at 10:35 AM, Ethan Rowe wrote: On 04/26/2010 01:26 PM, Isaac Arias wrote: On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: ... In my opinion, a mapping solution for Cassandra should be more like a Template. Something that helps map (back and forth) rows to objects, columns to properties, etc. Since the data model can vary so much depending on data access patters, any overly structured approach that prescribes a particular schema will be of limited use. For what it's worth, this is exactly my opinion after looking at the problem for a bit, and I'm actively developing such a solution in Ruby. I spent ... So, for my approach, there's one project that gives metaprogramming semantics for building the mapping behavior you describe: build classes that are oriented towards mapping between simple JSON-like structures and full-blown business objects. And a separate project that layers Cassandra specifics on top of that underlying mapper tool. +1 I think proper layering is the way to go: it makes problem (of simple construction of services that use Cassandra as the storage system) much easier to solve, divide and conquer. There are pretty decent OJM/OXM solutions that are mostly orthogonal wrt distributed storage part. I understand that there are some trade-offs (some things are easiest to optimize when Cassandra core handles them), but flexibility and best-tool-for-the-job have their benefits too. Right. Additionally, this mapping layer between "simple" (i.e. JSON-ready) structures and "complex" (i.e. business objects) would seem to be of much more general value than a Cassandra-specific mapper. I would think most any environment with a heavy reliance on Thrift services would benefit from such tools. -- Ethan Rowe End Point Corporation et...@endpoint.com
Re: Cassandra cluster runs into OOM when bulk loading data
Cassandra Version 0.6.1 OpenJDK Server VM (build 14.0-b16, mixed mode) Import speed is about 10MB/s for the full cluster; if a compaction is going on the individual node is I/O limited tpstats: caught me, didn't know this. I will set up a test and try to catch a node during the critical time. Thanks, Roland 2010/4/26 Chris Goffinet > Which version of Cassandra? > Which version of Java JVM are you using? > What do your I/O stats look like when bulk importing? > When you run `nodeprobe -host tpstats` is any thread pool backing up > during the import? > > -Chris > > > 2010/4/26 Roland Hänel > > I have a cluster of 5 machines building a Cassandra datastore, and I load >> bulk data into this using the Java Thrift API. The first ~250GB runs fine, >> then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using >> and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM >> allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts >> are done with consistency level ALL. >> >> I hope with this I have avoided all the 'usual dummy errors' that lead to >> OOM's. I have begun to troubleshoot the issue with JMX, however, it's >> difficult to catch the JVM in the right moment because it runs well for >> several hours before this thing happens. >> >> One thing gets to my mind, maybe one of the experts could confirm or >> reject this idea for me: is it possible that when one machine slows down a >> little bit (for example because a big compaction is going on), the memtables >> don't get flushed to disk as fast as they are building up under the >> continuing bulk import? That would result in a downward spiral, the system >> gets slower and slower on disk I/O, but since more and more data arrives >> over Thrift, finally OOM. >> >> I'm using the "periodic" commit log sync, maybe also this could create a >> situation where the commit log writer is too slow to catch up with the data >> intake, resulting in ever growing memory usage? >> >> Maybe these thoughts are just bullshit. Let me now if so... ;-) >> >> >> >
Re: Cassandra cluster runs into OOM when bulk loading data
Upgrade to b20 of Sun's version of JVM. This OOM might be related to LinkedBlockQueue issues that were fixed. -Chris 2010/4/26 Roland Hänel > Cassandra Version 0.6.1 > OpenJDK Server VM (build 14.0-b16, mixed mode) > Import speed is about 10MB/s for the full cluster; if a compaction is going > on the individual node is I/O limited > tpstats: caught me, didn't know this. I will set up a test and try to catch > a node during the critical time. > > Thanks, > Roland > > > 2010/4/26 Chris Goffinet > > Which version of Cassandra? >> Which version of Java JVM are you using? >> What do your I/O stats look like when bulk importing? >> When you run `nodeprobe -host tpstats` is any thread pool backing up >> during the import? >> >> -Chris >> >> >> 2010/4/26 Roland Hänel >> >> I have a cluster of 5 machines building a Cassandra datastore, and I load >>> bulk data into this using the Java Thrift API. The first ~250GB runs fine, >>> then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using >>> and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM >>> allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts >>> are done with consistency level ALL. >>> >>> I hope with this I have avoided all the 'usual dummy errors' that lead to >>> OOM's. I have begun to troubleshoot the issue with JMX, however, it's >>> difficult to catch the JVM in the right moment because it runs well for >>> several hours before this thing happens. >>> >>> One thing gets to my mind, maybe one of the experts could confirm or >>> reject this idea for me: is it possible that when one machine slows down a >>> little bit (for example because a big compaction is going on), the memtables >>> don't get flushed to disk as fast as they are building up under the >>> continuing bulk import? That would result in a downward spiral, the system >>> gets slower and slower on disk I/O, but since more and more data arrives >>> over Thrift, finally OOM. >>> >>> I'm using the "periodic" commit log sync, maybe also this could create a >>> situation where the commit log writer is too slow to catch up with the data >>> intake, resulting in ever growing memory usage? >>> >>> Maybe these thoughts are just bullshit. Let me now if so... ;-) >>> >>> >>> >> >
Re: Can Cassandra make real use of several DataFileDirectories?
2010/4/26 Roland Hänel : > Ryan, I agree with you on the hot spots, however for the physical disk > performance, even the worst case hot spot is not worse than RAID0: in a hot > spot scenario, it might be that 90% of your reads go to one hard drive. But > with RAID0, 100% of your reads will go to *all* hard drives. RAID0 is designed specifically to improve performance (both latency and bandwidth). I'm unclear about why you think it would decrease importance. Perhaps you're thinking of another RAID type? Paul Prescod
Re: Cassandra cluster runs into OOM when bulk loading data
Thanks Chris 2010/4/26 Chris Goffinet > Upgrade to b20 of Sun's version of JVM. This OOM might be related to > LinkedBlockQueue issues that were fixed. > > -Chris > > > 2010/4/26 Roland Hänel > >> Cassandra Version 0.6.1 >> OpenJDK Server VM (build 14.0-b16, mixed mode) >> Import speed is about 10MB/s for the full cluster; if a compaction is >> going on the individual node is I/O limited >> tpstats: caught me, didn't know this. I will set up a test and try to >> catch a node during the critical time. >> >> Thanks, >> Roland >> >> >> 2010/4/26 Chris Goffinet >> >> Which version of Cassandra? >>> Which version of Java JVM are you using? >>> What do your I/O stats look like when bulk importing? >>> When you run `nodeprobe -host tpstats` is any thread pool backing up >>> during the import? >>> >>> -Chris >>> >>> >>> 2010/4/26 Roland Hänel >>> >>> I have a cluster of 5 machines building a Cassandra datastore, and I load bulk data into this using the Java Thrift API. The first ~250GB runs fine, then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts are done with consistency level ALL. I hope with this I have avoided all the 'usual dummy errors' that lead to OOM's. I have begun to troubleshoot the issue with JMX, however, it's difficult to catch the JVM in the right moment because it runs well for several hours before this thing happens. One thing gets to my mind, maybe one of the experts could confirm or reject this idea for me: is it possible that when one machine slows down a little bit (for example because a big compaction is going on), the memtables don't get flushed to disk as fast as they are building up under the continuing bulk import? That would result in a downward spiral, the system gets slower and slower on disk I/O, but since more and more data arrives over Thrift, finally OOM. I'm using the "periodic" commit log sync, maybe also this could create a situation where the commit log writer is too slow to catch up with the data intake, resulting in ever growing memory usage? Maybe these thoughts are just bullshit. Let me now if so... ;-) >>> >> >
Re: The Difference Between Cassandra and HBase
On Sat, Apr 24, 2010 at 10:20 AM, dir dir wrote: > In general what is the difference between Cassandra and HBase?? > > Thanks. > Others have already said it ... Cassandra has a peer architecture, with all peers being essentially equivalent (minus the concept of a "seed," as far as I can tell). This is a great architectural advantage of Cassandra and Cassandra-like systems. It wasn't really possible to make practical systems like this in earlier ages because of computing (memory, CPU, disk) limitations which made characteristic times (including expected characteristic response, recovery, replication, etc. times) and system dynamics almost impossible to deal with. This problem persists but has become far more manageable because expected response times haven't evolved or narrowed any faster than computational capabilities. HBase on the other hand is a layered system already. It relies on the underlying HDFS, beyond and above the OS. As a more layered systems, it has better service architecture, in a sense, but it relies and is limited to the capabilities of those "services" ... say the distributed file service. Cassandra rolls its own partitioning and replication mechanisms at the level of its peers. It does not rely on some underlying system service for these capabilities. Cassandra is definitely easier to provision and use, from an operational point of view, and this is a great advantage -- although installations that afford scanning (through ordered partitioning) would become more involved. (As suggested by others, reading the BigTable and Dynamo paper will help you to establish the difference between HBase and Cassandra in more clear, architectural terms.) - m.
Announcing Riptano professional Cassandra support and services
Short version: Matt Pfeil and I have founded http://riptano.com to provide production Cassandra support, training, and professional services. Yes, we're hiring. Long version: http://spyced.blogspot.com/2010/04/and-now-for-something-completely.html We're happy to answer questions on- or off-list. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Can Cassandra make real use of several DataFileDirectories?
RAID0 decreases the performance of muliple, concurrent random reads because for each read request (I assume that at least a couple of stripe sizes are read), all hard disks are involved in that read. Consider the following example: you want to read 1MB out of each of two files a) both files are on the same RAID0 of two disks. For the first 1MB read request, both disks contain some stripes of this request, both disks have to move their heads to the correct location and do the read. The second read request has to wait until the first one finishes, because it is served from the same disks and depends on the same disk heads. b) files are on seperate disks. Both reads can be done at the same time, because disk heads can move independently. Or look at it this way: if you issue a read request on a RAID0, and your disks have 8ms access time, then after the read request, the whole RAID0 is completely blocked for 8ms. If you handle the disks independently, only the disk containing the file is blocked. RAID0 has its advantages of course. Streaming reads/writes (e.g. during a compaction) will be extremely fast. -Roland 2010/4/26 Paul Prescod > 2010/4/26 Roland Hänel : > > Ryan, I agree with you on the hot spots, however for the physical disk > > performance, even the worst case hot spot is not worse than RAID0: in a > hot > > spot scenario, it might be that 90% of your reads go to one hard drive. > But > > with RAID0, 100% of your reads will go to *all* hard drives. > > RAID0 is designed specifically to improve performance (both latency > and bandwidth). I'm unclear about why you think it would decrease > importance. Perhaps you're thinking of another RAID type? > > Paul Prescod >
How to generate 'unique' identifiers for use in Cassandra
Typically, in the SQL world we use things like AUTO_INCREMENT columns that let us create a unique key automatically if a row is inserted into a table. What do you guys usually do to create identifiers for use in Cassandra? Do we only rely on "currentTimeMills() + random()" to create something that is 'unique enough' (but theoretically not fail-safe)? Or are some people here using systems like ZooKeeper for this purpose? -Roland
Re: How to generate 'unique' identifiers for use in Cassandra
http://wiki.apache.org/cassandra/UUID if you don't need transactional ordering, ZooKeeper or something comparable if you do. 2010/4/26 Roland Hänel > Typically, in the SQL world we use things like AUTO_INCREMENT columns that > let us create a unique key automatically if a row is inserted into a table. > > What do you guys usually do to create identifiers for use in Cassandra? > > Do we only rely on "currentTimeMills() + random()" to create something that > is 'unique enough' (but theoretically not fail-safe)? Or are some people > here using systems like ZooKeeper for this purpose? > > -Roland > >
Re: Can Cassandra make real use of several DataFileDirectories?
I think it might be worse case that you read all the disks. If your block size is large enough to hold an entire row, you should only have to read one disk to get that data. I for instance, stopped using multiple data directories and instead use a RAID0. The number of blocks read is not the same for all the disks as you suggest it would be if every disk was involved in every transaction. Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda1 11.80 1.60 105.60 8528 sdb 17.20 867.20 0.00 4336 0 sdc 2.60 0.00 155.20 0776 sdd 16.40 796.80 0.00 3984 0 sde 21.80 1113.60 8.00 5568 40 md0 56.00 2777.60 8.00 13888 40 sdb, sdd and sdd are raided on md0 on an ec2 xlarge instance, the number of blockes is different. Of course my rows are small (1-2 Kb), so I should rarely cross a block boundary, with 1MB rows you are more likely to, so multiple data directories might be better for you. I think it all sort of depends on your data size. -Anthony On Mon, Apr 26, 2010 at 10:09:58PM +0200, Roland H?nel wrote: > RAID0 decreases the performance of muliple, concurrent random reads because > for each read request (I assume that at least a couple of stripe sizes are > read), all hard disks are involved in that read. > > Consider the following example: you want to read 1MB out of each of two > files > > a) both files are on the same RAID0 of two disks. For the first 1MB read > request, both disks contain some stripes of this request, both disks have to > move their heads to the correct location and do the read. The second read > request has to wait until the first one finishes, because it is served from > the same disks and depends on the same disk heads. > > b) files are on seperate disks. Both reads can be done at the same time, > because disk heads can move independently. > > Or look at it this way: if you issue a read request on a RAID0, and your > disks have 8ms access time, then after the read request, the whole RAID0 is > completely blocked for 8ms. If you handle the disks independently, only the > disk containing the file is blocked. > > RAID0 has its advantages of course. Streaming reads/writes (e.g. during a > compaction) will be extremely fast. > > -Roland > > > 2010/4/26 Paul Prescod > > > 2010/4/26 Roland Hänel : > > > Ryan, I agree with you on the hot spots, however for the physical disk > > > performance, even the worst case hot spot is not worse than RAID0: in a > > hot > > > spot scenario, it might be that 90% of your reads go to one hard drive. > > But > > > with RAID0, 100% of your reads will go to *all* hard drives. > > > > RAID0 is designed specifically to improve performance (both latency > > and bandwidth). I'm unclear about why you think it would decrease > > importance. Perhaps you're thinking of another RAID type? > > > > Paul Prescod > > -- Anthony Molinaro
Re: Can Cassandra make real use of several DataFileDirectories?
On Mon, Apr 26, 2010 at 2:15 PM, Anthony Molinaro wrote: > I think it might be worse case that you read all the disks. If your > block size is large enough to hold an entire row, you should only have to > read one disk to get that data. And conversely, for a large enough row you might benefit from streaming from two disks at once rather than one. Paul
Re: strange get_range_slices behaviour v0.6.1
I've broken this case down further to some pyton code that works against the thrift generated client and am still getting the same odd results. With keys obejct1, object2 and object3 an open ended get_range_slice starting with "object1" only returns object1 and 2. I'm guessing that I've got something wrong or my expectation of how get_range_slice works is wrong, but I cannot see where I've gone wrong. Any help would be appreciated. They python code to add and read keys is below, assumes a Cassandra.Client connection. import time from cassandra import Cassandra,ttypes from thrift import Thrift from thrift.protocol import TBinaryProtocol from thrift.transport import TSocket, TTransport def add_data(conn): col_path = ttypes.ColumnPath(column_family="Standard1", column="col_name") consistency = ttypes.ConsistencyLevel.QUORUM for key in ["object1", "object2", "object3"]: conn.insert("Keyspace1", key, col_path, "col_value", int(time.time() * 1e6), consistency) return def read_range(conn, start_key, end_key): col_parent = ttypes.ColumnParent(column_family="Standard1") predicate = ttypes.SlicePredicate(column_names=["col_name"]) range = ttypes.KeyRange(start_key=start_key, end_key=end_key, count=1000) consistency = ttypes.ConsistencyLevel.QUORUM return conn.get_range_slices("Keyspace1", col_parent, predicate, range, consistency) Below is the result of calling read_range with different start values. I've also included the debug log for each call, the line starting with "reading RangeSliceCommand" seems to show that key hash for "object2" is greater than "object3". #expect to return objects 1,2 and 3 In [37]: cass_test.read_range(conn, "object1", "") Out[37]: [KeySlice(columns=[ColumnOrSuperColumn(column=Column(timestamp=1272315595268837, name='col_name', value='col_value'), super_column=None)], key='object1'), KeySlice(columns=[ColumnOrSuperColumn(column=Column(timestamp=1272315595272693, name='col_name', value='col_value'), super_column=None)], key='object3')] DEBUG 09:29:59,791 range_slice DEBUG 09:29:59,791 RangeSliceCommand{keyspace='Keyspace1', column_family='Standard1', super_column=null, predicate=SlicePredicate(column_names:[...@257b40fe]), range=[121587881847328893689247922008234581399,0], max_keys=1000} DEBUG 09:29:59,791 Adding to restricted ranges [121587881847328893689247922008234581399,0] for (75349581786326521367945210761838448174,75349581786326521367945210761838448174] DEBUG 09:29:59,791 reading RangeSliceCommand{keyspace='Keyspace1', column_family='Standard1', super_column=null, predicate=SlicePredicate(column_names:[...@257b40fe]), range=[121587881847328893689247922008234581399,0], max_keys=1000} from 1...@localhost/127.0.0.1 DEBUG 09:29:59,791 Sending RangeSliceReply{rows=Row(key='object1', cf=ColumnFamily(Standard1 [636f6c5f6e616d65:false:9...@1272315595268837,])),Row(key='object3', cf=ColumnFamily(Standard1 [636f6c5f6e616d65:false:9...@1272315595272693,]))} to 1...@localhost/127.0.0.1 DEBUG 09:29:59,791 Processing response on a callback from 1...@localhost/127.0.0.1 DEBUG 09:29:59,791 range slices read object1 DEBUG 09:29:59,791 range slices read object3 In [38]: cass_test.read_range(conn, "object2", "") Out[38]: [KeySlice(columns=[ColumnOrSuperColumn(column=Column(timestamp=1272315595271798, name='col_name', value='col_value'), super_column=None)], key='object2'), KeySlice(columns=[ColumnOrSuperColumn(column=Column(timestamp=1272315595268837, name='col_name', value='col_value'), super_column=None)], key='object1'), KeySlice(columns=[ColumnOrSuperColumn(column=Column(timestamp=1272315595272693, name='col_name', value='col_value'), super_column=None)], key='object3')] DEBUG 09:34:48,133 range_slice DEBUG 09:34:48,133 RangeSliceCommand{keyspace='Keyspace1', column_family='Standard1', super_column=null, predicate=SlicePredicate(column_names:[...@7966340c]), range=[28312518014678916505369931620527723964,0], max_keys=1000} DEBUG 09:34:48,133 Adding to restricted ranges [28312518014678916505369931620527723964,0] for (75349581786326521367945210761838448174,75349581786326521367945210761838448174] DEBUG 09:34:48,133 reading RangeSliceCommand{keyspace='Keyspace1', column_family='Standard1', super_column=null, predicate=SlicePredicate(column_names:[...@7966340c]), range=[28312518014678916505369931620527723964,0], max_keys=1000} from 1...@localhost/127.0.0.1 DEBUG 09:34:48,133 Sending RangeSliceReply{rows=Row(key='object2', cf=ColumnFamily(Standard1 [636f6c5f6e616d65:false:9...@1272315595271798,])),Row(key='object1', cf=ColumnFamily(Standard1 [636f6c5f6e616d65:false:9...@1272315595268837,])),Row(key='object3', cf=ColumnFamily(Standard1 [636f6c5f6e616d65:false:9...@1272315595272693,]))} to 1...@localhost/127.0.0.1 DEBUG 09:34:48,133 Processing response on a callback from 1...@localhost/127.0.0.1 DEBUG 09:34:48,133 range slices read object2 DEBUG 09:34:48,133 range slices read object1 DEBUG 09:34:48,133 rang
Quorom consistency in a changing ring
Hello, Is my interpretation correct that Cassandra is intended to guarantee quorom consistency (overlapping read/write sets) at all times, including a ring that is actively changing? I.e., there are no (intended) cases where qurom consistency is defeated due to writes or reads going to nodes that are actively participating in token:s moving? If yes, is there any material on how this is accomplished and/or pointers to roughly which parts of the implementation is responsible for ensuring this works? Thanks! -- / Peter Schuller
Re: Quorom consistency in a changing ring
Increasing the replication level is known to break it. --Original Message-- From: Peter Schuller Sender: sc...@scode.org To: user@cassandra.apache.org ReplyTo: user@cassandra.apache.org Subject: Quorom consistency in a changing ring Sent: Apr 26, 2010 21:55 Hello, Is my interpretation correct that Cassandra is intended to guarantee quorom consistency (overlapping read/write sets) at all times, including a ring that is actively changing? I.e., there are no (intended) cases where qurom consistency is defeated due to writes or reads going to nodes that are actively participating in token:s moving? If yes, is there any material on how this is accomplished and/or pointers to roughly which parts of the implementation is responsible for ensuring this works? Thanks! -- / Peter Schuller
Re: Quorom consistency in a changing ring
> Increasing the replication level is known to break it. Thanks! Yes, of that I am aware. When I said ring changes I meant nodes being added and removed, or just re-balanced, implying tokens moving around the ring. -- / Peter Schuller aka scode
Re: ORM in Cassandra?
I call tragedy a 'Cassandra Object Abstraction' (COA), because I try write a reusable implementation of patterns that are commonly used for cassandra data modeling. E.g. using TimeUUID columns for storing an Index is a pattern. Then various strategies to partition these Indexes are another pattern. I'm hoping that after some iteration a good mix of high-level abstractions that can be reused for all kinds of apps will emerge. It feels ambitious to me to try to implement cross-nosql-store abstractions before these patterns and best practices have been documented and battle-proven. On that note, if such documentation does exist, or you know cool patterns, i'd love to hear about them! Paul On Mon, Apr 26, 2010 at 10:46 AM, banks wrote: > The real tragedy is that we have not created a new acronym for this yet... > > OKVM... it makes more sense... > > > On Mon, Apr 26, 2010 at 10:35 AM, Ethan Rowe wrote: >> >> On 04/26/2010 01:26 PM, Isaac Arias wrote: >>> >>> On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: >>> >>> Clearly Cassandra is not an RDBMS. The intent of my Hibernate reference was to be more lyrical. Sorry if that didn't come through. >>> >>> Nonetheless, the need remains to relieve ourselves from excessive boilerplate coding. >>> >>> I agree with eliminating boilerplate code. Chris Shorrock wrote a >>> simple object mapper in Scala for his Cascal Cassandra client. You may >>> want to check out the wiki on GitHub >>> (http://wiki.github.com/shorrockin/cascal/). >>> >>> In my opinion, a mapping solution for Cassandra should be more like a >>> Template. Something that helps map (back and forth) rows to objects, >>> columns to properties, etc. Since the data model can vary so much >>> depending on data access patters, any overly structured approach that >>> prescribes a particular schema will be of limited use. >>> >> >> For what it's worth, this is exactly my opinion after looking at the >> problem for a bit, and I'm actively developing such a solution in Ruby. I >> spent some time playing with the CassandraObject project, but felt that >> despite all the good work that went in there, it didn't feel to me like it >> fit the problem space in an idiomatic manner. No criticism intended there; >> it seems to lean a little more towards a very structured schema, with less >> flexibility for things like collection attributes the members of which all >> have a key that matches a pattern (which is a use case we have). >> >> So, for my approach, there's one project that gives metaprogramming >> semantics for building the mapping behavior you describe: build classes that >> are oriented towards mapping between simple JSON-like structures and >> full-blown business objects. And a separate project that layers Cassandra >> specifics on top of that underlying mapper tool. >> >> The rub being: it's for a client, and we're collectively sorting out the >> details for releasing the code in some useful, public manner. But hopefully >> I'll get something useful out there for potential Ruby enthusiasts before >> too long. Hopefully a week or two. >> >> Thanks. >> - Ethan >> >> -- >> Ethan Rowe >> End Point Corporation >> et...@endpoint.com >> > >
Re: Quorom consistency in a changing ring
Live nodes that have tokens indicating they should receive a copy of data count towards write quorum. This means if a node is down (not decommissioned) the copy sent to the node acting as the hinted handoff replica will not count towards achieving quorum. If a token is moved, it is moved. It is not in 2 places at once. If you are using CL.QUORUM and it succeeds, it really is reading or writing RF / 2 + 1 copies. b 2010/4/26 Peter Schüller : >> Increasing the replication level is known to break it. > > Thanks! Yes, of that I am aware. When I said ring changes I meant > nodes being added and removed, or just re-balanced, implying tokens > moving around the ring. > > -- > / Peter Schuller aka scode >
how to get apache cassandra version with thrift client ?
Hi all: How to get apache cassandra version with thrift client ? Thanks for reply. -- Shuge Lee | Lee Li | 李蠡
Re: Super and Regular Columns
On Fri, Apr 23, 2010 at 3:32 PM, Robert wrote: > I am starting out with Cassandra and I had a couple of questions, I read a > lot of the documentation including: > http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model > First I wanted to make sure I understand this > bug: http://issues.apache.org/jira/browse/CASSANDRA-598 > Borrowing from the the example provided in that article, would an example > subcolumn be 'friend1' or 'street'? friend1 is the name a a supercolumn; street is the name of a subcolumn > Second, for a one to many map where ordering is not important what are the > tradeoffs between these two options? > > A. Use a ColumnFamily where the key maps to an item id, and in each row each > column is one of the items it is mapped to? > > B. Use SuperColumnFamily where each key is an item id, and each column (are > these the right terms?) is one of the items it is mapped to, and the value > is essentially empty? I don't see what using supercolumns gives you here, so don't use them. :) -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Cassandra reverting deletes?
How are you checking that the rows are gone? Are you experiencing node outages during this? DC_QUORUM is unfinished code right now, you should avoid using it. Can you reproduce with normal QUORUM? On Sat, Apr 24, 2010 at 12:23 PM, Joost Ouwerkerk wrote: > I'm having trouble deleting rows in Cassandra. After running a job that > deletes hundreds of rows, I run another job that verifies that the rows are > gone. Both jobs run correctly. However, when I run the verification job an > hour later, the rows have re-appeared. This is not a case of "ghosting" > because the verification job actually checks that there is data in the > columns. > > I am running a cluster with 12 nodes and a replication factor of 3. I am > using DC_QUORUM consistency when deleting. > > Any ideas? > Joost. > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Cassandra use cases: as a datagrid ? as a distributed cache ?
On Mon, Apr 26, 2010 at 9:04 AM, Dominique De Vito wrote: > (1) has anyone already used Cassandra as an in-memory data grid ? > If no, does anyone know how far such a database is from, let's say, Oracle > Coherence ? > Does Cassandra provide, for example, a (synchronized) cache on the client > side ? If you mean an in-process cache on the client side, no. > (2) has anyone already used Cassandra as a distributed cache ? > Are there some testimonials somewhere about this use case ? That's basically what reddit is using it for. http://blog.reddit.com/2010/03/she-who-entangles-men.html -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: cassandra 0.5.1 java.lang.OutOfMemoryError: Java heap space issue
0.5 has a bug that allows it to OOM itself from replaying the log too fast. You should upgrade to 0.6.1. On Mon, Apr 26, 2010 at 12:11 PM, elsif wrote: > > Hello. I have a six node cassandra cluster running on modest hardware > with 1G of heap assigned to cassandra. After inserting about 245 > million rows of data, cassandra failed with a > java.lang.OutOfMemoryError: Java heap space error. I rasied the java > heap to 2G, but still get the same error when trying to restart cassandra. > > I am using Cassandra 0.5.1 with Sun jre1.6.0_18. > > Any thoughts on how to resolve this issue are greatly appreciated. > > Here are log excerpts from two of the nodes: > > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dcf9f19e > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dd04bf9c > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dd08981a > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dd7f7ac9 > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dde1d4cf > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(de32aec3 > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(de378105 > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(deb5d591 > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(ded75dee > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(defe3445 > [0a011d0c,0a011d0d,]) > INFO [FLUSH-TIMER] 2010-04-23 16:20:00,071 ColumnFamilyStore.java (line > 393) IpTag has reached its threshold; switching in a fresh Memtable > INFO [FLUSH-TIMER] 2010-04-23 16:20:00,072 ColumnFamilyStore.java (line > 1035) Enqueuing flush of Memtable(IpTag)@7816 > INFO [FLUSH-SORTER-POOL:1] 2010-04-23 16:20:00,072 Memtable.java (line > 183) Sorting Memtable(IpTag)@7816 > INFO [FLUSH-WRITER-POOL:1] 2010-04-23 16:20:00,107 Memtable.java (line > 192) Writing Memtable(IpTag)@7816 > DEBUG [Timer-0] 2010-04-23 16:20:00,130 LoadDisseminator.java (line 39) > Disseminating load info ... > ERROR [ROW-MUTATION-STAGE:41] 2010-04-23 16:20:00,348 > CassandraDaemon.java (line 71) Fatal exception in thread > Thread[ROW-MUTATION-STAGE:41,5,main] > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOfRange(Unknown Source) > at java.lang.String.(Unknown Source) > at java.lang.StringBuilder.toString(Unknown Source) > at > org.apache.cassandra.db.marshal.AbstractType.getColumnsString(AbstractType.java:87) > at > org.apache.cassandra.db.ColumnFamily.toString(ColumnFamily.java:344) > at > org.apache.commons.lang.ObjectUtils.toString(ObjectUtils.java:241) > at org.apache.commons.lang.StringUtils.join(StringUtils.java:3073) > at org.apache.commons.lang.StringUtils.join(StringUtils.java:3133) > at > org.apache.cassandra.db.RowMutation.toString(RowMutation.java:263) > at java.lang.String.valueOf(Unknown Source) > at java.lang.StringBuilder.append(Unknown Source) > at > org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:46) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:38) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > at java.lang.Thread.run(Unknown Source) > > --- > > DEBUG [main] 2010-04-23 17:15:45,501 CommitLog.java (line 312) Reading > mutation at 57527476 > DEBUG [main] 2010-04-23 17:16:11,375 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5c0,])} > DEBUG [main] 2010-04-23 17:16:45,293 CommitLog.java (line 312) Reading > mutation at 57527686 > DEBUG [main] 2010-04-23 17:16:45,294 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} > DEBUG [main] 2010-04-23 17:16:54,311 CommitLog.java (line 312) Reading > mutation at 57527919 > DEBUG [main] 2010-04-23 17:17:46,344 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} > DEBUG [main] 2010-04-23 17:17:55,530 CommitLog.java (line 312) Reading > mutation at 57528129 > DEBUG [main] 2010-04-23 17:18:20,266 CommitLog.java (line 340) replayi
Re: how to get apache cassandra version with thrift client ?
You can't get the Cassandra release version, but you can get the Thrift api version, which is more useful. It's compiled as a constant VERSION string in your client library. See the comments in interface/cassandra.thrift. On Mon, Apr 26, 2010 at 8:14 PM, Shuge Lee wrote: > Hi all: > How to get apache cassandra version with thrift client ? > Thanks for reply. > > -- > Shuge Lee | Lee Li | 李蠡 > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Cassandra cluster runs into OOM when bulk loading data
I have the same problem here, and I analysised the hprof file with mat, as you said, LinkedBlockQueue used 2.6GB. I think the ThreadPool of cassandra should limit the queue size. cassandra 0.6.1 java version $ java -version java version "1.6.0_20" Java(TM) SE Runtime Environment (build 1.6.0_20-b02) Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) iostat $ iostat -x -l 1 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 81.00 8175.00 224.00 17.00 23984.00 2728.00 221.68 1.011.86 0.76 18.20 tpstats, of coz, this node is still alive $ ./nodetool -host localhost tpstats Pool NameActive Pending Completed FILEUTILS-DELETE-POOL 0 0 1281 STREAM-STAGE 0 0 0 RESPONSE-STAGE0 0 473617241 ROW-READ-STAGE0 0 0 LB-OPERATIONS 0 0 0 MESSAGE-DESERIALIZER-POOL 0 0 718355184 GMFD 0 0 132509 LB-TARGET 0 0 0 CONSISTENCY-MANAGER 0 0 0 ROW-MUTATION-STAGE0 0 293735704 MESSAGE-STREAMING-POOL0 0 6 LOAD-BALANCER-STAGE 0 0 0 FLUSH-SORTER-POOL 0 0 0 MEMTABLE-POST-FLUSHER 0 0 1870 FLUSH-WRITER-POOL 0 0 1870 AE-SERVICE-STAGE 0 0 5 HINTED-HANDOFF-POOL 0 0 21 On Tue, Apr 27, 2010 at 3:32 AM, Chris Goffinet wrote: > Upgrade to b20 of Sun's version of JVM. This OOM might be related to > LinkedBlockQueue issues that were fixed. > > -Chris > > > 2010/4/26 Roland Hänel > >> Cassandra Version 0.6.1 >> OpenJDK Server VM (build 14.0-b16, mixed mode) >> Import speed is about 10MB/s for the full cluster; if a compaction is >> going on the individual node is I/O limited >> tpstats: caught me, didn't know this. I will set up a test and try to >> catch a node during the critical time. >> >> Thanks, >> Roland >> >> >> 2010/4/26 Chris Goffinet >> >> Which version of Cassandra? >>> Which version of Java JVM are you using? >>> What do your I/O stats look like when bulk importing? >>> When you run `nodeprobe -host tpstats` is any thread pool backing up >>> during the import? >>> >>> -Chris >>> >>> >>> 2010/4/26 Roland Hänel >>> >>> I have a cluster of 5 machines building a Cassandra datastore, and I load bulk data into this using the Java Thrift API. The first ~250GB runs fine, then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts are done with consistency level ALL. I hope with this I have avoided all the 'usual dummy errors' that lead to OOM's. I have begun to troubleshoot the issue with JMX, however, it's difficult to catch the JVM in the right moment because it runs well for several hours before this thing happens. One thing gets to my mind, maybe one of the experts could confirm or reject this idea for me: is it possible that when one machine slows down a little bit (for example because a big compaction is going on), the memtables don't get flushed to disk as fast as they are building up under the continuing bulk import? That would result in a downward spiral, the system gets slower and slower on disk I/O, but since more and more data arrives over Thrift, finally OOM. I'm using the "periodic" commit log sync, maybe also this could create a situation where the commit log writer is too slow to catch up with the data intake, resulting in ever growing memory usage? Maybe these thoughts are just bullshit. Let me now if so... ;-) >>> >> >
Re: value size, is there a suggested limit?
Hi Ahmed, Casandra has a limitation to store value in to database. the maximum size is 2^31-1 byte. if you have more than 2^31-1 byte, I suggest you to create several chunk data. On Mon, Apr 26, 2010 at 3:19 AM, S Ahmed wrote: > Is there a suggested sized maximum that you can set the value of a given > key? > > e.g. could I convert a document to bytes and store it as a value to a key? > if yes, which I presume so, what if the file is 10mb? or 100mb? >
Re: how to get apache cassandra version with thrift client ?
I know I can get thrift API version. However, I writing a CLI for Cassandra in Python with readline support, and it will supports one-key deploy/upgrade cassandra+thrift remote, I need to get ApacheCassandra version to make sure it has deploy successfully. 2010/4/27 Jonathan Ellis > You can't get the Cassandra release version, but you can get the > Thrift api version, which is more useful. It's compiled as a constant > VERSION string in your client library. See the comments in > interface/cassandra.thrift. > > On Mon, Apr 26, 2010 at 8:14 PM, Shuge Lee wrote: > > Hi all: > > How to get apache cassandra version with thrift client ? > > Thanks for reply. > > > > -- > > Shuge Lee | Lee Li | 李蠡 > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > -- Shuge Lee | Lee Li | 李蠡
Re: Cassandra cluster runs into OOM when bulk loading data
I'll work on doing more tests around this. In 0.5 we used a different data structure that required polling. But this does seem problematic. -Chris On Apr 26, 2010, at 7:04 PM, Eric Yu wrote: > I have the same problem here, and I analysised the hprof file with mat, as > you said, LinkedBlockQueue used 2.6GB. > I think the ThreadPool of cassandra should limit the queue size. > > cassandra 0.6.1 > > java version > $ java -version > java version "1.6.0_20" > Java(TM) SE Runtime Environment (build 1.6.0_20-b02) > Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) > > iostat > $ iostat -x -l 1 > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await svctm %util > sda 81.00 8175.00 224.00 17.00 23984.00 2728.00 221.68 > 1.011.86 0.76 18.20 > > tpstats, of coz, this node is still alive > $ ./nodetool -host localhost tpstats > Pool NameActive Pending Completed > FILEUTILS-DELETE-POOL 0 0 1281 > STREAM-STAGE 0 0 0 > RESPONSE-STAGE0 0 473617241 > ROW-READ-STAGE0 0 0 > LB-OPERATIONS 0 0 0 > MESSAGE-DESERIALIZER-POOL 0 0 718355184 > GMFD 0 0 132509 > LB-TARGET 0 0 0 > CONSISTENCY-MANAGER 0 0 0 > ROW-MUTATION-STAGE0 0 293735704 > MESSAGE-STREAMING-POOL0 0 6 > LOAD-BALANCER-STAGE 0 0 0 > FLUSH-SORTER-POOL 0 0 0 > MEMTABLE-POST-FLUSHER 0 0 1870 > FLUSH-WRITER-POOL 0 0 1870 > AE-SERVICE-STAGE 0 0 5 > HINTED-HANDOFF-POOL 0 0 21 > > > On Tue, Apr 27, 2010 at 3:32 AM, Chris Goffinet wrote: > Upgrade to b20 of Sun's version of JVM. This OOM might be related to > LinkedBlockQueue issues that were fixed. > > -Chris > > > 2010/4/26 Roland Hänel > Cassandra Version 0.6.1 > OpenJDK Server VM (build 14.0-b16, mixed mode) > Import speed is about 10MB/s for the full cluster; if a compaction is going > on the individual node is I/O limited > tpstats: caught me, didn't know this. I will set up a test and try to catch a > node during the critical time. > > Thanks, > Roland > > > 2010/4/26 Chris Goffinet > > Which version of Cassandra? > Which version of Java JVM are you using? > What do your I/O stats look like when bulk importing? > When you run `nodeprobe -host tpstats` is any thread pool backing up > during the import? > > -Chris > > > 2010/4/26 Roland Hänel > > I have a cluster of 5 machines building a Cassandra datastore, and I load > bulk data into this using the Java Thrift API. The first ~250GB runs fine, > then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using > and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM > allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts > are done with consistency level ALL. > > I hope with this I have avoided all the 'usual dummy errors' that lead to > OOM's. I have begun to troubleshoot the issue with JMX, however, it's > difficult to catch the JVM in the right moment because it runs well for > several hours before this thing happens. > > One thing gets to my mind, maybe one of the experts could confirm or reject > this idea for me: is it possible that when one machine slows down a little > bit (for example because a big compaction is going on), the memtables don't > get flushed to disk as fast as they are building up under the continuing bulk > import? That would result in a downward spiral, the system gets slower and > slower on disk I/O, but since more and more data arrives over Thrift, finally > OOM. > > I'm using the "periodic" commit log sync, maybe also this could create a > situation where the commit log writer is too slow to catch up with the data > intake, resulting in ever growing memory usage? > > Maybe these thoughts are just bullshit. Let me now if so... ;-) > > > > > >
error during snapshot
I was attempting to get a snapshot on our cassandra nodes. I get the following error every time I run nodetool ... snapshot. Exception in thread "main" java.io.IOException: Cannot run program "ln": java.io.IOException: error=12, Cannot allocate memory at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:221) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1060) at org.apache.cassandra.db.Table.snapshot(Table.java:256) at org.apache.cassandra.service.StorageService.takeAllSnapshot(StorageService.java:1005) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1426) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1264) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1359) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory at java.lang.UNIXProcess.(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) ... 34 more The nodes are both Amazon EC2 Large instances with 7.5G RAM (6 allocated for Java heap) with two cores and only 70G of data in casssandra. They have plenty of available RAM and HD space. Has anyone else run into this error? Lee Parker
Re: Cassandra use cases: as a datagrid ? as a distributed cache ?
great talk tonight in NYC I attended in regards to using Cassandra as a Lucene Index store (really great idea nicely implemented) http://blog.sematext.com/2010/02/09/lucandra-a-cassandra-based-lucene-backend/ so Lucinda uses Cassandra as a distributed cache of indexes =8^) On Mon, Apr 26, 2010 at 9:47 PM, Jonathan Ellis wrote: > On Mon, Apr 26, 2010 at 9:04 AM, Dominique De Vito > wrote: >> (1) has anyone already used Cassandra as an in-memory data grid ? >> If no, does anyone know how far such a database is from, let's say, Oracle >> Coherence ? >> Does Cassandra provide, for example, a (synchronized) cache on the client >> side ? > > If you mean an in-process cache on the client side, no. > >> (2) has anyone already used Cassandra as a distributed cache ? >> Are there some testimonials somewhere about this use case ? > > That's basically what reddit is using it for. > http://blog.reddit.com/2010/03/she-who-entangles-men.html > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > -- /* Joe Stein http://www.linkedin.com/in/charmalloc */
Re: Cassandra use cases: as a datagrid ? as a distributed cache ?
(sp) Lucandra http://github.com/tjake/Lucandra On Mon, Apr 26, 2010 at 11:08 PM, Joseph Stein wrote: > great talk tonight in NYC I attended in regards to using Cassandra as > a Lucene Index store (really great idea nicely implemented) > http://blog.sematext.com/2010/02/09/lucandra-a-cassandra-based-lucene-backend/ > > so Lucinda uses Cassandra as a distributed cache of indexes =8^) > > > On Mon, Apr 26, 2010 at 9:47 PM, Jonathan Ellis wrote: >> On Mon, Apr 26, 2010 at 9:04 AM, Dominique De Vito >> wrote: >>> (1) has anyone already used Cassandra as an in-memory data grid ? >>> If no, does anyone know how far such a database is from, let's say, Oracle >>> Coherence ? >>> Does Cassandra provide, for example, a (synchronized) cache on the client >>> side ? >> >> If you mean an in-process cache on the client side, no. >> >>> (2) has anyone already used Cassandra as a distributed cache ? >>> Are there some testimonials somewhere about this use case ? >> >> That's basically what reddit is using it for. >> http://blog.reddit.com/2010/03/she-who-entangles-men.html >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com >> > > > > -- > /* > Joe Stein > http://www.linkedin.com/in/charmalloc > */ > -- /* Joe Stein http://www.linkedin.com/in/charmalloc */