Re: Cassandra use cases: as a datagrid ? as a distributed cache ?

2010-04-26 Thread Joseph Stein
(sp) Lucandra http://github.com/tjake/Lucandra On Mon, Apr 26, 2010 at 11:08 PM, Joseph Stein wrote: > great talk tonight in NYC I attended in regards to using Cassandra as > a Lucene Index store (really great idea nicely implemented) > http://blog.sematext.com/2010/02/09/lucandra-a-cassandra-bas

Re: Cassandra use cases: as a datagrid ? as a distributed cache ?

2010-04-26 Thread Joseph Stein
great talk tonight in NYC I attended in regards to using Cassandra as a Lucene Index store (really great idea nicely implemented) http://blog.sematext.com/2010/02/09/lucandra-a-cassandra-based-lucene-backend/ so Lucinda uses Cassandra as a distributed cache of indexes =8^) On Mon, Apr 26, 2010 a

error during snapshot

2010-04-26 Thread Lee Parker
I was attempting to get a snapshot on our cassandra nodes. I get the following error every time I run nodetool ... snapshot. Exception in thread "main" java.io.IOException: Cannot run program "ln": java.io.IOException: error=12, Cannot allocate memory at java.lang.ProcessBuilder.start(ProcessBuil

Re: Cassandra cluster runs into OOM when bulk loading data

2010-04-26 Thread Chris Goffinet
I'll work on doing more tests around this. In 0.5 we used a different data structure that required polling. But this does seem problematic. -Chris On Apr 26, 2010, at 7:04 PM, Eric Yu wrote: > I have the same problem here, and I analysised the hprof file with mat, as > you said, LinkedBlockQu

Re: how to get apache cassandra version with thrift client ?

2010-04-26 Thread Shuge Lee
I know I can get thrift API version. However, I writing a CLI for Cassandra in Python with readline support, and it will supports one-key deploy/upgrade cassandra+thrift remote, I need to get ApacheCassandra version to make sure it has deploy successfully. 2010/4/27 Jonathan Ellis > You can't

Re: value size, is there a suggested limit?

2010-04-26 Thread dir dir
Hi Ahmed, Casandra has a limitation to store value in to database. the maximum size is 2^31-1 byte. if you have more than 2^31-1 byte, I suggest you to create several chunk data. On Mon, Apr 26, 2010 at 3:19 AM, S Ahmed wrote: > Is there a suggested sized maximum that you can set the value of

Re: Cassandra cluster runs into OOM when bulk loading data

2010-04-26 Thread Eric Yu
I have the same problem here, and I analysised the hprof file with mat, as you said, LinkedBlockQueue used 2.6GB. I think the ThreadPool of cassandra should limit the queue size. cassandra 0.6.1 java version $ java -version java version "1.6.0_20" Java(TM) SE Runtime Environment (build 1.6.0_20-b

Re: how to get apache cassandra version with thrift client ?

2010-04-26 Thread Jonathan Ellis
You can't get the Cassandra release version, but you can get the Thrift api version, which is more useful. It's compiled as a constant VERSION string in your client library. See the comments in interface/cassandra.thrift. On Mon, Apr 26, 2010 at 8:14 PM, Shuge Lee wrote: > Hi all: > How to get

Re: cassandra 0.5.1 java.lang.OutOfMemoryError: Java heap space issue

2010-04-26 Thread Jonathan Ellis
0.5 has a bug that allows it to OOM itself from replaying the log too fast. You should upgrade to 0.6.1. On Mon, Apr 26, 2010 at 12:11 PM, elsif wrote: > > Hello.  I have a six node cassandra cluster running on modest hardware > with 1G of heap assigned to cassandra.  After inserting about 245 >

Re: Cassandra use cases: as a datagrid ? as a distributed cache ?

2010-04-26 Thread Jonathan Ellis
On Mon, Apr 26, 2010 at 9:04 AM, Dominique De Vito wrote: > (1) has anyone already used Cassandra as an in-memory data grid ? > If no, does anyone know how far such a database is from, let's say, Oracle > Coherence ? > Does Cassandra provide, for example, a (synchronized) cache on the client > sid

Re: Cassandra reverting deletes?

2010-04-26 Thread Jonathan Ellis
How are you checking that the rows are gone? Are you experiencing node outages during this? DC_QUORUM is unfinished code right now, you should avoid using it. Can you reproduce with normal QUORUM? On Sat, Apr 24, 2010 at 12:23 PM, Joost Ouwerkerk wrote: > I'm having trouble deleting rows in Cas

Re: Super and Regular Columns

2010-04-26 Thread Jonathan Ellis
On Fri, Apr 23, 2010 at 3:32 PM, Robert wrote: > I am starting out with Cassandra and I had a couple of questions, I read a > lot of the documentation including: > http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model > First I wanted to make sure I understand this > bug: http://issues.apa

how to get apache cassandra version with thrift client ?

2010-04-26 Thread Shuge Lee
Hi all: How to get apache cassandra version with thrift client ? Thanks for reply. -- Shuge Lee | Lee Li | 李蠡

Re: Quorom consistency in a changing ring

2010-04-26 Thread Benjamin Black
Live nodes that have tokens indicating they should receive a copy of data count towards write quorum. This means if a node is down (not decommissioned) the copy sent to the node acting as the hinted handoff replica will not count towards achieving quorum. If a token is moved, it is moved. It is

Re: ORM in Cassandra?

2010-04-26 Thread Paul Bohm
I call tragedy a 'Cassandra Object Abstraction' (COA), because I try write a reusable implementation of patterns that are commonly used for cassandra data modeling. E.g. using TimeUUID columns for storing an Index is a pattern. Then various strategies to partition these Indexes are another pattern.

Re: Quorom consistency in a changing ring

2010-04-26 Thread Peter Schüller
> Increasing the replication level is known to break it. Thanks! Yes, of that I am aware. When I said ring changes I meant nodes being added and removed, or just re-balanced, implying tokens moving around the ring. -- / Peter Schuller aka scode

Re: Quorom consistency in a changing ring

2010-04-26 Thread David Timothy Strauss
Increasing the replication level is known to break it. --Original Message-- From: Peter Schuller Sender: sc...@scode.org To: user@cassandra.apache.org ReplyTo: user@cassandra.apache.org Subject: Quorom consistency in a changing ring Sent: Apr 26, 2010 21:55 Hello, Is my interpretation co

Quorom consistency in a changing ring

2010-04-26 Thread Peter Schuller
Hello, Is my interpretation correct that Cassandra is intended to guarantee quorom consistency (overlapping read/write sets) at all times, including a ring that is actively changing? I.e., there are no (intended) cases where qurom consistency is defeated due to writes or reads going to nodes that

Re: strange get_range_slices behaviour v0.6.1

2010-04-26 Thread aaron
I've broken this case down further to some pyton code that works against the thrift generated client and am still getting the same odd results. With keys obejct1, object2 and object3 an open ended get_range_slice starting with "object1" only returns object1 and 2. I'm guessing that I've got so

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Paul Prescod
On Mon, Apr 26, 2010 at 2:15 PM, Anthony Molinaro wrote: > I think it might be worse case that you read all the disks. If your > block size is large enough to hold an entire row, you should only have to > read one disk to get that data. And conversely, for a large enough row you might benefit fro

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Anthony Molinaro
I think it might be worse case that you read all the disks. If your block size is large enough to hold an entire row, you should only have to read one disk to get that data. I for instance, stopped using multiple data directories and instead use a RAID0. The number of blocks read is not the same

Re: How to generate 'unique' identifiers for use in Cassandra

2010-04-26 Thread Miguel Verde
http://wiki.apache.org/cassandra/UUID if you don't need transactional ordering, ZooKeeper or something comparable if you do. 2010/4/26 Roland Hänel > Typically, in the SQL world we use things like AUTO_INCREMENT columns that > let us create a unique key automatically if a row is inserted into a

How to generate 'unique' identifiers for use in Cassandra

2010-04-26 Thread Roland Hänel
Typically, in the SQL world we use things like AUTO_INCREMENT columns that let us create a unique key automatically if a row is inserted into a table. What do you guys usually do to create identifiers for use in Cassandra? Do we only rely on "currentTimeMills() + random()" to create something tha

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Roland Hänel
RAID0 decreases the performance of muliple, concurrent random reads because for each read request (I assume that at least a couple of stripe sizes are read), all hard disks are involved in that read. Consider the following example: you want to read 1MB out of each of two files a) both files are o

Announcing Riptano professional Cassandra support and services

2010-04-26 Thread Jonathan Ellis
Short version: Matt Pfeil and I have founded http://riptano.com to provide production Cassandra support, training, and professional services. Yes, we're hiring. Long version: http://spyced.blogspot.com/2010/04/and-now-for-something-completely.html We're happy to answer questions on- or off-list

Re: The Difference Between Cassandra and HBase

2010-04-26 Thread Masood Mortazavi
On Sat, Apr 24, 2010 at 10:20 AM, dir dir wrote: > In general what is the difference between Cassandra and HBase?? > > Thanks. > Others have already said it ... Cassandra has a peer architecture, with all peers being essentially equivalent (minus the concept of a "seed," as far as I can tell).

Re: Cassandra cluster runs into OOM when bulk loading data

2010-04-26 Thread Roland Hänel
Thanks Chris 2010/4/26 Chris Goffinet > Upgrade to b20 of Sun's version of JVM. This OOM might be related to > LinkedBlockQueue issues that were fixed. > > -Chris > > > 2010/4/26 Roland Hänel > >> Cassandra Version 0.6.1 >> OpenJDK Server VM (build 14.0-b16, mixed mode) >> Import speed is about

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Paul Prescod
2010/4/26 Roland Hänel : > Ryan, I agree with you on the hot spots, however for the physical disk > performance, even the worst case hot spot is not worse than RAID0: in a hot > spot scenario, it might be that 90% of your reads go to one hard drive. But > with RAID0, 100% of your reads will go to *

Re: Cassandra cluster runs into OOM when bulk loading data

2010-04-26 Thread Chris Goffinet
Upgrade to b20 of Sun's version of JVM. This OOM might be related to LinkedBlockQueue issues that were fixed. -Chris 2010/4/26 Roland Hänel > Cassandra Version 0.6.1 > OpenJDK Server VM (build 14.0-b16, mixed mode) > Import speed is about 10MB/s for the full cluster; if a compaction is going >

Re: Cassandra cluster runs into OOM when bulk loading data

2010-04-26 Thread Roland Hänel
Cassandra Version 0.6.1 OpenJDK Server VM (build 14.0-b16, mixed mode) Import speed is about 10MB/s for the full cluster; if a compaction is going on the individual node is I/O limited tpstats: caught me, didn't know this. I will set up a test and try to catch a node during the critical time. Than

Re: ORM in Cassandra?

2010-04-26 Thread Ethan Rowe
On 04/26/2010 03:11 PM, Tatu Saloranta wrote: On Mon, Apr 26, 2010 at 10:35 AM, Ethan Rowe wrote: On 04/26/2010 01:26 PM, Isaac Arias wrote: On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: ... In my opinion, a mapping solution for Cassandra should be more like a Template. Something

Re: Question about TimeUUIDType

2010-04-26 Thread Tatu Saloranta
On Sun, Apr 25, 2010 at 5:43 PM, Jonathan Ellis wrote: > On Sun, Apr 25, 2010 at 5:40 PM, Tatu Saloranta wrote: >>> Now with TimeUUIDType, if two UUID have the same timestamps, they are >>> ordered >>> by bytes order. >> >> Naively for the whole UUID? That would not be good, given that >> timest

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Roland Hänel
Ryan, I agree with you on the hot spots, however for the physical disk performance, even the worst case hot spot is not worse than RAID0: in a hot spot scenario, it might be that 90% of your reads go to one hard drive. But with RAID0, 100% of your reads will go to *all* hard drives. But you're rig

Re: Cassandra cluster runs into OOM when bulk loading data

2010-04-26 Thread Chris Goffinet
Which version of Cassandra? Which version of Java JVM are you using? What do your I/O stats look like when bulk importing? When you run `nodeprobe -host tpstats` is any thread pool backing up during the import? -Chris 2010/4/26 Roland Hänel > I have a cluster of 5 machines building a Cass

Cassandra cluster runs into OOM when bulk loading data

2010-04-26 Thread Roland Hänel
I have a cluster of 5 machines building a Cassandra datastore, and I load bulk data into this using the Java Thrift API. The first ~250GB runs fine, then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using and row or index caches, and since I only have 5 CF's and some 2,5 GB of

Re: ORM in Cassandra?

2010-04-26 Thread Tatu Saloranta
On Mon, Apr 26, 2010 at 10:35 AM, Ethan Rowe wrote: > On 04/26/2010 01:26 PM, Isaac Arias wrote: >> >> On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: >> ... >> In my opinion, a mapping solution for Cassandra should be more like a >> Template. Something that helps map (back and forth) rows to

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Ryan King
2010/4/26 Roland Hänel : > Hm... I understand that RAID0 would help to create a bigger pool for > compactions. However, it might impact read performance: if I have several > CF's (with their SSTables), random read requests for the CF files that are > on separate disks will behave nicely - however i

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Roland Hänel
Hm... I understand that RAID0 would help to create a bigger pool for compactions. However, it might impact read performance: if I have several CF's (with their SSTables), random read requests for the CF files that are on separate disks will behave nicely - however if it's RAID0 then a random read o

Re: range get over subcolumns on supercolumn family

2010-04-26 Thread Rafael Ribeiro
Just found the way... keyRange start and end key will be the same and instead of specifying the count and start on KeyRange it has to be specified on SliceRange and then keySlices will come with a single key and a list of columns... 2010/4/25 Rafael Ribeiro > Hi all! > > I am trying to do a p

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/CassandraHardware On Mon, Apr 26, 2010 at 1:06 PM, Edmond Lau wrote: > Ryan - > > You (or maybe someone else) mentioned using RAID-0 instead of multiple > data directories at the Cassandra hackathon as well.  Could you > explain the motivation behind that? > > Tha

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Edmond Lau
Ryan - You (or maybe someone else) mentioned using RAID-0 instead of multiple data directories at the Cassandra hackathon as well. Could you explain the motivation behind that? Thanks, Edmond On Mon, Apr 26, 2010 at 9:53 AM, Ryan King wrote: > I would recommend using RAID-0 rather that multipl

Re: ORM in Cassandra?

2010-04-26 Thread banks
The real tragedy is that we have not created a new acronym for this yet... OKVM... it makes more sense... On Mon, Apr 26, 2010 at 10:35 AM, Ethan Rowe wrote: > On 04/26/2010 01:26 PM, Isaac Arias wrote: > >> On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: >> >> >> >>> Clearly Cassandra is

Re: ORM in Cassandra?

2010-04-26 Thread Ethan Rowe
On 04/26/2010 01:26 PM, Isaac Arias wrote: On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: Clearly Cassandra is not an RDBMS. The intent of my Hibernate reference was to be more lyrical. Sorry if that didn't come through. Nonetheless, the need remains to relieve ourselves

Re: ORM in Cassandra?

2010-04-26 Thread Jeff Hodges
There is, of course, also cassandra_object on the ruby side. I assume this thread has the implicit requirement of Java, though. -- Jeff On Mon, Apr 26, 2010 at 10:26 AM, Isaac Arias wrote: > On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: > >> Clearly Cassandra is not an RDBMS.  The intent o

Re: ORM in Cassandra?

2010-04-26 Thread Isaac Arias
On Apr 26, 2010, at 12:13 PM, Geoffry Roberts wrote: > Clearly Cassandra is not an RDBMS. The intent of my Hibernate > reference was to be more lyrical. Sorry if that didn't come through. > Nonetheless, the need remains to relieve ourselves from excessive > boilerplate coding. I agree with eli

Cassandra Job in Pasadena

2010-04-26 Thread Anthony Molinaro
Hi, OpenX is looking for someone to work fulltime on Cassandra, we're located in Pasadena, CA. Here's a link to the job description http://www.openx.org/jobs/position/software-engineer-infrastructure We've been running cassandra in production since 0.3.0, and currently have 3 cassandra cluste

cassandra 0.5.1 java.lang.OutOfMemoryError: Java heap space issue

2010-04-26 Thread elsif
Hello. I have a six node cassandra cluster running on modest hardware with 1G of heap assigned to cassandra. After inserting about 245 million rows of data, cassandra failed with a java.lang.OutOfMemoryError: Java heap space error. I rasied the java heap to 2G, but still get the same error when

Re: How do you construct an index and use it, especially in Ruby

2010-04-26 Thread Ryan King
On Sun, Apr 25, 2010 at 11:14 AM, Bob Hutchison wrote: > > Hi, > > I'm new to Cassandra and trying to work out how to do something that I've > implemented any number of times (e.g. TokyoCabinet, Perst, even the > filesystem using grep :-) I've managed to get some of this working in > Cassandra

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Ryan King
I would recommend using RAID-0 rather that multiple data directories. -ryan 2010/4/26 Roland Hänel : > I have a configuration like this: > >   >   /storage01/cassandra/data >   /storage02/cassandra/data >   /storage03/cassandra/data >   > > After loading a big chunk of data into cas

Re: ORM in Cassandra?

2010-04-26 Thread Geoffry Roberts
Clearly Cassandra is not an RDBMS. The intent of my Hibernate reference was to be more lyrical. Sorry if that didn't come through. Nonetheless, the need remains to relieve ourselves from excessive boilerplate coding. On Mon, Apr 26, 2010 at 9:00 AM, Ned Wolpert wrote: > I don't think you are t

Re: Is SuperColumn necessary?

2010-04-26 Thread Jonathan Ellis
I think that once we have built-in indexing (CASSANDRA-749) you can make a good case for dropping supercolumns (at least, dropping them from the public API and reserving them for internal use). On Mon, Apr 26, 2010 at 11:05 AM, Schubert Zhang wrote: > I don't think the SuperColumn is so necessary

Re: Does anybody work about transaction on cassandra ?

2010-04-26 Thread Cagatay Kavukcuoglu
Better fault tolerance? Scalability to large data volumes? A combination of ZooKeeper based transactions and Cassandra may have better characteristics than RDBMS on these criteria. There's no question that trade-offs are involved, but as far as these issues are concerned, you'd be starting from

Is SuperColumn necessary?

2010-04-26 Thread Schubert Zhang
I don't think the SuperColumn is so necessary. I think this level of logic can be leaved to application. Do you think so? If SuperColumn is needed, as https://issues.apache.org/jira/browse/CASSANDRA-598, we should build index in SuperColumns level and SubColumns level. Thus, the levels of index

Re: ORM in Cassandra?

2010-04-26 Thread Ned Wolpert
I don't think you are trying to convert Cassandra to a RDBMS with what you want. The issue is that finding a way to map these objects to Cassandra in a meaningful way is hard. Its not as easy as saying 'do what hibernate does' simply because its not an RDBMS... but it is a reasonable and useful goa

Re: Trying To Understand get_range_slices Results When Using RandomPartitioner

2010-04-26 Thread Schubert Zhang
RandomPartioner is for row-keys. #1 no #2 yes #3 yes On Sat, Apr 24, 2010 at 4:33 AM, Larry Root wrote: > I trying to better understand how using the RandomPartitioner will affect > my ability to select ranges of keys. Consider my simple example where we > have many online games across differ

Re: ORM in Cassandra?

2010-04-26 Thread Geoffry Roberts
I am going to agree with axQd. Having something that does for Cassandra what say, Hibernate does for RDBMS seems an effort well worth pursuing. I have some complex object graphs written in Java. If I could annotate them and get persistence with a well laid out schema. It would be good. On Mon, A

Re: running cassandra as a service on windows

2010-04-26 Thread Antonio Alvarado Hernández
Hi all, Had you tried with Tanuki's Java Wrapper? It's so easy to deploy in Windows... -aah 2010/4/23, Miguel Verde : > https://issues.apache.org/jira/browse/CASSANDRA-292 points to > http://commons.apache.org/daemon/procrun.html which is used by other Apache > software to implement Windows servic

Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token

2010-04-26 Thread Jonathan Ellis
this is what IPartitioner does On Mon, Apr 26, 2010 at 10:16 AM, Schubert Zhang wrote: > Hi Jonathan Ellis and Stu Hood, > > I think, finally, we should provide a user customizable key abstract class. > User can define what types of key and its class, which define how to compare > keys. > > Schub

Re: ORM in Cassandra?

2010-04-26 Thread Schubert Zhang
I think you should forget these RDBMS tech. On Sat, Apr 24, 2010 at 11:00 AM, aXqd wrote: > On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert > wrote: > > There is nothing wrong with what you are asking. Some work has been done > to > > get an ORM layer ontop of cassandra, for example, with a RubyO

Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token

2010-04-26 Thread Schubert Zhang
Hi Jonathan Ellis and Stu Hood, I think, finally, we should provide a user customizable key abstract class. User can define what types of key and its class, which define how to compare keys. Schubert On Sat, Apr 24, 2010 at 1:16 PM, Stu Hood wrote: > Your keys cannot be an encoded as binary fo

RE: Does anybody work about transaction on cassandra ?

2010-04-26 Thread Mark Jones
Orthogonal in this case means "at cross purposes" Transactions can't really be done with eventual consistency because all nodes don't have all the info at the time the transaction is done. I think they recommend zookeeper for this kind of stuff, but I don't know why you want to use Cassandra v

Re: newbie question on how columns names are indexed/lucene limitations?

2010-04-26 Thread Schubert Zhang
The column index in a row is a sorted-blocked index (like b-tree), just like bigtable. On Mon, Apr 26, 2010 at 2:43 AM, Stu Hood wrote: > The indexes within rows are _not_ implemented with Lucene: there is a > custom index structure that allows for random access within a row. But, you > should p

Re: MapReduce, Timeouts and Range Batch Size

2010-04-26 Thread Jonathan Ellis
OPP will be marginally faster. Maybe 10%? I don't think anyone has benchmarked it. On Fri, Apr 23, 2010 at 10:30 AM, Joost Ouwerkerk wrote: > In that case I should probably wait for 0.7.  Is there any fundamental > performance difference in get_range_slices between Random and > Order-Preserving

Cassandra use cases: as a datagrid ? as a distributed cache ?

2010-04-26 Thread Dominique De Vito
Hi, Cassandra comes closer and closer to a data grid like Oracle Coherence: Cassandra includes distributed "hash maps", partitioning, high availability, map/reduce processing, (some) request capability, etc. So, I am wondering about the 2 following (and possible ?) Cassandra's use cases :

Re: value size, is there a suggested limit?

2010-04-26 Thread Schubert Zhang
I think that is not what cassandra good at. On Mon, Apr 26, 2010 at 4:22 AM, Mark Greene wrote: > http://wiki.apache.org/cassandra/CassandraLimitations > > > On Sun, Apr 25, 2010 at 4:19 PM, S Ahmed wrote: > >> Is there a suggested sized maximum that you can set the value of a given >> key? >>

Re: when i use the OrderPreservingPartition, the load is very imbalance

2010-04-26 Thread Lucas Di Pentima
Hello Mark, El 26/04/2010, a las 07:17, Mark Robson escribió: > I think the solution to this would be to choose your nodes' tokens wisely > before you start inserting data, and if possible, modify the keys to split > them better between the nodes. > > For example, if your key has two parts, on

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Roland Hänel
Thanks very much. Precisely answers my questions. :-) 2010/4/26 Schubert Zhang > Please refer the code: > > org.apache.cassandra.db.ColumnFamilyStore > > public String getFlushPath() > { > long guessedSize = 2 * DatabaseDescriptor.getMemtableThroughput() * > 1024*1024; // 2* adds

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Schubert Zhang
Please refer the code: org.apache.cassandra.db.ColumnFamilyStore public String getFlushPath() { long guessedSize = 2 * DatabaseDescriptor.getMemtableThroughput() * 1024*1024; // 2* adds room for keys, column indexes String location = DatabaseDescriptor.getDataFileLocationF

Re: Re: when i use the OrderPreservingPartition, the load is veryimbalance

2010-04-26 Thread Bingbing Liu
thank you so much for your help! 2010-04-26 Bingbing Liu 发件人: Mark Robson 发送时间: 2010-04-26 18:17:53 收件人: user 抄送: 主题: Re: when i use the OrderPreservingPartition, the load is veryimbalance On 26 April 2010 01:18, 刘兵兵 wrote: i do some INSERT ,because i will do some scan operation

Re: when i use the OrderPreservingPartition, the load is very imbalance

2010-04-26 Thread Schubert Zhang
When starting your cassandra cluster, please configure the InitialToken for each node, which make the key range balance. On Mon, Apr 26, 2010 at 6:17 PM, Mark Robson wrote: > On 26 April 2010 01:18, 刘兵兵 wrote: > >> i do some INSERT ,because i will do some scan operations, i use the >> OrderPres

Re: when i use the OrderPreservingPartition, the load is very imbalance

2010-04-26 Thread Mark Robson
On 26 April 2010 01:18, 刘兵兵 wrote: > i do some INSERT ,because i will do some scan operations, i use the > OrderPreservingPartition method. > > the state of the cluster is showed below. > > as i predicated the load is very imbalance I think the solution to this would be to choose your nodes' t

Re: when i use the OrderPreservingPartition, the load is very imbalance

2010-04-26 Thread Roland Hänel
sorry, if specifying the token manually, use: bin/nodetool -h move 2010/4/26 Roland Hänel > 1) you can re-balance a node with > > bin/nodetool -h token [] > > specify a new token manually or let the system guess one. > > 2) take a look into your system.log to find out why your nodes

Re: how to store file in the cassandra?

2010-04-26 Thread Mark Robson
On 26 April 2010 00:57, Shuge Lee wrote: > In Python: > > keyspace.columnfamily[key][column] = value > > files.video[uuid.uuid4()]['name'] = 'foo.flv' > files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv' > Hi. Storing the filename in the database will not solve the file storage problem. C

Re: when i use the OrderPreservingPartition, the load is very imbalance

2010-04-26 Thread Roland Hänel
1) you can re-balance a node with bin/nodetool -h token [] specify a new token manually or let the system guess one. 2) take a look into your system.log to find out why your nodes are dying. 2010/4/26 刘兵兵 > i do some INSERT ,because i will do some scan operations, i use the > OrderPres

Re: how to store file in the cassandra?

2010-04-26 Thread dir dir
Hi Jonathan, Cassandra seems has not a Blob data type. To handle binary large object data, we have to use array of byte. I have a question to you. Suppose I have a MPEG video files 15 MB. To save this video file into Cassandra database I will store this file into array of byte. One day, I feel thi

Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Roland Hänel
I have a configuration like this: /storage01/cassandra/data /storage02/cassandra/data /storage03/cassandra/data After loading a big chunk of data into cassandra, I end up wich some 70GB in the first directory, and only about 10GB in the second and third one. All rows are q