strange get_range_slices behaviour v0.6.1

2010-04-25 Thread aaron
I've been looking at the get_range_slices feature and have found some odd behaviour I do not understand. Basically the keys returned in a range query do not match what I would expect to see. I think it may have something to do with the ordering of keys that I don't know about, but I'm just guessin

Re: Re: how to store file in the cassandra?

2010-04-25 Thread Bingbing Liu
thanks , 2010-04-26 Bingbing Liu 发件人: Jonathan Ellis 发送时间: 2010-04-26 09:29:28 收件人: user 抄送: 主题: Re: how to store file in the cassandra? Cassandra stores byte arrays. You can certainly store file data in it, although if it is larger than a few MB you should chunk it into multipl

Re: how to store file in the cassandra?

2010-04-25 Thread Jonathan Ellis
Cassandra stores byte arrays. You can certainly store file data in it, although if it is larger than a few MB you should chunk it into multiple columns. On Sun, Apr 25, 2010 at 8:21 PM, Shuge Lee wrote: > Yes. > > Cassandra does save raw string data only, not a file, and shouldn't save a > file.

Re: how to store file in the cassandra?

2010-04-25 Thread Shuge Lee
Yes. Cassandra does save raw string data only, not a file, and shouldn't save a file. 2010/4/26 刘兵兵 > sorry i'm not very familiar with python, are you meaning that the files are > stored in the file system of the os? > > then , the cassandra just stores the path to access the files? > > > > On

when i use the OrderPreservingPartition, the load is very imbalance

2010-04-25 Thread 刘兵兵
i do some INSERT ,because i will do some scan operations, i use the OrderPreservingPartition method. the state of the cluster is showed below. as i predicated the load is very imbalance, and some of the nodes down (in some nodes,the Cassandra processes died and in others the processes are alive

Re: how to store file in the cassandra?

2010-04-25 Thread 刘兵兵
sorry i'm not very familiar with python, are you meaning that the files are stored in the file system of the os? then , the cassandra just stores the path to access the files? On Mon, Apr 26, 2010 at 8:57 AM, Shuge Lee wrote: > In Python: > > keyspace.columnfamily[key][column] = value > > file

Re: how to store file in the cassandra?

2010-04-25 Thread Shuge Lee
In Python: keyspace.columnfamily[key][column] = value files.video[uuid.uuid4()]['name'] = 'foo.flv' files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv' create a mapping files.video = { uuid.uuid4() : { 'name' : 'foo.flv', 'path' : '/var/files/foo.flv', } } if most

how to store file in the cassandra?

2010-04-25 Thread Bingbing Liu
any suggestion? 2010-04-26 Bingbing Liu

Re: Question about TimeUUIDType

2010-04-25 Thread Jonathan Ellis
On Sun, Apr 25, 2010 at 5:40 PM, Tatu Saloranta wrote: >> Now with TimeUUIDType, if two UUID have the same timestamps, they are ordered >> by bytes order. > > Naively for the whole UUID? That would not be good, given that > timestamp within UUID is not stored in expected lexical order, but > with

Re: Question about TimeUUIDType

2010-04-25 Thread Tatu Saloranta
On Sat, Apr 24, 2010 at 2:08 AM, Sylvain Lebresne wrote: > On Sat, Apr 24, 2010 at 12:53 AM, Jesse McConnell > wrote: >> try LexicalUUIDType, that will distinguish the secs correctly >> >> imo based on the existing impl (last I checked at least) TimeUUIDType >> was equivalent to LongType > > It u

Re: Lucandra - Lucene/Solr on Cassandra: April 26, NYC

2010-04-25 Thread Utku Can Topçu
Can you please release the talk at a place after it's been done? Best Regards, Utku On Thu, Apr 22, 2010 at 6:51 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hello folks, > > Those of you in or near NYC and using Lucene or Solr should come to > "Lucandra - a Cassandra-based backen

range get over subcolumns on supercolumn family

2010-04-25 Thread Rafael Ribeiro
Hi all! I am trying to do a paginated query on the subcolumns of a superfamily column but sincerely I am a little bit confused. I have already been able to do a range query but only over the keys of a regular column family. For the keys case I've been able to do so using the code below:

Re: value size, is there a suggested limit?

2010-04-25 Thread Mark Greene
http://wiki.apache.org/cassandra/CassandraLimitations On Sun, Apr 25, 2010 at 4:19 PM, S Ahmed wrote: > Is there a suggested sized maximum that you can set the value of a given > key? > > e.g. could I convert a document to bytes and store it as a value to a key? > if yes, which I presume so, wh

value size, is there a suggested limit?

2010-04-25 Thread S Ahmed
Is there a suggested sized maximum that you can set the value of a given key? e.g. could I convert a document to bytes and store it as a value to a key? if yes, which I presume so, what if the file is 10mb? or 100mb?

RE: newbie question on how columns names are i ndexed/lucene limitations?

2010-04-25 Thread Stu Hood
The indexes within rows are _not_ implemented with Lucene: there is a custom index structure that allows for random access within a row. But, you should probably read http://wiki.apache.org/cassandra/CassandraLimitations to understand the current limitations of the file format, some of which are

Re: Cassandra - Thread leak when high concurrent load

2010-04-25 Thread Brandon Williams
On Sun, Apr 25, 2010 at 12:09 PM, JKnight JKnight wrote: > Thanks Robson, > > The number of thread gradually increase to 7000. And the server hang up. > I know threadpool is used to prevent creating large number of thread. > > So why Cassandra create large number of thread when high concurrent loa

How do you construct an index and use it, especially in Ruby

2010-04-25 Thread Bob Hutchison
Hi, I'm new to Cassandra and trying to work out how to do something that I've implemented any number of times (e.g. TokyoCabinet, Perst, even the filesystem using grep :-) I've managed to get some of this working in Cassandra but not all. So here's the core of the situation. I have this opaq

Re: Cassandra - Thread leak when high concurrent load

2010-04-25 Thread JKnight JKnight
Thanks Robson, The number of thread gradually increase to 7000. And the server hang up. I know threadpool is used to prevent creating large number of thread. So why Cassandra create large number of thread when high concurrent load. On Sun, Apr 25, 2010 at 5:38 PM, Mark Robson wrote: > > > On 2

newbie question on how columns names are indexed/lucene limitations?

2010-04-25 Thread TuX RaceR
Hello Cassandra Users, When use the RandomPartinionner and a simple ColumnFamily/Columns (i.e. no SuperColumns) my understanding is that one signle Row can store millions of columns. If I look at the http://wiki.apache.org/cassandra/API, I understand that I can get a subset of the millions o

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joseph Stein
it is kind of the classic distinction between OLTP & OLAP. Cassandra is to OLTP as HBase is to OLAP (for those SAT nutz). Both are useful and valuable in their own right, agreed. On Sun, Apr 25, 2010 at 12:20 PM, Jeff Hodges wrote: > HBase is awesome when you need high throughput and don't care

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joe Stump
On Apr 25, 2010, at 5:18 PM, Eric Hauser wrote: > Out of curiosity, are you planning on copying the data you store in > HBase/Hive into separate Hadoop cluster in a different data center or backing > up HDFS in some other manner? Redundancy isn't an issue within the cluster; > it's more a con

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Jeff Hodges
HBase is awesome when you need high throughput and don't care so much about latency. Cassandra is generally the opposite. They are wonderfully complementary. -- Jeff On Sun, Apr 25, 2010 at 8:19 AM, Lenin Gali wrote: > I second Joe. > > Lenin > Sent from my BlackBerry® wireless handheld > > -

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Eric Hauser
Out of curiosity, are you planning on copying the data you store in HBase/Hive into separate Hadoop cluster in a different data center or backing up HDFS in some other manner? Redundancy isn't an issue within the cluster; it's more a concern of storing all your HDFS data in one physical location.

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Lenin Gali
I second Joe. Lenin Sent from my BlackBerry® wireless handheld -Original Message- From: Joe Stump Date: Sun, 25 Apr 2010 13:04:50 To: Subject: Re: The Difference Between Cassandra and HBase On Apr 25, 2010, at 11:40 AM, Mark Robson wrote: > For me an important difference is that Cas

Re: Cassandra-cli tutorials

2010-04-25 Thread Roger Schildmeijer
On 25 apr 2010, at 15.15em, S Ahmed wrote: > Ok excited I got it up and running on windows 7, yah! > > Curious, are there any tutorials or examples of using the cassandra-cli? http://wiki.apache.org/cassandra/CassandraCli > > BTW, the cassandra-cli is pretty cool, even comes with tab-complete

Cassandra-cli tutorials

2010-04-25 Thread S Ahmed
Ok excited I got it up and running on windows 7, yah! Curious, are there any tutorials or examples of using the cassandra-cli? BTW, the cassandra-cli is pretty cool, even comes with tab-complete, is that an OS thing or someone coded that feature up? I'm going to dig into the code for thistha

Re: getting cassandra setup on windows 7

2010-04-25 Thread S Ahmed
great that worked thanks! On Fri, Apr 23, 2010 at 2:28 PM, Mark Greene wrote: > Try the > cassandra-with-fixes.bat > file > attached to the issue. I had the same issue an that bat file got cassandra > to start.

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joe Stump
On Apr 25, 2010, at 11:40 AM, Mark Robson wrote: > For me an important difference is that Cassandra is operationally much more > straightforward - there is only one type of node, and it is fully redundant > (depending what consistency level you're using). > > This seems to be an advantage in C

Re: tcp CLOSE_WAIT bug

2010-04-25 Thread yangfeng
I encountered the same problem! Hope to get some help.Tks. 2010/4/22 Ingram Chen > arh! That's right. > > I check OutboundTcpConnection and it only does closeSocket() after > something went wrong. I will log more in OutboundTcpConnection to see what > actually happens. > > Thank your help. > > >

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Mark Robson
For me an important difference is that Cassandra is operationally much more straightforward - there is only one type of node, and it is fully redundant (depending what consistency level you're using). This seems to be an advantage in Cassandra vs most other distributed storage systems, which almos

Re: Cassandra - Thread leak when high concurrent load

2010-04-25 Thread Mark Robson
On 25 April 2010 10:48, JKnight JKnight wrote: > Dear all, > > My Cassandra server had thread leak when high concurrent load. I used > jconsole and saw many, many thread occur. > Just because there are a lot of threads, need not imply a thread leak. Cassandra uses a lot of threads. Do you see t

Cassandra - Thread leak when high concurrent load

2010-04-25 Thread JKnight JKnight
Dear all, My Cassandra server had thread leak when high concurrent load. I used jconsole and saw many, many thread occur. I knew Cassandra use TThreadPoolServer for handling request. And Cassandra use DebuggableThreadPoolExecutor to handling command (read/write). I want to know the reason of thr