Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Vijay
What I did for one of our project was similar Use super col to strore files and dir metadata use another row(Key UUID) to store the dir contents (Files and subdirectory). we used UUID instead of paths because there will be rename or move store the small files in cassandra We used I

Re: SuperColumns

2010-04-14 Thread Vijay
Yes a super column can only have columns in it. Regards, On Wed, Apr 14, 2010 at 10:28 PM, Christian Torres wrote: > I'm defining a ColumnFamily (Table) type Super, It's posible to have a > SuperColumn inside another SuperColumn or SuperColumns can only have normal > columns? > > -- > Christi

SuperColumns

2010-04-14 Thread Christian Torres
I'm defining a ColumnFamily (Table) type Super, It's posible to have a SuperColumn inside another SuperColumn or SuperColumns can only have normal columns? -- Christian Torres * Desarrollador Web * Guegue.com * Celular: +505 84 65 92 62 * Loving of the Programming

Re: Lucandra or some way to query

2010-04-14 Thread Jake Luciani
Lucandra spreads the data randomly by index + field combination so you do get "some" distribution for free. Otherwise you can use "nodetool loadbalance" to alter the token ring to alleviate hotspots. On Thu, Apr 15, 2010 at 2:04 AM, HubertChang wrote: > > If you worked with Lucandra in a dedicat

TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-04-14 Thread richard yao
I am having a try on cassandra, and I use php to access cassandra by thrift API. I got an error like this: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160 What's wrong? Thanks.

Re: Starting Cassandra Fauna

2010-04-14 Thread Nirmala Agadgar
Hi, I'm using ruby client as of now. Can u give details for ruby client.Also if possible java client. Thanks for reply. - Nirmala On Thu, Apr 15, 2010 at 10:02 AM, richard yao wrote: > try this > https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP > > > > > On Thu, Apr 15, 2010 a

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Tatu Saloranta
On Wed, Apr 14, 2010 at 7:26 PM, Avinash Lakshman wrote: > OPP is not required here. You would be better off using a Random partitioner > because you want to get a random distribution of the metadata. Not for splitting, but for actual file system hierarchy it would. How else would you traverse hi

Re: Starting Cassandra Fauna

2010-04-14 Thread Paul Prescod
There is a tutorial here: * http://www.sodeso.nl/?p=80 This page includes data inserts: * http://www.sodeso.nl/?p=251 Like: c.setColumn(new Column("email".getBytes("utf-8"), "ronald (at) sodeso.nl".getBytes("utf-8"), timestamp)) columns.add(c); The Sample code is attached to that blog post.

Re: Starting Cassandra Fauna

2010-04-14 Thread richard yao
try this https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP On Thu, Apr 15, 2010 at 12:23 PM, Nirmala Agadgar wrote: > Hi, > > I want to insert data into Cassandra programmatically in a loop. > Also i'm a newbie to Linux world and Github. Started to work on Linux for > only rea

Re: Starting Cassandra Fauna

2010-04-14 Thread Nirmala Agadgar
Hi, I want to insert data into Cassandra programmatically in a loop. Also i'm a newbie to Linux world and Github. Started to work on Linux for only reason to implement Cassandra.Digging Cassandra for last on week.How to insert data in cassandra and test it? Can anyone help me out on this? - Nim

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Ken Sandney
tried CassFS, but not stable yet, may be a good prototype to start On Thu, Apr 15, 2010 at 12:15 PM, Michael Greene wrote: > On Wed, Apr 14, 2010 at 11:01 PM, Ken Sandney wrote: > >> a fuse based FS maybe better I guess > > > This has been done, for better or worse, by jdarcy of http://pl.atyp.

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Michael Greene
On Wed, Apr 14, 2010 at 11:01 PM, Ken Sandney wrote: > a fuse based FS maybe better I guess This has been done, for better or worse, by jdarcy of http://pl.atyp.us/: http://github.com/jdarcy/CassFS

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Ken Sandney
a fuse based FS maybe better I guess On Thu, Apr 15, 2010 at 11:50 AM, Jonathan Ellis wrote: > You forked Cassandra 0.5 for that? > > That's... a strange way to do it. > > On Wed, Apr 14, 2010 at 9:36 PM, Jeff Zhang wrote: > > We are currently doing such things, and now we are still at the sta

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Jonathan Ellis
You forked Cassandra 0.5 for that? That's... a strange way to do it. On Wed, Apr 14, 2010 at 9:36 PM, Jeff Zhang wrote: > We are currently doing such things, and now we are still at the start stage. > Currently we only plan to store small files. For large files, splitting to > small blocks is re

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread HubertChang
Note: there are glusterfs, ceph, brtfs and luster. there is drbd. -- View this message in context: http://n2.nabble.com/Is-that-possible-to-write-a-file-system-over-Cassandra-tp4905111p4905312.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Jeff Zhang
We are currently doing such things, and now we are still at the start stage. Currently we only plan to store small files. For large files, splitting to small blocks is really one of our options. You can check out from here http://code.google.com/p/cassandra-fs/ Document for this project is lack n

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Miguel Verde
On Wed, Apr 14, 2010 at 9:26 PM, Avinash Lakshman < avinash.laksh...@gmail.com> wrote: > OPP is not required here. You would be better off using a Random > partitioner because you want to get a random distribution of the metadata. Not required, certainly. However, it strikes me that 1 cluster i

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Avinash Lakshman
OPP is not required here. You would be better off using a Random partitioner because you want to get a random distribution of the metadata. Avinash On Wed, Apr 14, 2010 at 7:25 PM, Avinash Lakshman < avinash.laksh...@gmail.com> wrote: > Exactly. You can split a file into blocks of any size and y

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Avinash Lakshman
Exactly. You can split a file into blocks of any size and you can actually distribute the metadata across a large set of machines. You wouldn't have the issue of having small files in this approach. The issue maybe the eventual consistency - not sure that is a paradigm that would be acceptable for

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Miguel Verde
On Wed, Apr 14, 2010 at 9:15 PM, Ken Sandney wrote: > Large files can be split into small blocks, and the size of block can be > tuned. It may increase the complexity of writing such a file system, but can > be for general purpose (not only for relative small files) Right, this is the path tha

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Ken Sandney
Large files can be split into small blocks, and the size of block can be tuned. It may increase the complexity of writing such a file system, but can be for general purpose (not only for relative small files) On Thu, Apr 15, 2010 at 10:08 AM, Tatu Saloranta wrote: > On Wed, Apr 14, 2010 at 6:42 P

Re: Is that possible to write a file system over Cassandra?

2010-04-14 Thread Tatu Saloranta
On Wed, Apr 14, 2010 at 6:42 PM, Zhuguo Shi wrote: > Hi, > Cassandra has a good distributed model: decentralized, auto-partition, > auto-recovery. I am evaluating about writing a file system over Cassandra > (like CassFS: http://github.com/jdarcy/CassFS ), but I don't know if > Cassandra is good a

Re: Lucandra or some way to query

2010-04-14 Thread HubertChang
If you worked with Lucandra in a dedicated searching-purposed cluster, you could balanced the data very well with some effort. >>I think Lucandra is really a great idea, but since it needs order-preserving-partitioner, does that mean >>there may be some 'hot-spot' during searching? -- View this

Is that possible to write a file system over Cassandra?

2010-04-14 Thread Zhuguo Shi
Hi, Cassandra has a good distributed model: decentralized, auto-partition, auto-recovery. I am evaluating about writing a file system over Cassandra (like CassFS: http://github.com/jdarcy/CassFS ), but I don't know if Cassandra is good at such use case? Regards

Re: Lucandra or some way to query

2010-04-14 Thread Zhuguo Shi
I think Lucandra is really a great idea, but since it needs order-preserving-partitioner, does that mean there may be some 'hot-spot' during searching?

Re: KeysCached and sstable

2010-04-14 Thread Jonathan Ellis
On Wed, Apr 14, 2010 at 10:23 AM, Paul Prescod wrote: > The inline docs say: > >       ~ The optional KeysCached attribute specifies >       ~ the number of keys per sstable whose locations we keep in >       ~ memory in "mostly LRU" order. > > There are a few confusing bits in that sentence. > >

Re: Did 0.6 break sstable2jason? or am I missing something?

2010-04-14 Thread Brandon Williams
On Wed, Apr 14, 2010 at 11:53 AM, Chris Beaumont wrote: > I enjoy very much being able to quickly get a peak at my data once stored, > and so > far sstable2json was a great help... > > I just completed switching from 0.5.1 to 0.6, and here is what I am getting > now: > $ sstable2json Standard2-1-I

Did 0.6 break sstable2jason? or am I missing something?

2010-04-14 Thread Chris Beaumont
I enjoy very much being able to quickly get a peak at my data once stored, and so far sstable2json was a great help... I just completed switching from 0.5.1 to 0.6, and here is what I am getting now: $ sstable2json Standard2-1-Index.db Exception in thread "main" java.lang.NullPointerException

Re: Lucandra or some way to query

2010-04-14 Thread Jake Luciani
Hi, What doesn't work with lucandra exactly? Feel free to msg me. -Jake On Wed, Apr 14, 2010 at 9:30 PM, Jesus Ibanez wrote: > I will explore Lucandra a little more and if I can't get it to work today, > I will go for Option 2. > Using SQL will not be efficient in the future, if my website gr

Re: Lucandra or some way to query

2010-04-14 Thread Jesus Ibanez
I will explore Lucandra a little more and if I can't get it to work today, I will go for Option 2. Using SQL will not be efficient in the future, if my website grows. Thenks for your answer Eric! Jesús. 2010/4/14 Eric Evans > On Wed, 2010-04-14 at 06:45 -0300, Jesus Ibanez wrote: > > Option 1

Re: Reading thousands of columns

2010-04-14 Thread James Golick
The values are empty. It's 3000 UUIDs. On Wed, Apr 14, 2010 at 12:40 PM, Avinash Lakshman < avinash.laksh...@gmail.com> wrote: > How large are the values? How much data on disk? > > On Wednesday, April 14, 2010, James Golick wrote: > > Just for the record, I am able to repeat this locally. > > I

Re: Reading thousands of columns

2010-04-14 Thread Avinash Lakshman
How large are the values? How much data on disk? On Wednesday, April 14, 2010, James Golick wrote: > Just for the record, I am able to repeat this locally. > I'm seeing around 150ms to read 1000 columns from a row that has 3000 in it. > If I enable the rowcache, that goes down to about 90ms. Acc

Re: Reading thousands of columns

2010-04-14 Thread James Golick
Just for the record, I am able to repeat this locally. I'm seeing around 150ms to read 1000 columns from a row that has 3000 in it. If I enable the rowcache, that goes down to about 90ms. According to my profile, 90% of the time is being spent waiting for cassandra to respond, so it's not thrift.

Re: [RELEASE] 0.6.0

2010-04-14 Thread Ted Zlatanov
On Wed, 14 Apr 2010 12:23:19 -0500 Eric Evans wrote: EE> On Wed, 2010-04-14 at 10:16 -0500, Ted Zlatanov wrote: >> Can it support a non-root user through /etc/default/cassandra? I've >> been patching the init script myself but was hoping this would be >> standard. EE> It's the first item on d

Re: Reading thousands of columns

2010-04-14 Thread Paul Prescod
On Wed, Apr 14, 2010 at 10:31 AM, Mike Malone wrote: > ... > > Couldn't you cache a list of keys that were returned for the key range, then > cache individual rows separately or not at all? > By "blowing away rows queried by key" I'm guessing you mean "pushing them > out of the LRU cache," not exp

Re: History values

2010-04-14 Thread Paul Prescod
If you want to use Cassandra, you should probably store each historical value as a new column in the row. On Wed, Apr 14, 2010 at 12:34 AM, Yésica Rey wrote: > I am new to using cassandra. In the documentation I have read, understand, > that as in other non-documentary databases, to update the va

Re: Reading thousands of columns

2010-04-14 Thread Mike Malone
On Wed, Apr 14, 2010 at 7:45 AM, Jonathan Ellis wrote: > 35-50ms for how many rows of 1000 columns each? > > get_range_slices does not use the row cache, for the same reason that > oracle doesn't cache tuples from sequential scans -- blowing away > 1000s of rows worth of recently used rows querie

Re: [RELEASE] 0.6.0

2010-04-14 Thread Eric Evans
On Wed, 2010-04-14 at 10:16 -0500, Ted Zlatanov wrote: > Can it support a non-root user through /etc/default/cassandra? I've > been patching the init script myself but was hoping this would be > standard. It's the first item on debian/TODO, but, you know, patches welcome and all that. -- Eric

Re: Lucandra or some way to query

2010-04-14 Thread Eric Evans
On Wed, 2010-04-14 at 06:45 -0300, Jesus Ibanez wrote: > Option 1 - insert data in all different ways I need in order to be > able to query? Rolling your own indexes is fairly common with Cassandra. > Option 2 - implement Lucandra? Can you link me to a blog or an article > that guides me on how t

Re: Reading thousands of columns

2010-04-14 Thread James Golick
That helped a little. But, it's still quite slow. Now, it's around 20-35ms on average, sometimes as high as 70ms. On Wed, Apr 14, 2010 at 8:50 AM, James Golick wrote: > Right - that make sense. I'm only fetching one row. I'll give it a try with > get_slice(). > > Thanks, > > -James > > > On Wed,

Re: Reading thousands of columns

2010-04-14 Thread James Golick
Right - that make sense. I'm only fetching one row. I'll give it a try with get_slice(). Thanks, -James On Wed, Apr 14, 2010 at 7:45 AM, Jonathan Ellis wrote: > 35-50ms for how many rows of 1000 columns each? > > get_range_slices does not use the row cache, for the same reason that > oracle do

Re: History values

2010-04-14 Thread Mike Gallamore
Here here on documentation. For example thrift examples in python and java. That is great but I've never coded in either (and am limited to perl or C at work because when have 5 years worth of code and experience with other modules provided for those languages). So I'm stuck with whatever the

KeysCached and sstable

2010-04-14 Thread Paul Prescod
The inline docs say: ~ The optional KeysCached attribute specifies ~ the number of keys per sstable whose locations we keep in ~ memory in "mostly LRU" order. There are a few confusing bits in that sentence. 1. Why is "keys per sstable" rather than "keys per column family".

Re: [RELEASE] 0.6.0

2010-04-14 Thread Ted Zlatanov
On Tue, 13 Apr 2010 15:54:39 -0500 Eric Evans wrote: EE> I leaned into it. An updated package has been uploaded to the Cassandra EE> repo (see: http://wiki.apache.org/cassandra/DebianPackaging). Thank you for providing the release to the repository. Can it support a non-root user through /etc/

Re: Reading thousands of columns

2010-04-14 Thread Jonathan Ellis
35-50ms for how many rows of 1000 columns each? get_range_slices does not use the row cache, for the same reason that oracle doesn't cache tuples from sequential scans -- blowing away 1000s of rows worth of recently used rows queried by key, for a swath of rows from the scan, is the wrong call mor

Re: Time-series data model

2010-04-14 Thread alex kamil
James, i'm a big fan of Cassandra, but have you looked at http://en.wikipedia.org/wiki/RRDtool is is natively built for this type of problem Alex On Wed, Apr 14, 2010 at 9:02 AM, Jean-Pierre Bergamin wrote: > Hello everyone > > We are currently evaluating a new DB system (replacing MySQL) to st

Re: Reading thousands of columns

2010-04-14 Thread Gautam Singaraju
Yes, I find that get_range_slices takes an incredibly long time return the results. --- Gautam On Tue, Apr 13, 2010 at 2:00 PM, James Golick wrote: > Hi All, > I'm seeing about 35-50ms to read 1000 columns from a CF using > get_range_slices. The columns are TimeUUIDType with empty values. > The

Re: Time-series data model

2010-04-14 Thread Ted Zlatanov
On Wed, 14 Apr 2010 15:02:29 +0200 "Jean-Pierre Bergamin" wrote: JB> The metrics are stored together with a timestamp. The queries we want to JB> perform are: JB> * The last value of a specific metric of a device JB> * The values of a specific metric of a device between two timestamps t1 and

Re: Time-series data model

2010-04-14 Thread Zhiguo Zhang
first of all I am a new bee by Non-SQL. I try write down my opinions as references: If I were you, I will use 2 columnfamilys: 1.CF, key is devices 2.CF, key is timeuuid how do u think about that? Mike On Wed, Apr 14, 2010 at 3:02 PM, Jean-Pierre Bergamin wrote: > Hello everyone > > We are

Time-series data model

2010-04-14 Thread Jean-Pierre Bergamin
Hello everyone We are currently evaluating a new DB system (replacing MySQL) to store massive amounts of time-series data. The data are various metrics from various network and IT devices and systems. Metrics i.e. could be CPU usage of the server "xy" in percent, memory usage of server "xy" in MB,

Re: History values

2010-04-14 Thread Jonathan Ellis
The closest is http://github.com/driftx/chiton On Wed, Apr 14, 2010 at 2:57 AM, Yésica Rey wrote: > Ok, thank you very much for your reply. > I have another question may seem stupid ... Cassandra has a graphical > console, such as mysql for SQL databases? > > Regards! >

Re: Starting Cassandra Fauna

2010-04-14 Thread Jonathan Ellis
there are two "installing on centos" articles linked on http://wiki.apache.org/cassandra/ArticlesAndPresentations On Wed, Apr 14, 2010 at 1:28 AM, Nirmala Agadgar wrote: > Hi, > > Can anyone please list steps to install and run cassandra in centos. > It can help me to follow and check where i mis

Re: RE : Re: RE : Re: Two dimensional matrices

2010-04-14 Thread Philippe
> I'm confused : don't range queries such as the ones we've been > > discussing require using an orderedpartitionner ? > > Alright, so distribution depends on your choice of token. > Ah yes, I get it now : with a naive orderedpartitioner, the key is associated with the node whose token is the clos

server crash - how to invertigate

2010-04-14 Thread Ran Tavory
I'm running a 0.6.0 cluster with four nodes and one of them just crashed. The logs all seem normal and I haven't seen anything special in the jmx counters before the crash. I have one client writing and reading using 10 threads and using 3 different column families: KvAds, KvImpressions and KvUse

Re: History values

2010-04-14 Thread aXqd
On Wed, Apr 14, 2010 at 5:13 PM, Zhiguo Zhang wrote: > I think it is still to young, and have to wait or write your self the > "graphical console", at least, I don't find any until now. Frankly speaking, I'm OK to be without GUI...But I am really disappointed by those so-called 'documents'. I rea

Lucandra or some way to query

2010-04-14 Thread Jesus Ibanez
Hello. I need to know how to search in Cassandra. I could save the data in different ways so I can then retrive it like for example this: get keyspace.users['123'] => (column=name, value=John, timestamp=xx) get keyspace.searchByName['John'] => (column=userID, value=123, timestamp=xxx

Re: History values

2010-04-14 Thread Zhiguo Zhang
I think it is still to young, and have to wait or write your self the "graphical console", at least, I don't find any until now. On Wed, Apr 14, 2010 at 10:04 AM, Bertil Chapuis wrote: > I'm also new to cassandra and about the same question I asked me if using > super columns with one key per ve

Re: New User: OSX vs. Debian on Cassandra 0.5.0 with Thrift

2010-04-14 Thread Zhiguo Zhang
Hi, sorry I can't help you, but could you please tell me, how could you get the charts in the attachment? Thanks. Mike On Wed, Apr 14, 2010 at 6:38 AM, Heath Oderman wrote: > Hi, > > I wrote a few days ago and got a few good suggestions. I'm still seeing > dramatic differences between Cassand

Re: History values

2010-04-14 Thread Bertil Chapuis
I'm also new to cassandra and about the same question I asked me if using super columns with one key per version was feasible. Is there limitations to this use case (or better practices)? Thank you and best regards, Bertil Chapuis On 14 April 2010 09:45, Sylvain Lebresne wrote: > > I am new to

Re: History values

2010-04-14 Thread Yésica Rey
Ok, thank you very much for your reply. I have another question may seem stupid ... Cassandra has a graphical console, such as mysql for SQL databases? Regards!

Re: History values

2010-04-14 Thread Sylvain Lebresne
> I am new to using cassandra. In the documentation I have read, understand, > that as in other non-documentary databases, to update the value of a > key-value tuple, this new value is stored with a timestamp different but > without entirely losing the old value. > I wonder, as I can restore the hi

Re: History values

2010-04-14 Thread Benjamin Black
Values with newer timestamps completely replace the old values. There is no way to access historic values. On Wed, Apr 14, 2010 at 12:34 AM, Yésica Rey wrote: > I am new to using cassandra. In the documentation I have read, understand, > that as in other non-documentary databases, to update the

History values

2010-04-14 Thread Yésica Rey
I am new to using cassandra. In the documentation I have read, understand, that as in other non-documentary databases, to update the value of a key-value tuple, this new value is stored with a timestamp different but without entirely losing the old value. I wonder, as I can restore the historic

Re: Caching is a full row?

2010-04-14 Thread Sylvain Lebresne
Yes, it will put the whole row in cache even if you read only a bunch of columns. It means in particular that with row cache, every time you read a row, the full row will be read on a cache miss. Thus it may hurts you read badly in some scenario (typically with big rows) instead of helping them. En