Re: Cassandra to store 1 billion small 64KB Blobs

2010-07-23 Thread Michael Widmann
Hi Jonathan Thanks for your very valuable input on this. I maybe didn't enough explanation - so I'll try to clarify Here are some thoughts: - binary data will not be indexed - only stored. - The file name to the binary data (a hash) should be indexed for search - We could group the ha

Re: Cassandra basics

2010-07-23 Thread themana...@juno.com
Sonia, If you haven't visited the site below, I think it does the best job of explaining what Cassandra is all about: http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model Consider a group of ColumFamilies within a Keyspace like a group of tables in a database. CompareWith tells Cassa

Cassandra basics

2010-07-23 Thread sonia gehlot
Hi Guys, Recently I have started reading about Cassandra from the material available on net. But I have very confused with the basic understanding of Cassandra. I understand columnFamilies, supper columnFamilies but I have no clear understanding of 1. I don't understand wha

Re: CRUD test

2010-07-23 Thread Jonathan Shook
I think you are getting it. As far as what means what at which level, it's really about using them consistently in every case. The [row] key (or [row] key range) is a top-level argument for all of the operations, since it is the "key" to mapping the set of responsible nodes. The key is the part of

RE: CRUD test

2010-07-23 Thread Peter Minearo
Consequentially the remove should look like: ColumnPath cp1 = new ColumnPath("Super2"); cp1.setSuper_column("Best Western".getBytes()); client.remove(KEYSPACE, "hotel", cp1,

RE: CRUD test

2010-07-23 Thread Peter Minearo
CORRECTION: ColumnPath cp1 = new ColumnPath("Super2"); cp1.setSuper_column("Best Western".getBytes()); cp1.setColumn("name".getBytes()); client.insert(KEYSPACE, "hotel", cp1, "Best Western of SF".getBytes(), System.currentTimeMillis(), ConsistencyLevel.ALL); -Original Message- From:

RE: CRUD test

2010-07-23 Thread Peter Minearo
Interesting!! Let me rephrase to make sure I understood what is going on: When Inserting data via the "insert" function/method: void insert(string keyspace, string key, ColumnPath column_path, binary value, i64 timestamp, ConsistencyLevel consistency_level) The "key" parameter is the actual Key

Re: CRUD test

2010-07-23 Thread Jonathan Shook
Correct. After the initial insert, cassandra> get Keyspace1.Super2['name'] => (super_column=hotel, (column=Best Western, value=Best Western of SF, timestamp=1279916772571) (column=Econolodge, value=Econolodge of SF, timestamp=1279916772573)) Returned 1 results. ... and ... cassandra>

RE: CRUD test

2010-07-23 Thread Peter Minearo
The Model Should look like: Super2 = { hotel: { Best Western: {name: "Best Western of SF"} Econolodge: {name: "Econolodge of SF"} } } Are the CRUD Operations not referencing this correctly? -Original Message- From: Jonath

Re: CRUD test

2010-07-23 Thread Jonathan Shook
There seem to be data consistency bugs in the test. Are "name" and "hotel" being used in a pair-wise way? Specifically, the first test is using creating one and checking for the other. On Fri, Jul 23, 2010 at 2:46 PM, Oleg Tsvinev wrote: > Johathan, > I followed your suggestion. Unfortunately, C

Re: CRUD test

2010-07-23 Thread Oleg Tsvinev
Johathan, I followed your suggestion. Unfortunately, CRUD test still does not work for me. Can you provide a simplest CRUD test possible that works? On Fri, Jul 23, 2010 at 10:59 AM, Jonathan Shook wrote: > I suspect that it is still your timestamps. > You can verify this with a fake timestamp

Re: Cassandra to store 1 billion small 64KB Blobs

2010-07-23 Thread Peter Schuller
> We plan to use cassandra as a data storage on at least 2 nodes with RF=2 > for about 1 billion small files. > We do have about 48TB discspace behind for each node. > > now my question is - is this possible with cassandra - reliable - means > (every blob is stored on 2 jbods).. > > we may grow up

Failing to create a 2 Node cluster on a Windows machine

2010-07-23 Thread Alaa Zubaidi
Hi, I am new to Cassandra, and I want to create a 2 node cluster on the SAME machine running windows for testing my application before we get new hardware. I am facing "No other nodes seen! Unable to bootstrap" The first node starts successfully while the second node gives me the error. All po

Re: Cassandra to store 1 billion small 64KB Blobs

2010-07-23 Thread Jonathan Shook
There are two scaling factors to consider here. In general the worst case growth of operations in Cassandra is kept near to O(log2(N)). Any worse growth would be considered a design problem, or at least a high priority target for improvement. This is important for considering the load generated by

Re: CRUD test

2010-07-23 Thread Jonathan Shook
I suspect that it is still your timestamps. You can verify this with a fake timestamp generator that is simply incremented on each getTimestamp(). 1 millisecond is a long time for code that is wrapped tightly in a test. You are likely using the same logical time stamp for multiple operations. On

Re: get the latest column fails in cassandra 7

2010-07-23 Thread Jonathan Ellis
maybe you're confusing "timestamp order" with "column name order?" hard to say without a complete example including the inserts. On Tue, Jul 20, 2010 at 10:10 PM, Bujji4Tech wrote: > hi all , > I am trying Cassandra 7(using latest build) got problem in getting the > latest column in  a row. > >

Cassandra to store 1 billion small 64KB Blobs

2010-07-23 Thread Michael Widmann
Hi We plan to use cassandra as a data storage on at least 2 nodes with RF=2 for about 1 billion small files. We do have about 48TB discspace behind for each node. now my question is - is this possible with cassandra - reliable - means (every blob is stored on 2 jbods).. we may grow up to nearly

Cassandra to store 1 billion small 64KB Blobs

2010-07-23 Thread Michael Widmann
Hi We plan to use cassandra as a data storage on at least 2 nodes with RF=2 for about 1 billion small files. We do have about 48TB discspace behind for each node. now my question is - is this possible with cassandra - reliable - means (every blob is stored on 2 jbods).. we may grow up to nearly

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-23 Thread Julie
Jonathan Ellis gmail.com> writes: > > then obsolete sstables is not your culprit. > I believe I figured out how to force my node disk usage to go down. I had been letting Cassandra perform its own data management, and did not use nodetool to force anything since in our real system, the data w

Cassandra and Lucene

2010-07-23 Thread Michelan Arendse
Hi I have recently started working on Cassandra as I need to make a distribute Lucene index and found that Lucandra was the best for this. Since then I have configured everything and it's working ok. Now the problem comes in when I need to write this Lucene index to Cassandra or convert it so tha

Re: Adding TimeUUID as Column Name using a PAST time?

2010-07-23 Thread themana...@juno.com
Disregard... Something was wrong with my instance to where I could not insert anything. Since I'm in early development I just re-installed Cassandra and the TimeUUID of a past time inserts just fine. -- Original Message -- From: "themana...@juno.com" To: user@cassandra.apache.or

Adding TimeUUID as Column Name using a PAST time?

2010-07-23 Thread themana...@juno.com
I apologize if this is a dumb question...but is it possible to create a column with a TimeUUID in the past? I've been trying to do so, but my records are not being saved. I have date/times from some historical MySQL data that I've converted to a TimeUUID through a modded Python uuid1() function