Reconfiguring nodes - getting bootstrap error

2010-06-22 Thread Anthony Ikeda
I had to reconfigure my Cassandra nodes today to allow us to use Lucandra and made the following changes: * Shutdown ALL Cassandra instances * For each node: o Added in Lucandra Keyspace o Changed the Partitioner to OrderPreservingPartitioner o Deleted the folders in my D

Deletion and batch_mutate

2010-06-22 Thread Ron
Hi everyone, I'm a new user of Cassandra, and during my tests, I've encountered a problem with deleting rows from CFs. I use Cassandra 0.6.2 and coding in Java, using the native Java Thrift API. The way my application works, I need to delete multiple rows at a time (just like reads and writes). Ob

Re: java.lang.OutOfMemoryError: Map failed

2010-06-22 Thread Oleg Anastasjev
> Daniel: >   > Thanks. That thread helped me solve my problem. >   > I was able to run a 700k MySQL record import without a single memory error. >   > I changed the following sections in storage-conf.xml to fix the OutofMemory errors: >   >  standard > batch  >  1 Going to standard mode is not

Re: Deletion and batch_mutate

2010-06-22 Thread Mishail
Take a look at https://issues.apache.org/jira/browse/CASSANDRA-494 https://issues.apache.org/jira/browse/CASSANDRA-1027 On 22.06.2010 19:00, Ron wrote: > Hi everyone, > I'm a new user of Cassandra, and during my tests, I've encountered a > problem with deleting rows from CFs. > I use Cassandra

unsubscribe

2010-06-22 Thread Dean Steele
unsubscribe d...@dintran.com Dean Steele Reason: too much mail volume, I would prefer an weekly case study review.

[OT] Re: unsubscribe

2010-06-22 Thread Torsten Curdt
Hey Dean ...and everyone else not managing to unsubscribe (and sending mails to the list instead): If you don't know how to unsubscribe you can always look at the List-Unsubscribe: header of any of the list emails. These days most of the time you will find that an "-unsubscribe" suffix is used

OrderPreservingPartitioner and manual token assignment

2010-06-22 Thread Maxim Kramarenko
Hello! I use OrderPreservingPartitioner and assign tokens manually. Questions are: 1) Why range sorted in alphabetical order, not numeric order ? It was ok with RandomPartitioner Address Status Load Range Ring 84 172.19.0.35

Re: New to cassandra

2010-06-22 Thread yaw
And this one is useful : https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP 2010/6/22 Shahan Khan > The wiki is a great place: > > http://wiki.apache.org/cassandra/FrontPage > > Getting Started: http://wiki.apache.org/cassandra/GettingStarted > > Cassandra interfaces with PHP

Re: OrderPreservingPartitioner and manual token assignment

2010-06-22 Thread Sylvain Lebresne
2010/6/22 Maxim Kramarenko : > Hello! > > I use OrderPreservingPartitioner and assign tokens manually. > > Questions are: > > 1) Why range sorted in alphabetical order, not numeric order ? > It was ok with RandomPartitioner With RandomPartitioner, tokens are md5 hashes, thus number and the compari

Re: django or pylons

2010-06-22 Thread Jonathan Ellis
What problems did you run into? On Mon, Jun 21, 2010 at 6:32 AM, Eugenio Minardi wrote: > Hi, I had gave a look to django + cassandra I found the twissandra project > (a django version of twitter based on cassandra). > But since I am new to django I couldnt make it work. If you find it > interest

UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
I want to use UUIDs whose alphanumeric order is the same as their chronological order. So I'm generating Version 4 UUIDs ( http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29) as follows: public class Id { static Random random = new Random(); public static Stri

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-22 Thread Jonathan Ellis
What I would expect to have happen is for the removed node to disappear from the ring and for nodes that are supposed to get more data to start streaming it over. I would expect it to be hours before any new data started appearing anywhere when you are anticompacting 80+GB prior to the streaming p

Re: Deletion and batch_mutate

2010-06-22 Thread Jonathan Ellis
right. in other words, you can delete entire rows w/ batch_mutate in 0.6.3 or trunk, but for 0.6.2 the best workaround is to issue multiple remove commands. On Tue, Jun 22, 2010 at 5:09 AM, Mishail wrote: > Take a look at > > https://issues.apache.org/jira/browse/CASSANDRA-494 > > https://issues

Re: Reconfiguring nodes - getting bootstrap error

2010-06-22 Thread Jonathan Ellis
sounds like a problem with your seed configuration On Tue, Jun 22, 2010 at 3:06 AM, Anthony Ikeda < anthony.ik...@cardlink.com.au> wrote: > I had to reconfigure my Cassandra nodes today to allow us to use Lucandra > and made the following changes: > > · Shutdown ALL Cassandra instances >

Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread Jonathan Ellis
Why not just use version 1 UUIDs and TimeUUIDType? On Tue, Jun 22, 2010 at 8:58 AM, David Boxenhorn wrote: > I want to use UUIDs whose alphanumeric order is the same as their > chronological order. So I'm generating Version 4 UUIDs ( > http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Ve

Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
As I understand it, the string value of TimeUUIDType does not sort alphanumerically in chronological order. Isn't that right? I want to use these ids in Oracle as well as Cassandra, and I want them to sort in chronological order. In Oracle they will have to be varchars (I think). Even in Cassandr

Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-22 Thread Julie
Gary Dusbabek gmail.com> writes: > > *Hopefully* fixed. I was never able to duplicate the problem on my > workstation, but I had a pretty good idea what was causing the > problem. Julie, if you're in a position to apply and test the fix, it > would help help us make sure we've got this one nai

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-22 Thread Joost Ouwerkerk
I don't mind missing data for a few hours, it's the weird behaviour of get_range_slices that's bothering me. I added some logging to ColumnFamilyRecordReader to see what's going on: Split startToken=67160993471237854630929198835217410155, endToken=68643623863384825230116928934887817211 ... Gett

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-22 Thread Jonathan Ellis
Ah, that sounds like https://issues.apache.org/jira/browse/CASSANDRA-1198. That it happened after removetoken is just that that happened to change your ring topology enough to make your queries start hitting it. On Tue, Jun 22, 2010 at 10:39 AM, Joost Ouwerkerk wrote: > I don't mind missing data

Finding new Cassandra data

2010-06-22 Thread David Boxenhorn
In my system, I have a Cassandra front end, and an Oracle back end. Some information is created in the back end, and pushed out to the front end, and some information is created in the front end and pulled into the back end. Question: How do I locate new rows that have been crated in Cassandra, fo

Re: Finding new Cassandra data

2010-06-22 Thread Gary Dusbabek
On Tue, Jun 22, 2010 at 09:59, David Boxenhorn wrote: > In my system, I have a Cassandra front end, and an Oracle back end. Some > information is created in the back end, and pushed out to the front end, and > some information is created in the front end and pulled into the back end. > > Question:

Re: Hector vs cassandra-java-client

2010-06-22 Thread Bjorn Borud
"Dop Sun" writes: > Updated. the first Cassandra client lib to make it into the Maven repositories will probably end up with a big audience. :-) -Bjørn

Re: Finding new Cassandra data

2010-06-22 Thread Phil Stanhope
I can envision two fundamentally different approaches: 1. A CF that is CompareWith LONG ... use microsecond timestamps as your keys ... then you can filter by time ranges. This implies that you are willing to do a double write (once for the original data and then again for the logging). And a t

Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread Tatu Saloranta
On Tue, Jun 22, 2010 at 5:58 AM, David Boxenhorn wrote: > I want to use UUIDs whose alphanumeric order is the same as their > chronological order. So I'm generating Version 4 UUIDs ( ... > Is there anything wrong with this idea? If you want to keep it completely ordered, it's probably not enough

Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
A little bit of time fuzziness on the order of a few milliseconds is fine with me. This is user-generated data, so it only has to be time-ordered at the level that a user can perceive. I have no worries about my solution working - I'm sure it will work. I just wonder if TimeUUIDType isn't superior

Re: Uneven distribution using RP

2010-06-22 Thread James Golick
This node's load is now growing at a ridiculous rate. It is at 105GB, with the next most loaded node at 70.63GB. Given that RF=3, I would assume that the replicas' nodes would grow relatively quickly too? On Mon, Jun 21, 2010 at 6:44 AM, aaron morton wrote: > According to http://wiki.apache.org/

Re: Uneven distribution using RP

2010-06-22 Thread Robert Coli
On 6/22/10 10:07 AM, James Golick wrote: This node's load is now growing at a ridiculous rate. It is at 105GB, with the next most loaded node at 70.63GB. Given that RF=3, I would assume that the replicas' nodes would grow relatively quickly too? What Replica Placement Strategy are you using (R

Re: Uneven distribution using RP

2010-06-22 Thread James Golick
RackUnaware, currently On Tue, Jun 22, 2010 at 1:26 PM, Robert Coli wrote: > On 6/22/10 10:07 AM, James Golick wrote: > >> This node's load is now growing at a ridiculous rate. It is at 105GB, with >> the next most loaded node at 70.63GB. >> >> Given that RF=3, I would assume that the replicas'

forum application data model conversion

2010-06-22 Thread S Ahmed
Converting a Forum application to cassandra's data model. Tables: Posts [postID, threadID, userID, subject, body, created, lastmodified] So this table contains the actual question subject and body. When a user logs in, they want to see a list of their questions, and also order by the last-modif

Write Rate / Second

2010-06-22 Thread Mubarak Seyed
How to find out the performance metrics such as write rate per second, and read rate per second. I could not find out from tpstats and cfstats command. Are there any attributes in JMX? Can someone please help me. Thanks, Mubarak

Re: Write Rate / Second

2010-06-22 Thread Jonathan Ellis
rate = operations / latency On Tue, Jun 22, 2010 at 2:50 PM, Mubarak Seyed wrote: > How to find out the performance metrics such as write rate per second, and > read rate per second. I could not find out from tpstats and cfstats command. > > Are there any attributes in JMX? Can someone please he

Re: how to implement the function similar to inbox search?

2010-06-22 Thread Jonathan Ellis
Not having an index doesn't matter if you're going to read all the subcolumns back at once, which IIANM is the idea here. On Mon, Jun 21, 2010 at 12:20 PM, hu wei wrote: > in datamodel wiki: >  You can think of each super column name as a term and the columns within as > the docids with rank info

Re: bulk loading

2010-06-22 Thread Torsten Curdt
I looked at the thrift service implementation and got it working. (Much faster import!) Thanks! On Mon, Jun 21, 2010 at 13:09, Oleg Anastasjev wrote: > Torsten Curdt vafer.org> writes: > >> >> First I tried with my one "cassandra -f" instance then I saw this >> requires a separate IP. (Why?) >

Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread Tatu Saloranta
On Tue, Jun 22, 2010 at 9:12 AM, David Boxenhorn wrote: > A little bit of time fuzziness on the order of a few milliseconds is fine > with me. This is user-generated data, so it only has to be time-ordered at > the level that a user can perceive. Ok, so mostly ordered. :-) > I have no worries ab

SQL Server to Cassandra Schema Design - Ideas Anyone?

2010-06-22 Thread Craig Faulkner
I'm having a little block in converting an existing SQL Server schema that we have into Cassandra Keyspace(s). The whole key-value thing has just not clicked yet. Do any of you know of any good examples that are more complex than the example in the readme file? We are looking to report on web

Cassandra Health Monitoring

2010-06-22 Thread Andrew Psaltis
All, We have been working through some operations scenarios, so that we are ready to deploy our first Cassandra cluster into production  in the coming months. During this process our operations folks have asked us to provide a Health Check service. I am using the word service here very liberally

Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-22 Thread Julie
Gary Dusbabek gmail.com> writes: > > *Hopefully* fixed. I was never able to duplicate the problem on my > workstation, but I had a pretty good idea what was causing the > problem. Julie, if you're in a position to apply and test the fix, it > would help help us make sure we've got this one nai

Re: Uneven distribution using RP

2010-06-22 Thread James Golick
Turns out that this is due to a larger proportion of the wide rows in the system being located on that node. I moved its token over a little to compensate for it, but it doesn't seem to have helped at this point. What's confusing about this is that RF=3 and no other node's load is growing as quick

Re: Uneven distribution using RP

2010-06-22 Thread Jeremy Dunck
On Tue, Jun 22, 2010 at 4:08 PM, James Golick wrote: > Turns out that this is due to a larger proportion of the wide rows in the > system being located on that node. I moved its token over a little to > compensate for it, but it doesn't seem to have helped at this point. > What's confusing about t

Re: Uneven distribution using RP

2010-06-22 Thread James Golick
It's compacting at a ridiculously fast rate. The pending compactions have been growing for a while. It's also flushing memtables really quickly for a particular CF. Like, really quickly. Like, one every minute. I increased the thresholds by 10x and it's still going fast. On Tue, Jun 22, 2010 at 5

nodetool loadbalance : Strerams Continue on Non Acceptance of New Token

2010-06-22 Thread Arya Goudarzi
Hi, Please confirm if this is an issue and should be reported or I am doing something wrong. I could not find anything relevant on JIRA: Playing with 0.7 nightly (today's build), I setup a 3 node cluster this way: - Added one node; - Loaded default schema with RF 1 from YAML using JMX; - Loa

Hector - Java doc

2010-06-22 Thread Mubarak Seyed
Where can i find the java doc for Hector java client? Do i need to build one from source? -- Thanks, Mubarak Seyed.

Never ending compaction

2010-06-22 Thread James Golick
We had to take a node down for an upgrade last night. When we brought it back online in the morning, it got slammed by HH data all day so badly that it was compacting near constantly, and the pending compactions pool was piling up. I shut most of the writes down to let things catch up, which they m

Re: Hector - Java doc

2010-06-22 Thread Jonathan Holloway
I couldn't find the docs online but the Ant build script here in the source: http://github.com/rantav/hector/blob/master/build.xml has a javadoc target you can run to generate them... hope that helps... Jon. On 22 June 2010 21:25, Mubarak Seyed wrote: > Where can i find the java doc for Hecto

Re: Hector - Java doc

2010-06-22 Thread Ran Tavory
There isn't an online javadoc page, but the code is online and well documented and there's a wiki and all sorts of documents and examples http://github.com/rantav/hector/blob/master/src/main/java/me/prettyprint/cassandra/service/Keyspace.java http://wiki.github.com/rantav/hector/ On Wed, Jun 23, 2

Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
Having a physical location encoded in the UUID *increases* the chance of a collision, because it means fewer random bits. There definitely will be more than one UUID created in the same clock unit on the same machine! The same bits that you use to encode your few servers can be used for over 100 tr