date:20110314

Re: problems while TimeUUIDType-index-querying with two expressions

2011-03-14 Thread aaron morton

Perfectly reasonable, created https://issues.apache.org/jira/browse/CASSANDRA-2328 Aaron On 15 Mar 2011, at 16:52, Jonathan Ellis wrote: > Sounds like we should send an InvalidRequestException then. > > On Mon, Mar 14, 2011 at 8:06 PM, aaron morton wrote: >> It's failing to when comparing two

running all unit tests

2011-03-14 Thread Jeffrey Wang

Hey all, We're applying some patches to our own branch of Cassandra, and we are wondering if there is a good way to run all the unit tests. Just having JUnit run all the test classes seems to result in a lot of errors that are hard to fix, so I'm hoping there's an easy way to do this. Thanks!

Re: problems while TimeUUIDType-index-querying with two expressions

2011-03-14 Thread Jonathan Ellis

Sounds like we should send an InvalidRequestException then. On Mon, Mar 14, 2011 at 8:06 PM, aaron morton wrote: > It's failing to when comparing two TimeUUID values because on of them is not > properly formatted. In this case it's comparing a stored value with the > value passed in the get_index

Re: Calculate memory used for keycache

2011-03-14 Thread Robert Coli

On Mon, Mar 14, 2011 at 1:19 PM, Peter Schuller wrote: >> How is it possible calculate this value? I think that key size, if we use >> RandomPartitioner will 16 bytes so keycache will took 16*(num of keycache >> elements) bytes ?? > > The easiest way right now is probably empirical testing. The is

Re: calculating initial_token

2011-03-14 Thread aaron morton

Once the node has started once, it will not use the value for initial_token in cassandra.yaml. Use nodetool move to assign a new token to the node. nodetool loadbalance is generally a bad idea www.spidertracks.com Aaron On 15 Mar 2011, at 13:04, Narendra Sharma wrote: > The %age (owns) is jus

Re: problems while TimeUUIDType-index-querying with two expressions

2011-03-14 Thread aaron morton

It's failing to when comparing two TimeUUID values because on of them is not properly formatted. In this case it's comparing a stored value with the value passed in the get_indexed_slice() query expression. I'm going to assume it's the value passed for the expression. When you create the Inde

Re: Seed

2011-03-14 Thread aaron morton

What page is that from ? Aaron On 15 Mar 2011, at 06:20, mcasandra wrote: > > Tyler Hobbs-2 wrote: >> >> Seeds: >> Never use a node's own address as a seed if you are bootstrapping it by >> setting autobootstrap to true! >> > > I came accross this on the wiki. Can someone please help me und

Re: problem with bootstrap

2011-03-14 Thread aaron morton

Thanks, will try to look into it. Aaron On 14 Mar 2011, at 20:43, Patrik Modesto wrote: > On Fri, Mar 11, 2011 at 22:31, Aaron Morton wrote: >> The assertion is interesting. Can you reproduce it with logging at debug and >> post the results? Could you try to reproduce it with a clean cluster?

Re: calculating initial_token

2011-03-14 Thread Narendra Sharma

The %age (owns) is just the arc length in terms of %age of tokens a node owns out of the total token space. It doesn't reflect the actual data. The size (load) is the real current load. -Naren On Mon, Mar 14, 2011 at 2:59 PM, Sasha Dolgy wrote: > ah, you know ... i have been reading it wrong.

Re: Write speed roughly 1/10 of expected.

2011-03-14 Thread Tyler Hobbs

> > Re: Mr. Hobbs, > > Did you mean "which has the benefit of THRIFT-638, while 0.7.a.2 does not" > (instead of 0.7.a.3)? 0.7.a.3 was the latest version of phpcassa we could > find on github. We installed 0.7.a.3 with its C extension and didn't see an > improvement. Is there a newer version with TH

Re: nodetool loadbalance

2011-03-14 Thread Sasha Dolgy

Yes, a lot of what is on the wiki makes perfect sense when read the right way. suppose there arent enough pictures or before/after info online to help the knowledge flow. On Mar 14, 2011 11:52 PM, "Jonathan Ellis" wrote: > You should read http://wiki.apache.org/cassandra/Operations before > runni

Re: nodetool loadbalance

2011-03-14 Thread Sasha Dolgy

Using the tokens I generated earlier, i ran nodetool move on each node and things look much better for the Owns % ... Address Status State LoadOwnsToken 170141183460469231731687303715884105725 10.0.0.1 Up Normal 234.51 KB 0.00% 0 10.0.0.2Up Normal

Re: nodetool loadbalance

2011-03-14 Thread Jonathan Ellis

You should read http://wiki.apache.org/cassandra/Operations before running loadbalance. On Mon, Mar 14, 2011 at 5:27 PM, Sasha Dolgy wrote: > With my six node cluster ... nodetool loadbalance should be run on one > node or all six? I run it on one and the ownership percentage gets > even more un

Re: Linux HugePages and mmap

2011-03-14 Thread mcasandra

Thanks! I think it still is a good idea to enable HiugePages and use UseLargePageSize option in JVM. What do you think? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Linux-HugePages-and-mmap-tp6170193p6171008.html Sent from the cassandra-u...@in

Re: nodetool loadbalance

2011-03-14 Thread Sasha Dolgy

With my six node cluster ... nodetool loadbalance should be run on one node or all six? I run it on one and the ownership percentage gets even more unbalanced. So... in the spirit of the evening, I run it on another node . as you see, the ownership % keeps increasing and the token numbers kee

Re: Linux HugePages and mmap

2011-03-14 Thread Jonathan Ellis

On Mon, Mar 14, 2011 at 3:01 PM, mcasandra wrote: > > Jonathan Ellis-3 wrote: >> >> Wrong. The recommendation is to leave it on auto. >> > this is where I see mmap recommended for index. > http://wiki.apache.org/cassandra/StorageConfiguration FTFY. >> HugePages has nothing to do with disk acces

Re: calculating initial_token

2011-03-14 Thread Sasha Dolgy

ah, you know ... i have been reading it wrong. the output shows a nice fancy column called "Owns" but i've only ever seen the percentage ... the amount of data or "load" is even ... doh. thanks for the reply. cheers -sd On Mon, Mar 14, 2011 at 10:47 PM, Narendra Sharma wrote: > On the same pag

Re: calculating initial_token

2011-03-14 Thread Narendra Sharma

On the same page there is a section on Load Balance that talks about python script to compute tokens. I believe your question is more about assigning new tokens and not compute tokens. 1. "nodetool loadbalance" will result in recomputation of tokens. It will pick tokens based on the load and not t

Re: Calculate memory used for keycache

2011-03-14 Thread Narendra Sharma

Sometime back I looked at the code to find that out. Following is the result. There will be some additional overhead for internal DS for ConcurrentLinkedHashMap. * (<8 bytes for position i.e. value> + + <16 bytes for token (RP)> + <8 byte reference for DecoratedKey> + <8 bytes for descriptor ref

Re: Calculate memory used for keycache

2011-03-14 Thread Peter Schuller

> How is it possible calculate this value? I think that key size, if we use > RandomPartitioner will 16 bytes so keycache will took 16*(num of keycache > elements) bytes ?? The easiest way right now is probably empirical testing. The issue is that the "memory use" must include overhead associated

Re: Strange behaivour

2011-03-14 Thread Peter Schuller

Can you try a 'strace -fp PID' when it's in the state of spinning with system CPU time? I'm wondering whether it's stuck in a single syscall or just spinning around one or a set of syscalls. I have very vague recollections of a discussion on the list a few months ago about triggering a kernel bug

Re: Out of Memory every 2 weeks

2011-03-14 Thread Peter Schuller

> I am going to try that. Also, you may want to augment your VM options with: -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimestamps That way there should hopefully be some corroborating evidence as to the nature of the heap growth over time. -- / Peter Schuller

Re: Write speed roughly 1/10 of expected.

2011-03-14 Thread Steven Liu

Re: Mr. Schuller, The test documents are very small (a few lines of text each). Test data model is standard CF with each document correponding to a row containing 9-12 columns. We are using a single client for sequential batch_insert (probably maps to batch mutate in phpcassa), so it is very possi

Calculate memory used for keycache

2011-03-14 Thread ruslan usifov

Hello How is it possible calculate this value? I think that key size, if we use RandomPartitioner will 16 bytes so keycache will took 16*(num of keycache elements) bytes ??

calculating initial_token

2011-03-14 Thread Sasha Dolgy

Sorry for being a bit daft ... Wanted a bit of validation or rejection ... If I have a 6 node cluster, replication factor 2 (don't think this is applicable to the token decision) is the following sufficient and correct for determining the tokens: #!/bin/bash for nodes in {0..5}; do echo "$nod

Re: Increase flush writer queue

2011-03-14 Thread Brandon Williams

On Mon, Mar 14, 2011 at 1:03 PM, Daniel Doubleday wrote: > I was thinking of setting the work queue in CFS.flushWriterPool to > > new LinkedBlockingQueue(3) // because 3 is my favorite number > > instead of > > new > LinkedBlockingQueue(DatabaseDescriptor.getAllDataFileLocations().length) > Alter

Re: Linux HugePages and mmap

2011-03-14 Thread mcasandra

Jonathan Ellis-3 wrote: > > Wrong. The recommendation is to leave it on auto. > this is where I see mmap recommended for index. http://wiki.apache.org/cassandra/StorageConfiguration http://wiki.apache.org/cassandra/StorageConfiguration Jonathan Ellis-3 wrote: > > HugePages has nothing to

Re: Linux HugePages and mmap

2011-03-14 Thread Jonathan Ellis

On Mon, Mar 14, 2011 at 1:59 PM, mcasandra wrote: > Currently, in cassandra.yaml disk_access_mode is set to "auto" but the > recommendation seems to be to use 'mmap_index_only'. Wrong. The recommendation is to leave it on auto. > If we use HugePages then do we still need to worry about setting

Re: Increase flush writer queue

2011-03-14 Thread Jonathan Ellis

Sure, that should be fine. On Mon, Mar 14, 2011 at 1:03 PM, Daniel Doubleday wrote: > Hi all, > > on 0.6: > > we are facing increased write latencies every now and then when an > unfortunate write command thread becomes the flush writer for a mem table > because of an already running mem table

Re: Strange behaivour

2011-03-14 Thread ruslan usifov

I detect that this was after change schema and it hung on waitpid syscall. What can i do with this?

Re: reducing disk usage advice

2011-03-14 Thread Karl Hiramoto

On 03/14/11 15:33, Sylvain Lebresne wrote: > > CASSANDRA-1537 is probably also a partial but possibly sufficient > solution. That's also probably easier than CASSANDRA-1610 and I'll try > to give it a shot asap, that had been on my todo list way too long. > Thanks, eager to see CASSANDRA-1610 somed

Linux HugePages and mmap

2011-03-14 Thread mcasandra

Currently, in cassandra.yaml disk_access_mode is set to "auto" but the recommendation seems to be to use 'mmap_index_only'. If we use HugePages then do we still need to worry about setting disk_access_mode to mmap? I am planning to enable HugePages and use -XX:+UseLargePages option in JVM. I had a

Re: Map-Reduce on top of cassandra

2011-03-14 Thread Jeremy Hanna

Just for the sake of updating this thread - Orr didn't yet have task trackers on the Cassandra nodes so most of the time was likely due to copying the ~100G of data to the hadoop cluster prior to processing. They're going to try after installing task trackers on the nodes. On Mar 14, 2011, at

Increase flush writer queue

2011-03-14 Thread Daniel Doubleday

Hi all, on 0.6: we are facing increased write latencies every now and then when an unfortunate write command thread becomes the flush writer for a mem table because of an already running mem table flush. I was thinking of setting the work queue in CFS.flushWriterPool to new LinkedBlockingQu

Re: Double ColumnType and comparing

2011-03-14 Thread Norman Maurer

I will have a look at what it takes to implement it.. Bye, Norman 2011/3/14 David Boxenhorn > I you do it, I'd recommend BigDecimal. It's an exact type, and usually what > you want. > > On Mon, Mar 14, 2011 at 3:40 PM, Jonathan Ellis wrote: > >> We'd be happy to commit a patch contributing a

Re: Seed

2011-03-14 Thread mcasandra

Tyler Hobbs-2 wrote: > > Seeds: > Never use a node's own address as a seed if you are bootstrapping it by > setting autobootstrap to true! > I came accross this on the wiki. Can someone please help me understand this with some example? -- View this message in context: http://cassandra-user-i

Re: On 0.6.6 to 0.7.3 migration, DC-aware traffic and minimising data transfer

2011-03-14 Thread David Boxenhorn

How do you write to two versions of Cassandra from the same client? Two versions of Hector? On Mon, Mar 14, 2011 at 6:46 PM, Robert Coli wrote: > On Mon, Mar 14, 2011 at 8:39 AM, Jedd Rashbrooke > wrote: > > But more importantly for us it would mean we'd have just the > > one major outage, ra

Re: On 0.6.6 to 0.7.3 migration, DC-aware traffic and minimising data transfer

2011-03-14 Thread Robert Coli

On Mon, Mar 14, 2011 at 8:39 AM, Jedd Rashbrooke wrote: > But more importantly for us it would mean we'd have just the > one major outage, rather than two (relocation and 0.6 -> 0.7) Take zero major outages instead? :D a) Set up new cluster on new version. b) Fork application writes, so all wr

Fron scribe to cassandra

2011-03-14 Thread salidu andrea

Hi all, On the wiki pages i read an example (wirtten in java) to store data from scribe. I tried to write the same code in php without success. there is someone here who tried to do the same? tks in advanced javasilk

Re: Out of Memory every 2 weeks

2011-03-14 Thread Jean-Yves LEBLEU

Thank you, I am going to try that.

Re: Out of Memory every 2 weeks

2011-03-14 Thread Robert Coli

On Mon, Mar 14, 2011 at 8:27 AM, Jean-Yves LEBLEU wrote: > Sorry to create a new thread about Out of Memory problem, but I > checked all other threads and did not find the answer. > [...] > The question is I don't really understand the configuration problem, > if some body have any clue of what we

Re: On 0.6.6 to 0.7.3 migration, DC-aware traffic and minimising data transfer

2011-03-14 Thread Jedd Rashbrooke

Jonathon, thank you for your answers here. To explain this bit ... On 11 March 2011 20:46, Jonathan Ellis wrote: > On Thu, Mar 10, 2011 at 6:06 AM, Jedd Rashbrooke wrote: >> Copying a cluster between AWS DC's: >> We have ~ 150-250GB per node, with a Replication Factor of 4. >> I ack that 0

Re: Map-Reduce on top of cassandra

2011-03-14 Thread Jeremy Hanna

Can you go into the #cassandra channel and ask your question? See if jeromatron or driftx are around. That way there can be a back and forth about settings and things. http://webchat.freenode.net/?channels=#cassandra On Mar 14, 2011, at 10:06 AM, Or Yanay wrote: > Hi All, > > I am trying t

Out of Memory every 2 weeks

2011-03-14 Thread Jean-Yves LEBLEU

Sorry to create a new thread about Out of Memory problem, but I checked all other threads and did not find the answer. We have a running cluster of 2 cassandra nodes replication factor = 2 on red hat 4.8 32 bits with 4 G of memory where we run periodicaly out of memory (every 2 weeks) and both n

problems while TimeUUIDType-index-querying with two expressions

2011-03-14 Thread Johannes Hoerle

Hi all, in order to improve our queries, we started to use IndexedSliceQueries from the hector project (https://github.com/zznate/hector-examples). I followed the instructions for creating IndexedSlicesQuery with GetIndexedSlices.java. I created the corresponding CF with in a keyspace called "Ke

Map-Reduce on top of cassandra

2011-03-14 Thread Or Yanay

Hi All, I am trying to write some map-reduce tasks so I can find out stuff like - how many records have X status? I am using 0.7.0 and have 5 nodes with ~100G of data on each node. I have written the code based on the word_count example and the map-reduce is running successfully BUT is extremel

Re: reducing disk usage advice

2011-03-14 Thread Sylvain Lebresne

On Sun, Mar 13, 2011 at 7:10 PM, Karl Hiramoto wrote: > > Hi, > > I'm looking for advice on reducing disk usage. I've ran out of disk space > two days in a row while running a nightly scheduled nodetool repair && > nodetool compact cronjob. > > I have 6 nodes RF=3 each with 300 GB drives at

Re: Double ColumnType and comparing

2011-03-14 Thread David Boxenhorn

I you do it, I'd recommend BigDecimal. It's an exact type, and usually what you want. On Mon, Mar 14, 2011 at 3:40 PM, Jonathan Ellis wrote: > We'd be happy to commit a patch contributing a DoubleType. > > On Sun, Mar 13, 2011 at 7:36 PM, Paul Teasdale > wrote: > > I am quite new to Cassandra a

Re: Double ColumnType and comparing

2011-03-14 Thread Jonathan Ellis

We'd be happy to commit a patch contributing a DoubleType. On Sun, Mar 13, 2011 at 7:36 PM, Paul Teasdale wrote: > I am quite new to Cassandra and am trying to model a simple Column Family > which uses Doubles as column names: > Datalines: { // ColumnFamilly > dataline-1:{ // row key > 23.5: 'som

Re: secondary indexes on data imported by json2sstable

2011-03-14 Thread Jonathan Ellis

You'd need to drop and recreate the index (but see https://issues.apache.org/jira/browse/CASSANDRA-2320 when doing this). On Mon, Mar 14, 2011 at 6:07 AM, Terje Marthinussen wrote: > Hi, > Should it be expected that secondary indexes are automatically regenerated > when importing data using json2

Conflict resolution in Cassandra

2011-03-14 Thread Milind Parikh

https://docs.google.com/document/d/13Yc2t4d07290TdiRmSTchuAk9sbp4BeqOpqeYhbcDFM/edit?hl=en There was an excellent session on vector clocks and synchronous writes in cassandra. Here are my gleanings out of it. /*** sent from my android...please pardon occasional typos as I resp

Re: On 0.6.6 to 0.7.3 migration, DC-aware traffic and minimising data transfer

2011-03-14 Thread Chris Burroughs

On 03/11/2011 03:46 PM, Jonathan Ellis wrote: > Repairs is not yet WAN-optimized but is still cheap if your replicas > are close to consistent since only merkle trees + inconsistent ranges > are sent over the network. > What is the ticket number for WAN optimized repair?

Re: Double ColumnType and comparing

2011-03-14 Thread Eric Charles

Or maybe convert double to long, just as hector's DoubleSerializer does https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/serializers/DoubleSerializer.java I was happy to use it here. Tks, - Eric On 14/03/2011 02:52, aaron morton wrote: There is nothing in

Re: secondary indexes on data imported by json2sstable

2011-03-14 Thread Norman Maurer

I would expect they get created on the fly while importing. If not I think its a bug... Bye, Norman 2011/3/14 Terje Marthinussen > Hi, > > Should it be expected that secondary indexes are automatically regenerated > when importing data using json2sstable? > Or is there some manual procedure th

secondary indexes on data imported by json2sstable

2011-03-14 Thread Terje Marthinussen

Hi, Should it be expected that secondary indexes are automatically regenerated when importing data using json2sstable? Or is there some manual procedure that needs to be done to generate them? Regards, Terje

Re: problem with bootstrap

2011-03-14 Thread Patrik Modesto

On Fri, Mar 11, 2011 at 22:31, Aaron Morton wrote: > The assertion is interesting. Can you reproduce it with logging at debug and > post the results? Could you try to reproduce it with a clean cluster? It was on a clean cluster last time. Anyway I started clean cluster again, repeated the same s

56 matches

Mail list logo