date:20101027

Question regarding support of batch_mutate + delete + slice predicate

2010-10-27 Thread Dwight Smith

Investigation of this combination led to the following: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/batch-m utate-deletion-slice-range-predicate-unsupported-td5048309.html Are there plans (6.x or 7) to support this? Thanks

Re: Cassandra newbie question

2010-10-27 Thread Arijit Mukherjee

Thanx Gary. I was thinking of using range partitioning for breaking the input. Say, we could have different threads handling diffierent rages - (A-J) by thread1, (K-P) by thread2. This way, there won't probably be any chance of collision. But the thread which actually performs the distribution cou

Re: Config Maximum heap size for Cassandra

2010-10-27 Thread Jonathan Ellis

http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html On Wed, Oct 27, 2010 at 9:30 PM, JKnight JKnight wrote: > Could you tell me why Cassandra use memory more than needed? > > > On Thu, Oct 28, 2010 at 9:15 AM, Nicholas Knight > wrote: >> >> Presumably you're on a 32-bit archite

Re: Config Maximum heap size for Cassandra

2010-10-27 Thread Nicholas Knight

Cassandra needs all the RAM you can give it so it can cache things for optimum performance. If you need it to use less, give it less. -NK On Oct 28, 2010, at 10:30 AM, JKnight JKnight wrote: > Could you tell me why Cassandra use memory more than needed? > > > On Thu, Oct 28, 2010 at 9:15 AM,

network configurations in medium to large size installations

2010-10-27 Thread Terje Marthinussen

Hi, Just curious if anyone has any best practices/experiences/thoughts to share on network configurations for cassandra setups with tens to hundreds of nodes and high traffic (thousands of requests/sec)? For instance: - Do you just "hook it all together"? - If you have 2 interfaces, do you prefer

Re: Config Maximum heap size for Cassandra

2010-10-27 Thread JKnight JKnight

Could you tell me why Cassandra use memory more than needed? On Thu, Oct 28, 2010 at 9:15 AM, Nicholas Knight wrote: > Presumably you're on a 32-bit architecture (or at least a 32-bit JVM). > 32-bit processes won't be able to address more than "X" amount of memory, > where X would usually be >=

Re: Config Maximum heap size for Cassandra

2010-10-27 Thread Nicholas Knight

Presumably you're on a 32-bit architecture (or at least a 32-bit JVM). 32-bit processes won't be able to address more than "X" amount of memory, where X would usually be >= 2GB, and < 4GB. The reason you can't use a full 4GB is that part of the address space is necessarily reserved by the OS ke

Config Maximum heap size for Cassandra

2010-10-27 Thread JKnight JKnight

Hi all, When I config Maximum heap size -Xmx4G, the memory will consume to 3.5G. When I call Perform GC (jconsole), the used memory reduce to 1G. When I config Maximum heap size -Xmx2G, Cassandra system run well. Is that Casandra problem? I want Cassandra use memory more effective. How can I do

Re: Cluster load balancing?

2010-10-27 Thread Tyler Hobbs

Not sure if this is the cause, but do all of your nodes have the same seed list? Did you bring up the seeds first? - Tyler On Wed, Oct 27, 2010 at 1:46 PM, Thibaut Britz < thibaut.br...@trendiction.com> wrote: > Depending on the range I choose, choosing manually a token will also fail. > (node

Re: cassandra + avro | python client vs java client

2010-10-27 Thread Jonathan Ellis

Then you should use Thrift from Python if you are concerned about speed. (I think the speed penalty there is only about 2x w/ the extension.) On Wed, Oct 27, 2010 at 4:15 PM, Koert Kuipers wrote: > It does not have a c extension as far as I know > > -Original Message- > From: Jonathan El

Re: 0.7 problem on cygwin

2010-10-27 Thread Chris Oei

I guess so. I tried hacking a quick work-around for the "Filename must include parent directory", but I got another error (below). So, since it appears that mixing architectures is not officially supported, I think I'll give up on this. Goodbye, Windows 7. Thanks, Chris ERROR 14:07:47,534 Fatal

RE: cassandra + avro | python client vs java client

2010-10-27 Thread Koert Kuipers

It does not have a c extension as far as I know -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Wednesday, October 27, 2010 5:01 PM To: user Subject: Re: cassandra + avro | python client vs java client Does Avro have a Python C extension yet? If not, 10x is righ

Re: 0.7 problem on cygwin

2010-10-27 Thread Jonathan Ellis

Short version: don't mix nodes on different architectures in the same cluster. On Wed, Oct 27, 2010 at 2:09 PM, Chris Oei wrote: > Hi all, > > I'm getting the following when I try to bootstrap my Cassandra cluster on a > Windows > machine. > > INFO 11:47:10,300 Joining: sleeping 3 ms for pend

Re: java.lang.OutOfMemoryError: Map failed

2010-10-27 Thread Jonathan Ellis

Sounds like either you are running on a 32bit architecture or JVM or you don't have OS level permissions to mmap large Cassandra data files. One workaround may be to switch to mmap_index_only mode. On Wed, Oct 27, 2010 at 1:49 PM, Matthew Dennis wrote: > 2 GiB is pretty small for a C* node. You

Re: cassandra + avro | python client vs java client

2010-10-27 Thread Jonathan Ellis

Does Avro have a Python C extension yet? If not, 10x is right in line with how much faster I would expect Java to be than pure Python. On Wed, Oct 27, 2010 at 11:59 AM, Koert Kuipers wrote: > Hey all, > > I have Cassandra 0.7 (nightly build from halfway September) running on one > test machine w

Re: 0.7.0beta2 spinning/wedged after aggressive overnight writing

2010-10-27 Thread Jonathan Ellis

(moving to user list.) This sounds like you are GC storming (gcinspector lines in the log could confirm/refute this) and if I were to guess it would be that the memtable thresholds picked by b2 are too high. We cut them in half for rc1 in http://issues.apache.org/jira/browse/CASSANDRA-1641, but a

Re: How to get the result from the closest node

2010-10-27 Thread Joe Alex

Thanks much, verifies what I thought it is doing when connecting to a random node. Will play with RackAware and DCQUORUM. Wanted to see if anybody else has a case where they want to connect to local Data Center always. A case where the Nodes are geographically apart like A (NY) and D (London). Wan

Re: 0.7 problem on cygwin

2010-10-27 Thread ruslan usifov

Sorry for my bad english. In bootstrup cassandra send full file path between nodes, So for example win node deside send value-e-27-Data.db file to unix node(cygwin in you case). Unix node receive full file path of file "value-e-27-Data.db" on win node i.e. F:\cassandra\7.0\data\test_1\value-e-27-

Re: 0.7 problem on cygwin

2010-10-27 Thread Chris Oei

Sorry -- I don't quite understand: what is not supported by cassandra? The bin directory contains cassandra.bat, so I assumed cassandra works on Windows. Do you mean that cassandra works on Windows but not on cygwin? I had already checked my cassandra.yaml file to make sure that I used backslashes

Re: 0.7 problem on cygwin

2010-10-27 Thread ruslan usifov

It occurs from for differences between pathseparator chars in windows(\) and unix(or mac os("/")), and this doesn't supported by cassandra. If you interesting a cant send patch to you which solve this problem. Why so? i don't know this question to developers of cassandra 2010/10/27 Chris Oei > H

0.7 problem on cygwin

2010-10-27 Thread Chris Oei

Hi all, I'm getting the following when I try to bootstrap my Cassandra cluster on a Windows machine. INFO 11:47:10,300 Joining: sleeping 3 ms for pending range setup INFO 11:47:40,302 Bootstrapping ERROR 11:47:40,453 Fatal exception in thread Thread[Thread-5,5,main] java.lang.AssertionError:

Re: Adding nodes wrong/data not balanced across nodes

2010-10-27 Thread Matthew Dennis

You need to specify your initial tokens. LoadBalance really doesn't do a good job of balancing the load. Take a look at "Load Balancing" in http://wiki.apache.org/cassandra/Operations There is a little python script in there to help you pick tokens for a given cluster size. If you don't want to

Re: java.lang.OutOfMemoryError: Map failed

2010-10-27 Thread Matthew Dennis

2 GiB is pretty small for a C* node. You can also try reducing all the caching to zero with so little memory. If you have lots of CFs you probably want to reduce the memtable throughput too. On Wed, Oct 27, 2010 at 12:43 PM, Koert Kuipers < koert.kuip...@diamondnotch.com> wrote: > While bootst

Re: Cluster load balancing?

2010-10-27 Thread Thibaut Britz

Depending on the range I choose, choosing manually a token will also fail. (node will never exit boostrap, streams doesn't list any open streams) INFO [Thread-53] 2010-10-27 20:33:37,399 SSTableReader.java (line 120) Sampling index for /hd2/cassandra/data/table_xyz/table_xyz-3-Data.db INFO [Thr

Re: High BloomFilterFalseRation

2010-10-27 Thread Daniel Doubleday

Ah of course - question makes total sense. But no: this is not the case: I am not constantly asking the same question since the tree is deep enough. Most data nodes are level 5 from the root. So the parents getting queried will be different most of the time. Since the parent nodes are created

Re: Cluster load balancing?

2010-10-27 Thread Thibaut Britz

Hello Tyler, thanksf or the quick answer. That's true, I should have noticed. I also tried kicking out one node, clearing all directories and then restarting it with the bootstrap option. It received a few files, but just set there in bootstrapping mode (streams always printed bootstrapping witho

java.lang.OutOfMemoryError: Map failed

2010-10-27 Thread Koert Kuipers

While bootstrapping a new node, the existing node that is supposed to provide the data throws an error, and the bootstrapping hangs. The log from the existing node is below. Both nodes have little memory (only 2 Gig, windows machines). I used default configurations (Cassandra 0.7). Any suggestio

Re: Cluster load balancing?

2010-10-27 Thread Tyler Hobbs

With OrderPreservingPartitioner, you have to keep the ring balanced manually. This is why people frequently suggest that you use RandomPartitioner unless you absolutely have to do otherwise. With OPP, keys are *not* evenly distributed around the ring. Apparently you have lots of keys that are bet

Re: High BloomFilterFalseRation

2010-10-27 Thread Jonathan Ellis

Do you have a key "a/b" then? What columns does it have? On Wed, Oct 27, 2010 at 9:14 AM, Daniel Doubleday wrote: > Hm - > > not sure if I understand the random question. We are using RP. But I wouldn't > know why that should matter. > I thought that the bloom filter hash function should evenly

Re: High BloomFilterFalseRation

2010-10-27 Thread Daniel Doubleday

Hm - not sure if I understand the random question. We are using RP. But I wouldn't know why that should matter. I thought that the bloom filter hash function should evenly distribute no matter what keys come in. Keys are '/' separated strings (aka paths :-)) I do bulk inserts like: (1000 rows

Re: High BloomFilterFalseRation

2010-10-27 Thread Jonathan Ellis

This is not expected, no. How random are your queries? If you have a couple outlier rows causing the false positives that are being queried over and over then that could just be the luck of the draw. On Wed, Oct 27, 2010 at 5:24 AM, Daniel Doubleday wrote: > Hi people > > We are currently movin

Re: Time to wait for CF to be consistent after stopping writes.

2010-10-27 Thread Gary Dusbabek

On Wed, Oct 27, 2010 at 05:08, Utku Can Topçu wrote: > Hi, > > For a columnfamily in a keyspace which has RF=3, I'm issuing writes with > ConsistencyLevel.ONE. > > in the configuration I have: > - memtable_flush_after_mins : 30 > - memtable_throughput_in_mb : 32 > > I'm writing to this columnfamil

Re: Cassandra newbie question

2010-10-27 Thread Gary Dusbabek

On Wed, Oct 27, 2010 at 03:24, Arijit Mukherjee wrote: > Hi All > > I've another related question. > > I am using a stream of records of the form (A, B, n) where the pair > (A,B) can occur multiple times. For example, you could have the > following rset of records - > > A, B, 2 > P, Q, 5 > X, Y, 3

High BloomFilterFalseRation

2010-10-27 Thread Daniel Doubleday

Hi people We are currently moving our second use case from mysql to cassandra. While importing the data (ongoing) I noticed that the BloomFilterFalseRation seems to be pretty high compared to another CF which is in used in production right now. Its a hierarchical data model and I cannot avoid t

Time to wait for CF to be consistent after stopping writes.

2010-10-27 Thread Utku Can Topçu

Hi, For a columnfamily in a keyspace which has RF=3, I'm issuing writes with ConsistencyLevel.ONE. in the configuration I have: - memtable_flush_after_mins : 30 - memtable_throughput_in_mb : 32 I'm writing to this columnfamily continuously for about 1 hour then stop writing. So the question is:

Re: What happens if there is a collision?

2010-10-27 Thread Jérôme Verstrynge

Peter, many thanks for all this information. On 26/10/2010 21:17, Peter Sculler wrote: It does mention that timestamps are used for conflict resolution but does not really dwell on the issue, and the remainder elides timestamps. So perhaps it's easy to miss. I also notice that the phrasing is su

Re: New nodes won't bootstrap on .66

2010-10-27 Thread Dimitry Lvovsky

Hi Aaron, Thanks for your reply. We still haven't solved this unfortunately. How did you start the bootstrap for the .18 node ? Standard way: we set "AutoBootstrap" to true and added all the servers from the working ring as seeds. > Was it the .18 or the .17 node you tried to add We first t

Re: Cassandra newbie question

2010-10-27 Thread Arijit Mukherjee

Hi All I've another related question. I am using a stream of records of the form (A, B, n) where the pair (A,B) can occur multiple times. For example, you could have the following rset of records - A, B, 2 P, Q, 5 X, Y, 3 A, B, 8 A, B, 2 ... The data store has a set of columns - (key, count, s

38 matches

Mail list logo