So I've been thinking about the problem of how to do range queries on keys
with random partitioning. I'm new to Cassandra, and I don't know what the
plans are, but I have an idea and I thought I'd just put it out there:
Predicate Indexes.
I would like to be able to define predicate indexes in Cass
Hi all,
my env
6 servers with about 200GB data.
data structure,
64B rowkey + (5B column)*20,
rowkey and column.value are all random bytes from a-z,A-Z,0-9
problem
when I tried iterate over the data in the storage, I always get
org::apache::cassandra::TimedOutException
(RpcTim
more info:
CL = ONE,
replica = 2,
and when I tried to monitor the disk_io with iostat I get almost 0MB/s
read & 0% CPU on the machine the scan-data app started on.
Thanks!
??: Shuai Yuan
??: user@cassandra.apache.org
: [***SPAM*** ] problem when trying t
We didn't change partitioners.
Maybe we did some other stupid thing, but not that one.
On Wed, Jun 2, 2010 at 8:52 PM, Gary Dusbabek wrote:
> I was able to reproduce the error by staring up a node using
> RandomPartioner, kill it, switch to OrderPreservingPartitioner,
> restart, kill, switch b
We want to try out Cassandra in the cloud. Any recommendations? Comments?
Should we use Amazon? Rackspace? Something else?
> We did indeed have a problem with our GC settings. The survivor ratio was
> too low. After changing that things are better but we are still seeing GC
> that takes 5-10 seconds, which is enough for the node to drop out of the
> cluster briefly.
This still indicates full GC:s. What is your write
Hi
I think In this case (logging hard traffic) both of two idea can't scale
write operation in current Cassandra.
So wait for secondary index support.
2010/6/3 Jonathan Shook
> Insert "if you want to use long values for keys and column names"
> above paragraph 2. I forgot that part.
>
> On Wed,
> We want to try out Cassandra in the cloud. Any recommendations? Comments?
> Should we use Amazon? Rackspace? Something else?
I'm using it on Amazon with mostly success. I'd recommend increasing Phi from 8
to 10, use the 4-core/15gb instances to start, and if you plan to be really
heavy on rea
I've had a few nodes crash (Out of heap), and when I pull the heap dump, there
are hundreds of thousands of MessageDeserializationTasks in the thread pool
executor, using up GB of the heap. I'm running 0.6.2 on sun jvm u20 and the
nodes are under heavy load. Has anyone else run into this? I have
>> So with the row cache, that first node (the primary replica) is the one that
>> has that row cached, yes?
> No, it's the closest node as determined by snitch.sortByProximity.
And with the default snitch, rack-unaware placement, random partitioner, and
all nodes up, that's the primary replica,
Hi,
I am getting OOM during load tests:
java.lang.OutOfMemoryError: Java heap space
at java.util.HashSet.(HashSet.java:125)
at
com.google.common.collect.Sets.newHashSetWithExpectedSize(Sets.java:181)
at
com.google.common.collect.HashMultimap.createCollection(HashMultimap
Hi all,
connecting to a cluster with cassandra-cli and trying a describe command, I
obtain a "missing K_TABLE" message :
cassandra> describe Keyspace1
line 1:9 missing K_TABLE at 'Keyspace1'
Keyspace1.Super1
Column Family Type: Super
Columns Sorted By: org.apache.cassandra.db.marshal.bytest...@2
Are you running "ant test"? It defaults to setting memory to 1G. If
you're running them outside of ant, you'll need to set max memory
manually.
Gary.
On Thu, Jun 3, 2010 at 10:35, Lev Stesin wrote:
> Hi,
>
> I am getting OOM during load tests:
>
> java.lang.OutOfMemoryError: Java heap space
>
On Thu, 2010-06-03 at 11:29 +0300, David Boxenhorn wrote:
> We want to try out Cassandra in the cloud. Any recommendations?
> Comments?
>
> Should we use Amazon? Rackspace? Something else?
I personally haven't used Cassandra on EC2, but others have reported
significantly better disk IO, (and hen
Gary,
Is there a directive to set it? Or should I modify the cassandra
script itself? Thanks.
Lev.
On Thu, Jun 3, 2010 at 10:48 AM, Gary Dusbabek wrote:
> Are you running "ant test"? It defaults to setting memory to 1G. If
> you're running them outside of ant, you'll need to set max memory
>
We're using Cassandra on AWS at SimpleGeo. We software RAID 0 stripe
the ephemeral drives to achieve better I/O and have machines in
multiple Availability Zones with a custom EndPointSnitch that
replicates the data between AZs for high availability (to be
open-sourced/contributed at some point).
Ben,
do you just keep the commit log on the ephemeral drive? Or data and
commit? (I was confused by your reference to XFS and snapshots -- I
assume you keep data on the XFS drive)
-Mike
On Thu, Jun 3, 2010 at 2:29 PM, Ben Standefer wrote:
> We're using Cassandra on AWS at SimpleGeo. We softwa
I'm having difficulties setting up a 3 way cassandra cluster. Any comments/help
would be appreciated.
My goal is that all data should be fully replicated amongst the 3 nodes. I want
to simulate the failure of one node and proof that the test column family still
can be accessed.
In a nutshell I
It's set in the build file:
But I'm not sure if you're using the build file or not. It kind of
sounds like you are not.
Gary.
On Thu, Jun 3, 2010 at 11:24, Lev Stesin wrote:
> Gary,
>
> Is there a directive to set it? Or should I modify the cassandra
> script itself? Thanks.
>
> Lev.
>
> On
Your replication factor is only set to 1, which means that each key
will only live on a single node. If you do wait for bootstrapping to
commence (takes 90s in trunk, I don't recall in 0.6), you should see
some keys moving unless your inserts were all into a small range.
Perhaps your being impatie
The commit log and data directory are on the same mounted directory
structure (the 2 RAID 0 striped ephemeral disks) rather than using 1
of the ephemeral disks for the data and 1 of the ephemeral disks for
the data directory. While it's usually advised that for disk
utilization reasons you keep th
Ben,
thanks for that, we may try that. I did find an AWS forum tidbit from
two years ago:
"4 ephemeral stores striped together can give significantly higher
throughput for sequential writes than EBS."
http://developer.amazonwebservices.com/connect/thread.jspa?messageID=125197𞤍
-Mike
On Thu, J
Mike, yep, there are a lot of benchmarks proving it (plus it just makes sense)
http://stu.mp/2009/12/disk-io-and-throughput-benchmarks-on-amazons-ec2.html
http://www.mysqlperformanceblog.com/2009/08/06/ec2ebs-single-and-raid-volumes-io-bencmark/
http://orion.heroku.com/past/2009/7/29/io_performanc
On 2010-06-03 13:07, Stephan Pfammatter wrote:
Cassandra-or
[...]
cassandra-or
Aside from the replication factor noted by Gary, this should point to
your existing node (cassandra-ca) otherwise, how will this node know
where existing node is and where to get the data from?
Cassandra-az
[
We're back with another public Cassandra training:
http://www.eventbrite.com/event/718755818
This will be Riptano's 6th training session (including the four we've
done that were on-site with a specific customer), and in my humble
opinion the material's really solid at this point.
The eventbrite t
It's documented that get_range_slice() supports all partitioner in 0.6
Kevin
??: Olivier Mallassi
??: user@cassandra.apache.org
: [***SPAM*** ] Re: question about class SlicePredicate
: Tue, 1 Jun 2010 13:38:03 +0200
Does it work whatever the chosen p
use smaller slices and page through the data
2010/6/3 Shuai Yuan :
> Hi all,
>
> my env
>
> 6 servers with about 200GB data.
>
> data structure,
>
> 64B rowkey + (5B column)*20,
> rowkey and column.value are all random bytes from a-z,A-Z,0-9
>
> problem
>
> when I tried iterate ove
On Thu, Jun 3, 2010 at 10:17 AM, David King wrote:
>>> So with the row cache, that first node (the primary replica) is the one
>>> that has that row cached, yes?
>> No, it's the closest node as determined by snitch.sortByProximity.
>
> And with the default snitch, rack-unaware placement, random p
Sounds like a bug in the cli. Maybe it only knows how to describe KS
+ CF together?
Please file a bug report at https://issues.apache.org/jira/browse/CASSANDRA.
On Thu, Jun 3, 2010 at 10:37 AM, yaw wrote:
> Hi all,
> connecting to a cluster with cassandra-cli and trying a describe command, I
>
Note the describe_keyspace API method does not exhibit this behavior in 0.6.2
... seems to be a problem specific to cassandra-cli.
-phil
On Jun 3, 2010, at 10:18 PM, Jonathan Ellis wrote:
> Sounds like a bug in the cli. Maybe it only knows how to describe KS
> + CF together?
>
> Please file a
having the write or read stage fill up, will cause as a secondary
effect deserialization to fill up
moral: when you start getting timeout exceptions, have your clients
sleep for 100ms or otherwise back off (or maybe you just need to add
capacity)
On Thu, Jun 3, 2010 at 10:16 AM, Daniel Kluesing
Thanks for the hint.
I found out it was "too many opened files" error and the server side
just lost response to the get_range_slice() request by throwing out an
exception.
Now works with "ulimit -n 32768".
Kevin
??: Jonathan Ellis
??: user@cassandra.apache.or
http://wiki.apache.org/cassandra/MultinodeCluster
On Thu, Jun 3, 2010 at 1:07 PM, Stephan Pfammatter
wrote:
> I’m having difficulties setting up a 3 way cassandra cluster. Any
> comments/help would be appreciated.
>
>
>
> My goal is that all data should be fully replicated amongst the 3 nodes. I
I have ten 0.5.1 Cassandra nodes in my cluster, and I update them to
cassandra to 0.6.2 yesterday.
But today I find six cassandra nodes have high CPU usage more than 400% in
my 8-core CPU sever.
The worst one is more than 760%. It is very serious.
I use jvisualvm to watch the worst node, and
We're seeing this as well. We were testing with a 40+ node cluster on the
latest 0.6 branch from few days ago.
-Chris
On Jun 3, 2010, at 9:55 PM, Lu Ming wrote:
>
> I have ten 0.5.1 Cassandra nodes in my cluster, and I update them to
> cassandra to 0.6.2 yesterday.
> But today I find six cass
we have a SupperCF which may have up to 1000 supper columns and 5
clumns for each supper column, the read latency may go up to 50ms
(even higher), I think it's a long time to response, how to tune the
storage config to optimize the performace? I read the wiki,
may help to do this, supose that
36 matches
Mail list logo