Hi,
I was searching for similar topic in mailing list, I think there is still
misunderstanding in measuring cluster. It will be nice if someone could
write right definitions.
What are we measuring? Ops/sec, throughput in Mbit/s?, number of
clients/threads writing/reading data?
I read Jonathan sa
On Sun, Mar 20, 2011 at 10:23 AM, pob wrote:
> Hi,
> I was searching for similar topic in mailing list, I think there is still
> misunderstanding in measuring cluster. It will be nice if someone could
> write right definitions.
> What are we measuring? Ops/sec, throughput in Mbit/s?, number of
> c
As Vijay says look at the "fat client" contrib. Even if the node is only
responsible for a small about of the ring, it would normally still get data
handed to it and read from it as a replica. You would need to use a Replica
Placement Strategy that knew it ignore the "connection only" nodes.
I
Where in Oz are you ?
I'm heading to Sydney next week and then Melbourne. Happy to meet up.
Aaron
On 19 Mar 2011, at 13:13, Ashlee Saunders wrote:
> Hello Dave,
>
> I am in Australia and was wondering if this group could do a phone hookup?
>
> Ash
>
> On 19/03/2011, at 2:25 AM, Dave Gardner
On Sun, Mar 20, 2011 at 1:20 PM, aaron morton wrote:
> Even if the node is only
> responsible for a small about of the ring, it would normally still get data
> handed to it and read from it as a replica. You would need to use a Replica
> Placement Strategy that knew it ignore the "connection only"
I'd collapse all the data for a single object into a single column, not sure
about storing 100 objects in a single column though.
Have you considered any concurrency issues ? e.g. multiple threads / processes
wanting to update different objects in the same group of 100?
Dont understand your r
Aaron, thanks for chiming in.
I'm doing what you said, i.e. all data for a single object (which is quite
lean with about 100 attributes 10 bytes each) just goes into a single
column, as opposed to the previous version of my application, which had all
attributes of each small object mapped to indiv
ah, my flippant comments at the end.
Instead of "Single point of failure" perhaps I should have said "specialist
nodes are a bad idea as they may reduce the overall availability of the cluster
to the availability any one sub group." e.g. a cluster of 10 nodes, where 8 are
data and 2 are connec
The API provides for Quourm CL to be used either across all replicas (ignoring
DC's), the local DC or each DC. See Consistency level here
http://wiki.apache.org/cassandra/API
Hope that helps
Aaron
On 19 Mar 2011, at 06:54, mcasandra wrote:
> When in active/active data center how to decide righ
Can we get some more information...
- What is countPendingHints showing ? Are they all for the same row ?
- What about listEndpointsPendingHints are their different end points listed
there ?
- Can you turn up the logging to DEBUG on one of the machines that has the
increasing number of hints ?
0.7.1+ uses zero-copy reads in mmap'd mode so having 80k references to
the same column is essentially just the reference overhead.
On Fri, Mar 18, 2011 at 7:11 PM, Dan Retzlaff wrote:
> Dear experts, :)
> Our application triggered an OOM error in Cassandra 0.6.5 by reading the
> same 1.7MB column
Hi All,
I want to modify the values in the cassandra.yaml which comes with
the cassandra-0.7 package based on values of hostnames,
colo etc.
Does someone knows of some script which I can use which reads in default
cassandra.yaml and write outs new cassandra.yaml
with values based on numbe
When compacting it will use the path with the greatest free space. When
compaction completes successfully the files will lose their temporary status
and that will be their new home.
Aaron
On 18 Mar 2011, at 14:10, John Lewis wrote:
> | data_file_directories makes it seem as though cassandra ca
Recent discussion on the dev list
http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html
Aaron
On 19 Mar 2011, at 06:46, A J wrote:
> Just to add, all the telnet (port 7000) and cassandra-cli (port 9160)
> connections are done using the public DNS (that goes like
> ec2-.compute.
Have you looked at Puppet ? This example is from 0.6* but it's still good
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/easy_street_deploying_cassandra_via
Or Chef
http://blog.darkhax.com/2010/11/05/instant-nosql-cluster-with-chef-cassandra-and-your-favorite-cloud-hosting-provider
A
Internally a multiget just turned into a series of single row gets. There is no
seek and partial scan such as you may see when reading from the clustered index
in a RDBMS.
Unless you have a performance problem and you've tried other things I'd put
this idea of the back burner. There are many o
Just ran into a Java segfault on 0.7.4 when Cassandra created a new
commitlog segment. Does that point to a bug in the JVM, or in
Cassandra? My guess would be the JVM, but I wanted to check before
submitting a bug report to anyone.
Thanks!
Jason
segfaults are either a JVM bug (are you on the latest Sun version?
openjdk is very behind on fixes) or bad hardware.
On Sun, Mar 20, 2011 at 7:17 PM, Jason Harvey wrote:
> Just ran into a Java segfault on 0.7.4 when Cassandra created a new
> commitlog segment. Does that point to a bug in the JVM,
Indeed. This is likely at the JVM level.. (if not lower down the stack)
Do you happen to have a hs*err file for the crash? Is it reproducible?
What 'java -version"? What version of linux?
thanks,
Sri
On Sun, Mar 20, 2011 at 5:48 PM, Jonathan Ellis wrote:
> segfaults are either a JVM bug (are y
I'm on Openjdk. I'll switch over to Sun Java and see how that goes.
Thx for the info!
Jason
On Mar 20, 5:48 pm, Jonathan Ellis wrote:
> segfaults are either a JVM bug (are you on the latest Sun version?
> openjdk is very behind on fixes) or bad hardware.
>
> On Sun, Mar 20, 2011 at 7:17 PM, Jaso
The test was inconclusive because we decomissioned that cluster before
it'd be running long enough to exhibit the problem.
-ryan
On Wed, Mar 16, 2011 at 7:27 PM, Zhu Han wrote:
>
>
> On Thu, Feb 3, 2011 at 1:49 AM, Ryan King wrote:
>>
>> On Wed, Feb 2, 2011 at 6:22 AM, Chris Burroughs
>> wrote
CL is just a way to satisfy consistency but you still want majority of your
reads (preferrably) occurring in the same DC.
I don't think that answers my question at all. I understand the CL but I
think I have more basic and important question about active/active data
center and the replicas in that
22 matches
Mail list logo