Re: writes performance

2011-03-20 Thread pob
Hi,

I was searching for similar topic in mailing list, I think there is still
misunderstanding in measuring cluster. It will be nice if someone could
write right definitions.

What are we measuring? Ops/sec, throughput in Mbit/s?, number of
clients/threads writing/reading data?

I read Jonathan said it doesnt matter if you use CL.ONE or CL.QUORUM, but
for example writing with CL.ONE into one node of cluster with 3 nodes, RF =
3 works fine instead writing with CL.ONE into 3 nodes in parallel randomly
(stress.py -d node1,node2,node3) of  same cluster with 3 nodes, RF = 3
have consequences in nodes crashing because java out of memory.

Another thing, it was said if you use RF = N your throughput of the whole
cluster is one node throughput / 3, whats throughput in that case? Bandwith?
Ops/sec? Whats one node throughput ? One node with RF=1? Im
getting completely lost while Im trying to do some estimation about how big
stream i can write into cluster, what happens if I double nodes of cluster
and so on.


Thanks for explanation or any hints.


Best,
Peter

2011/3/20 pob 

> Hello,
>
> I set up cluster with 3 nodes/ 4Gram,4cores,raid0. I did experiment with
> stress.py to see how fast my inserts are. The results are confusing.
>
> In each case stress.py was inserting 170KB of data:
> 1)
> stress.py was inserting directly to one node -dNode1, RF=3, CL.ONE
>
> 30 inserts in 1296 sec (30,246,246,0.01123401983,1296)
>
> 2)
> stress.py was inserting directly to one node -dNode1, RF=3, CL.QUORUM
>
> 30 inserts in 987 sec   (30,128,128,0.00894131883979,978)
>
> 3)
> stress.py was inserting random into all 3 nodes  -dNode1,Node2,Node3 RF=3,
> CL.QUORUM
>
> 30 inserts in 784 sec (30,157,157,0.00900169542641,784)
>
> 4)
> stress.py was inserting directly to one node -dNode1, RF=3, CL.ALL
>
> similar to case 1)
> ---
>
> Im not surprising about cases 2,3) but the biggest surprise for me is why
> cl.one is slower then cl.quorum. CL.one has less "acks", shorter time of
> waiting... and so on.
>
> I was looking at some blogs about "write" architecture but the reason is
> still not clear for me.
>
> http://www.mikeperham.com/2010/03/13/cassandra-internals-writing/
> http://prettyprint.me/2010/05/02/understanding-cassandra-code-base/
>
>
> Thanks for advice.
>
>
> Best,
> Peter
>
>


Re: writes performance

2011-03-20 Thread Edward Capriolo
On Sun, Mar 20, 2011 at 10:23 AM, pob  wrote:
> Hi,
> I was searching for similar topic in mailing list, I think there is still
> misunderstanding in measuring cluster. It will be nice if someone could
> write right definitions.
> What are we measuring? Ops/sec, throughput in Mbit/s?, number of
> clients/threads writing/reading data?
> I read Jonathan said it doesnt matter if you use CL.ONE or CL.QUORUM, but
> for example writing with CL.ONE into one node of cluster with 3 nodes, RF =
> 3 works fine instead writing with CL.ONE into 3 nodes in parallel randomly
> (stress.py -d node1,node2,node3) of  same cluster with 3 nodes, RF = 3
> have consequences in nodes crashing because java out of memory.
> Another thing, it was said if you use RF = N your throughput of the whole
> cluster is one node throughput / 3, whats throughput in that case? Bandwith?
> Ops/sec? Whats one node throughput ? One node with RF=1? Im
> getting completely lost while Im trying to do some estimation about how big
> stream i can write into cluster, what happens if I double nodes of cluster
> and so on.
>
> Thanks for explanation or any hints.
>
> Best,
> Peter
> 2011/3/20 pob 
>>
>> Hello,
>> I set up cluster with 3 nodes/ 4Gram,4cores,raid0. I did experiment with
>> stress.py to see how fast my inserts are. The results are confusing.
>> In each case stress.py was inserting 170KB of data:
>> 1)
>> stress.py was inserting directly to one node -dNode1, RF=3, CL.ONE
>> 30 inserts in 1296 sec (30,246,246,0.01123401983,1296)
>> 2)
>> stress.py was inserting directly to one node -dNode1, RF=3, CL.QUORUM
>> 30 inserts in 987 sec   (30,128,128,0.00894131883979,978)
>> 3)
>> stress.py was inserting random into all 3 nodes  -dNode1,Node2,Node3 RF=3,
>> CL.QUORUM
>> 30 inserts in 784 sec (30,157,157,0.00900169542641,784)
>> 4)
>> stress.py was inserting directly to one node -dNode1, RF=3, CL.ALL
>> similar to case 1)
>> ---
>> Im not surprising about cases 2,3) but the biggest surprise for me is why
>> cl.one is slower then cl.quorum. CL.one has less "acks", shorter time of
>> waiting... and so on.
>> I was looking at some blogs about "write" architecture but the reason is
>> still not clear for me.
>> http://www.mikeperham.com/2010/03/13/cassandra-internals-writing/
>> http://prettyprint.me/2010/05/02/understanding-cassandra-code-base/
>>
>> Thanks for advice.
>>
>> Best,
>> Peter
>
>

Peter,

There are too many combinations of Replication factor, Consistency
Level, Node Count, and work load to have extended write ups about how
each situation performs.

The paper that does the best job explaining this is the yahoo cloud
server benchmark

http://research.yahoo.com/files/ycsb.pdf

* This paper is old
** This paper tests with older version of Cassandra
*** YCSB seems to be fragmented across github now

Also remember the stress test tools create fictitious workloads. You
can "game" a stress test and produce incredible results or vice versa.
(you always get more of what you measure)

I can not speak for anyone but I imagine the stress test tools are
used primarily by the developers to ensure no performance regressions
after patches.

I think one way to look at the performance is by saying "It is what it
is". i.e. You have disks, you have RAM, data is sorted, it is designed
to be as fast as it can. Scale-out means you can grow the cluster
indefinitely. So how hard you can drive a single node becomes less of
an issue.


Re: Optimizing a few nodes to handle all client connections?

2011-03-20 Thread aaron morton
As Vijay says look at the "fat client" contrib. Even if the node is only 
responsible for a small about of the ring, it would normally still get data 
handed to it and read from it as a replica. You would need to use a Replica 
Placement Strategy that knew it ignore the "connection only" nodes. 

IMHO it's a bad idea: Single point of failure, wasted compute resources, 
imbalance between "connection" and "worker" nodes. 

Aaron

On 19 Mar 2011, at 15:27, Vijay wrote:

> Are you saying you dont like the idea of the co-ordinator node being in the 
> same ring? if yes have you looked at the cassandra "fat client" in contrib?
> 
> Regards,
> 
> 
> 
> 
> On Fri, Mar 18, 2011 at 6:55 PM, Jason Harvey  wrote:
> Hola everyone,
> 
> I have been considering making a few nodes only manage 1 token and
> entirely dedicating them to talking to clients. My reasoning behind
> this is I don't like the idea of a node having a dual-duty of handling
> data, and talking to all of the client stuff.
> 
> Is there any merit to this thought?
> 
> Cheers,
> Jason
> 



Re: Cassandra London UG meetup Monday

2011-03-20 Thread aaron morton
Where in Oz are you ?

I'm heading to Sydney next week and then Melbourne. Happy to meet up.

Aaron

On 19 Mar 2011, at 13:13, Ashlee Saunders wrote:

> Hello Dave,
> 
> I am in Australia and was wondering if this group could do a phone hookup?
> 
> Ash
> 
> On 19/03/2011, at 2:25 AM, Dave Gardner  wrote:
> 
>> Hi all,
>> 
>> Anyone based in the UK may be interested in our user group meetup on Monday. 
>>  We will have talks on Hadoop integration and some performance data related 
>> to this.
>> 
>> Please come along if you'd like to meet other people using Cassandra or 
>> would like to learn more.
>> 
>> http://www.meetup.com/Cassandra-London/events/15490570/
>> 
>> Dave



Re: Optimizing a few nodes to handle all client connections?

2011-03-20 Thread Robert Coli
On Sun, Mar 20, 2011 at 1:20 PM, aaron morton  wrote:
> Even if the node is only
> responsible for a small about of the ring, it would normally still get data
> handed to it and read from it as a replica. You would need to use a Replica
> Placement Strategy that knew it ignore the "connection only" nodes.
> IMHO it's a bad idea: Single point of failure, wasted compute resources,
> imbalance between "connection" and "worker" nodes.

As I understand what is being proposed, the node could only be
responsible for a single token, and presumably would perform very well
indeed when reading or writing that token. I don't see why you would
need to avoid placing a single token's worth of data on a node, or why
it would become a single point of failure if you did.. is there
something I'm missing.. ?

=Rob


Re: Reading whole row vs a range of columns (pycassa)

2011-03-20 Thread aaron morton
I'd collapse all the data for a single object into a single column, not sure 
about storing 100 objects in a single column though. 

Have you considered any concurrency issues ? e.g. multiple threads / processes 
wanting to update different objects in the same group of 100? 

Dont understand your reference to the OOP in the context of a reading 100 
columns from a row. 

Aaron

 
On 19 Mar 2011, at 16:22, buddhasystem wrote:

> As I'm working on this further, I want to understand this:
> 
> Is it advantageous to flatten data in blocks (strings) each containing a
> series of objects, if I know that a serial object read is often likely, but
> don't want to resort to OPP? I worked out the optimal granularity, it seems.
> Is it better to read a serialized single column with 100 objects than a row
> consisting of a hundred columns each modeling an object?
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Reading-whole-row-vs-a-range-of-columns-pycassa-tp6186518p6186782.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.



Re: Reading whole row vs a range of columns (pycassa)

2011-03-20 Thread buddhasystem
Aaron, thanks for chiming in.

I'm doing what you said, i.e. all data for a single object (which is quite
lean with about 100 attributes 10 bytes each) just goes into a single
column, as opposed to the previous version of my application, which had all
attributes of each small object mapped to individual columns.

So yes, I perhaps considered having 100 objects in a single column but that
is suboptimal for many reasons (hard to add object later).

My reference to OOP was this -- if I was sticking with the original design,
it could have been advantageous to have OOP since statistically it's likely
that requests for objects are often serial, e.g. often people don't query
for just one object with id=123, but for a series like id=[123..145]. If I
bunch these into rows containing 100 objects each, that promises some
efficiency right there, as I read one row as opposed to say 50.




aaron morton wrote:
> 
> I'd collapse all the data for a single object into a single column, not
> sure about storing 100 objects in a single column though. 
> 
> Have you considered any concurrency issues ? e.g. multiple threads /
> processes wanting to update different objects in the same group of 100? 
> 
> Dont understand your reference to the OOP in the context of a reading 100
> columns from a row. 
> 
> Aaron
> 
>  
> On 19 Mar 2011, at 16:22, buddhasystem wrote:
> 
> > As I'm working on this further, I want to understand this:
> > 
> > Is it advantageous to flatten data in blocks (strings) each
> containing a
> > series of objects, if I know that a serial object read is often
> likely, but
> > don't want to resort to OPP? I worked out the optimal granularity, it
> seems.
> > Is it better to read a serialized single column with 100 objects than
> a row
> > consisting of a hundred columns each modeling an object?
> > 
> > --
> > View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Reading-whole-row-vs-a-range-of-columns-pycassa-tp6186518p6186782.html
> > Sent from the cassandra-u...@incubator.apache.org mailing list
> archive at Nabble.com.
> 


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Reading-whole-row-vs-a-range-of-columns-pycassa-tp6186518p6190639.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Optimizing a few nodes to handle all client connections?

2011-03-20 Thread aaron morton
ah, my flippant comments at the end. 

Instead of "Single point of failure" perhaps I should have said "specialist 
nodes are a bad idea as they may reduce the overall availability of the cluster 
to the availability any one sub group." e.g. a cluster of 10 nodes, where 8 are 
data and 2 are connections may be down for 100% of the keys after the loss of 2 
nodes if they happen to be the connection nodes.  

WRT the partitioner I was thinking of a situation such as 

node 1 : 33.3% of the ring
node 2 : 33.3% of the ring
node 3 : 33.3% of the ring
node 4 : 0.1% of the ring 

My point was that giving the node a small token range would not be enough to 
reduce it's data load. If node 4 was a functioning node in the ring then at RF 
3 it will be asked to be a replica for the data from nodes 2 and 3. Unless the 
replica strategy excluded the node from the list of natural endpoints for all 
but the token range it was responsible for. 

Aaron


On 21 Mar 2011, at 10:28, Robert Coli wrote:

> On Sun, Mar 20, 2011 at 1:20 PM, aaron morton  wrote:
>> Even if the node is only
>> responsible for a small about of the ring, it would normally still get data
>> handed to it and read from it as a replica. You would need to use a Replica
>> Placement Strategy that knew it ignore the "connection only" nodes.
>> IMHO it's a bad idea: Single point of failure, wasted compute resources,
>> imbalance between "connection" and "worker" nodes.
> 
> As I understand what is being proposed, the node could only be
> responsible for a single token, and presumably would perform very well
> indeed when reading or writing that token. I don't see why you would
> need to avoid placing a single token's worth of data on a node, or why
> it would become a single point of failure if you did.. is there
> something I'm missing.. ?
> 
> =Rob



Re: Active / Active Data Center and RF

2011-03-20 Thread aaron morton
The API provides for Quourm CL to be used either across all replicas (ignoring 
DC's), the local DC or each DC. See Consistency level here 
http://wiki.apache.org/cassandra/API

Hope that helps
Aaron

On 19 Mar 2011, at 06:54, mcasandra wrote:

> When in active/active data center how to decide right replication factor?
> Client may connect and request for the information from either data center
> so if locally it's RF=3 then in multiple data center should it be RF=6 in
> active/active?
> 
> Or what happens if it's RF=3 with network toplogy and 2 copies are stored in
> Site A and 1 copy in Site B data center. Now client for some reason is
> directed to Site B data center and does a write, now would Site B have 2
> copies and Site A one (or still 2)? It's getting confusing slowly :) I have
> several more questions but will start with understanding this first. 
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Active-Active-Data-Center-and-RF-tp6185528p6185528.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.



Re: HintedHandoff increases in read?

2011-03-20 Thread aaron morton
Can we get some more information...

- What is countPendingHints showing ? Are they all for the same row ? 
- What about listEndpointsPendingHints are their different end points listed 
there ?
- Can you turn up the logging to DEBUG on one of the machines that has the 
increasing number of hints ? When a node is asked to store a hint it will log a 
message with "Adding hint for" You should then see a message that includes 
"applied.  Sending response to"
- Number of nodes and RF ?

ReadRepair does not use HintedHandoff as part of it's processing. 

Hope that helps. 
Aaron

On 19 Mar 2011, at 05:10, Shotaro Kamio wrote:

> Hi,
> 
> When I looking at "countPendingHints" in HintedHandoffManager via jmx,
> I found that pending hints increases even when my cluster handles only
> reads with quorum from clients.
> The count decreases when I see it in long period (e.g., in an hour).
> But it can increase in several thousands in short period (for
> instance, in a few seconds. even in minutes some cases). My cluster
> looks healthy at that time.
> Is it normal behavior? I thought that hints are created only when data
> are written and a node is down. Does it increase with read repair?
> I'm waiting for finishing hinted handoff. But it takes long time.
> I'm using cassandra 0.7.4.
> 
> 
> Best regards,
> Shtoaro



Re: 0.6.5 OOM during high read load

2011-03-20 Thread Jonathan Ellis
0.7.1+ uses zero-copy reads in mmap'd mode so having 80k references to
the same column is essentially just the reference overhead.

On Fri, Mar 18, 2011 at 7:11 PM, Dan Retzlaff  wrote:
> Dear experts, :)
> Our application triggered an OOM error in Cassandra 0.6.5 by reading the
> same 1.7MB column repeatedly (~80k reads). I analyzed the heap dump, and it
> looks like the column value was queued 5400 times in an
> OutboundTcpConnection destined for the Cassandra instance that received the
> client request. Unfortunately, this intra-node connection goes across a
> 100Mb data center interconnect, so it was only a matter of time before the
> heap was exhausted.
> Is there something I can do (other than change the application behavior) to
> avoid this failure mode? I'm not the first to run into this, am I?
> Thanks,
> Dan



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


script to modify cassandra.yaml file

2011-03-20 Thread Anurag Gujral
Hi All,
  I want to modify the values in the cassandra.yaml which comes with
the cassandra-0.7 package based on values of hostnames,
colo etc.
Does someone knows of some script which I can use which reads in default
cassandra.yaml and write outs new cassandra.yaml
with values based on number of nodes in the cluster ,hostname,colo name etc.

Thanks
Anurag


Re: moving data from single node cassandra

2011-03-20 Thread aaron morton
When compacting it will use the path with the greatest free space. When 
compaction completes successfully the files will lose their temporary status 
and that will be their new home.

Aaron

On 18 Mar 2011, at 14:10, John Lewis wrote:

> | data_file_directories makes it seem as though cassandra can use more than 
> one location for sstable storage. Does anyone know how it splits up the data 
> between partitions? I am trying to plan for just about every worst case 
> scenario I can right now, and I want to know if I can change the config to 
> open up some secondary storage for a compaction if needed.
> 
> Lewis
> 
> On Mar 17, 2011, at 6:03 PM, Maki Watanabe wrote:
> 
>> Refer to:
>> http://wiki.apache.org/cassandra/StorageConfiguration
>> 
>> You can specify the data directories with following parameter in
>> storage-config.xml (or cassandra.yaml in 0.7+).
>> 
>> commit_log_directory : where commitlog will be written
>> data_file_directories : data files
>> saved_cache_directory : saved row cache
>> 
>> maki
>> 
>> 
>> 2011/3/17 Komal Goyal :
>>> Hi,
>>> I am having single node cassandra setup on a windows machine.
>>> Very soon I have ran out of space on this machine so have increased the
>>> hardisk capacity of the machine.
>>> Now I want to know how I configure cassandra to start storing data in these
>>> high space partitions?
>>> Also how the existing data store in this single node cassandra can be moved
>>> from C drive to the other drives?
>>> Is there any documentation as to how these configurations can be done?
>>> some supporting links will be very helpful..
>>> 
>>> 
>>> Thanks,
>>> 
>>> Komal Goyal
>>> 
> 



Re: EC2 - 2 regions

2011-03-20 Thread aaron morton
Recent discussion on the dev list 
http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html

Aaron

On 19 Mar 2011, at 06:46, A J wrote:

> Just to add, all the telnet (port 7000) and cassandra-cli (port 9160)
> connections are done using the public DNS (that goes like
> ec2-.compute.amazonaws.com)
> 
> On Fri, Mar 18, 2011 at 1:37 PM, A J  wrote:
>> I am able to telnet from one region to another on 7000 port without
>> issues. (I get the expected Connected to .Escape character is
>> '^]'.)
>> 
>> Also I am able to execute cassandra client on 9160 port from one
>> region to another without issues (this is when I run cassandra
>> separately on each region without forming a cluster).
>> 
>> So I think the ports 7000 and 9160 are not the issue.
>> 
>> 
>> 
>> On Fri, Mar 18, 2011 at 1:26 PM, Dave Viner  wrote:
>>> From the us-west instance, are you able to connect to the us-east instance
>>> using telnet on port 7000 and 9160?
>>> If not, then you need to open those ports for communication (via your
>>> Security Group)
>>> Dave Viner
>>> 
>>> On Fri, Mar 18, 2011 at 10:20 AM, A J  wrote:
 
 Thats exactly what I am doing.
 
 I was able to do the first two scenarios without any issues (i.e. 2
 nodes in same availability zone. Followed by an additional node in a
 different zone but same region)
 
 I am stuck at the third scenario of separate regions.
 
 (I did read the "Cassandra nodes on EC2 in two different regions not
 communicating" thread but it did not seem to end with resolution)
 
 
 On Fri, Mar 18, 2011 at 1:15 PM, Dave Viner  wrote:
> Hi AJ,
> I'd suggest getting to a multi-region cluster step-by-step.  First, get
> 2
> nodes running in the same availability zone.  Make sure that works
> properly.
>  Second, add a node in a separate availability zone, but in the same
> region.
>  Make sure that's working properly.  Third, add a node that's in a
> separate
> region.
> Taking it step-by-step will ensure that any issues are specific to the
> region-to-region communication, rather than intra-zone connectivity or
> cassandra cluster configuration.
> Dave Viner
> 
> On Fri, Mar 18, 2011 at 8:34 AM, A J  wrote:
>> 
>> Hello,
>> 
>> I am trying to setup a cassandra cluster across regions.
>> For testing I am keeping it simple and just having one node in US-EAST
>> (say ec2-1-2-3-4.compute-1.amazonaws.com) and one node in US-WEST (say
>> ec2-2-2-3-4.us-west-1.compute.amazonaws.com).
>> Using Cassandra 0.7.4
>> 
>> 
>> The one in east region is the seed node and has the values as:
>> auto_bootstrap: false
>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com
>> listen_address: ec2-1-2-3-4.compute-1.amazonaws.com
>> rpc_address: 0.0.0.0
>> 
>> The one in west region is non seed and has the values as:
>> auto_bootstrap: true
>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com
>> listen_address: ec2-2-2-3-4.us-west-1.compute.amazonaws.com
>> rpc_address: 0.0.0.0
>> 
>> I first fire the seed node (east region instance) and it comes up
>> without issues.
>> When I fire the non-seed node (west region instance) it fails after
>> sometime with the error:
>> 
>> DEBUG 15:09:08,844 Created HHOM instance, registered MBean.
>>  INFO 15:09:08,844 Joining: getting load information
>>  INFO 15:09:08,845 Sleeping 9 ms to wait for load information...
>> DEBUG 15:09:09,822 attempting to connect to
>> ec2-1-2-3-4.compute-1.amazonaws.com/1.2.3.4
>> DEBUG 15:09:10,825 Disseminating load info ...
>> DEBUG 15:10:10,826 Disseminating load info ...
>> DEBUG 15:10:38,845 ... got load info
>>  INFO 15:10:38,845 Joining: getting bootstrap token
>> ERROR 15:10:38,847 Exception encountered during startup.
>> java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
>>at
>> 
>> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:164)
>>at
>> 
>> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:146)
>>at
>> 
>> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:141)
>>at
>> 
>> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:450)
>>at
>> 
>> org.apache.cassandra.service.StorageService.initServer(StorageService.java:404)
>>at
>> 
>> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:192)
>>at
>> 
>> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
>>at
>> 
>> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
>> 
>> 
>> The seed node seems to somewhat acknowledge the non-seed node:

Re: script to modify cassandra.yaml file

2011-03-20 Thread aaron morton
Have you looked at Puppet ? This example is from 0.6* but it's still good 
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/easy_street_deploying_cassandra_via

Or Chef 
http://blog.darkhax.com/2010/11/05/instant-nosql-cluster-with-chef-cassandra-and-your-favorite-cloud-hosting-provider

Aaron

On 21 Mar 2011, at 12:39, Anurag Gujral wrote:

> Hi All,
>   I want to modify the values in the cassandra.yaml which comes with 
> the cassandra-0.7 package based on values of hostnames,
> colo etc.
> Does someone knows of some script which I can use which reads in default 
> cassandra.yaml and write outs new cassandra.yaml
> with values based on number of nodes in the cluster ,hostname,colo name etc.
> 
> Thanks
> Anurag



Re: Reading whole row vs a range of columns (pycassa)

2011-03-20 Thread aaron morton
Internally a multiget just turned into a series of single row gets. There is no 
seek and partial scan such as you may see when reading from the clustered index 
in a RDBMS. 

Unless you have a performance problem and you've tried other things I'd put 
this idea of the back burner. There are many other factors that impact read 
performance, and OOP requires a lot more care than RP.

Aaron
 
On 21 Mar 2011, at 11:36, buddhasystem wrote:

> Aaron, thanks for chiming in.
> 
> I'm doing what you said, i.e. all data for a single object (which is quite
> lean with about 100 attributes 10 bytes each) just goes into a single
> column, as opposed to the previous version of my application, which had all
> attributes of each small object mapped to individual columns.
> 
> So yes, I perhaps considered having 100 objects in a single column but that
> is suboptimal for many reasons (hard to add object later).
> 
> My reference to OOP was this -- if I was sticking with the original design,
> it could have been advantageous to have OOP since statistically it's likely
> that requests for objects are often serial, e.g. often people don't query
> for just one object with id=123, but for a series like id=[123..145]. If I
> bunch these into rows containing 100 objects each, that promises some
> efficiency right there, as I read one row as opposed to say 50.
> 
> 
> 
> 
> aaron morton wrote:
>> 
>> I'd collapse all the data for a single object into a single column, not
>> sure about storing 100 objects in a single column though. 
>> 
>> Have you considered any concurrency issues ? e.g. multiple threads /
>> processes wanting to update different objects in the same group of 100? 
>> 
>> Dont understand your reference to the OOP in the context of a reading 100
>> columns from a row. 
>> 
>> Aaron
>> 
>> 
>> On 19 Mar 2011, at 16:22, buddhasystem wrote:
>> 
>> > As I'm working on this further, I want to understand this:
>> > 
>> > Is it advantageous to flatten data in blocks (strings) each
>> containing a
>> > series of objects, if I know that a serial object read is often
>> likely, but
>> > don't want to resort to OPP? I worked out the optimal granularity, it
>> seems.
>> > Is it better to read a serialized single column with 100 objects than
>> a row
>> > consisting of a hundred columns each modeling an object?
>> > 
>> > --
>> > View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Reading-whole-row-vs-a-range-of-columns-pycassa-tp6186518p6186782.html
>> > Sent from the cassandra-u...@incubator.apache.org mailing list
>> archive at Nabble.com.
>> 
> 
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Reading-whole-row-vs-a-range-of-columns-pycassa-tp6186518p6190639.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.



Java segfault

2011-03-20 Thread Jason Harvey
Just ran into a Java segfault on 0.7.4 when Cassandra created a new
commitlog segment. Does that point to a bug in the JVM, or in
Cassandra? My guess would be the JVM, but I wanted to check before
submitting a bug report to anyone.

Thanks!
Jason


Re: Java segfault

2011-03-20 Thread Jonathan Ellis
segfaults are either a JVM bug (are you on the latest Sun version?
openjdk is very behind on fixes) or bad hardware.

On Sun, Mar 20, 2011 at 7:17 PM, Jason Harvey  wrote:
> Just ran into a Java segfault on 0.7.4 when Cassandra created a new
> commitlog segment. Does that point to a bug in the JVM, or in
> Cassandra? My guess would be the JVM, but I wanted to check before
> submitting a bug report to anyone.
>
> Thanks!
> Jason
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Java segfault

2011-03-20 Thread SriSatish Ambati
Indeed. This is likely at the JVM level.. (if not lower down the stack)

Do you happen to have a hs*err file for the crash? Is it reproducible?
What 'java -version"? What version of linux?

thanks,
Sri

On Sun, Mar 20, 2011 at 5:48 PM, Jonathan Ellis  wrote:

> segfaults are either a JVM bug (are you on the latest Sun version?
> openjdk is very behind on fixes) or bad hardware.
>
> On Sun, Mar 20, 2011 at 7:17 PM, Jason Harvey  wrote:
> > Just ran into a Java segfault on 0.7.4 when Cassandra created a new
> > commitlog segment. Does that point to a bug in the JVM, or in
> > Cassandra? My guess would be the JVM, but I wanted to check before
> > submitting a bug report to anyone.
> >
> > Thanks!
> > Jason
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: Java segfault

2011-03-20 Thread Jason Harvey
I'm on Openjdk. I'll switch over to Sun Java and see how that goes.

Thx for the info!
Jason

On Mar 20, 5:48 pm, Jonathan Ellis  wrote:
> segfaults are either a JVM bug (are you on the latest Sun version?
> openjdk is very behind on fixes) or bad hardware.
>
> On Sun, Mar 20, 2011 at 7:17 PM, Jason Harvey  wrote:
> > Just ran into a Java segfault on 0.7.4 when Cassandra created a new
> > commitlog segment. Does that point to a bug in the JVM, or in
> > Cassandra? My guess would be the JVM, but I wanted to check before
> > submitting a bug report to anyone.
>
> > Thanks!
> > Jason
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra 
> supporthttp://www.datastax.com


Re: reduced cached mem; resident set size growth

2011-03-20 Thread Ryan King
The test was inconclusive because we decomissioned that cluster before
it'd be running long enough to exhibit the problem.

-ryan

On Wed, Mar 16, 2011 at 7:27 PM, Zhu Han  wrote:
>
>
> On Thu, Feb 3, 2011 at 1:49 AM, Ryan King  wrote:
>>
>> On Wed, Feb 2, 2011 at 6:22 AM, Chris Burroughs
>>  wrote:
>> > On 01/28/2011 09:19 PM, Chris Burroughs wrote:
>> >> Thanks Oleg and Zhu.  I swear that wasn't a new hotspot version when I
>> >> checked, but that's obviously not the case.  I'll update one node to
>> >> the
>> >> latest as soon as I can and report back.
>> >
>> >
>> > RSS over 48 hours with java 6 update 23:
>> >
>> > http://img716.imageshack.us/img716/5202/u2348hours.png
>> >
>> > I'll continue monitoring but RSS still appears to grow without bounds.
>> > Zhu reported a similar problem with Ubuntu 10.04.  While possible, it
>> > would seem seam extraordinary unlikely that there is a glibc or kernel
>> > bug affecting us both.
>>
>> We're seeing a similar problem with one of our clusters (but over a
>> longer time scale). Its possible that its not a leak, but just
>> fragmentation. Unless you've told it otherwise, the jvm uses glibc's
>> malloc implementation for off-heap allocations. We're currently
>> running a test with jemalloc on one node to see if the problem goes
>> away.
>
> Ryan, does jemalloc solve the RSS growth problem in your test?
>
>> -ryan
>
>


Re: Active / Active Data Center and RF

2011-03-20 Thread mcasandra
CL is just a way to satisfy consistency but you still want majority of your
reads (preferrably) occurring in the same DC.

I don't think that answers my question at all. I understand the CL but I
think I have more basic and important question about active/active data
center and the replicas in that very specific scenario which to me looks
like a issue somehow. Can someone please look at my question specifically
again?




--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Active-Active-Data-Center-and-RF-tp6185528p6191120.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.