Need help in cassandra data model..

2010-09-01 Thread Manikandan R
Hello Everyone,

I am doing study on cassandra to make use of it for the below problem -

My current system has many components. To complete any transaction(request),
it has to pass through all the components. While it passes through, there is
no clear visibility to know what is happening behind each component at any
given point.

Trying to create a new component using cassandra, which receives the status
of the transaction(request) from other components for any given
transaction(request) id, so that it can viewed online in real time.

Here is the model, I have in my mind:

Transactions: {

100: { //transaction id as key

 t1: { //Say, at time t1

  status: "Received the request. Doing validation";
  mode: "debug";
 };

 t2 : { // Say, at time t2

 status: "dropped:1000|absent=2000";
 dropped: 1000;
 absent: 2000;
 mode: "info";
 };



101: { //transaction id as key

 t1: { //Say, at time t1

  status: "Doing External Verification..."
  mode: "debug";
 };

 t2 : { // Say, at time t2

 status: "valid:2000|invalid:1000";
 valid: 1000;
 invalid: 1000;
 mode: "info";
 };
}

Status can delivered to the users depending on the SLA. Based on the mode,
it can be displayed to users. Users who has access to see detailed status,
can see all the mode's status, where as, user who want to see only reports,
can see the mode with "info".

Please let me know your thoughts.

Thanks,
Mani


Re: Need help in cassandra data model..

2010-09-01 Thread aaron morton
From what you have described that sounds OK (am assuming there is not millions 
of events per transaction). When designing the model consider all the ways you 
will want to read the data back and then dernomalize the data appropriately. 
Ideally each request for data from cassandra should be handled by single CF. 

When storing the time of the event, you could use either a Long as the seconds 
since epoch or an ASCII string with the time formatted using the ISO format. 

Aaron

On 1 Sep 2010, at 19:09, Manikandan R wrote:

> Hello Everyone,
> 
> I am doing study on cassandra to make use of it for the below problem - 
> 
> My current system has many components. To complete any transaction(request), 
> it has to pass through all the components. While it passes through, there is 
> no clear visibility to know what is happening behind each component at any 
> given point.
> 
> Trying to create a new component using cassandra, which receives the status 
> of the transaction(request) from other components for any given 
> transaction(request) id, so that it can viewed online in real time.
> 
> Here is the model, I have in my mind:
> 
> Transactions: {
> 
> 100: { //transaction id as key
> 
>  t1: { //Say, at time t1
> 
>   status: "Received the request. Doing validation";
>   mode: "debug";
>  };
> 
>  t2 : { // Say, at time t2
>   
>  status: "dropped:1000|absent=2000";
>  dropped: 1000;
>  absent: 2000;
>  mode: "info";
>  };
> 
> 
> 
> 101: { //transaction id as key
> 
>  t1: { //Say, at time t1
> 
>   status: "Doing External Verification..."
>   mode: "debug";
>  };
> 
>  t2 : { // Say, at time t2
>   
>  status: "valid:2000|invalid:1000";
>  valid: 1000;
>  invalid: 1000;
>  mode: "info";
>  };
> }
> 
> Status can delivered to the users depending on the SLA. Based on the mode, it 
> can be displayed to users. Users who has access to see detailed status, can 
> see all the mode's status, where as, user who want to see only reports, can 
> see the mode with "info". 
> 
> Please let me know your thoughts.
> 
> Thanks,
> Mani



Server problem C 0.6.5

2010-09-01 Thread Thorvaldsson Justus
Ok I have two test servers, they are RH and pretty nice. I have two problems 
with one of them and none with the other. Same configuration but the seed and 
listen address that is their opposites. Nothing fancy. RF=2

All info I can get is also here and some more like conf, 590 rows
http://pastie.org/1131106

Problem nr 1 and the most annoying one.
I by emptying the data folder and commitlog folder and start the servers.

I write data to both nodes, this time CL.ONE but happen when CL.ALL aswell. The 
node that is troubling me is not writing memory to disc. As soon it is time to 
do that it just starts to GC and doing that for a long time and then enqueuing 
the flush and not write, its unresponsive during gc storms. The other node 
works just as expected, it takes the memory and writes it down in a matter of 
seconds, this is not a lot of memory and no reads.

Log from troubling node:
--
 INFO 10:42:26,842 GC for ParNew: 808 ms, 106688440 reclaimed leaving 
7273866048 used; max is 17388929024
 INFO 10:42:31,613 GC for ParNew: 882 ms, 120705376 reclaimed leaving 
7292752352 used; max is 17388929024
 INFO 10:42:32,615 GC for ParNew: 621 ms, 108181664 reclaimed leaving 
7324162368 used; max is 17388929024
 INFO 10:42:35,468 GC for ParNew: 732 ms, 107646952 reclaimed leaving 
7407855104 used; max is 17388929024
 INFO 10:42:36,540 GC for ParNew: 556 ms, 106819200 reclaimed leaving 
7440627584 used; max is 17388929024
 INFO 10:42:38,348 GC for ParNew: 676 ms, 111891904 reclaimed leaving 
7490450648 used; max is 17388929024
 INFO 10:42:39,413 GC for ParNew: 768 ms, 110205856 reclaimed leaving 
7519836472 used; max is 17388929024
 INFO 10:42:40,671 GC for ParNew: 755 ms, 112034384 reclaimed leaving 
7547393768 used; max is 17388929024
 INFO 10:42:41,884 GC for ParNew: 834 ms, 108972528 reclaimed leaving 
7578012920 used; max is 17388929024
 INFO 10:42:43,102 GC for ParNew: 971 ms, 110778800 reclaimed leaving 
7606825800 used; max is 17388929024
 INFO 10:42:44,391 GC for ParNew: 1076 ms, 109996232 reclaimed leaving 
7636421248 used; max is 17388929024
 --
I had trouble copy pasting all of the data running the server remotely with 
putty.

Ring
Address   Status Load  Range
  Ring
   142713423890871059377105093567732377974
x.x.x.211 Up 486 bytes 45911723912241754468195357739525604647 
|<--|
x.x.x.209 Up 501.23 MB 142713423890871059377105093567732377974
|-->|

tpstats from node that wont wake up from this state.

When doing the ParNew

Pool NameActive   Pending  Completed
STREAM-STAGE  0 0  0
RESPONSE-STAGE0 01003801
ROW-READ-STAGE0 0  0
LB-OPERATIONS 0 0  0
MISCELLANEOUS-POOL0 0  0
GMFD  0 0   1047
LB-TARGET 0 0  0
CONSISTENCY-MANAGER   0 0  0
ROW-MUTATION-STAGE   321830261035233
MESSAGE-STREAMING-POOL0 0  0
LOAD-BALANCER-STAGE   0 0  0
FLUSH-SORTER-POOL 0 0  0
MEMTABLE-POST-FLUSHER 1 2  1
FLUSH-WRITER-POOL 1 2  1
AE-SERVICE-STAGE  0 0  0
HINTED-HANDOFF-POOL   0 0  2

When done with ParNew

Pool NameActive   Pending  Completed
STREAM-STAGE  0 0  0
RESPONSE-STAGE0 01003801
ROW-READ-STAGE0 0  0
LB-OPERATIONS 0 0  0
MISCELLANEOUS-POOL0 0  0
GMFD  0 0  17617
LB-TARGET 0 0  0
CONSISTENCY-MANAGER   0 0  0
ROW-MUTATION-STAGE0 01218212
MESSAGE-STREAMING-POOL0 0  0
LOAD-BALANCER-STAGE   0 0  0
FLUSH-SORTER-POOL 0 0  0
MEMTABLE-POST-FLUSHER 1 2  2
FLUSH-WRITER-POOL 1 2  2
AE-SERVICE-STAGE  0 0  0
HINTED-HANDOFF-POOL   1 1  3

It is not that it is writing slowly but that is not writing at all, ever or 
extremely slowly I think it is writing from gossip not connections to the node. 
And not any amount and it has nothing to do with swapping or the 16gb it 

LongString

2010-09-01 Thread Kevin Irwig
Hi,

I came across this presentation (link below) by Sarkissian (no first name 
given) at Digg about their use of Cassandra. On page 27 he says "Custom 
comparators turn out to be key" and mentions in the next few slides a 
LongString (actually once a LongString the other times a LongSting, but I'm 
assuming that's just a typo). Most of my CFs use some long strings (urls) 
either as rows or column names, and I'm keen to know more about what they may 
have learned. Does anyone know if they contributed this class to back to 
Cassandra or can anyone guess at how long strings might need to be handled 
differently to what the standard string comparator does?

https://nosqleast.com/2009/slides/sarkissian-cassandra.pdf

Thanks in advance,
Kevin.


dont mind my last letter (server problem)

2010-09-01 Thread Thorvaldsson Justus
Or this one =)
Server error was indeed an error but on my behalf.

If you try to memlock more than available memory on server it will kernel crash
Also
If you use swap as ram it will be having a lot of trouble

/J

AB SVENSKA SPEL
106 10 Stockholm
Sturegatan 11, Sundbyberg
Växel +46 8 757 77 00
http://svenskaspel.se


Re: PHP/avro possibility

2010-09-01 Thread Jeff Hammerbacher
To follow up on this post: the PHP implementation of Avro has been committed
to trunk (see https://issues.apache.org/jira/browse/AVRO-627) and will be
available in the 1.4.0 release, which is being voted on currently.

On Wed, Aug 25, 2010 at 7:39 PM, Jeremy Hanna wrote:

> For those interested in PHP support with avro, there was an interesting
> email on the avro list.
>
> http://www.apacheserver.net/Avro-PHP-library-at226970.htm


Re: Client developer mailing list

2010-09-01 Thread Guilherme Defreitas
Hi guys,

I'm new in cassandra development and I would like to know witch is the best
(stable) client in Ruby to use with Cassandra? It will be use in a rails
project, but it don't need to be "Active Record" like.

Thanks
Guilherme

On Tue, Aug 31, 2010 at 11:58 PM, Gasol Wu  wrote:

> great to see, subscribed.
>


Re: JConsole/SSH tunneling tip

2010-09-01 Thread Edward Capriolo
On Wed, Sep 1, 2010 at 9:03 AM, Matthew Conway  wrote:
> If you need to tunnel jconsole to a remote cassandra instance, the SSH socks 
> proxy (ssh -D)is the easiest, least intrusive way.  More details:
>
> http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html
>
> Matt
>
>
Matts approach makes good sense for a interactive JMX session.

RMI is fubar. All my cacti templates use JMX to pull data from hadoop,
cassandra,etc. Though it is easy enough to tunnel your system
tunneling your monitoring station is harder

I converted all my scripts that originally ran remotely with jmx, to
run locally on the host, and then I call them with nagios remote
plugin executor.

# tail -1 /etc/nagios/nrpe.cfg
command[run_caches]=/usr/lib64/nagios/plugins/run_caches.sh $ARG1$
$ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$

# /usr/lib/nagios/plugins/check_nrpe -H cdbsd01.xx -c run_caches
-a cdbsd01.hadoop.pvt 8585 dummyUser dummyPass  RowCache Data
Size:103 Capacity:103 Hits:20770366 Requests:26687574
RecentHitRate:0.10300684727597499

This lets me collect all the performance data over NRPE and avoids all
the hostname issues, nat, and port issues.

Edward


Re: JConsole/SSH tunneling tip

2010-09-01 Thread Jonathan Ellis
Thanks for writing this up!

On Wed, Sep 1, 2010 at 6:03 AM, Matthew Conway  wrote:
> If you need to tunnel jconsole to a remote cassandra instance, the SSH socks 
> proxy (ssh -D)is the easiest, least intrusive way.  More details:
>
> http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html
>
> Matt
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: TTransportException intermittently in 0.7

2010-09-01 Thread Carl Bruecken
 I believe the problem here is with Pelops.   The batch_mutate does a 
flush and the pelops flush is not required.   Removing the flush from 
pelops fixes the issue.


By chance is anyone that has seen this error using Pelops?


On 8/31/10 6:54 PM, Carl Bruecken wrote:
I've made some progress on narrowing this down and am able to 
reproduce easily.   I am using pelops as a client and I configured the 
policy in pelops to only establish 1 connection to a cassandra node.  
I'm able to step through the pelops code line by line and see the 
resulting thrift transport logging in cassandra.   Seems that flushing 
the transport causes the unwanted TTransportConnection in the server 
and subsequent closing of the connection.   The connection should stay 
open after flushing.   When there are many connection established the 
behaviour seems intermittent and many operations succeed.




Here are the details

1) The trigger from the client side is when the framed transport is 
flushed.

   conn.getAPI().batch_mutate(convertedBatch, cLevel);
// Flush connection
conn.flush();

2) In CustomTThreadPoolServer.java in Cassandra I modified the code to 
log TTransportExceptions.


catch (TTransportException ttx) {
LOGGER.error("Transport exception", ttx);
} catch (TException tx) {
LOGGER.error("Thrift error occurred during processing of 
message.", tx);

} catch (Exception x) {
LOGGER.error("Error occurred during processing of message.", x);
}


3) Here is the exception that is ignored in cassandra.   Flushing the 
transport causes the server to believe the client has closed the 
connection.


org.apache.thrift.transport.TTransportException: Cannot read. Remote 
side has closed. Tried to read 4 bytes, but only got 0 bytes.

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2487)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:637)

4) The next batch mutate to this connection caused the exception in 
the client


 WARN [main] 2010-08-31 18:40:06,749 Operand.java (line 72) Operation 
failed as result of network exception. Connection must be destroyed.  
See cause for details...
org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Connection reset
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at 
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:905)
at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:889)

at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:42)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:38)
at org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:53)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:49)
at com.aol.data.c7.App.doWork(App.java:41)
at com.aol.data.c7.App.main(App.java:77)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

... 15 more




On 8/31/10 4:04 PM, Jonathan Ellis wrote:


No, I don't know that anyone has reproduced that.  TTransportException
always means "something went wrong on the thrift side" in my
experience, it shouldn't be cassandra-version specific.

On Tue, Aug 31, 2010 at 12:53 PM, Carl Bruecken
 wrote:
>
>  Are there any estimates as to when a fix for this will be checked into
> trunk?
>
> Coincidentally, has any

Re: TTransportException intermittently in 0.7

2010-09-01 Thread Andres March
 I saw this with pelops and only with batch mutate.  Other calls worked 
fine.


On 09/01/2010 08:16 AM, Carl Bruecken wrote:
I believe the problem here is with Pelops.   The batch_mutate does a 
flush and the pelops flush is not required.   Removing the flush from 
pelops fixes the issue.


By chance is anyone that has seen this error using Pelops?


On 8/31/10 6:54 PM, Carl Bruecken wrote:
I've made some progress on narrowing this down and am able to 
reproduce easily.   I am using pelops as a client and I configured 
the policy in pelops to only establish 1 connection to a cassandra 
node.  I'm able to step through the pelops code line by line and see 
the resulting thrift transport logging in cassandra.   Seems that 
flushing the transport causes the unwanted TTransportConnection in 
the server and subsequent closing of the connection.   The connection 
should stay open after flushing.   When there are many connection 
established the behaviour seems intermittent and many operations succeed.




Here are the details

1) The trigger from the client side is when the framed transport is 
flushed.

   conn.getAPI().batch_mutate(convertedBatch, cLevel);
// Flush connection
conn.flush();

2) In CustomTThreadPoolServer.java in Cassandra I modified the code 
to log TTransportExceptions.


catch (TTransportException ttx) {
LOGGER.error("Transport exception", ttx);
} catch (TException tx) {
LOGGER.error("Thrift error occurred during processing of 
message.", tx);

} catch (Exception x) {
LOGGER.error("Error occurred during processing of message.", x);
}


3) Here is the exception that is ignored in cassandra.   Flushing the 
transport causes the server to believe the client has closed the 
connection.


org.apache.thrift.transport.TTransportException: Cannot read. Remote 
side has closed. Tried to read 4 bytes, but only got 0 bytes.

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2487)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:637)

4) The next batch mutate to this connection caused the exception in 
the client


 WARN [main] 2010-08-31 18:40:06,749 Operand.java (line 72) Operation 
failed as result of network exception. Connection must be destroyed.  
See cause for details...
org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Connection reset
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at 
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:905)
at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:889)

at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:42)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:38)
at org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:53)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:49)
at com.aol.data.c7.App.doWork(App.java:41)
at com.aol.data.c7.App.main(App.java:77)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

... 15 more




On 8/31/10 4:04 PM, Jonathan Ellis wrote:


No, I don't know that anyone has reproduced that.  TTransportException
always means "something went wrong on the thrift side" in my
experience, it shouldn't be cassandra-version specific.

On Tue, Aug 31, 2010 at 12:53 PM, Carl Brue

Re: LongString

2010-09-01 Thread Tyler Hobbs
Hey Kevin,

It looks to me like they use LongString to sort primarily by the Long, then
secondarily by the String portion of a "LongString".  I believe all you
would need to do to implement this is to add a class to
src/java/org/apache/cassandra/db/marshal .  I would recommend copying
LongType to start, then adjust the compare() and getString() methods.  As
far as using this for column names goes, I don't think there's anything else
to it.  I'm not sure about row keys.

- Tyler

On Wed, Sep 1, 2010 at 5:27 AM, Kevin Irwig  wrote:

>  Hi,
>
> I came across this presentation (link below) by Sarkissian (no first name
> given) at Digg about their use of Cassandra. On page 27 he says "Custom
> comparators turn out to be key" and mentions in the next few slides a
> LongString (actually once a LongString the other times a LongSting, but I'm
> assuming that's just a typo). Most of my CFs use some long strings (urls)
> either as rows or column names, and I'm keen to know more about what they
> may have learned. Does anyone know if they contributed this class to back to
> Cassandra or can anyone guess at how long strings might need to be handled
> differently to what the standard string comparator does?
>
> https://nosqleast.com/2009/slides/sarkissian-cassandra.pdf
>
> Thanks in advance,
> Kevin.
>
>


Re: Client developer mailing list

2010-09-01 Thread Ryan King
On Wed, Sep 1, 2010 at 4:40 AM, Guilherme Defreitas
 wrote:
> Hi guys,
> I'm new in cassandra development and I would like to know witch is the best
> (stable) client in Ruby to use with Cassandra? It will be use in a rails
> project, but it don't need to be "Active Record" like.

Try http://github.com/fauna/cassandra or
http://github.com/nzkoz/cassandra_object

-ryan


Re: Cassandra on AWS across Regions

2010-09-01 Thread Peter Fales
A few months ago, there was a thread on this list about using Cassandra
across multiple EC2 regions.   I was interested in doing in doing 
the same thing, and managed to make it work.

To implement this, there are basically two things that need to change.
First, in storage-conf.xml, I used the "external" IP addresses for
 and  - these external address are needed for 
the machines in different regions to talk to each other.   However, they
also work within regions.  

However, that doesn't quite work with the stock Cassandra, as it will
try to bind and listen on those addresses and give up because they
don't appear to be valid network addresses.  This patch causes 
Cassandra to listen on the local network, rather than the 
defined in the config file.   (This is not a completely general
solution.  It assumes that there is only one local network, and that the
default network is the one to use, but - at least for EC2 - that assumption
should be OK)

Part of my motivation for posting here is to solicit feedback on the 
third part of the patch.   I was able to get my two-region cluster 
up and running by patching just the first two files.   The third
change may be needed under certain conditions, but I never seemed to
hit that code.

Here's the source patch:


diff -ur 
orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java
 
apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java
--- 
orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java
 2010-08-16 17:48:02.0 -0500
+++ 
apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java
  2010-09-01 10:05:34.0 -0500
@@ -147,7 +147,16 @@
 ServerSocketChannel serverChannel = ServerSocketChannel.open();
 final ServerSocket ss = serverChannel.socket();
 ss.setReuseAddress(true);
+
+/* OLD 
 ss.bind(new InetSocketAddress(localEp, 
DatabaseDescriptor.getStoragePort()));
+*/
+   /* In order to allow using Amazon EC2 across regions, we listen
+* on our local address, rather rather than the "public" IP address
+* defined in storage-conf.xml 
+*/
+ss.bind(new InetSocketAddress(InetAddress.getLocalHost(), 
DatabaseDescriptor.getStoragePort()));
+
 socketThread = new SocketThread(ss, "ACCEPT-" + localEp);
 socketThread.start();
 listenGate.signalAll();
diff -ur 
orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
 
apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
--- 
orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
2010-07-27 16:09:18.0 -0500
+++ 
apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
 2010-09-01 10:09:31.0 -0500
@@ -149,7 +149,16 @@
 try
 {
 // zero means 'bind on any available port.'
+
+   /* In order to allow using Amazon EC2 across regions, we 
+* listen on our local address, rather rather than the
+* "public" IP address defined in storage-conf.xml
+*/
+
+/* OLD
 socket = new Socket(endpoint, 
DatabaseDescriptor.getStoragePort(), FBUtilities.getLocalAddress(), 0);
+*/
+socket = new Socket(endpoint, 
DatabaseDescriptor.getStoragePort(), InetAddress.getLocalHost(), 0);
 socket.setTcpNoDelay(true);
 output = new DataOutputStream(socket.getOutputStream());
 return true;
diff -ur 
orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java
 
apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java
--- 
orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java
   2010-05-28 11:23:04.0 -0500
+++ 
apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java
2010-09-01 10:07:43.0 -0500
@@ -122,6 +122,14 @@
 {
 SocketChannel channel = SocketChannel.open();
 // force local binding on correctly specified interface.
+
+   /* When using Amazon EC2 "public" IP addresses, we probably
+* won't be able to bind to the address.  However, I don't see
+* this code getting hit, and I'm not sure under what circumstances
+* it would get run.
+*/
+System.out.println("FIXME - probably can't bind to this address: 
"+FBUtilities.getLocalAddress()+"\n");
+
 channel.socket().bind(new 
InetSocketAddress(FBUtilities.getLocalAddress(), 0));
 int attempts = 0;
 while (true)


-- 
Peter Fales
Alcatel-Lucent
Member of Technical Staff
1960 Lucent Lane
Room: 9H-505
Naperville, IL 60566-7033
Email: peter.fa...@alcatel-lucent.com
Phone: 630 979 8031


Re: JConsole/SSH tunneling tip

2010-09-01 Thread Janne Jalkanen

On Sep 1, 2010, at 16:03 , Matthew Conway wrote:

If you need to tunnel jconsole to a remote cassandra instance, the  
SSH socks proxy (ssh -D)is the easiest, least intrusive way.  More  
details:


http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html


Totally awesome. I've lost several hours of my precious life trying to  
figure out how to do this, and now you solved it!  Me and whatever is  
left of my sanity thank you :-D


/Janne


Re: What is K-table ?

2010-09-01 Thread Andrew Garman
yaw  gmail.com> writes:

> 
> Hi all, connecting to a cluster with cassandra-cli and trying a 
> describe command,
> I obtain a  "missing K_TABLE" message



> Is this a real issue?

I would chock this one up to a mix of user error and cryptic 
CLI message.

describe command is describe keyspace 

--
[defa...@unknown] describe keyspace alerts
Keyspace: alerts

Column Family Name: byLocation
Column Family Type: Super
Column Sorted By: org.apache.cassandra.db.marshal.UTF8Type
flush period: null minutes
--

The CLI seems to be nice enough to guess that what you want
is describe keyspace.  If the CLI responded with something 
informative, users would have a d'oh moment and type it right 
next time.

--
[defa...@unknown] describe alerts
line 1:9 missing K_TABLE at 'alerts'
Keyspace: alerts

Column Family Name: byLocation
Column Family Type: Super
Column Sorted By: org.apache.cassandra.db.marshal.UTF8Type
flush period: null minutes
--

I'm guessing that as describe  gives a message
that the developers plan to describe other things in the future.
At which point, it'll become a worst bet that the user intends 
to describe keyspace as more things can be described.




Re: Cassandra on AWS across Regions

2010-09-01 Thread Andres March

 Could you explain this point further?  Was there an exception?

On 09/01/2010 09:26 AM, Peter Fales wrote:

that doesn't quite work with the stock Cassandra, as it will
try to bind and listen on those addresses and give up because they
don't appear to be valid network addresses.


--
*Andres March*
ama...@qualcomm.com 
Qualcomm Internet Services


Riptano Cassandra training in Denver

2010-09-01 Thread Jonathan Ellis
Riptano is going to be in Denver next Friday (Sept 10) for a full-day
Cassandra training (taught by yours truly).  The training is broken
into two parts: the first covers application design and modeling in
Cassandra, with exercises using the Pycassa library; the second covers
operations, troubleshooting, and performance tuning.

For more details or to register for the training, see
http://www.eventbrite.com/event/756085472

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cassandra on AWS across Regions

2010-09-01 Thread Benjamin Black
The issue is this:

The IP address by which an EC2 instance is known _externally_ is not
actually on the instance itself (the address being translated), and
the _internal_ address is not accessible across regions.  Since you
can't bind a specific address that is not on one of your local
interfaces, and Cassandra nodes don't have a notion of internal vs
external you need a mechanism by which a node is told to bind one IP
(the internal one), while it gossips another (the external one).

I like what this patch does conceptually, but would prefer
configuration options to cause it to happen (obviously a much larger
patch).  Very cool, Peter!


b

On Wed, Sep 1, 2010 at 1:10 PM, Andres March  wrote:
> Could you explain this point further?  Was there an exception?
>
> On 09/01/2010 09:26 AM, Peter Fales wrote:
>
> that doesn't quite work with the stock Cassandra, as it will
> try to bind and listen on those addresses and give up because they
> don't appear to be valid network addresses.
>
> --
> Andres March
> ama...@qualcomm.com
> Qualcomm Internet Services


Re: Cassandra on AWS across Regions

2010-09-01 Thread Peter Fales
I probably should have made it clear that I wasn't proposing this as
an official patch (as you point out, it's not general enough for 
production use).   I'm just looking for feedback on the concept (thanks!)
and thought it might possibly be useful to other folks trying to
do the same thing.


On Wed, Sep 01, 2010 at 03:24:44PM -0500, Benjamin Black wrote:
> The issue is this:
> 
> The IP address by which an EC2 instance is known _externally_ is not
> actually on the instance itself (the address being translated), and
> the _internal_ address is not accessible across regions.  Since you
> can't bind a specific address that is not on one of your local
> interfaces, and Cassandra nodes don't have a notion of internal vs
> external you need a mechanism by which a node is told to bind one IP
> (the internal one), while it gossips another (the external one).
> 
> I like what this patch does conceptually, but would prefer
> configuration options to cause it to happen (obviously a much larger
> patch).  Very cool, Peter!
> 
> 
> b
> 
> On Wed, Sep 1, 2010 at 1:10 PM, Andres March  wrote:
> > Could you explain this point further?  Was there an exception?
> >
> > On 09/01/2010 09:26 AM, Peter Fales wrote:
> >
> > that doesn't quite work with the stock Cassandra, as it will
> > try to bind and listen on those addresses and give up because they
> > don't appear to be valid network addresses.
> >
> > --
> > Andres March
> > ama...@qualcomm.com
> > Qualcomm Internet Services

-- 
Peter Fales
Alcatel-Lucent
Member of Technical Staff
1960 Lucent Lane
Room: 9H-505
Naperville, IL 60566-7033
Email: peter.fa...@alcatel-lucent.com
Phone: 630 979 8031


Re: Cassandra on AWS across Regions

2010-09-01 Thread Edward Capriolo
On Wed, Sep 1, 2010 at 4:42 PM, Peter Fales
 wrote:
> I probably should have made it clear that I wasn't proposing this as
> an official patch (as you point out, it's not general enough for
> production use).   I'm just looking for feedback on the concept (thanks!)
> and thought it might possibly be useful to other folks trying to
> do the same thing.
>
>
> On Wed, Sep 01, 2010 at 03:24:44PM -0500, Benjamin Black wrote:
>> The issue is this:
>>
>> The IP address by which an EC2 instance is known _externally_ is not
>> actually on the instance itself (the address being translated), and
>> the _internal_ address is not accessible across regions.  Since you
>> can't bind a specific address that is not on one of your local
>> interfaces, and Cassandra nodes don't have a notion of internal vs
>> external you need a mechanism by which a node is told to bind one IP
>> (the internal one), while it gossips another (the external one).
>>
>> I like what this patch does conceptually, but would prefer
>> configuration options to cause it to happen (obviously a much larger
>> patch).  Very cool, Peter!
>>
>>
>> b
>>
>> On Wed, Sep 1, 2010 at 1:10 PM, Andres March  wrote:
>> > Could you explain this point further?  Was there an exception?
>> >
>> > On 09/01/2010 09:26 AM, Peter Fales wrote:
>> >
>> > that doesn't quite work with the stock Cassandra, as it will
>> > try to bind and listen on those addresses and give up because they
>> > don't appear to be valid network addresses.
>> >
>> > --
>> > Andres March
>> > ama...@qualcomm.com
>> > Qualcomm Internet Services
>
> --
> Peter Fales
> Alcatel-Lucent
> Member of Technical Staff
> 1960 Lucent Lane
> Room: 9H-505
> Naperville, IL 60566-7033
> Email: peter.fa...@alcatel-lucent.com
> Phone: 630 979 8031
>

Even though the performance will be impacted, this essentially is
allowing cassandra to run over Network Address Translated IP. Not a
bad thing.


Re: Cassandra on AWS across Regions

2010-09-01 Thread Andres March
 Is it not possible to put the external host name in cassandra.yaml and 
add a host entry in /etc/hosts for that name to resolve to the local 
interface?


On 09/01/2010 01:24 PM, Benjamin Black wrote:

The issue is this:

The IP address by which an EC2 instance is known _externally_ is not
actually on the instance itself (the address being translated), and
the _internal_ address is not accessible across regions.  Since you
can't bind a specific address that is not on one of your local
interfaces, and Cassandra nodes don't have a notion of internal vs
external you need a mechanism by which a node is told to bind one IP
(the internal one), while it gossips another (the external one).

I like what this patch does conceptually, but would prefer
configuration options to cause it to happen (obviously a much larger
patch).  Very cool, Peter!


b

On Wed, Sep 1, 2010 at 1:10 PM, Andres March  wrote:

Could you explain this point further?  Was there an exception?

On 09/01/2010 09:26 AM, Peter Fales wrote:

that doesn't quite work with the stock Cassandra, as it will
try to bind and listen on those addresses and give up because they
don't appear to be valid network addresses.

--
Andres March
ama...@qualcomm.com
Qualcomm Internet Services


--
*Andres March*
ama...@qualcomm.com 
Qualcomm Internet Services


Re: Cassandra on AWS across Regions

2010-09-01 Thread Benjamin Black
It's not gossiping hostnames, it's gossiping IP addresses.  The
purpose of Peter's patch is to have the system gossip its external
address (so other nodes can connect), but bind its internal address.
As Edward notes, it helps with NAT in general, not just EC2.  Not
perfect, but a great start.


b

On Wed, Sep 1, 2010 at 2:57 PM, Andres March  wrote:
> Is it not possible to put the external host name in cassandra.yaml and add a
> host entry in /etc/hosts for that name to resolve to the local interface?
>
> On 09/01/2010 01:24 PM, Benjamin Black wrote:
>
> The issue is this:
>
> The IP address by which an EC2 instance is known _externally_ is not
> actually on the instance itself (the address being translated), and
> the _internal_ address is not accessible across regions.  Since you
> can't bind a specific address that is not on one of your local
> interfaces, and Cassandra nodes don't have a notion of internal vs
> external you need a mechanism by which a node is told to bind one IP
> (the internal one), while it gossips another (the external one).
>
> I like what this patch does conceptually, but would prefer
> configuration options to cause it to happen (obviously a much larger
> patch).  Very cool, Peter!
>
>
> b
>
> On Wed, Sep 1, 2010 at 1:10 PM, Andres March  wrote:
>
> Could you explain this point further?  Was there an exception?
>
> On 09/01/2010 09:26 AM, Peter Fales wrote:
>
> that doesn't quite work with the stock Cassandra, as it will
> try to bind and listen on those addresses and give up because they
> don't appear to be valid network addresses.
>
> --
> Andres March
> ama...@qualcomm.com
> Qualcomm Internet Services
>
> --
> Andres March
> ama...@qualcomm.com
> Qualcomm Internet Services


Re: Cassandra on AWS across Regions

2010-09-01 Thread Jonathan Ellis
+1

On Wed, Sep 1, 2010 at 1:24 PM, Benjamin Black  wrote:
> The issue is this:
>
> The IP address by which an EC2 instance is known _externally_ is not
> actually on the instance itself (the address being translated), and
> the _internal_ address is not accessible across regions.  Since you
> can't bind a specific address that is not on one of your local
> interfaces, and Cassandra nodes don't have a notion of internal vs
> external you need a mechanism by which a node is told to bind one IP
> (the internal one), while it gossips another (the external one).
>
> I like what this patch does conceptually, but would prefer
> configuration options to cause it to happen (obviously a much larger
> patch).  Very cool, Peter!
>
>
> b
>
> On Wed, Sep 1, 2010 at 1:10 PM, Andres March  wrote:
>> Could you explain this point further?  Was there an exception?
>>
>> On 09/01/2010 09:26 AM, Peter Fales wrote:
>>
>> that doesn't quite work with the stock Cassandra, as it will
>> try to bind and listen on those addresses and give up because they
>> don't appear to be valid network addresses.
>>
>> --
>> Andres March
>> ama...@qualcomm.com
>> Qualcomm Internet Services
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cassandra on AWS across Regions

2010-09-01 Thread Andres March
 I thought you might say that.  Is there some reason to gossip IP 
addresses vs hostnames?  I thought that layer of indirection could be 
useful in more than just this use case.


I still think it is a good idea to have a separate bind vs gossip config 
param.


On 09/01/2010 03:10 PM, Benjamin Black wrote:

It's not gossiping hostnames, it's gossiping IP addresses.  The
purpose of Peter's patch is to have the system gossip its external
address (so other nodes can connect), but bind its internal address.
As Edward notes, it helps with NAT in general, not just EC2.  Not
perfect, but a great start.


b

On Wed, Sep 1, 2010 at 2:57 PM, Andres March  wrote:

Is it not possible to put the external host name in cassandra.yaml and add a
host entry in /etc/hosts for that name to resolve to the local interface?

On 09/01/2010 01:24 PM, Benjamin Black wrote:

The issue is this:

The IP address by which an EC2 instance is known _externally_ is not
actually on the instance itself (the address being translated), and
the _internal_ address is not accessible across regions.  Since you
can't bind a specific address that is not on one of your local
interfaces, and Cassandra nodes don't have a notion of internal vs
external you need a mechanism by which a node is told to bind one IP
(the internal one), while it gossips another (the external one).

I like what this patch does conceptually, but would prefer
configuration options to cause it to happen (obviously a much larger
patch).  Very cool, Peter!


b

On Wed, Sep 1, 2010 at 1:10 PM, Andres March  wrote:

Could you explain this point further?  Was there an exception?

On 09/01/2010 09:26 AM, Peter Fales wrote:

that doesn't quite work with the stock Cassandra, as it will
try to bind and listen on those addresses and give up because they
don't appear to be valid network addresses.

--
Andres March
ama...@qualcomm.com
Qualcomm Internet Services

--
Andres March
ama...@qualcomm.com
Qualcomm Internet Services


--
*Andres March*
ama...@qualcomm.com 
Qualcomm Internet Services


Re: Cassandra on AWS across Regions

2010-09-01 Thread Benjamin Black
On Wed, Sep 1, 2010 at 3:18 PM, Andres March  wrote:
> I thought you might say that.  Is there some reason to gossip IP addresses
> vs hostnames?  I thought that layer of indirection could be useful in more
> than just this use case.
>

The trade-off for that flexibility is that nodes are now dependent on
name resolution during normal operation, rather than only at startup.
The opportunities for horribly confusing failure scenarios are
numerous and frightening.  Other than NAT (which can clearly be dealt
with without gossiping hostnames), what do you think this would
enable?


b


Re: Cassandra on AWS across Regions

2010-09-01 Thread Joe Stump

On Sep 1, 2010, at 1:42 PM, Peter Fales wrote:

> I probably should have made it clear that I wasn't proposing this as
> an official patch (as you point out, it's not general enough for 
> production use).   I'm just looking for feedback on the concept (thanks!)
> and thought it might possibly be useful to other folks trying to
> do the same thing.

We're extremely interested in this patch and helping out. Let me know if you 
need resources. SimpleGeo is ready, willing, and able to help as we are close 
to undertaking a similar endeavor. 

--Joe



Re: Cassandra on AWS across Regions

2010-09-01 Thread Andres March
 I didn't have anything specific in mind. I understand all the issues 
around DNS and not advocating only supporting hostnames (just thought it 
would be a nice option).  I also wouldn't expect name resolution to be 
done all the time, only when the node is first being started or during 
initial discovery.


One use case might be when nodes are spread out over multiple networks 
as the poster describes, nodes on the same network on a private 
interface could incur less network overhead than if they go out through 
the public interface.  I'm not sure that this is even possible given 
that cassandra binds to only one interface.



On 09/01/2010 03:23 PM, Benjamin Black wrote:

On Wed, Sep 1, 2010 at 3:18 PM, Andres March  wrote:

I thought you might say that.  Is there some reason to gossip IP addresses
vs hostnames?  I thought that layer of indirection could be useful in more
than just this use case.


The trade-off for that flexibility is that nodes are now dependent on
name resolution during normal operation, rather than only at startup.
The opportunities for horribly confusing failure scenarios are
numerous and frightening.  Other than NAT (which can clearly be dealt
with without gossiping hostnames), what do you think this would
enable?


b


--
*Andres March*
ama...@qualcomm.com 
Qualcomm Internet Services


Re: Cassandra on AWS across Regions

2010-09-01 Thread Benjamin Black
On Wed, Sep 1, 2010 at 4:16 PM, Andres March  wrote:
> I didn't have anything specific in mind. I understand all the issues around
> DNS and not advocating only supporting hostnames (just thought it would be a
> nice option).  I also wouldn't expect name resolution to be done all the
> time, only when the node is first being started or during initial discovery.
>

All nodes would have to resolve whenever topology changed.

> One use case might be when nodes are spread out over multiple networks as
> the poster describes, nodes on the same network on a private interface could
> incur less network overhead than if they go out through the public
> interface.  I'm not sure that this is even possible given that cassandra
> binds to only one interface.
>

This case is not actually solved more simply by gossiping hostnames.
It requires much more in-depth understanding of infrastructure
topology.


b


order of mutations in batch_mutate

2010-09-01 Thread Terje Marthinussen
Hi,

Just a curiosity. I should probably read some code and write a test to make
sure, but not important enough right now for that :)

   -


   void batch_mutate(string keyspace,
map>> mutation_map, ConsistencyLevel
consistency_level)

Will performance of a batch_mutate be affected by the order of mutations in
the list?

Terje


Re: order of mutations in batch_mutate

2010-09-01 Thread Jonathan Ellis
no

On Wed, Sep 1, 2010 at 7:34 PM, Terje Marthinussen
 wrote:
> Hi,
>
> Just a curiosity. I should probably read some code and write a test to make
> sure, but not important enough right now for that :)
>
> void batch_mutate(string keyspace, map>> mutation_map, ConsistencyLevel consistency_level)
>
> Will performance of a batch_mutate be affected by the order of mutations in
> the list?
>
> Terje
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Riptano Cassandra training in Denver

2010-09-01 Thread vineet daniel
Hi Jonathan

Any plans of coming to India in future ?

___
Regards
Vineet Daniel
+918106217121
___

Let your email find you


On Thu, Sep 2, 2010 at 1:52 AM, Jonathan Ellis  wrote:

> Riptano is going to be in Denver next Friday (Sept 10) for a full-day
> Cassandra training (taught by yours truly).  The training is broken
> into two parts: the first covers application design and modeling in
> Cassandra, with exercises using the Pycassa library; the second covers
> operations, troubleshooting, and performance tuning.
>
> For more details or to register for the training, see
> http://www.eventbrite.com/event/756085472
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


Re: Riptano Cassandra training in Denver

2010-09-01 Thread samal gorai
It will be gr8.

Samal Gorai

On Thu, Sep 2, 2010 at 10:46 AM, vineet daniel wrote:

> Hi Jonathan
>
> Any plans of coming to India in future ?
>
> ___
> Regards
> Vineet Daniel
> +918106217121
> ___
>
> Let your email find you
>
>
> On Thu, Sep 2, 2010 at 1:52 AM, Jonathan Ellis  wrote:
>
>> Riptano is going to be in Denver next Friday (Sept 10) for a full-day
>> Cassandra training (taught by yours truly).  The training is broken
>> into two parts: the first covers application design and modeling in
>> Cassandra, with exercises using the Pycassa library; the second covers
>> operations, troubleshooting, and performance tuning.
>>
>> For more details or to register for the training, see
>> http://www.eventbrite.com/event/756085472
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>


about insert benchmark

2010-09-01 Thread ChingShen
Hi all,

  I run a benchmark with my own code and found that the 10 inserts
performance is better than others, Why?
 Can anyone explain it?

Thanks.

Partitioner = OPP
CL = ONE
==
1000 records
insert one:201 ms
insert per:0.201 ms
insert thput:4975.1245 ops/sec
==
1 records
insert one:1950 ms
insert per:0.195 ms
insert thput:5128.205 ops/sec
==
10 records
insert one:15576 ms
insert per:0.15576 ms
insert thput:6420.134 ops/sec
==
50 records
insert one:82177 ms
insert per:0.164354 ms
insert thput:6084.4272 ops/sec

Shen


Re: about insert benchmark

2010-09-01 Thread vineet daniel
Hi Ching

You are inserting using php,perl,python,java or ? and is cassandra installed
locally or on a network system and is it a single system or you have a
cluster of nodes. I know I've asked you many questions but the answers will
help immensely to assess the results.

Anyways congrats on getting better results :-) .
___
Regards
Vineet Daniel
+918106217121
___

Let your email find you


On Thu, Sep 2, 2010 at 11:39 AM, ChingShen  wrote:

> Hi all,
>
>   I run a benchmark with my own code and found that the 10 inserts
> performance is better than others, Why?
>  Can anyone explain it?
>
> Thanks.
>
> Partitioner = OPP
> CL = ONE
> ==
> 1000 records
> insert one:201 ms
> insert per:0.201 ms
> insert thput:4975.1245 ops/sec
> ==
> 1 records
> insert one:1950 ms
> insert per:0.195 ms
> insert thput:5128.205 ops/sec
> ==
> 10 records
> insert one:15576 ms
> insert per:0.15576 ms
> insert thput:6420.134 ops/sec
> ==
> 50 records
> insert one:82177 ms
> insert per:0.164354 ms
> insert thput:6084.4272 ops/sec
>
> Shen
>


Re: about insert benchmark

2010-09-01 Thread ChingShen
Hi Daniel,

   I have 4 nodes in my cluster, and run a benchmark on node A in Java.
  P.S. Replication = 3

Shen

On Thu, Sep 2, 2010 at 2:49 PM, vineet daniel wrote:

> Hi Ching
>
> You are inserting using php,perl,python,java or ? and is cassandra
> installed locally or on a network system and is it a single system or you
> have a cluster of nodes. I know I've asked you many questions but the
> answers will help immensely to assess the results.
>
> Anyways congrats on getting better results :-) .
>
> ___
> Regards
> Vineet Daniel
> +918106217121
> ___
>
> Let your email find you
>
>
> On Thu, Sep 2, 2010 at 11:39 AM, ChingShen wrote:
>
>> Hi all,
>>
>>   I run a benchmark with my own code and found that the 10 inserts
>> performance is better than others, Why?
>>  Can anyone explain it?
>>
>> Thanks.
>>
>> Partitioner = OPP
>> CL = ONE
>> ==
>> 1000 records
>> insert one:201 ms
>> insert per:0.201 ms
>> insert thput:4975.1245 ops/sec
>> ==
>> 1 records
>> insert one:1950 ms
>> insert per:0.195 ms
>> insert thput:5128.205 ops/sec
>> ==
>> 10 records
>> insert one:15576 ms
>> insert per:0.15576 ms
>> insert thput:6420.134 ops/sec
>> ==
>> 50 records
>> insert one:82177 ms
>> insert per:0.164354 ms
>> insert thput:6084.4272 ops/sec
>>
>> Shen
>>
>
>