Re: Performance regression tests

2010-05-12 Thread Johan Oskarsson
I've started looking into how this issue. My current thinking are as follows.

Add support for Cassandra in Whirr: 
http://wiki.apache.org/incubator/WhirrProposal
This would allow us to start a short lived Cassandra cluster on one of the 
cloud services (EC2/Rackspace etc) for testing.
Real hardware would of course be better, but this is a good starting point.

For running the actual tests I have been looking at YCSB: 
http://github.com/brianfrankcooper/YCSB
I've added support for Cassandra trunk as of last week and am now off and on 
working on adding an measurements export function so we can get the results as 
a JSON file. It's fairly straight forward.

The best way to expose these results as graphs etc and raise an error if they 
are unexpected would be a plugin to Hudson. That way all our test results are 
in one place.
Other projects such as HBase might be interested in contributing to a 
Hudson-YCSB plugin. This would probably be best done as separate project on 
github for example.

If we want further results on how performance is affected by failures we could 
run with
http://github.com/toddlipcon/gremlins
or
https://issues.apache.org/jira/browse/CASSANDRA-561


Thoughts?

/Johan

On 11 maj 2010, at 20.38, Kushal Pisavadia wrote:

> Hi,
> 
> Due to conflicting schedules, I was unable to take part in the GSoC this
> year. However, I'm still very interested in helping out the community for
> this specific case.
> 
> Rather than just coding off a solution that would suit my own needs, I'm
> here asking for some help.
> 
> What short-term goals do you have in mind? What long-term goals do you have
> in mind?
> 
> I've had a look at the respective ticket —
> https://issues.apache.org/jira/browse/CASSANDRA-875 — but rather than just
> refactor the py_stress utility I'd like to make something that fulfils
> whatever needs that the current utility fails to meet.
> 
> I'm also curious about how you'd like me to commit/expose my code.
> Originally I was thinking of creating a separate git repo, specific to this
> utility, but have no issues working from a fork on Github either.
> 
> Kind Regards,
> 
> Kushal Pisavadia



Re: Performance regression tests

2010-05-12 Thread Todd Lipcon
Hey Johan,

A Hudson plugin would be great. A short term solution, though, would
be to simply use the existing support for Hudson graphing from
properties files: http://wiki.hudson-ci.org/display/HUDSON/Plot+Plugin

At a previous job we used to use this plugin to plot web page response
times, and it served its purpose great.

-Todd

On Wed, May 12, 2010 at 1:13 AM, Johan Oskarsson  wrote:
> I've started looking into how this issue. My current thinking are as follows.
>
> Add support for Cassandra in Whirr: 
> http://wiki.apache.org/incubator/WhirrProposal
> This would allow us to start a short lived Cassandra cluster on one of the 
> cloud services (EC2/Rackspace etc) for testing.
> Real hardware would of course be better, but this is a good starting point.
>
> For running the actual tests I have been looking at YCSB: 
> http://github.com/brianfrankcooper/YCSB
> I've added support for Cassandra trunk as of last week and am now off and on 
> working on adding an measurements export function so we can get the results 
> as a JSON file. It's fairly straight forward.
>
> The best way to expose these results as graphs etc and raise an error if they 
> are unexpected would be a plugin to Hudson. That way all our test results are 
> in one place.
> Other projects such as HBase might be interested in contributing to a 
> Hudson-YCSB plugin. This would probably be best done as separate project on 
> github for example.
>
> If we want further results on how performance is affected by failures we 
> could run with
> http://github.com/toddlipcon/gremlins
> or
> https://issues.apache.org/jira/browse/CASSANDRA-561
>
>
> Thoughts?
>
> /Johan
>
> On 11 maj 2010, at 20.38, Kushal Pisavadia wrote:
>
>> Hi,
>>
>> Due to conflicting schedules, I was unable to take part in the GSoC this
>> year. However, I'm still very interested in helping out the community for
>> this specific case.
>>
>> Rather than just coding off a solution that would suit my own needs, I'm
>> here asking for some help.
>>
>> What short-term goals do you have in mind? What long-term goals do you have
>> in mind?
>>
>> I've had a look at the respective ticket —
>> https://issues.apache.org/jira/browse/CASSANDRA-875 — but rather than just
>> refactor the py_stress utility I'd like to make something that fulfils
>> whatever needs that the current utility fails to meet.
>>
>> I'm also curious about how you'd like me to commit/expose my code.
>> Originally I was thinking of creating a separate git repo, specific to this
>> utility, but have no issues working from a fork on Github either.
>>
>> Kind Regards,
>>
>> Kushal Pisavadia
>
>



-- 
Todd Lipcon
Software Engineer, Cloudera