On 08/10/10 22:06, Konstantin Boudnik wrote:
All,
I want to start a discussion about future approaches to perform Hadoop
system (and potentially other types of) testing in 0.22 and later.
As many of you know recent development effort from a number of Hadoop
developers brought to the existence new system test framework codename Herriot.
If you never hear about it please check HADOOP-6332 and
http://wiki.apache.org/hadoop/HowToUseSystemTestFramework
Now, Herriot is a great tool which allows for much wider and powerful
inspection and interventions of/into remote Hadoop's daemons (aka
observability andjcontrollability). There's a catch, however, for such powers
come at the costs of a build instrumentation.
On the other hand, there's a fairly large number of cases where no
introspection into daemons' internals is required. These can be carried by
a simple communication via Hadoop CLI. To name a few: testing ACL refreshes,
basic file ops, etc.
-stuff we aren't testing properly today, you mean.
My ultimate goals is to, essentially, has a single uniformed test
driver/framework (such as JUnit) to control all/most types of tests execution
starting at the TUT (true unit tests end) up to the system and, potentially,
load tests.
One of the benefits such approach will provide is to facilitate integration of
other types of testing into CI infrastructure (read Hudson) and will provide
well-supported and familiar for many test development environment, lowering the
learning curve for potential contributors who might want to join Hadoop
community and helps us to make Hadoop even better product.
I'd like some JARs containing tests that could be deployed against a
cluster to QA it, to say "this cluster works", to stress test things,
and do all the multi-host, multi JVM regression testing that we
currently don't have formal test suites for. That will include HtmlUnit
tests against every web page, as well as command line stuff.
I'd also like this stuff to be somewhat independent of how the cluster
gets deployed, you just point the test runner at a list of machines or a
cluster and it works things out and runs the tests. That way, whatever
CM tooling you have, you can test a cluster