[openstack-dev] [all] Acceptable methods for establishing per-test-suite behaviors

Mike Bayer Fri, 22 Aug 2014 12:37:06 -0700

Hi all -

I’ve spent many weeks on a series of patches for which the primary goal is to 
provide very efficient patterns for tests that use databases and schemas within 
those databases, including compatibility with parallel tests, transactional 
testing, and scenario-driven testing (e.g. a test that runs multiple times 
against different databases).


To that end, the current two patches that achieve this behavior in a rudimental 
fashion are part of oslo.db and are at: 
https://review.openstack.org/#/c/110486/ and 
https://review.openstack.org/#/c/113153/.    They have been in the queue for 
about four weeks now.      The general theory of operation is that within a 
particular Python process, a fixed database identifier is established 
(currently via an environment variable).   As tests request the services of 
databases, such as a Postgresql database or a MySQL database, the system will 
provision a database within that backend of that fixed identifier and return 
it.   The test can then request that it make use of a particular “schema” - for 
example, Nova’s tests may request that they are using the “nova schema”, which 
means that the schema for Nova’s model will be created within this database, 
and will them remain permanently across the span of many tests which use this 
same schema.  Only when a test requests that it wants a different schema, or no 
schema, will the tables be dropped.    To ensure the schema is “clean” for 
every test, the provisioning system ensures that each test runs within a 
transaction, which at test end is rolled back.    In order to accommodate tests 
that themselves need to roll back, the test additionally runs within the 
context of a SAVEPOINT.   This system is entirely working, and for those that 
are wondering, yes it works with SQLite as well (see 
https://review.openstack.org/#/c/113152/).

And as implied earlier, to ensure the operations upon this schema don’t 
conflict with parallel test runs, the whole thing is running within a database 
that is specific to the Python process.

So instead of the current behavior of generating the entire nova schema for 
every test and being hardcoded to Sqlite, a particular test will be able to run 
itself against any specific backend or all available backends in series, 
without needing to do a CREATE for the whole schema on every test.   It will 
greatly expand database coverage as well as allow database tests to run 
dramatically faster, using entirely consistent systems for setting up schemas 
and database connectivity.

The “transactional test” system is one I’ve used extensively in other projects. 
 SQLAlchemy itself now runs tests against a py.test-specific variant which runs 
under parallel testing and generates ad-hoc schemas per Python process.    The 
patches above achieve these patterns successfully and transparently in the 
context of Openstack tests, only the “scenarios” support for a single test to 
run repeatedly against multiple backends is still a todo.

However, the first patch has just been -1’ed by Robert Collins, the publisher 
of many of the various “testtools” libraries that are prevalent within 
Openstack projects.

Robert suggests that the approach integrate with the testresources library: 
https://pypi.python.org/pypi/testresources.   I’ve evaluated this system and 
after some initial resistance I can see that it would in fact work very nicely 
with the system I have, in that it provides the OptimisingTestSuite - a special 
unittest test suite that will take tests like the above which are marked 
needing particular resources, and then sort them such that individual resources 
are set up and torn down a minimal number of times.    It has heavy algorithmic 
logic to accomplish this which is certainly far beyond what would be 
appropriate to home-roll within oslo.db.

I like the idea of integrating this optimization a lot, however it runs into a 
particular issue which I also hit upon with my more simplistic approach.   

The issue is that being able to use a resource like a database schema across 
many tests requires that some kind of logic has access to the test run as a 
whole.    At the very least, a hook that indicates “the tests are done, lets 
tear down these ad-hoc databases” is needed.

For my first iteration, I observed that Openstack tests are generally run 
either via testr, or via a shell script.  So to that end I expanded upon an 
approach that was already present in oslo.db, that is to use scripts which 
provision the names of databases to create, and then drop them at the end of 
all tests run.   For testr, I used the “instance_execute”, “instance_dispose”, 
and “instance_provision” hooks in testr.conf to call upon these sub-scripts:

    instance_provision=${PYTHON:-python} -m oslo.db.sqlalchemy.provision echo 
$INSTANCE_COUNT
    instance_dispose=${PYTHON:-python} -m oslo.db.sqlalchemy.provision drop 
--conditional $INSTANCE_IDS
    instance_execute=OSLO_SCHEMA_TOKEN=$INSTANCE_ID $COMMAND

That is, the provisioning system within tests looks only at OSLO_SCHEMA_TOKEN 
to determine what “name” it is running on.   The final “teardown” is given by 
instance_dispose which emits a DROP for the databases that were created.  The 
“echo” command does *not* create a database, it only generates identifiers - 
the databases themselves are created lazily on an as-needed basis.

For systems that use shell scripts, the approach is the same.   Those systems 
would integrate the above three commands into the shell script directly, 
because again all that’s needed is that OSLO_SCHEMA_TOKEN environment variable 
and then the “drop” step at the end.

Lots of people have complained that I put those hooks into .testr.conf.  Even 
though they do not preclude the use of other systems, people don’t like it 
there, so OK, so let’s take them out.    Which leaves us with, what system 
*should* we use?   

Robert’s suggestion of OptimisingTestSuite sounds great, so let’s see how that 
works.    We have to in fact use the unittest “load_tests()” hook: 
https://docs.python.org/2/library/unittest.html#load-tests-protocol.  It says 
“new in Python 2.7”, I’m not sure that perhaps testrepository honors this 
system in Python 2.6 as well (which is my first question).    The hook does not 
seem to be supported by nose, and py.test is totally out the window already, it 
already doesn’t integrate with testscenarios and probably not with 
testresources either.

It also means, unless I’m totally misunderstanding (please clarify for me!)  
that the system of integrating the transactional test provisioning system means 
that projects will have to add a load_tests() function to all of their test 
modules, or at least the ones that include classes which subclass DbTestCase.   
I grepped around and found the keystone-pythonclient project using this method 
- there are load_tests() functions in many modules each of which specify 
OptimisingTestSuite separately.   This already seems less than ideal in that it 
no longer is enough for a test case to subclass a particular base, like 
DbTestCase; the whole thing still won’t work unless this magic load_tests() 
function is also present in the module with explicit callouts to 
OptimisingTestSuite, or some oslo.db hook to do something similar.

Additionally, in order to get the optimization to work across multiple test 
modules, the load_tests() functions would have to coordinate such that the same 
OptimisingTestSuite is used across all of them- I have not seen an example of 
this, though again perhaps an oslo.db helper can coordinate.   But it does mean 
that thousands of tests will now be reordered and controlled by 
OptimisingTestSuite, even those tests within the module that are not actually 
in need of it, unless the system can be made more specific to only put certain 
kinds of test cases into that particular suite.

Anyway, this seems like an invasive and verbose change to be made across 
hundreds of modules, unless there is a better way I’m not familiar with (which 
is my second question).  My approach of using .testr.conf / shell scripts 
allowed the system to work transparently for any test case that subclassed 
DbTestCase, but again, I need an approach that will be approved of by the 
community or else it is obviously useless effort.   If you all are OK with 
putting lots of load_tests() functions throughout your test modules, just let 
me know and I’ll go with that.

So to generalize among questions one and two above, the overarching question I 
have to the Openstack developer community: please tell me what system(s) are 
acceptable here in order to provide either per-test-run fixtures or otherwise 
be able to instrument the collection of tests!   Would you all be amenable to 
OptimisingTestSuite being injected across all of your test modules that include 
database-backed tests with explicit code, or should some other system be 
devised,  and if so, what is nature of that system?    Are there more examples 
I should be looking at?   To be clear, the system I am working on here will be 
the official “oslo.db” system of writing database-enabled tests, which can run 
very efficiently in parallel using per-process schemas and transaction-rollback 
tests.    Ideally we’re going to want this system to be available anywhere a 
test decides to subclass the DB test fixture.

I can make this approach work in many ways, but at this point I’m not willing 
to spend another four weeks on an approach only to have it -1’ed.      Can the 
community please advise on what methodology is acceptable so that I may spend 
my efforts on a system that can be approved?  Thanks!

- mike









_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] Acceptable methods for establishing per-test-suite behaviors

Reply via email to