On 13/01/2021 18:02, Marek Marczykowski-Górecki wrote:
> On Wed, Jan 13, 2021 at 03:45:10PM +0000, Andrew Cooper wrote:
>> On 12/01/2021 17:35, Ian Jackson wrote:
>>>  * When appropriately configured, the xen.git build system
>>>    will ship them into dist/install/usr/local/....
>>>
>>>  * There will be a specific directory containing tests which
>>>    are suitable for running automatically:
>>>       dist/install/usr/local/lib/xen/autotests
>>>
>>>  * The pass/success reporting API will be encoded into the
>>>    filename.  For now we define only one API:
>>>      dist/install/usr/local/lib/xen/autotests/simple/*
>> Whatever we go for, it ought to be an easy path, and honestly - that's a
>> mouthful to get to.  These tests need to be just as easy for developers
>> to use.
>>
>> How about opt/xen-autotest/* ?  Much easier in a deb/rpm build to filter
>> the things which clearly shouldn't be in a production system.  It's also
>> invariant of libdir which is an advantage for downstreams who test on
>> more than one distro.
> I can see them being packaged as a separate sub-package instead of not
> packaging at all. And many distributions have rules forbidding native
> packages form putting stuff into /opt. So a location somewhere in /usr
> would be preferable.
> I'm fine with dist/install/usr/local/lib/xen/autotests.
> For running manually, there could be a trivial test runner (shell
> script) that could be installed into dist/install/usr/local/sbin - to
> make it even more convenient to access.

local, or not, is down to --prefix, but I do like the idea of having a
trivial test runner put into sbin for humans to use.

Automation such as OSSTest and XenRT will want to track tests
individually for reporting and/or regression purposes.  But humans
wanting to test their change want an easy "run everything and confirm ok".

This is precisely why `./xtf-runner -aqq` exists.

>>>  * A test may exit with one of the XTF exit statuses:
>>>           0 # XTF SUCCESS
>>>           3 # XTF SKIP
>>>           4 # XTF ERROR
>>>           5 # XTF FAILURE
>>>           6 # XTF CRASH
>>>    (osstest will treat anything other than 0 and 3 as "fail")
>>>    Any other termination should be treated as CRASH.
>>>
>>>  * If a test exits nonzero it should print some information about
>>>    the situation to stderr.
>>>
>>>  * Tests may print other information to stderr, which will be captured
>>>    and logged.
>>>
>>>  * Tests *must not* print anything to stdout.  Test runners *must*
>>>    ignore stdout (though they may log it, possibly mixed in with
>>>    stderr).  stdout is reserved for future extension.
>> I disagree.  It is far more important that the tests are pleasant for
>> humans to use, both in terms of developing them to begin with, and
>> maintaining them if they go wrong.
> I can see why such restriction would be useful for future extensions,
> but I don't think it's strictly necessary. An extension can for example
> require stdout lines with a specific prefix and ignore others.

If it comes down to it, we can define a new ABI if we need to make an
change, and can't find a compatible way of doing it.

But I don't see this plausibly happening for "simple".

>>>  * Tests should exit with XTF ERROR if they are passed any arguments.
>> ./test-foo --debug-this-harder would be a perfectly reasonable thing for
>> a human to use to investigate problems, and it is what we use in the
>> fuzzing harnesses for investigating crashes.
> A data point: libvirt test suite have very same requirement - test
> binaries needs to be run without an argument and error when they get
> one. In addition to the error, you get a help message with env variables
> to use instead (VIR_TEST_DEBUG=1 for example). And I do run it with some
> dummy argument, to get this help message, because I can't remember how
> those variables are named. Every single time.
> But TBH, if parameters wouldn't be very obvious (like --verbose, or
> --debug), I'd run --help first too.
>
> In any case, unknown command like arguments should result in an error.

It's somewhat rude to lump every test with needing to use getopt, but I
suppose we ought to do something.

>> The problem is "for these configurations".  Even at the totally basic level,
>>
>> * PV and Shadow are conditional on Xen's Kconfig.
>> * PV32 is further conditional on command line settings, and/or whether
>> the hardware supports CET-SS.
>> * HVM is dependent on Kconfig, hardware, firmware and command line
>> settings.  HAP similarly.
>>
>> `xl create` doesn't handle missing CONFIG_SHADOW, or PV32 being disabled
>> cleanly, despite having suitable information available in `xl info`. 
>> While this can (and should) be fixed, its not helpful for the more
>> general testing case.
> It indeed is an issue, but in practice this can be worked up
> incrementally - when the test fails, see if the reason is unsupported
> (by the test) configuration. And if so - add a condition for this
> specific case (to return SKIP or choose another configuration). Rinse
> and repeat.

That isn't robust.  For the examples given, they all yield EOPNOTSUPP,
and from that alone, you can't distinguish "Xen really can't do $X" from
"Someone regressed Xen's logic", but you can't distinguish "shadow
unavailable" from "HVM unavailable" like this.

Really, what you want is:

if ( xen claims feature $X )
    success = ("text $X" == SUCCESS)
else
    success = ("test $X" == EOPNOTSUPP)

Because that will allow you to spot "someone broke the domain
construction logic around $X".


Combined with the orchestration framework's idea of "what should Xen be
able to do", you also have:

success = ("xen claims $X" == "xen ought to claim $X")

which allows you to spot "someone broke Xen's ability to spot/enable $X
on boot".


Other similar testable areas here are:

foreach interesting CPUID bit:
    if ( bit supported in guest )
        success = ("vm configured with $X" == "$X visible in guest") &&
                   ("vm configured without $X" == "$X not visible in guest")
    else
        success = "vm configured with $X" rejected during construction


There is absolutely a time and a place for the simple tests, and there
is a lot of value in having them, but a small piece of C isn't capable
of doing this higher level reasoning.

Reasoning this fine grained *is* important given the shear complexity of
virtualisation, and I fully intend that the "simple", and this
"not-quite-so-simple" set of autotests will be sufficiently quick as to
be minutes worth of testing, and live in the Gitlab/patchew pre-commit
testing.

~Andrew

P.S. if its not already clear by now, I've spent a lot of time trying to
figure out how to turn "something, somewhere, went wrong" bug reports
into something more useful for a test robot to report, at the point that
it has figured out that something went wrong.

Reply via email to