On Tue, 2023-02-14 at 17:53 +0100, Alexis Lothoré via
lists.openembedded.org wrote:
> From: Alexis Lothoré <alexis.loth...@bootlin.com>
> 
> This v2 does not contain any change in patches content, it only sets the From:
> field correctly. Sorry for the noise.
> 
> This patch serie is a proposal linked to discussion initiated here:
> https://lists.yoctoproject.org/g/automated-testing/topic/96652823#1219
> 
> After integration of some improvements on regression reporting, it has been
> observed that the regression report of version 4.2_M2 is way too big. When
> checking it, it appears that a big part of the report is composed of "missing
> tests" (regression detected because test status changed from "PASS" to 
> "None").
> It is mostly due to oeselftest results, since oeselftest is run multiple time
> for a single build, but not with the same parameters (so not the same tests
> "sets"), so those test sets are not comparable.
> 
> The proposed serie introduce OSELFTEST_METADATA appended to tests results when
> the TEST_TYPE is "oeselftest". An oeselftest result with those metadata looks
> like this:
>       [...]
>       "configuration": {
>               "HOST_DISTRO": "fedora-36",
>               "HOST_NAME": "fedora36-ty-3",
>               "LAYERS": {
>                       [...]
>               },
>               "MACHINE": "qemux86",
>               "STARTTIME": "20230126235248",
>               "TESTSERIES": "qemux86",
>               "TEST_TYPE": "oeselftest",
>               "OESELFTEST_METADATA": {
>                   "run_all_tests": true,
>                   "run_tests": null,
>                   "skips": null,
>                   "machine": null,
>                   "select_tags": ["toolchain-user", "toolchain-system"],
>                   "exclude_tags": null
>               } 
>       }
>       [...]
> 
> Additionally, the serie now makes resulttool look at a METADATA_MATCH_TABLE,
> which tells that when compared test results have a specific TEST_TYPE, it 
> should
> look for some specific metadata to know if tests can be compared or not. It 
> will
> then remove all the false positive in regression reports due to tests present 
> in
> base results but not found in target results because of skipped tests/excluded
> tags
> 
> * this serie prioritize retro-compatibility: if the base test is older (ie: it
> does not have the needed metadata), it will consider tests as "comparable"
> * additionally to tests added in oeqa test cases, some "best effort" manual
> testing has been done, with the following cases:
>   - run a basic test (e.g: `oeselftest -r tinfoils`), collect test result, 
> break
>     test, collect result, ensure tests are compared. Change oeselftest
>     parameters, ensure tests are not compared
>   - collect base and target tests results from 4.2_M2 regression report,
>     manually add new metadata to some tests, replay regression report, ensure
>     that regressions are kept or discarded depending on the metadata

I think this is heading in the right direction. Firstly, can we put
some kind of test script into OE-Core for making debugging/testing this
easier?

I'm wondering if we can take some of the code from qa_send_email and
move it into OE-Core such that I could do something like:

show-regression-report 4.2_M1 4.2_M2

which would then resolve those two tags to commits, find the
testresults repo, fetch the data depth1 then call resulttool regression
on them.

I did that manually to experiment. I realised that if we do something
like:

    if "MACHINE" in base_configuration and "MACHINE" in target_configuration:
        if base_configuration["MACHINE"] != target_configuration["MACHINE"]:
            print("Skipping")
            return False

in metadata_matches() we can skip a lot of mismatched combinations even
with the older test results. I think we also should be able to use some
pattern matching to generate a dummy OESELFTEST_METADATA section for
older data which doesn't have it. For example, the presence of meta_ide
tests indicates one particular type of test. Combined with the MACHINE
match, this should let us compare old and new data? That would mean
metadata_matches() would need to see into the actual results too.

Does that make sense?

Cheers,

Richard

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#177223): 
https://lists.openembedded.org/g/openembedded-core/message/177223
Mute This Topic: https://lists.openembedded.org/mt/96964070/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to