Hi again Lucas, may be my mail slipped through but basically the difference between the two importers is that you are just parsing
https://ci.debian.net/data/status/unstable/amd64/packages.json while the Python script is parsing https://ci.debian.net/data/status/*/*/packages.json You even have line 5 # FIXME there might be more suites at some point so you was aware of that issue. Would you mind solving that FIXME? Sorry, I do not speak Ruby. Kind regards Andreas. On Tue, Apr 14, 2020 at 06:12:39AM +0200, Andreas Tille wrote: > Hi Lucas, > > On Mon, Apr 13, 2020 at 10:37:57PM +0200, Lucas Nussbaum wrote: > > I'm sorry if I haven't paid enough attention. But what is the difference > > with the 'ci' importer? > > > > https://salsa.debian.org/qa/udd/-/blob/master/rimporters/ci.rb > > I think the problem is that UDD is not documented and I simply did not > know about the ci table. :-( > > However, when looking at it there is a difference: > > udd=# select status, arch, count(*) from ci group by status, arch order by > status, arch; > status | arch | count > ---------+-------+------- > fail | amd64 | 925 > neutral | amd64 | 1593 > pass | amd64 | 10805 > tmpfail | amd64 | 14 > (4 Zeilen) > > > udd=# select status, architecture, count(*) from autopkgtest group by status, > architecture order by status, architecture; > status | architecture | count > ---------+--------------+------- > fail | amd64 | 1561 > fail | arm64 | 1471 > fail | ppc64el | 711 > neutral | amd64 | 2879 > neutral | arm64 | 1322 > neutral | ppc64el | 298 > pass | amd64 | 21373 > pass | arm64 | 11114 > pass | ppc64el | 2458 > tmpfail | amd64 | 11 > tmpfail | arm64 | 85 > tmpfail | ppc64el | 7 > (12 Zeilen) > > I guess we should merge both and make sure that all data is imported. > I would never have written a new importer if I would have been aware > of the existing one - but I do not speak Ruby to fix the existing one. > > Kind regards > Andreas. > > > On 11/04/20 at 07:12 +0200, Andreas Tille wrote: > > > Hi Paul, > > > > > > thanks for the clarification. This commit > > > > > > > > > https://salsa.debian.org/qa/udd/-/commit/6a874a89365671dd37a14a9bca25290dc55a1fc9 > > > > > > imports the current data. I will tests this a bit more and than activate > > > in a cron job as importer. > > > > > > Thanks a lot for your contribution > > > > > > Andreas. > > > > > > On Fri, Apr 10, 2020 at 09:05:31PM +0200, Paul Gevers wrote: > > > > Hi Andreas, > > > > > > > > On 09-04-2020 22:53, Andreas Tille wrote: > > > > > valid_keys = ( 'run_id', > > > > > # 'created_at', # Paul Gevers: should be > > > > > ignored > > > > > # 'updated_at', # Paul Gevers: should be > > > > > ignored > > > > > 'suite', > > > > > 'arch', > > > > > 'package', # ----> should be renamed to > > > > > 'source' > > > > > 'version', > > > > > 'trigger', # usually package.*version > > > > I expected you to mostly see "" or "migration-reference/0" here, with > > > > some hand crafted text from random DD's. > > > > > > > > > 'status', > > > > > 'requestor', # 'britney', 'debci' or e-mail > > > > > > > > Debian login to be precise, not e-mail. > > > > > > > > > 'pin_packages', # [] > > > > > > > > Since a couple of months this json and other pages only show "pure" > > > > suite runs, to pin_packages is always empty. pin_packages contains which > > > > packages are taken from another suite than the base suite. > > > > > > > > > # 'worker', # Paul Gevers: should be > > > > > ignored (is 'null' anyway) > > > > > > > > Oh, bug somewhere I guess. > > > > > > > > > 'date', > > > > > 'duration_seconds', > > > > > 'last_pass_date', > > > > > 'last_pass_version', > > > > > 'message', # see below > > > > > 'previous_status', > > > > > # 'duration_human', # Paul Gevers: > > > > > duration_seconds and duration_human feel double and the former is > > > > > leaner for in a database > > > > > # 'blame', # Paul Gevers: should be > > > > > ignored > > > > > ) > > > > > > > > > > # message can be > > > > > # $ grep '"message"' packages*.json | sed 's/^.*\.json: *//' | sort > > > > > | uniq > > > > > # "message": "All tests passed" > > > > > -> "status": "pass" > > > > > # "message": "Could not run tests due to a temporary testbed > > > > > failure" -> "status": "tmpfail" > > > > > # "message": "elbrus" > > > > > -> "status": "tmpfail" > > > > > # "message": "Erroneous package" > > > > > -> "status": "fail" > > > > > # "message": null > > > > > -> "status": "fail" > > > > > # "message": "No tests in this package or all skipped" > > > > > -> "status": "neutral" > > > > > # "message": "Tests failed", > > > > > -> "status": "fail" > > > > > # "message": "Tests failed, and at least one test skipped" > > > > > -> "status": "fail" > > > > > # "message": "Tests passed, but at least one test skipped" > > > > > -> "status": "pass" > > > > > # "message": "Unexpected autopkgtest exit code 20" > > > > > -> "status": "tmpfail" > > > > > > > > > > I agree that leaving out worker which is really always null makes > > > > > sense > > > > > but I tend to leave message since leaving this out looks like loosing > > > > > information. I tried to find a relation to status but it seems the > > > > > same > > > > > status can result in different messages. I think just a field in > > > > > addition > > > > > will not blow up UDD way more than it recently is - may be I consider > > > > > a normalised form, but usually UDD is not very normalised at all. > > > > > > > > In general this is the final output from autopkgtest. But, as you see my > > > > name there, I had to clean up several times and to be able to find those > > > > back, I added my nick to the message. The list thus may change over > > > > time. > > > > > > > > Paul > > > > > > > > > > > > > > > > > > > -- > > > http://fam-tille.de > > > > > > > > > > -- > http://fam-tille.de -- http://fam-tille.de