Hi Simon, * Simon Josefsson <si...@josefsson.org> [2025-03-07 18:17]:
It is not explicitly recorded, but I can deduce it from the data, as I have the name of the .changes file and can take everything before the first underscore (_) as source package name.Is it possible from your data sources to filter these two cases apart?
For the sake of simplicity, I did not split the dataset into monthly chunks. Instead, I binned the processing times by four mutually exclusive outcomes. So, without further ado, these are the percentiles for all uploads to NEW from September 2012 [1] until January 2025:
33743 ACCEPTs 50% - 4 days, 18:10:30 90% - 42 days, 3:26:44 98% - 106 days, 12:47:56 24443 ACCEPTs (binNEW) 50% - 2 days, 1:25:25 90% - 13 days, 23:44:49 98% - 67 days, 23:07:27 6318 REJECTs 50% - 8 days, 4:03:34 90% - 98 days, 16:03:15 98% - 267 days, 4:23:37 1712 REJECTs (binNEW) 50% - 21:28:34 90% - 43 days, 0:35:03 98% - 173 days, 1:30:30I'm pretty sure that you can fit exponential probability distributions on these, but that is work for another day.
Cheers Timo[1] In case you are wondering what the significance of that date is, it is when the dak log files changed to the current format, and I was too lazy to implement parsing support for the older ones. It also means there are a few false negatives for my detection of binNEW uploads, but I doubt it changes the results by much.
-- ⢀⣴⠾⠻⢶⣦⠀ ╭────────────────────────────────────────────────────╮ ⣾⠁⢠⠒⠀⣿⡁ │ Timo Röhling │ ⢿⡄⠘⠷⠚⠋⠀ │ 9B03 EBB9 8300 DF97 C2B1 23BF CC8C 6BDD 1403 F4CA │ ⠈⠳⣄⠀⠀⠀⠀ ╰────────────────────────────────────────────────────╯
signature.asc
Description: PGP signature