On Wed, Nov 17, 2021 at 08:05:38AM +0800, Paul Wise wrote: > On Tue, 2021-11-16 at 13:38 +0100, Bill Allombert wrote: > > > What is the idea exactly ? > > Bálint's idea was to ship popcon data in a popcon-stats-data package in > the Debian archive. I suggested to instead ship that in the apt > metadata present in the Packages files. > > > How often the popcon data are going to be refreshed ? > > I would assume with the same frequency as the existing data on the > popcon.d.o website is refreshed. Anything faster than that would just > be refreshing unchanged data. Anything slower than that would be > providing outdated data. Outdated data is fine though, so maybe weekly. > > > Which exact set of data are going to be used ? > > Initially I thought similar to the QA per-package popcon data: > > https://qa.debian.org/popcon.php?package=iotop > > Package: iotop > Popcon: 30314 7962 21197 1143 12 > > If I massage the by_inst file into the same format as this, I calculate > that the extra Popcon fields would add 3.7 MB to the Packages files and > that data would change often, making the apt updating process slower. > So probably the data should go into new files instead and there should > be a config file snippet to enable downloading them, a tool to query > and index them and a way for apt clients to get that data. > > Since the Debian repository splits the metadata by suite and component, > these new statistics should probably do the same. So the raw popcon > submissions would need to be individually mapped to a suite based on > the popcon version in the submission, and then each item in the > submission attributed to that suite/component. For popcon versions that > don't match a suite, if they match a known Debian version, attribute > them to the next highest suite and discard submissions with popcon > versions that were never in Debian, or maybe attribute them to the > relevant vendor separately. popcon submissions that don't have Debian > as the vendor probably should be discarded, or maybe attribute them to > the relevant vendor separately.
So the idea is to have a Popcon file for each suite ? So let say bookworm is released today. What bookworm/Popcon will contain ? We release a new popularity-contest package. What sid/Popcon will contain ? The package migrate to testing; What testing/Popcon will contain ? As I understand, the metadata for stable are only updated with point releases. Would that be the same for stable/Popcon ? I still do not quite see how this would work... We do not want to provide data generated from a very small subset of reports for accuracy and privacy reasons. The current all-popcon-result.gz/stable-popcon-result.gz split is middle ground between competing constraints. What not instead write a tool to download all-popcon-result.gz or stable-popcon-result.gz when needed, and cache them ? This can then be processed by a tool that makes suggestions. Cheers, -- Bill. <ballo...@debian.org> Imagine a large red swirl here.