Hi, The previous thread covered a few topics, in this one I'd like to focus on the data collected. So far people have indicated a few different kinds of data they'd find useful. However, I don't think enough attention has been put on explaining why they need the data and how they'd use it.
I think we shouldn't collect any data unless we have a good plan on how we'd be able to use it. In this thread, I'd like to collect ideas on what data to collect and how it could realistically be used. I'm going to start with the data and uses I can think of. Please reply with other things you can think of. 1) list of selected packages (@world) We would use this to determine the popularity of individual packages, plus by scanning their dependencies we would be able to make combined statistics for direct usage + dependencies of other selected packages. This would allow us to judge which packages need more of our attention. For example, as we port Python packages to Python 3.8 the packages with more declared users would be ported first. 2) USE flags on installed packages (disabled/default/enabled) This would allow us to determine which flags users are most likely to actually rely on. This could determine tested flag combinations, defaults, and required level of support for individual flags. For example, if OCaml bindings on some package are broken and require a lot of work, I would find useful to know how likely it is that anyone is using it. Or if a lot of people are enabling 'frobnicate' flag, I could consider employing USE defaults. 3) System profile This would primarily allow us to establish how transition to new profiles proceeds and could influence the decision on prolonging the support for old ones. As a side effect, we'd have stats on how popular different architectures are. For example, it would help us see whether people are moving away from amd64 17.0 to 17.1. 4) Arch - installed package correlation This one could be considered a bit invasive but it would help us determine how important is keeping particular arch keywords on a package. For example, package A breaks on SPARC. Fixing it would require significant effort. If we know it has users on SPARC we're more likely to put that effort; otherwise, we may just drop SPARC keywords and move on. That's all really useful stuff I can think of right now. What's your angle? -- Best regards, Michał Górny
signature.asc
Description: This is a digitally signed message part