[gentoo-dev] [gentoostats continued] Collected data and justification for it

Michał Górny Thu, 07 May 2020 00:30:03 -0700

Hi,

The previous thread covered a few topics, in this one I'd like to focus
on the data collected.  So far people have indicated a few different
kinds of data they'd find useful.  However, I don't think enough
attention has been put on explaining why they need the data and how
they'd use it.


I think we shouldn't collect any data unless we have a good plan on how
we'd be able to use it.  In this thread, I'd like to collect ideas
on what data to collect and how it could realistically be used.

I'm going to start with the data and uses I can think of.  Please reply
with other things you can think of.


1) list of selected packages (@world)

We would use this to determine the popularity of individual packages,
plus by scanning their dependencies we would be able to make combined
statistics for direct usage + dependencies of other selected packages. 
This would allow us to judge which packages need more of our attention.

For example, as we port Python packages to Python 3.8 the packages with
more declared users would be ported first.


2) USE flags on installed packages (disabled/default/enabled)

This would allow us to determine which flags users are most likely to
actually rely on.  This could determine tested flag combinations,
defaults, and required level of support for individual flags.

For example, if OCaml bindings on some package are broken and require
a lot of work, I would find useful to know how likely it is that anyone
is using it.  Or if a lot of people are enabling 'frobnicate' flag,
I could consider employing USE defaults.


3) System profile

This would primarily allow us to establish how transition to new
profiles proceeds and could influence the decision on prolonging
the support for old ones.  As a side effect, we'd have stats on how
popular different architectures are.

For example, it would help us see whether people are moving away from
amd64 17.0 to 17.1.


4) Arch - installed package correlation

This one could be considered a bit invasive but it would help us
determine how important is keeping particular arch keywords
on a package.

For example, package A breaks on SPARC.  Fixing it would require
significant effort.  If we know it has users on SPARC we're more likely
to put that effort; otherwise, we may just drop SPARC keywords and move
on.


That's all really useful stuff I can think of right now.  What's your
angle?

-- 
Best regards,
Michał Górny

signature.asc
Description: This is a digitally signed message part

[gentoo-dev] [gentoostats continued] Collected data and justification for it

Reply via email to