Hi,
I would be interested in per-package-and-version download statistics and
trends as well.
Le 2025-05-03 09:28, Philipp Kern a écrit :
The problem is that we currently do not want to retain this data.
You're absolutely right here, there is no point in retaining the raw
data, it gets sta
Memory usage approximations:
per tuple:
ipv6 = 16
package pointer = 3 (assuming <16777216 packages)
version pointer = 2 (assuming <65536 distinct version names)
+ some overhead
=> ~ 40 B seems fair?
But you could also just write to disk. It'll wear out an SSD though,
and random r/w on a harddrive i
On Sat, 2025-05-03 at 11:16 +0200, Erik Schulz wrote:
> I suspect that compliance with GDPR would require the data to be
> stored minimally.
> It seems reasonable to me that a 24-hour window would reduce most
> repeat-downloads.
> If you stream the request log and reduce to (ip,package,version), it
On 03/05/2025 02:35, Otto Kekäläinen wrote:
I am also interested in usage statistics. I feel it is much more
meaningful to work on packages that I know how have a lot of users.
+1
While neither popcon of download stats are accurate, they still show
trends and relative numbers which can be used
I suspect that compliance with GDPR would require the data to be
stored minimally.
It seems reasonable to me that a 24-hour window would reduce most
repeat-downloads.
If you stream the request log and reduce to (ip,package,version), it
will be minimal.
I think it would fit into memory, e.g. 10 mill
On 2025-05-03 03:35, Otto Kekäläinen wrote:
I'm interested in package popularity. I'm aware of popcon
(https://popcon.debian.org/), but I'm more interested in actual
downloads.
I am also interested in usage statistics. I feel it is much more
meaningful to work on packages that I know how have a
> I'm interested in package popularity. I'm aware of popcon
> (https://popcon.debian.org/), but I'm more interested in actual
> downloads.
I am also interested in usage statistics. I feel it is much more
meaningful to work on packages that I know how have a lot of users.
While neither popcon of d
> misguided popularity
I would argue a more objective description is that the measurement has bias.
I.e.
- repeat-download bias.
- external-download bias, when using mirrors.
- false-download bias, when malicious actors try to manipulate the
value, for example using many IPs.
I agree that install
I presume do some misguided popularity ranking like pypi does, by counting the
number of downloads.
It works terribly because large organizations that actually download it many
times will set up internal mirrors, so there is no chance for the value to
have any meaning.
Also on pypi and similar
On 2025-04-23 10:08, Erik Schulz wrote:
I'm interested in package popularity. I'm aware of popcon
(https://popcon.debian.org/), but I'm more interested in actual
downloads.
What would this be useful for? You only described technical details, not
why we would want to do this.
Kind regards
Phi
10 matches
Mail list logo