Re: Community renewal and project obsolescence

Rafael Laboissière Thu, 28 Dec 2023 07:58:36 -0800

* M. Zhou <lu...@debian.org> [2023-12-27 19:00]:

Thanks for sharing the figure. The data seems correlated with thenumber of new Debian accounts. See the figure below:Python Code for this figure:


 ```
 # modified from ChatGPT.
 # XXX: members.csv is copy-pasted from https://nm.debian.org/members/
 import pandas as pd
 import matplotlib.pyplot as plt
 df = pd.read_csv('members.csv', sep='\t')
 df = df[df['Since'] != '(unknown)'] # filter out invalid data
 df['Since'] = pd.to_datetime(df['Since'])
 df['Year'] = df['Since'].dt.year
 account_counts = df['Year'].value_counts().sort_index()
 smoothed_counts = account_counts.rolling(window=3).mean()
 plt.figure(figsize=(10, 6))
  plt.bar(account_counts.index, account_counts.values, color='skyblue')
 plt.plot(smoothed_counts.index, smoothed_counts.values, color='orange',
 label=f'Smoothed (Window=3)')
 plt.xlabel('Year')
 plt.ylabel('Number of Accounts Created')
 plt.title('Number of Accounts Created Each Year')
 plt.legend()
 plt.savefig('nm-year.png')
 ```

Thanks for the code and the figure. Indeed, the trend is confirmed byfitting a linear model count ~ year to the new members list. Thecoefficient is -1.39 member/year, which is significantly different fromzero (F[1,22] = 11.8, p < 0.01). Even when we take out the data from year2001, that could be interpreted as an outlier, the trend is stillsiginificant, with a drop of 0.98 member/year (F[1,21] = 8.48, p < 0.01).


Best,

Rafael Laboissière

P.S.1: The correct way to do the analysis above is by using ageneralized linear model, with the count data from a Poisson distribution(or, perhaps, by considering overdispersed data). I will eventually addthis to my code in Git.

P.S.2: In your Python code, it is possible to get the data frame directlyfrom the web page, without copying&pasting. Just replace the line:


    df = pd.read_csv('members.csv', sep='\t')

by:

    df = pd.read_html("https://nm.debian.org/members/";)[0]

I am wondering whether ChatGPT could have figured this out…

Re: Community renewal and project obsolescence

Reply via email to