Package: lintian
Version: 2.5.50.3
Severity: wishlist

Many contributors have multiple emails (common case being "non-DD"
email and a "DD" email).  However, lintian.d.o generates a maintainer
page per unique email and not per contributor.

We can do this by exporting the following data from
contributors.debian.org[1].  Using this dataset we can merge multiple
emails into one contributor by comparing the "user.email" value.  If
two entries have the same "user.email" value, then they are related to
the same contributor.

Access to the dataset is a privacy concern, so:
 * The data set (and related log files) should preferably at most be readable 
by the
   lintian maintainers on lindsay.d.o
    - technically, lindsay is DD-only, but I would still feel better with 0750
      over a 0755 permission.
 * We should not present any data from the dataset except where this is already 
public and
   exposed by the current implementation.
   - Example: Currently, I don't use my nthyk...@debian.org email for 
packaging, so that
     email must not appear on lintian.d.o even though the dataset lists it as 
an email
     associated to me.
   - On the flip side, lintian.d.o would expose ni...@thykier.net as I use that 
for packaging
     (like it did previously).

In the short term, we can do manual exports of the data (for
prototyping/testing).  Long term, we should setup some sort of batch
job with a service account to pull this data.  The latter probably needs
DSA and/or maintainers of contributor.d.o

Thanks,
~Niels

[1] 
https://contributors.debian.org/api/identifiers/?format=json&type=email&limit=none

 * Warning: large data set - your browser/editor might not like it (omit 
"limit=no" if you are going
   to click on the link)
 * Access restrictions: DD-only (Privacy)
 * Authentification: SSL certificate (from sso.d.o)
 * Code: 
https://anonscm.debian.org/cgit/nm/dc.git/commit/?id=7c58b6ce8092fea3c6902dfe8a10428f7c4d1795
   - Plus some later commits
 * Example in below using "?user__email=nthykier%40debian.org&type=email" as 
filter
   - Data is about me, declassified by me, so its exposure is a non-concern.

JSON example:

{
    "count": 2,
    "next": null,
    "previous": null,
    "results": [
        {
            "type": "email",
            "name": "ni...@thykier.net",
            "user": {
                "email": "nthyk...@debian.org",
                "full_name": "Niels Thykier"
            }
        },
        {
            "type": "email",
            "name": "nthyk...@debian.org",
            "user": {
                "email": "nthyk...@debian.org",
                "full_name": "Niels Thykier"
            }
        }
    ]
}

Reply via email to