Den ons 5 jan. 2022 kl 04:17 skrev Nathan Hartman <hartman.nat...@gmail.com
>:

> On Tue, Jan 4, 2022 at 9:17 PM Daniel Shahaf <d...@daniel.shahaf.name>
> wrote:
> >
> > Could someone take over, please?
> >
> > Looking at previous dev@ discussions of this ticket, we should also
> > double check a copy of the mboxes lives on some box that's owned by
> > Infra and backed up by Infra.  (svn-qavm doesn't qualify)
>
>
> > Thanks
> >
> > Daniel
> >
> >
> > Drew Foulks (Jira) wrote on Tue, 04 Jan 2022 13:24 +00:00:
> > > [
> > >
> https://issues.apache.org/jira/browse/INFRA-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> > > ]
> > >
> > > Drew Foulks updated INFRA-20213:
> > > --------------------------------
> > >     Review Date:   (was: 18/Nov/21)
> > >          Status: Waiting for user  (was: Waiting for Infra)
> > >
> > > The archives from svn-qavm have been uploaded to lists. Please verify
> > > that everything looks as expected.
> > >
> > >> Backfill mailing list archives with the pre-Apache list archives
> > >> ----------------------------------------------------------------
> > >>
> > >>                 Key: INFRA-20213
> > >>                 URL:
> https://issues.apache.org/jira/browse/INFRA-20213
> > >>             Project: Infrastructure
> > >>          Issue Type: Task
> > >>          Components: Backups, Mail Archives
> > >>            Reporter: Daniel Shahaf
> > >>            Assignee: Drew Foulks
> > >>            Priority: Major
> > >>              Labels: ReviewedByInfra
> > >>
> > >>  The Subversion project would like the archives of its pre-Apache
> public lists to be backed up on Apache hardware.  That covers the period
> between April 2000 and the creation of the *@subversion.apache.org
> mailing lists in November 2009.
> > >> The data is on svn-qavm.apache.org:/x1/svn-haxx-se-mirror/ in the
> form of monthly gzipped mboxes (the same format as mod_mbox's backend).
> The overall size is about 240MB compressed.  The data in 200004.gz through
> 200910.gz (inclusive) is entirely new to ASF.  The data in 200911.gz and
> later contains messages both from the @subversion.apache.org lists (which
> are already in ASF's archives, of course) and from the pre-Apache lists.
> > >> Ideally, we'd like the dev/ and users/ archives to be merged with the
> archives of the dev@subversion.a.o and users@subversion.a.o mailing
> lists.  Would that be possible?
> > >> Please ignore the directory org/ for now.  (The archives of org/
> don't correspond to an *@subversion.a.o mailing list, so they'll want to be
> handled separately.)
> > >> Thanks!
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian Jira
> > > (v8.20.1#820001)
>
>
> Did they really do it? Woohoo!
>

It sure looks like that :-) Thanks to everyone involved in making this
happen!


> For context, some time ago we asked Infra to backfill the
> lists.apache.org archives with our pre-ASF list archives and it
> appears that they have just done that for dev@ (from April 2000) and
> users@ (from July 2003).
>
> I browsed and manually compared a few samples to the same ones in our
> svn.haxx.se archives and it looks reasonable to me. The very first
> message in each list is the same in both archives. The only
> discrepancy I see is that lists.a.o seems to have *more* emails than
> svn.haxx.se for November 2009; I think there might have been a period
> of overlap when lists existed at both apache.org and tigris.org; if
> that's correct then it would account for that.
>
> In my testdrive, I don't see any problems. I'll wait a while before
> replying on INFRA-20213 to give others time to check it.
>

Well... I immediately spotted a problem.

The e-mail https://lists.apache.org/thread/km3dkh2b7cqddyp45n6n7x13n2wmxwcy
contains another e-mail as an attachment and I can't figure out how to view
it in lists.a.o. Compare to
https://svn.haxx.se/dev/archive-2005-01/1170.shtml. Now, the attachement is
actually there since I can select the raw view (
https://lists.apache.org/api/source.lua?id=km3dkh2b7cqddyp45n6n7x13n2wmxwcy)
to see it, so it looks more like a presentation issue and not an issue with
backfilling the old mbox'es. I'm going to report this to users@ponymail.a.o
[1].

Apart from that I don't see any immediate problems. It is a bit tricky to
compare straight away since there is a slight difference between lists.a.o
and svn.haxx.se on how to divide messages into months.

What I can't verify is whether this is stored and backed up the same
> way as the 2009-present archives. I guess only Infra can tell us that.
>

Greg Stein has added this to the Jira ticket (as a question to Drew
Foulks). I'd suggest that we wait for Drew's response and then close the
ticket.

Kind regards,
Daniel


[1] https://lists.apache.org/thread/74swv7fyx2t540z89nvwv7qzclzpc8oc

Reply via email to