Den ons 5 jan. 2022 kl 04:17 skrev Nathan Hartman <hartman.nat...@gmail.com >:
> On Tue, Jan 4, 2022 at 9:17 PM Daniel Shahaf <d...@daniel.shahaf.name> > wrote: > > > > Could someone take over, please? > > > > Looking at previous dev@ discussions of this ticket, we should also > > double check a copy of the mboxes lives on some box that's owned by > > Infra and backed up by Infra. (svn-qavm doesn't qualify) > > > > Thanks > > > > Daniel > > > > > > Drew Foulks (Jira) wrote on Tue, 04 Jan 2022 13:24 +00:00: > > > [ > > > > https://issues.apache.org/jira/browse/INFRA-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > > > ] > > > > > > Drew Foulks updated INFRA-20213: > > > -------------------------------- > > > Review Date: (was: 18/Nov/21) > > > Status: Waiting for user (was: Waiting for Infra) > > > > > > The archives from svn-qavm have been uploaded to lists. Please verify > > > that everything looks as expected. > > > > > >> Backfill mailing list archives with the pre-Apache list archives > > >> ---------------------------------------------------------------- > > >> > > >> Key: INFRA-20213 > > >> URL: > https://issues.apache.org/jira/browse/INFRA-20213 > > >> Project: Infrastructure > > >> Issue Type: Task > > >> Components: Backups, Mail Archives > > >> Reporter: Daniel Shahaf > > >> Assignee: Drew Foulks > > >> Priority: Major > > >> Labels: ReviewedByInfra > > >> > > >> The Subversion project would like the archives of its pre-Apache > public lists to be backed up on Apache hardware. That covers the period > between April 2000 and the creation of the *@subversion.apache.org > mailing lists in November 2009. > > >> The data is on svn-qavm.apache.org:/x1/svn-haxx-se-mirror/ in the > form of monthly gzipped mboxes (the same format as mod_mbox's backend). > The overall size is about 240MB compressed. The data in 200004.gz through > 200910.gz (inclusive) is entirely new to ASF. The data in 200911.gz and > later contains messages both from the @subversion.apache.org lists (which > are already in ASF's archives, of course) and from the pre-Apache lists. > > >> Ideally, we'd like the dev/ and users/ archives to be merged with the > archives of the dev@subversion.a.o and users@subversion.a.o mailing > lists. Would that be possible? > > >> Please ignore the directory org/ for now. (The archives of org/ > don't correspond to an *@subversion.a.o mailing list, so they'll want to be > handled separately.) > > >> Thanks! > > > > > > > > > > > > -- > > > This message was sent by Atlassian Jira > > > (v8.20.1#820001) > > > Did they really do it? Woohoo! > It sure looks like that :-) Thanks to everyone involved in making this happen! > For context, some time ago we asked Infra to backfill the > lists.apache.org archives with our pre-ASF list archives and it > appears that they have just done that for dev@ (from April 2000) and > users@ (from July 2003). > > I browsed and manually compared a few samples to the same ones in our > svn.haxx.se archives and it looks reasonable to me. The very first > message in each list is the same in both archives. The only > discrepancy I see is that lists.a.o seems to have *more* emails than > svn.haxx.se for November 2009; I think there might have been a period > of overlap when lists existed at both apache.org and tigris.org; if > that's correct then it would account for that. > > In my testdrive, I don't see any problems. I'll wait a while before > replying on INFRA-20213 to give others time to check it. > Well... I immediately spotted a problem. The e-mail https://lists.apache.org/thread/km3dkh2b7cqddyp45n6n7x13n2wmxwcy contains another e-mail as an attachment and I can't figure out how to view it in lists.a.o. Compare to https://svn.haxx.se/dev/archive-2005-01/1170.shtml. Now, the attachement is actually there since I can select the raw view ( https://lists.apache.org/api/source.lua?id=km3dkh2b7cqddyp45n6n7x13n2wmxwcy) to see it, so it looks more like a presentation issue and not an issue with backfilling the old mbox'es. I'm going to report this to users@ponymail.a.o [1]. Apart from that I don't see any immediate problems. It is a bit tricky to compare straight away since there is a slight difference between lists.a.o and svn.haxx.se on how to divide messages into months. What I can't verify is whether this is stored and backed up the same > way as the 2009-present archives. I guess only Infra can tell us that. > Greg Stein has added this to the Jira ticket (as a question to Drew Foulks). I'd suggest that we wait for Drew's response and then close the ticket. Kind regards, Daniel [1] https://lists.apache.org/thread/74swv7fyx2t540z89nvwv7qzclzpc8oc