Re: svn.haxx.se is going away

Daniel Shahaf Mon, 21 Dec 2020 02:04:03 -0800

Daniel Sahlberg wrote on Mon, 21 Dec 2020 08:55 +0100:
> Den fre 27 nov. 2020 kl 19:26 skrev Daniel Shahaf <d...@daniel.shahaf.name>:
> 
> > Sounds good.  Nathan, Daniel Sahlberg — could you work with Infra on
> > getting the data over to ASF hardware?
> >  
> 
> I have been given access to svn-qavm and uploaded a tarball of the website
> (including mboxes). I'm a bit reluctant to unpack it since it takes almost
> 7GB, and there is only 14 GB disk space remaining. Is it ok to unpack or
> should we ask Infra for more disk space?
>


I vote to ask for more disk space, especially considering that some
percentage is reserved for uid=0's use.

> > Note that svn-org@ doesn't have an equivalent @s.a.o list, and that, as
> > mentioned upthread, the post-migration (from tigris.org to apache.org)
> > mboxes may be in a different order than the official ones, and shouldn't
> > be "deduplicated".
> >  
> 
> The mboxes will be preserved but I don't plan to make them available for
> download (since they are not available from lists.a.o or mail-archives.a.o).
> 

Please do make them available for download.  Being able to download the
raw data is useful for both backup and perusal purposes, and I doubt
the bandwidth requirements would be a problem.  (Might want
a robots.txt entry, though?)

Regarding the behaviour of the existing archives, see
<https://mail-archives.apache.org/mod_mbox/subversion-dev/202012.mbox>
(which used to also be available via
https://subversion.apache.org/mail/, but nowadays that just redirects
to a landing page ☹).  I don't know whether lists.a.o has equivalent
functionality, but then again, lists.a.o has had vendor lock-in baked
into it from day one, so a lack of a "download raw rfc822 data" feature
might simply be another form of that.

The mod_mbox product is owned by dev@httpd.

> > You indicate a desire to maintain URLs. Do you have some ideas on that?
> >
> > Each individual message .shtml file contains the message-id in
> > a comment.  We can extract the comments and build a redirector around
> > them.  (By the way, this is basically the same exercise that Infra must
> > have solved back when Sebb received that CSV file from the lists.a.o
> > vendor, so there may be an opportunity for code reuse.)  Of course, the
> > full rsync likely has the same info available less scrapily.
> >
> > Or, as mentioned above, the .shtml files could just be preserved
> > statically (plus or minus an appropriate message in the list of years on
> > the /${listname}/ page).  In fact, I'm having trouble coming up with
> > a reason _not_ to serve a static snapshot of the pages, even if we do
> > build a redirector.
> >  
> 
> No redirector as of now, only the static [s]html pages.
> 

<glass type="half-full">Yay!</glass>

> I will need some help from root to:

Not me, I'm afraid; ENOTIME.

> 1. Install a web server. nginx? (just kidding)

Apache HTTP Server would probably be a better choice since more dev@svn
and Infra people are familiar with it, but it's a fair question to ask.
(Cf. INFRA-7524)

> 2. Setup httpd.conf
> 3. Configure a DocumentRoot where I can put the files. Doesn't seem right
> to store them in /home

Hmm.  These things should all be done via puppet.  I'm not sure what's
best practice nowadays regarding writing puppet PRs and testing them,
though.

Cheers,

Daniel

Re: svn.haxx.se is going away

Reply via email to