Jeff King <p...@peff.net> wrote:
> On Fri, Aug 12, 2016 at 10:42:55PM +0000, Eric Wong wrote:
> > Junio C Hamano <gits...@pobox.com> wrote:
> > > is still available.  An alternative
> > > 
> > >         nntp://news.public-inbox.org/inbox.comp.version-control.git
> > > 
> > > will become usable once it catches up with old messages.
> > 
> > Mostly caught up, I injected 33 more today which were
> > cross-posted (which tripped up some of my anti-spam rules) or
> > simply missed by gmane.
> > 
> > There may be more in some personal archives gmane doesn't
> > have...
> 
> Is there an easy way to get _just_ the list of message-ids you are
> storing (I know I can download the whole archive, but it's big)?

XHDR (or HDR) over NNTP should do it (that's how I checked
against gmane):
--------8<-----
use Net::NNTP;
my $nntp = Net::NNTP->new($ENV{NNTPSERVER} || 'news.public-inbox.org');
my ($num, $first, $last) = $nntp->group('inbox.comp.version-control.git');
my $batch = 10000;
my $i;
for ($i = $first; $i < $last; $i += $batch) {
        my $j = $i + $batch - 1;
        $j = $last if $j > $last;
        my $num2mid = $nntp->xhdr('Message-ID', "$i-$j");
        for my $n ($i..$j) {
                defined(my $mid = $num2mid->{$n}) or next;
                print "$mid\n";
        }
}

# and I forgot to optimize XHDR/HDR further in public-inbox-nntpd.
# Oh well, it seems to work, at least.

> Then I can cross-reference with my archive. I doubt I'll have anything
> significant that you don't. My archive of the early days was pulled from
> gmane, though I have been collecting steadily via mailing list delivery
> since 2007 or so.

What's odd is there's some messages with two Message-ID fields
from gmane from the old days, too.  I'll dig a bit another time.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to