dueff...@uwe-dueffert.de wrote:
> Hi,
>
> On Fri, 3 May 2013, Bruce Dubbs wrote:
>
>> I'm going to write a program to automatically identify out of date
>> packages for LFS.  Has anyone already done such a beast?

> I'm kind of doing that for a couple of years now (including some BLFS and
> even Windows stuff as well ;-]). I started with a bunch of bash scripts
> that basically parsed certain maintainer websites with certain regexps.
> This was quite hard to read, neither fast nor flexible and always out of
> date.
>
> Current solution (that I'm happy with for quite some years):
> All parsing stuff is done by a simple single C(++?) program now.
> It basically follows _all_ links and handles general stuff like stripping
> common extensions (*.tgz etc) or an appended "/download" and replacing
> "/from/a/mirror" by "/from/this/mirror".

I'm using php.  It is generally easier to maintain than C/C++.  php can 
do anything C can and the computation time is not an issue for this type 
of application.  The most time used will be fetching directory listings 
from remote sites.

> As basic input it gets a list of simple rules to look for:
> $packagename $starturl $pattern, e.g.:
> mpc http://www.multiprecision.org/?prog=mpc&page=download tar.gz
> check http://sourceforge.net/projects/check/files/check/ /tar.gz/download
>
> $pattern in most cases only specifies the (sub/parent)directory depth to
> search in (number of leading slashes) and the extension (or better: end)
> of the links to look for there. It usually does not filter for any kind of
> naming or versioning scheme. As a result I get a list of
> directories/websites searched in and a list of URLs to potentially
> download.
>
> This would include following uninteresting links (such as parent dirs or
> adverts or subdirs of outdated versions or subdirs of packages I'm not
> interested in). Therefore I keep a list of fully qualified
> directories/websites not to be searched by above C program again, e.g:
> ftp://ftp.funet.fi:21/pub/mirrors/ftp.easysw.com/pub/cups/1.1.19/
> ftp://ftp.funet.fi:21/pub/mirrors/ftp.easysw.com/pub/cups/1.1.20/
> http://apache.osuosl.org/
> http://creativecommons.org/licenses/by-sa/3.0/
> hhttp://jobs.sourceforge.net/

Yes, I may use a variation of that.

> This would give me a list of package URLs, but include stuff that I'm not
> intersted in (which just happens to come from the same directory/site) or
> stuff that I already have. Therefore I keep a list of such done packages
> with certain extensions stripped (to avoid getting an tar.gz as tar.xz
> again), e.g.:
> autoconf-2.52
> autoconf-2.53
> autoconf-2.54
> linux-2.6.16.18-utf8_input-1.patch
> linux-2.6.16.19
> linux-2.6.16.19-utf8_input-1.patch

Actually, I want to know if a xz version exists.  My order of preference 
is xz, bz2, gz.  All the packages in LFS are one of those. I haven't 
looked at BLFS yet.

> The C program has those 3 lists (currently 24KB commented rules, 120KB
> dirs done, 230KB packages done) in memory and can therefore filter results
> rapidly.

I agree that the memory requirement is not particularly large and few 
items, if any, beyond the final results need to be written out.

> [You can add further sanity checks like remembering when a certain rule
> resulted in package URLs at all or in new package URLs for the last time
> to hint at taking a look whether the maintainer changed website, extension
> or subdir structure.]
>
> So I automatically get a list of subdirs currently searched (and may
> exclude older versions or new unintersting packages or new advert from
> further search) and I automatically get a list of new package URLs that I
> may either want to download or just mark as done (for skipping missed
> intermediate versions or by-catch of packages I'm not interested in).
>
> Example: current list of new package URLs that I might potentially be
> interested in downloading:
> http://ftp.gnome.org/pub/gnome/sources/gtk+/3.9/gtk+-3.9.0.tar.xz
> http://icedtea.wildebeest.org/download/source/icedtea-2.1.8.tar.gz
> http://sourceforge.net/projects/libpng/files/libpng15/1.5.16beta02/libpng-1.5.16beta04.tar.xz/download
> http://www.linuxfromscratch.org/blfs/downloads/svn/blfs-book-svn-html-2013-05-03.tar.bz2
> http://www.linuxfromscratch.org/lfs/downloads/development/LFS-BOOK-SVN-20130501.tar.bz2

This does give me a couple of ideas to play with.  Thanks.

   -- Bruce


-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to