Richard Stallman commented on Jacob Bachmeyer's idea: > > > Another related check that /would/ have caught this attempt would be > > > comparing the aclocal m4 files in a release against their > (meta)upstream > > > sources before building a package. This is something distribution > > > maintainers could do without cooperation from upstream. If > > > m4/build-to-host.m4 had been recognized as coming from gnulib and > > > compared to the copy in gnulib, the nonempty diff would have been > > > suspicious. > > I have a hunch that some effort is needed to do that comparison, but > that it is feasible to write a script to do it could make it easy. > Is that so?
Yes, the technical side of such a comparison is relatively easy to implement: - There are less than about 2000 or 5000 *.m4 files that are shared between projects. Downloading and storing all historical versions of these files will take ca. 0.1 to 1 GB. - They would be stored in a content-based index, i.e. indexed by sha256 hash code. - A distribution could then quickly test whether a *.m4 file found in a distrib tarball is "known". The recurrently time-consuming part is, whenever an "unknown" *.m4 file appears, to - manually review it, - update the list of upstream git repositories (e.g. when a project has been forked) or the list of releases to consider (e.g. snapshots of GNU Autoconf or GNU libtool, or distribution-specific modifications). I agree with Jacob that a distro can put this in place, without needing to bother upstream developers. Bruno