Karan, Cem F CIV USARMY CCDC ARL (USA) via License-discuss dixit:

>If it were to be done seriously, then a great deal more thought would
>need to go into it. In one of the messages I sent out (see

What I’m doing at $dayjob (which is involved in Java™ mostly) is to,
at build time, do…

1. check if the source tree is clean (git status)
2. create, in a temp subdirectory, a clean copy of the source tree
   (git archive HEAD | tar xf, basically) with no files added/removed
3. create a tarball from that
4. ask Maven to download me the source JARs¹ of all dependencies
   (recursively) into another directory hierarchy
5. create a PKZIP archive from that
6. place the two archvies so that they will be included in the WAR
   file (basically Java Web executable, which the application server
   serves as HTTP file tree) and thus shipped inside the binary
7. at the end of build, check the committed list of dependencies
   against the current list and update² if necessary

The web code already has links to the two archives and the global
LICENCE.txt file (see below) plus extra licence files (things like
the LGPL, Apache licence, CPL, CC-BY, etc. get kept separate). See
https://evolvis.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=veraweb/veraweb.git;a=tree
if you wish to see it in action.

① source JARs is the best we can get from Maven; they normally
  only contain a copy of the .java files that are direct source
  for the .class files in the binary JAR, plus some minor metadata;
  some projects don’t even have that, so I recently amended this
  mechanism to account for those and placed manually downloaded
  distfiles into the tree that are copied there as well; so,
  source JARs are *not* Complete Corresponding Source, but we
  don’t deal with GPL/AGPL in the Java™ world (too much CDDL/EPL),
  and there’s often no (automatable) way to get the distfiles

  Torsten Werner had this idea of rebootstrapping a new Maven
  repository with explicit source requirements, it never went
  anywhere though.

  Oh, and don’t ever trust metadata like <license/> tags in the
  POMs (or, worse, package.json in the JavaScript world which is
  even more wild-west), they lie. Inspect every single file… or
  at least do some overall grepping plus some random sampling.

② by changing the file, showing a diff, telling the user to
  commit the changed file, and breaking the build (since this
  list is used in the earlier step), plus the dependencies in
  the list have a flag whether licence review has been done,
  which of course must be re-done each time a dependency ver‐
  sion changes (I painstakingly copy all licence grants and
  Apache NOTICE files into a central LICENCE.txt file also in
  the webroot of the project)

This is not perfect for multiple reasons, mostly the inadequate
source JAR mechanism (but if I get the binary JARs from an in‐
termediate, the Maven Central repository, and source JARs are
all I get, I’d argue that I couldn’t have done better if some‐
one complains and send them to the Maven Central operators).
Our software also creates source JARs for the individual com‐
ponents, nothing better, as there’s no standardised mechanism
in the Maven world; we could³ do better, but at least naming
would have to be agreed on. The JAR is like a library (even
if these can sometimes be executable, which ELF DLLs also can
be), the WAR is more like a finished binary, with some JARs
embedded and others expected to be provided by the runtime
environment, but we produce source on the WAR level.

③ https://repo1.maven.org/maven2/org/evolvis/tartools/maven-parent/1.17/
  I can build a -source.tgz in these projects and upload that,
  but as I said, naming is not standardised.

Worse, the source code is on project level, not library level.
Look at the output of (on Debian)
        apt-cache showsrc libxcb | fgrep ' deb libs '
which counts 24 packages shipping a library currently; there
is but one source tree for all of them combined.

I’d argue that the -source.tgz should be built by the parent
project or global Makefile and shipped under the name of the
source package, not of those libraries built, but then we’d,
in addition to naming, need a discovery mechanism.

This also applies to ELF (and other) binaries:

$ dpkg -L coreutils | fgrep bin/ | wc -l
105

I doubt I’d want 105 copies of the same source, or of the
contents of coreutils-8.30/lib/, or even of the GPL.

I’d argue that, in the presence of a packaging system, the
complete corresponding source should be an artefact, named
like the source package, with its own dependencies… and in
Debian we could just make source packages installable and
able to be depended on (there’s an open issue about that
somewhere); that plus a mechanism to install them all would
suffice.

For the Java™ world, we have a (bad) packaging system, but
keeping source JARs around (they are used for the benefit
of the IDE, not to provide actual buildable sources, so they
have a different scope anyway) but add discoverable source
archives (could even be JAR, to not break the ecosystem) as
a new requirement. That, plus code (which I already have ☻)
to download them all and embed into the finished web contai‐
ners, or create a bundle otherwise… most Java™ projects are
comprised of more than just one WAR (multiple WARs or EARs,
a tarball with files needed to install it or documentation
perhaps) so I tend to create a “shipment” archive which
could just bundle all sources as well if it’s not needed
inside the webroot. (I used to have that but moved to web.)

And then there’s this project where I was lazy; all depen‐
dencies are Apache/BSD/similar, so all I *needed* to bundle
were the licence notices. The system-inherent problems plus
the ever-small budget indicated so. Sure, I could bundle the
dependencies’ sources, but nobody’s going to look at it as
that project is not web but executable fat JAR.

tl;dr: Mechanisms are there, many even, but agreement is
       lacking.

bye,
//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg

_______________________________________________
License-discuss mailing list
License-discuss@lists.opensource.org
http://lists.opensource.org/mailman/listinfo/license-discuss_lists.opensource.org

Reply via email to