Question about proper archive area for packages that require big data for operation

Laszlo Kajan Tue, 23 Apr 2013 03:06:15 -0700

Dear Russ, Debian Med Team, Charles!

(Please keep Tobias Hamp in replies.)


@Russ: Please allow me to include you in a discussion about a few 
bioinformatics packages that depend on big, but free data [2]. I have cited
your opinion [3] in this discussion before. You are on the technical committee 
and on the policy team, so you, together with Charles, can help
substantially here.

[2] 
http://lists.alioth.debian.org/pipermail/debian-med-packaging/2013-April/thread.html
[3] https://lists.debian.org/debian-vote/2013/03/msg00279.html

This email is to continue the discussion about free packages that depend on big 
(e.g. >400MB) free data outside 'main'. These packages
apparently violate policy 2.2.1 [0] for inclusion in 'main' because they 
require software outside the 'main' area to function. They do not
violate point #1 of the social contract [1], which requires non-dependency on 
non-free components. For these big data packages, policy seems to
be overly restrictive compared to the social contract, leading to seemingly 
unfounded rejection from 'main'.

[0] http://www.debian.org/doc/debian-policy/ch-archive.html
[1] http://www.debian.org/social_contract

* In case the social contract indeed allows such packages to be in 'main' (and 
policy is overly restrictive), how could it be ensured that the
packages are accepted?

* What is the procedure within Debian to elicit a decision about the handling 
of such packages in terms of archive area? Discussion on d-devel,
followed by policy change? Asking the policy team to clarify policy for such 
packages? Technical committee?

 + Charles suggested such packages could go into 'main' [4], with a clear 
indication of the large data dependency of the package in the long
description.
   When possible, providing the scripts for generating the large data as well.

 [4] 
http://lists.alioth.debian.org/pipermail/debian-med-packaging/2013-April/019292.html

My goal as a Debian Developer and a packager is to get packages into Debian (so 
'main') that are allowed in there, in reasonably short time. I
would like to resolve this issue properly, because I believe it may pop up more 
often in bioinformatics software. For example, imagine a protein
folding tool that would require a very large database to search for homologues 
for contact prediction, and using the contacts it would predict
protein three-dimensional structure. This has been done before [5], and such a 
tool would be (is) immensely useful for bioinformatics. This tool
would depend on gigabytes of data we would not package. Yet, by all means, I 
would want the tool to be part of the distribution.

[5] http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0028766

Thank you for your opinion and advice.

Best regards,
Laszlo


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/517658d5.9040...@debian.org

Question about proper archive area for packages that require big data for operation

Reply via email to