Am 23.05.21 um 00:02 schrieb Nilesh Patra: > > On 5/23/21 2:54 AM, Andreas Tille wrote: >> On Sat, May 22, 2021 at 09:10:46AM +0200, Andreas Tille wrote: >>> On Fri, May 21, 2021 at 09:26:48PM +0200, Steffen Möller wrote: >>>> If someone needs a stimulus to package something - cuteSV >>>> (https://github.com/tjiangHIT/cuteSV), please. >>> I gave it a kickstart while sitting in the train (which will be >>> offline soon). Everybody can feel free to add own ID to Uploaders >>> and finalise. There is no build time test running now and no >>> autopkgtest. Data to test / benchmark are included - so this >>> should be feasible. >> I just packaged the precondition python3-cigar and uploaded to new. > I wrote a sample autopkgtest for cigar (basically used the same thingy in the > readme) > and did a few minor changes. > > I have no idea about autopkgtests for cutesv - I lack the pre-requistites > here and probably only Steffen can help here. > > PS: Please check and upload vbz-compression whenever you have time (after two > days as you wrote would be fine anyway) > I'll be inactive/be away for a couple of days (wish to take a break :-))
Thank you both, you are amazing! CuteSV is part of the https://github.com/nanoporetech/pipeline-structural-variation that I plan to run when first Nanopore reads surface in my inbox next week. You compare against a reference genome to run this, which we do not have in Debian, so, yes, we should think of some tests, but we should also find a way to perform such tests for other packages. This kind of leads to a follow-up question - we could have a "test package" that offers a fraction of the human genome, like the Y chromosome and a second - chromosome 22 maybe. That would not be too big and we can test with it. It would also be a bit meaningless, though. And for testing we do not need anything to be human (or real) in the first place. We could generate our own mini-genome or instead (which I would prefer) go for something small that is real, like yeast (for eukaryotes), E. coli (for bacteria), we ignore archea, and then .. there is https://www.ncbi.nlm.nih.gov/nuccore/CP014940 , i.e. that data fr C. Venter's https://www.jcvi.org/research/first-minimal-synthetic-bacterial-cell, which may be interesting to be distributed with an Open Source distribution. While there is always something novel found also for these genomes for which the genomic DNA is long known, we do not much harm by distributing such genomes. Professional researchers will update them, anyway. The same holds for the human genome, but it is a bit larger and we should possibly make our experiences with the smaller genomes, first. I'll let this think in for another while and then likely extend getData to deal with these genomes and auto-generate native Debian packages with it. Ok - back to some real work and I'll have a closer look at that pipeline. Best, Steffen