https://salsa.debian.org/med-team/catfishq is ready for review+sponsoring. Many thanks! Steffen
Am 23.05.21 um 16:18 schrieb Steffen Möller: > > > Am 23.05.21 um 14:26 schrieb Steffen Möller: >> Am 23.05.21 um 00:02 schrieb Nilesh Patra: >>> On 5/23/21 2:54 AM, Andreas Tille wrote: >>>> On Sat, May 22, 2021 at 09:10:46AM +0200, Andreas Tille wrote: >>>>> On Fri, May 21, 2021 at 09:26:48PM +0200, Steffen Möller wrote: >>>>>> If someone needs a stimulus to package something - cuteSV >>>>>> (https://github.com/tjiangHIT/cuteSV), please. >>>>> I gave it a kickstart while sitting in the train (which will be >>>>> offline soon). Everybody can feel free to add own ID to Uploaders >>>>> and finalise. There is no build time test running now and no >>>>> autopkgtest. Data to test / benchmark are included - so this >>>>> should be feasible. >>>> I just packaged the precondition python3-cigar and uploaded to new. >>> I wrote a sample autopkgtest for cigar (basically used the same thingy in >>> the readme) >>> and did a few minor changes. >>> >>> I have no idea about autopkgtests for cutesv - I lack the pre-requistites >>> here and probably only Steffen can help here. >>> >>> PS: Please check and upload vbz-compression whenever you have time (after >>> two days as you wrote would be fine anyway) >>> I'll be inactive/be away for a couple of days (wish to take a break :-)) >> Thank you both, you are amazing! >> >> CuteSV is part of the >> https://github.com/nanoporetech/pipeline-structural-variation that I >> plan to run when first Nanopore reads surface in my inbox next week. You >> compare against a reference genome to run this, which we do not have in >> Debian, so, yes, we should think of some tests, but we should also find >> a way to perform such tests for other packages. >> >> This kind of leads to a follow-up question - we could have a "test >> package" that offers a fraction of the human genome, like the Y >> chromosome and a second - chromosome 22 maybe. That would not be too big >> and we can test with it. It would also be a bit meaningless, though. And >> for testing we do not need anything to be human (or real) in the first >> place. We could generate our own mini-genome or instead (which I would >> prefer) go for something small that is real, like yeast (for >> eukaryotes), E. coli (for bacteria), we ignore archea, and then .. there >> is https://www.ncbi.nlm.nih.gov/nuccore/CP014940 , i.e. that data fr C. >> Venter's >> https://www.jcvi.org/research/first-minimal-synthetic-bacterial-cell, >> which may be interesting to be distributed with an Open Source >> distribution. >> >> While there is always something novel found also for these genomes for >> which the genomic DNA is long known, we do not much harm by distributing >> such genomes. Professional researchers will update them, anyway. The >> same holds for the human genome, but it is a bit larger and we should >> possibly make our experiences with the smaller genomes, first. >> >> I'll let this think in for another while and then likely extend getData >> to deal with these genomes and auto-generate native Debian packages with it. >> >> Ok - back to some real work and I'll have a closer look at that pipeline. > > I just went through their snakemakefile. To get this running, we need > > * catfishq > <https://github.com/philres/catfishq>https://github.com/philres/catfishq > <https://github.com/philres/catfishq> > * lra (long read aligner) https://github.com/ChaissonLab/LRA > <https://github.com/ChaissonLab/LRA> > * truvari https://github.com/spiralgenetics/truvari/ > <https://github.com/spiralgenetics/truvari/> > * add the scripts to libvcflib1/new package vcflib-scripts > > Catfishq looks straight-forward, I'll just go and adress that. LRA is > a meson build with "subprojects" that wrap other bits. Truvari drags > in a few python packages that in part we do not have, yet . Have added > that info to the Nanopore tab on > https://docs.google.com/spreadsheets/d/1tApLhVqxRZ2VOuMH_aPUgFENQJfbLlB_PFH_Ah_q7hM/edit#gid=1806578173 > > Best, > Steffen > >