Hi Nilesh, Am 22.03.21 um 12:41 schrieb Nilesh Patra: > >> I'm mostly addressing you specifically here for the new "workflow based" >> packages we should start working on -- as you mentioned at the sprints. >> Since freeze work should _mostly_ be done by now, we could focus on new >> packages :-) Yeah! >> Would you have any workflow package that you'd like help with?
In short: nextflow - but that is a tricky one, blocking many workflows, though. Slightly longer, I wish to encourage everyone to find their own preferences: * If (preferably if working at a University) you have a research group near to you that is working on anything SARS related, ask them what they are doing, try to understand that, and see what software there is and start a project with them. Mostly forget about Debian in the mean time / fix it as you need it. * If there is anything from the spreadsheet's keywords that interests you then read up on the biology of a few packages mentioned as "workflow packages" (which is meant to produce something that this is something the biologists would like to put into the results section of their paper) and look at the respective documentation, see if this builds, follow a tutorial if existing. And then we need to learn, still, how we can make some noise about this such that biologists find the tutorial for self-education - and/or find you as someone who can help to get this running on their data (or help finding someone who then helps). * There are different kinds of packages that may be important for Debian, also for Debian's acceptance the bioinformatics world A) housekeeping packages (I just made this name up as a pun on housekeeping genes) that are just expected to be available. I am not unlikely to have marked such in red in the leftmost column of the spreadsheet. It is the kind of package I go for when I am feeling a bit down and what a quick success. MEME (Others) - a classic bbtools (Others and bulk RNA-seq) - we may already have part of that in the distribution - I was/am a bit confused, still - is this redundant with bbmap? B) the "columns". These are representants of what software biologists are likely to need to go from raw data to a publication and nobody missing anything. My priorities here are virus tab: 1st and foremost: artic fieldbioinformatics - this uses the nanopore to tell what ebola/sars strain you have - this may be as close to the pandemics as we can possibly get. Since I work at a University Hospital I think I am allowed to feel positively about finding someone to field-test our fieldtest package once this is completed. There is the original artic implementation and a reimplementation with the nextflow workflow. Whatever we get to first, I tend to think. Confusing? That is why we need the bio.tools folks - it is too much for our tasks list (and for bio.tools, still :-) ). Single-Cell RNA-seq - all of them, preferably bulk RNA-seq - BioConductor, pigx-rnaseq - is mostly there nanopore - it is the sequencing technology that is closest to us - I actually own half of one, Jun has a complete one :) It is used in the field to genotype viruses - today - it is too young to have a perfect pipeline for it, yet, I tend to think. And the device is used so very heterogeneiously. Things get updated very frequently everywhere and so this is more like a "let's see what is going to be used"-kind of situation for me at the moment. There are some tools that block many columns from being completed. To mention here in particular are the workflow engines, and here it is nextflow that seems like being a beast to package. So, yes, Nilesh, please, nextflow out of the way would be a big help. A^B) the packages that have a direct application to virology/drug development and are mostly singular applications - look at what OpenPandemics' Forli lab and colleagues are giving us https://forlilab.org/ . My picks are AutoDock-CrankPep (Docking/Structures) since oligopeptides are a common tool to fish for antibodies, so you want to have something to model that. and sometimes it is "community forming" and "technical curiosity" that triggers me as for cmdock (Docking/Structures) autodock-gpu (Docking/Structures) which would be seen by all the BOINC-people. But who would not go through their website and dream a bit. There are other sheets that are a like anti-A: Packages that nobody expects, yet. "Synthetic Biology" (the next thing for a while already) or "Molecular Tumor Boards" (the next thing for even longer (like 25 years since microarrays came around) that are now emerging). I think I put this up mostly to have a place to put them, not really thinking that this is something that needs to go into the distribution asap. And there are sheets that are not existing - like I would like to care about if days had just a few more hours - like for proteomics or mass spec. We are completely blank on ontologies and how these could be maintained - more Java, mostly. Feel free to add them. >> And also two questions remain open: >> * Do we have a tutorial explaining the spreadsheet, or do you think we could >> find an alternative to the spreadsheet -- a salsa wiki or so? (Mostly for >> free software documentation perspective) To say it with Faulty Towers: I did not start it, but I added "Immune repertoire". The first "Info" page should be something close to a tutorial. Please everyone improve on it. I find these columns superior to our task list and also to what bio.tools/SciCrunch have yet come up with - only OMICtools had then chance to outshine it in its days. Find something better - I do not mind, very much in the contrary, really. We want packages, their dependencies and some biologically-relevant structuring. A dependency graph with tags may be an alternative. >> * What do you think of the "package of the week plan" that Andreas proposed? I admit to have forgotten about it. But how many people do we then want to work at the same package. Andreas did a good job with the videoconferences such that we got to know each other. So, I suggest to keep the Excel sheet as a synchronisation tools - we write the name of ours somewhere in the package line so we know who is active on it. And whoever wants extra input then just says so. scanpy I think would have been on my list from last months and I am very happy to see this addressed. My next picks would be for A) MEME B) pyomo A^B) autodock-gpu - can we have three packages of the week? > Hi Steffen, > > Apologies for ping, but any comment on this? Thank you for the ping, really. I need another few days on non-Debian-issues, I am afraid. Best, Steffen