Re: python3-scanpy 1.6.0 patched, could you take a look?

2021-03-23 Thread Nilesh Patra
On 2021-03-23 12:18, Andreas Tille wrote:
> I was pointed in some mails (which I'm to busy to seek for) by members 
> of the Python team to prefix source package names by 'python-' and I'm
> following this recommendation.

I see

> I know there are counter-examples but
> it also fits my personal taste that it is convenient to have source and
> (main) binary package name the same - well, I know the binary is now
> python3-* instead of python-* but you know what I mean.

Sure

> IMHO in the Debian Med team we should be extra picky about this.  We
> have lots of other applications.  I consider it very convenient to guess
> from the repository name what type of package can be expected.

Makes sense. Since you thin we should be extra picky about it, do you
think we should add
it into our policy somehow, or maybe add it into FAQ or some page that
is easily accessible and/or handy?

Nilesh



Re: python3-scanpy 1.6.0 patched, could you take a look?

2021-03-23 Thread Andreas Tille
Hi Robbi,

On Tue, Mar 23, 2021 at 11:49:34AM +0800, Robbi Nespu wrote:
> > thanks for looking into scanpy.
> You are welcome, this package I choose during Debian med online sprint and
> make me packaging it dependencies which is python-stdlib-list[1] and
> python-sinfo[2], which I so happy and glad to be able packaging two package
> on my first Debian packaging contribution, so I hope to help scanpy fixed
> and uploaded too :)

I also think that scanpy was a nice pick since it had quite some
educational advantages.
 
> > > * blocker 1.6.0-1 removed
> > > * add missing dependencies
> > > * fixed (patched) test file failure during execution
> > 
> > Your changes are sensible - but please do not create another changelog
> > entry for this.  The file debian/changelog is to record changes compared
> > to the previous Debian package upload.  If there was no previous package
> > we stick to
> > 
> > scanpy (1.6.0-1) UNRELEASED; urgency=medium
> > 
> > and for the first upload the only entry here should be
> > 
> >* Initial release (Closes: #970887)iff --git a/debian/changelog 
> > b/debian/changelog
> index 9afc630..a346380 100644
> --- a/debian/changelog
> +++ b/debian/changelog
> @@ -1,4 +1,4 @@
> -scanpy (1.6.0-2) UNRELEASED; urgency=medium
> +scanpy (1.7.1-1) UNRELEASED; urgency=medium

Yes!
 
> > (nothing else).  Speaking about this:  We should not leave anything that
> > has target distribution UNRELEASED inside the changelog.  Please always
> > edit inside the UNRELEASED changelog entry - and it will be set to
> > unstable once the package will be uploaded.  Users might read only the
> > latest changelog entry and will miss changes "hidden" under UNRELEASED.
> > In other words: The UNRELEASED "target distribution" is just for the
> > team members to know that a package is unreleased - not for the users.
> 
> It should be like this:
> 
> change log A
> 
> scanpy (1.6.0-1) unstable; urgency=medium
> 
>   [ Robbi Nespu ]
>   * add pythons anndata, sinfo, setuptools-scm, legacy-api-wrap as
> dependencies
>   * fixing test module which failed during execution
> 
>  -- Robbi Nespu   Fri, 19 Mar 2021 23:05:47 +0800
> 
>   [ Steffen Moeller ]
>   * Initial release (Closes: #970887)
> 
> BLOCKER: ModuleNotFoundError: No module named 'sinfo'
>  -- Steffen Moeller   Fri, 25 Sep 2020 01:33:40 +0200
> 

Definitely not - its way to much text and its even syntactically
wrong.  A changelog paragraph can have only a single

  -- Author   Date

line and this has to be the last line of such a paragraph.
 
> or
> 
> change log B
> 
> scanpy (1.6.0-1) unstable; urgency=medium
> 
>   [ Robbi Nespu ]
>   * Initial release (Closes: #970887)
>   * add pythons anndata, sinfo, setuptools-scm, legacy-api-wrap as
> dependencies
>   * fixing test module which failed during execution
> 
>  -- Robbi Nespu   Fri, 19 Mar 2021 23:05:47 +0800
> 
>   [ Steffen Moeller ]
> 
> BLOCKER: ModuleNotFoundError: No module named 'sinfo'
>  -- Steffen Moeller   Fri, 25 Sep 2020 01:33:40 +0200
> 

Can't see the diff to log A.  Its wrong in the same way.
 
> or
> 
> change log C
> 
> scanpy (1.6.0-1) unstable; urgency=medium
> 
>   [ Steffen Moeller, Robbi Nespu ]
>   * Initial release (Closes: #970887)
> 
>  -- Steffen Moeller   Fri, 25 Sep 2020 01:33:40 +0200
>  -- Robbi Nespu   Fri, 19 Mar 2021 23:05:47 +0800
> 

Also here those two lines are simply wrong.
 
> I think it should be like "change log C"? Correct me if I wrong.

This would be a valid changelog (see my commit [3]):


scanpy (1.7.1-1) UNRELEASED; urgency=medium 

  

  * Team upload
  * Initial release (Closes: #970887)

 -- Robbi Nespu   Fri, 19 Mar 2021 23:05:47 +0800


The "Team upload" is needed since you are not mentioned in
debian/control as "Uploaders".  If you would consider to keep on working
on this package, please add "Robbi Nespu " there
and drop the "Team upload" from d/changelog.  That's all.  Pretty simple,
isn't it. ;-)

> > I think you interpreted Diane correctly (from what I can read in the
> > attached text).
> Yes the Diane are the maintainer. I assume it on-going-process to be upload
> by ftpmaster and take time time to be available.

Hmmm, seems I should ask Diane again.  I can neither see it in new nor
in unstable.  This might either mean it was not uploaded yet or it was
rejected by ftpmaster.

Building it from Git for the moment to get scanpy building ...

> > IMHO we should always upload the

Re: python3-scanpy 1.6.0 patched, could you take a look?

2021-03-23 Thread Andreas Tille
On Tue, Mar 23, 2021 at 12:22:49AM -0700, Nilesh Patra wrote:
> 
> Makes sense. Since you thin we should be extra picky about it, do you
> think we should add
> it into our policy somehow, or maybe add it into FAQ or some page that
> is easily accessible and/or handy?

Yes, that makes sense.  It would be great if you could do so.

Kind regards

Andreas. 

-- 
http://fam-tille.de



Re: Which columns should we start working on?

2021-03-23 Thread Steffen Möller
Hi Nilesh,

Am 22.03.21 um 12:41 schrieb Nilesh Patra:
>
>> I'm mostly addressing you specifically here for the new "workflow based" 
>> packages we should start working on -- as you mentioned at the sprints.
>> Since freeze work should _mostly_ be done by now, we could focus on new 
>> packages :-)
Yeah!
>> Would you have any workflow package that you'd like help with?

In short: nextflow - but that is a tricky one, blocking many workflows,
though.

Slightly longer, I wish to encourage everyone to find their own preferences:

 * If (preferably if working at a University) you have a research group
near to you that is working on anything SARS related, ask them what they
are doing, try to understand that, and see what software there is and
start a project with them. Mostly forget about Debian in the mean time /
fix it as you need it.

 * If there is anything from the spreadsheet's keywords that interests
you then read up on the biology of a few packages mentioned as "workflow
packages" (which is meant to produce something that this is something
the biologists would like to put into the results section of their
paper) and look at the respective documentation, see if this builds,
follow a tutorial if existing. And then we need to learn, still, how we
can make some noise about this such that biologists find the tutorial
for self-education - and/or find you as someone who can help to get this
running on their data (or help finding someone who then helps).

 * There are different kinds of packages that may be important for
Debian, also for Debian's acceptance the bioinformatics world

   A) housekeeping packages (I just made this name up as a pun on
housekeeping genes) that are just expected to be available. I am not
unlikely to have marked such in red in the leftmost column of the
spreadsheet. It is the kind of package I go for when I am feeling a bit
down and what a quick success.

  MEME (Others) - a classic
  bbtools (Others and bulk RNA-seq) - we may already have part of
that in the distribution - I was/am a bit confused, still - is this
redundant with bbmap?

   B) the "columns". These are representants of what software biologists
are likely to need to go from raw data to a publication and nobody
missing anything. My priorities here are

  virus tab:
    1st and foremost: artic fieldbioinformatics - this uses the
nanopore to tell what ebola/sars strain you have - this may be as close
to the pandemics as we can possibly get. Since I work at a University
Hospital I think I am allowed to feel positively about finding someone
to field-test our fieldtest package once this is completed.
There is the original artic implementation and a reimplementation with
the nextflow workflow. Whatever we get to first, I tend to think.
Confusing? That is why we need the bio.tools folks - it is too much for
our tasks list (and for bio.tools, still :-) ).

      Single-Cell RNA-seq - all of them, preferably
      bulk RNA-seq - BioConductor,  pigx-rnaseq - is mostly there
      nanopore - it is the sequencing technology that is closest to us -
I actually own half of one, Jun has a complete one :) It is used in the
field to genotype viruses - today - it is too young to have a perfect
pipeline for it, yet, I tend to think. And the device is used so very
heterogeneiously. Things get updated very frequently everywhere and so
this is more like a "let's see what is going to be used"-kind of
situation for me at the moment.   

There are some tools that block many columns from being completed. To
mention here in particular are the workflow engines, and here it is
nextflow that seems like being a beast to package. So, yes, Nilesh,
please, nextflow out of the way would be a big help.

   A^B) the packages that have a direct application to virology/drug
development and are mostly singular applications - look at what
OpenPandemics' Forli lab and colleagues are giving us
https://forlilab.org/ . My picks are

    AutoDock-CrankPep (Docking/Structures) since oligopeptides are a
common tool to fish for antibodies, so you want to have something to
model that.

  and sometimes it is "community forming" and "technical curiosity" that
triggers me as for
    cmdock (Docking/Structures)
    autodock-gpu (Docking/Structures)
  which would be seen by all the BOINC-people. But who would not go
through their website and dream a bit.


There are other sheets that are a like

 anti-A: Packages that nobody expects, yet. "Synthetic Biology" (the
next thing for a while already) or "Molecular Tumor Boards" (the next
thing for even longer (like 25 years since microarrays came around) that
are now emerging). I think I put this up mostly to have a place to put
them, not really thinking that this is something that needs to go into
the distribution asap.

And there are sheets that are not existing - like I would like to care
about if days had just a few more hours - like for proteomics or mass
spec. We are completely blank on ontologies and how these

Re: python3-scanpy 1.6.0 patched, could you take a look?

2021-03-23 Thread Steffen Möller
Hi all, hi Robbi in particular,

Am 23.03.21 um 09:16 schrieb Andreas Tille:
> On Tue, Mar 23, 2021 at 12:22:49AM -0700, Nilesh Patra wrote:
>> Makes sense. Since you thin we should be extra picky about it, do you
>> think we should add
>> it into our policy somehow, or maybe add it into FAQ or some page that
>> is easily accessible and/or handy?
> Yes, that makes sense.  It would be great if you could do so.

I looked at a few tutorials and it seems like the Python shell is used
exclusively, so, I am fine either way.

scanpy is a bit unfortunate because of sc-analysis abbreviated to scan
does not actually scan. At least the module name is "scanpy" so it is
python(3)-scanpy, not python(3)-scan, which would be problematic. So
funny, right? Sigh. But then again, they have a Genome Biology paper
with that and for scVelo they are in Nature Biosciences - they improved
both the naming and with the journal, just kidding, Genome Biology is
already very nice. I just added scVelo to the excel table :) Maybe
scVelo degrades scanpy from workflow to a package ... need to think
about that, but since many scRNAseq packages do not yet offer these
"arrows in the diagram", it would seem unfair.

So, Robbi, well done. Please add yourself to the uploaders list. And I
am more than happy to have you as the active maintainer of scanpy. I
have just extended the d/u/metadata info - guix also has it as
python-scanpy, btw.  Andreas, may I ask you to sponsor scanpy, I mean
python-scanpy?

Best,

Steffen