I don’t mean to name the package “SingleCell”. I was referring to the biocView. 
Also, BUS format is quite different from the 10x molecule info, since while 
CellRanger aligns reads to the genome with STAR, the BUS file is generated by 
pseudoalignment to a transcriptome index and gives the set of transcripts a 
read is compatible to rather than which gene a read aligns to.

In the ExperimentHub vignette about creating a new ExperimentHub package, we 
should contact a Bioconductor team member to upload the data. So does it mean 
that I directly email one of the core team members?

Lambda

On 12/21/18, 3:02 AM, "Bioc-devel on behalf of 
bioc-devel-requ...@r-project.org" <bioc-devel-boun...@r-project.org on behalf 
of bioc-devel-requ...@r-project.org> wrote:

    Send Bioc-devel mailing list submissions to
        bioc-devel@r-project.org
    
    To subscribe or unsubscribe via the World Wide Web, visit
        https://stat.ethz.ch/mailman/listinfo/bioc-devel
    or, via email, send a message with subject or body 'help' to
        bioc-devel-requ...@r-project.org
    
    You can reach the person managing the list at
        bioc-devel-ow...@r-project.org
    
    When replying, please edit your Subject line so it is more specific
    than "Re: Contents of Bioc-devel digest..."
    
    
    Today's Topics:
    
       1. Re:  New ExperimentHub resource and some related questions
          (Aaron Lun)
       2. Re:  New ExperimentHub resource and some related questions
          (Shepherd, Lori)
       3. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
          (Martin Morgan)
       4. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
          (Tierney, Luke)
       5. Re: Compilation flags, CHECK errors and BiocNeighbors
          (Obenchain, Valerie)
    
    ----------------------------------------------------------------------
    
    Message: 1
    Date: Thu, 20 Dec 2018 12:00:20 +0000
    From: Aaron Lun <infinite.monkeys.with.keyboa...@gmail.com>
    To: bioc-devel <bioc-devel@r-project.org>
    Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
        questions
    Message-ID: <9bf95433-af04-431b-b71d-62425195d...@gmail.com>
    Content-Type: text/plain; charset="utf-8"
    
    I presume your package is not actually called “SingleCell” (in point 1). 
This would be pretty confusing wjem compared to the simpleSingleCell package, 
the SingleCellExperiment package, and the SingleCell biocViews term itself. It 
would probably make more sense to call it BUStoolsR or some other appropriate 
pun (e.g., RBUS, which is funniest when it gets to version 3.8.0.).
    
    Also, at first glance, the BUS format seems pretty similar to 10X’s 
molecule information file, for which the DropletUtils package has a series of 
reader functions. You may find some of the code there useful for your package. 
I might also add a readBUS() function to DropletUtils if this turns out to be a 
popular format for droplet data, though TBH the sparse matrix is a much more 
common starting point.
    
    -A
    
    > On 20 Dec 2018, at 01:42, Lu, Dongyi (Lambda) <d...@caltech.edu> wrote:
    > 
    > Hi everyone,
    > 
    > I’m writing a package (biocViews SinigleCell) that converts files of the 
BUS format (standing for Barcode, UMI, Set, see 
https://www.biorxiv.org/content/early/2018/11/21/472571) into a sparse matrix 
in R that can be used in Seurat and SingleCellExperiment. In order to write the 
examples and the vignette, I’m also putting the data itself into a package for 
ExperimentHub. The data used here are some mixed human and mouse cells from 
10x. Here are my questions:
    > 
    > 
    >  1.  In the documentation for 
`ExperimentHubData::makeExperimentHubMetadata`, the fields `RDataClass` and 
`DispatchClass` are required. However, this accompanying dataset package is 
meant to download text files (generated by command line tools outside R) to 
disk rather than into the R session, and it’s the job of the SingleCell package 
to converts the text files into a sparse matrix. There is a website documenting 
how the command line tools were used to generate the text files. So is this 
dataset still appropriate for ExperimentHub?
    >  2.  If it is appropriate, then what shall I put in `RDataClass` and 
`DispatchClass`?
    > 
    > Thanks,
    > Lambda
    > 
    >   [[alternative HTML version deleted]]
    > 
    > _______________________________________________
    > Bioc-devel@r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/bioc-devel
    
    
    
    
    ------------------------------
    
    Message: 2
    Date: Thu, 20 Dec 2018 12:05:57 +0000
    From: "Shepherd, Lori" <lori.sheph...@roswellpark.org>
    To: "Lu, Dongyi (Lambda)" <d...@caltech.edu>,
        "bioc-devel@r-project.org" <bioc-devel@r-project.org>
    Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
        questions
    Message-ID:
        
<mw2pr12mb23645e21836b066c9e38f9ddf9...@mw2pr12mb2364.namprd12.prod.outlook.com>
        
    Content-Type: text/plain; charset="utf-8"
    
    There is a DispatchClass  -  FilePath -  That will download the file and 
give you the path to the file in the cache location rather than loading it to 
the R session -  You then can use the file path in whatever read/load/etc 
method you deem fit.
    
    RDataClass  - I would either say character or matrix - knowing that there 
will be instructions on how to load the data somewhere in your package -
    
    
    
    Lori Shepherd
    
    Bioconductor Core Team
    
    Roswell Park Cancer Institute
    
    Department of Biostatistics & Bioinformatics
    
    Elm & Carlton Streets
    
    Buffalo, New York 14263
    
    ________________________________
    From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Lu, Dongyi 
(Lambda) <d...@caltech.edu>
    Sent: Wednesday, December 19, 2018 8:42:39 PM
    To: bioc-devel@r-project.org
    Subject: [Bioc-devel] New ExperimentHub resource and some related questions
    
    Hi everyone,
    
    I�m writing a package (biocViews SinigleCell) that converts files of the 
BUS format (standing for Barcode, UMI, Set, see 
https://www.biorxiv.org/content/early/2018/11/21/472571) into a sparse matrix 
in R that can be used in Seurat and SingleCellExperiment. In order to write the 
examples and the vignette, I�m also putting the data itself into a package for 
ExperimentHub. The data used here are some mixed human and mouse cells from 
10x. Here are my questions:
    
    
      1.  In the documentation for 
`ExperimentHubData::makeExperimentHubMetadata`, the fields `RDataClass` and 
`DispatchClass` are required. However, this accompanying dataset package is 
meant to download text files (generated by command line tools outside R) to 
disk rather than into the R session, and it�s the job of the SingleCell package 
to converts the text files into a sparse matrix. There is a website documenting 
how the command line tools were used to generate the text files. So is this 
dataset still appropriate for ExperimentHub?
      2.  If it is appropriate, then what shall I put in `RDataClass` and 
`DispatchClass`?
    
    Thanks,
    Lambda
    
            [[alternative HTML version deleted]]
    
    _______________________________________________
    Bioc-devel@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    
    
    This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
        [[alternative HTML version deleted]]
    
    
    
    
    ------------------------------
    
    Message: 3
    Date: Thu, 20 Dec 2018 14:17:04 +0000
    From: Martin Morgan <mtmorgan.b...@gmail.com>
    To: "Tierney, Luke" <luke-tier...@uiowa.edu>, "Shepherd, Lori"
        <lori.sheph...@roswellpark.org>
    Cc: bioc-devel <bioc-devel@r-project.org>
    Subject: Re: [Bioc-devel] Aliasing `]` breaks BiocCheck::BiocCheck()
        version 1.18.0
    Message-ID:
        
<mwhpr05mb3582c1f459721640bde93c55f9...@mwhpr05mb3582.namprd05.prod.outlook.com>
        
    Content-Type: text/plain; charset="utf-8"
    
    this comes from `findGlobals()`
    
    > foo <- `[`
    > findGlobals(foo)
    Error in makeUsageCollector(fun, ...) : only works for closures
    > traceback()
    4: stop("only works for closures")
    3: makeUsageCollector(fun, ...)
    2: collectUsage(fun, enterGlobal = enter)
    1: findGlobals(foo)
    
    In the bigger context it is in code that looks for poor 'coding practice', 
in this particular case looking for use of T / F rather than TRUE / FALSE, 
where the logic is to parse each function for use of global variables, and then 
to search for T / F amongst those.
    
    The full traceback when run on the package at 
https://github.com/mtmorgan/PkgA/tree/BiocCheck-sbs
    
    * Checking coding practice...
    Error in makeUsageCollector(fun, ...) : only works for closures
    > traceback()
    9: stop("only works for closures")
    8: makeUsageCollector(fun, ...)
    7: collectUsage(fun, enterGlobal = enter)
    6: findGlobals(value)
    5: FUN(X[[i]], ...)
    4: lapply(objs, FUN = function(obj) {
           value = env[[obj]]
           if (is.function(value)) 
               findGlobals(value)
           else character(0)
       })
    3: findLogicalRdir(pkgname, c("T", "F"))
    2: checkCodingPractice(package_dir, parsedCode, package_name)
    1: BiocCheck::BiocCheck(".")
    
    Martin
    
    On 12/19/18, 8:32 AM, "Bioc-devel on behalf of Tierney, Luke" 
<bioc-devel-boun...@r-project.org on behalf of luke-tier...@uiowa.edu> wrote:
    
        codetools already checks only closures in checkUsageENv and hande
        checkUsagePackage, so this is anissue on the Bioc side.
        
        Best,
        
        luke
        
        On Tue, 18 Dec 2018, Tierney, Luke wrote:
        
        > Codetools should probably be ignoring those. Will have a look
        >
        > Sent from my iPhone
        >
        >> On Dec 18, 2018, at 6:54 AM, Shepherd, Lori 
<lori.sheph...@roswellpark.org> wrote:
        >>
        >> Can you please open an issue for this so we don't lose track of it -
        >>
        >> https://github.com/Bioconductor/BiocCheck/issues
        >>
        >>
        >>
        >> Lori Shepherd
        >>
        >> Bioconductor Core Team
        >>
        >> Roswell Park Cancer Institute
        >>
        >> Department of Biostatistics & Bioinformatics
        >>
        >> Elm & Carlton Streets
        >>
        >> Buffalo, New York 14263
        >>
        >> ________________________________
        >> From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of 
Shian Su <s...@wehi.edu.au>
        >> Sent: Monday, December 17, 2018 8:34:10 PM
        >> To: bioc-devel
        >> Subject: [Bioc-devel] Aliasing `]` breaks BiocCheck::BiocCheck() 
version 1.18.0
        >>
        >> Hi all,
        >>
        >> If you put
        >>
        >> foo <- `[`
        >>
        >> Somewhere in a package, it will trigger
        >>
        >> Error in makeUsageCollector(fun, ...) : only works for closures
        >>
        >> In BiocCheck::BiocCheck() (version 1.18.0). This comes from
        >>
        >> if (typeof(fun) != "closure")
        >>        stop("only works for closures")
        >>
        >> In codetools::makeUsageCollector(), but
        >>
        >>> typeof(`[`)
        >> ## "special"
        >>
        >> Not that it matters for my use-case because I had discovered 
magrittr???s extract alias, but it might be an edge case worth covering, 
especially since the error message is so cryptic.
        >>
        >> Kind regards,
        >> Shian Su
        >>
        >> _______________________________________________
        >>
        >> The information in this email is confidential and 
intend...{{dropped:29}}
        >>
        >> _______________________________________________
        >> Bioc-devel@r-project.org mailing list
        >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
        > _______________________________________________
        > Bioc-devel@r-project.org mailing list
        > https://stat.ethz.ch/mailman/listinfo/bioc-devel
        
        -- 
        Luke Tierney
        Ralph E. Wareham Professor of Mathematical Sciences
        University of Iowa                  Phone:             319-335-3386
        Department of Statistics and        Fax:               319-335-3017
            Actuarial Science
        241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
        Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
        
        _______________________________________________
        Bioc-devel@r-project.org mailing list
        https://stat.ethz.ch/mailman/listinfo/bioc-devel
        
    
    
    ------------------------------
    
    Message: 4
    Date: Thu, 20 Dec 2018 14:31:47 +0000
    From: "Tierney, Luke" <luke-tier...@uiowa.edu>
    To: Martin Morgan <mtmorgan.b...@gmail.com>
    Cc: "Shepherd, Lori" <lori.sheph...@roswellpark.org>, bioc-devel
        <bioc-devel@r-project.org>
    Subject: Re: [Bioc-devel] Aliasing `]` breaks BiocCheck::BiocCheck()
        version 1.18.0
    Message-ID: <alpine.DEB.2.21.1812200829080.3478@luke-Latitude-7480>
    Content-Type: text/plain; charset="utf-8"
    
    That's where the error is signaled, but the issue is in
    
    > 4: lapply(objs, FUN = function(obj) {
    >       value = env[[obj]]
    >       if (is.function(value))
    >           findGlobals(value)
    >       else character(0)
    >   })
    > 3: findLogicalRdir(pkgname, c("T", "F"))
    
    Change is.function(value) to typeof(value) == "closure" and you should be 
OK.
    
    Best,
    
    luke
    
    On Thu, 20 Dec 2018, Martin Morgan wrote:
    
    > this comes from `findGlobals()`
    >
    >> foo <- `[`
    >> findGlobals(foo)
    > Error in makeUsageCollector(fun, ...) : only works for closures
    >> traceback()
    > 4: stop("only works for closures")
    > 3: makeUsageCollector(fun, ...)
    > 2: collectUsage(fun, enterGlobal = enter)
    > 1: findGlobals(foo)
    >
    > In the bigger context it is in code that looks for poor 'coding 
practice', in this particular case looking for use of T / F rather than TRUE / 
FALSE, where the logic is to parse each function for use of global variables, 
and then to search for T / F amongst those.
    >
    > The full traceback when run on the package at 
https://github.com/mtmorgan/PkgA/tree/BiocCheck-sbs
    >
    > * Checking coding practice...
    > Error in makeUsageCollector(fun, ...) : only works for closures
    >> traceback()
    > 9: stop("only works for closures")
    > 8: makeUsageCollector(fun, ...)
    > 7: collectUsage(fun, enterGlobal = enter)
    > 6: findGlobals(value)
    > 5: FUN(X[[i]], ...)
    > 4: lapply(objs, FUN = function(obj) {
    >       value = env[[obj]]
    >       if (is.function(value))
    >           findGlobals(value)
    >       else character(0)
    >   })
    > 3: findLogicalRdir(pkgname, c("T", "F"))
    > 2: checkCodingPractice(package_dir, parsedCode, package_name)
    > 1: BiocCheck::BiocCheck(".")
    >
    > Martin
    >
    > On 12/19/18, 8:32 AM, "Bioc-devel on behalf of Tierney, Luke" 
<bioc-devel-boun...@r-project.org on behalf of luke-tier...@uiowa.edu> wrote:
    >
    >    codetools already checks only closures in checkUsageENv and hande
    >    checkUsagePackage, so this is anissue on the Bioc side.
    >
    >    Best,
    >
    >    luke
    >
    >    On Tue, 18 Dec 2018, Tierney, Luke wrote:
    >
    >    > Codetools should probably be ignoring those. Will have a look
    >    >
    >    > Sent from my iPhone
    >    >
    >    >> On Dec 18, 2018, at 6:54 AM, Shepherd, Lori 
<lori.sheph...@roswellpark.org> wrote:
    >    >>
    >    >> Can you please open an issue for this so we don't lose track of it -
    >    >>
    >    >> https://github.com/Bioconductor/BiocCheck/issues
    >    >>
    >    >>
    >    >>
    >    >> Lori Shepherd
    >    >>
    >    >> Bioconductor Core Team
    >    >>
    >    >> Roswell Park Cancer Institute
    >    >>
    >    >> Department of Biostatistics & Bioinformatics
    >    >>
    >    >> Elm & Carlton Streets
    >    >>
    >    >> Buffalo, New York 14263
    >    >>
    >    >> ________________________________
    >    >> From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of 
Shian Su <s...@wehi.edu.au>
    >    >> Sent: Monday, December 17, 2018 8:34:10 PM
    >    >> To: bioc-devel
    >    >> Subject: [Bioc-devel] Aliasing `]` breaks BiocCheck::BiocCheck() 
version 1.18.0
    >    >>
    >    >> Hi all,
    >    >>
    >    >> If you put
    >    >>
    >    >> foo <- `[`
    >    >>
    >    >> Somewhere in a package, it will trigger
    >    >>
    >    >> Error in makeUsageCollector(fun, ...) : only works for closures
    >    >>
    >    >> In BiocCheck::BiocCheck() (version 1.18.0). This comes from
    >    >>
    >    >> if (typeof(fun) != "closure")
    >    >>        stop("only works for closures")
    >    >>
    >    >> In codetools::makeUsageCollector(), but
    >    >>
    >    >>> typeof(`[`)
    >    >> ## "special"
    >    >>
    >    >> Not that it matters for my use-case because I had discovered 
magrittr???s extract alias, but it might be an edge case worth covering, 
especially since the error message is so cryptic.
    >    >>
    >    >> Kind regards,
    >    >> Shian Su
    >    >>
    >    >> _______________________________________________
    >    >>
    >    >> The information in this email is confidential and 
intend...{{dropped:29}}
    >    >>
    >    >> _______________________________________________
    >    >> Bioc-devel@r-project.org mailing list
    >    >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
    >    > _______________________________________________
    >    > Bioc-devel@r-project.org mailing list
    >    > https://stat.ethz.ch/mailman/listinfo/bioc-devel
    >
    >    --
    >    Luke Tierney
    >    Ralph E. Wareham Professor of Mathematical Sciences
    >    University of Iowa                  Phone:             319-335-3386
    >    Department of Statistics and        Fax:               319-335-3017
    >        Actuarial Science
    >    241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
    >    Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
    >
    >    _______________________________________________
    >    Bioc-devel@r-project.org mailing list
    >    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    >
    >
    
    -- 
    Luke Tierney
    Ralph E. Wareham Professor of Mathematical Sciences
    University of Iowa                  Phone:             319-335-3386
    Department of Statistics and        Fax:               319-335-3017
        Actuarial Science
    241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
    Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
    
    ------------------------------
    
    Message: 5
    Date: Thu, 20 Dec 2018 19:52:08 +0000
    From: "Obenchain, Valerie" <valerie.obench...@roswellpark.org>
    To: Aaron Lun <infinite.monkeys.with.keyboa...@gmail.com>,
        "bioc-devel@r-project.org" <bioc-devel@r-project.org>
    Subject: Re: [Bioc-devel] Compilation flags, CHECK errors and
        BiocNeighbors
    Message-ID:
        
<mwhpr1201mb02547c0566b9daf16450cf7bff...@mwhpr1201mb0254.namprd12.prod.outlook.com>
        
    Content-Type: text/plain; charset="utf-8"
    
    The problem is that during the nightly builds, one of the Bioconductor 
    packages writes out a .R/Makevars.win in biocbuild's HOME during R CMD 
    build.
    
    Yesterday I removed the .R/ directory before the builds started and, as 
    expected, today's NodeInfo on tokay2 and packages using the C++11 show 
    the correct flags.
    
    If this .R/Makevars.win is not removed, it will (and did in the past) 
    pollute the next build cycle such that the NodeInfo and all packages 
    using C++11 would report/use the wrong flags.
    
    I think I've narrowed down which package is doing this and will contact 
    the maintainer. We'll also implement some sanitation code in the BBS to 
    prevent this from happening again.
    
    The reason HOME is writable is that many applications need to create 
    files (often hidden) such as lock files, cache, config files etc. If 
    they can't, they'll break and they will sometimes break in a subtle way 
    that is not immediately obvious.
    
    One last follow up is to explain why the previous iteration of the 
    NodeInfo on the build report reported the incorrect C++11 flags. The 
    problem there was that previously we were only picking up CXX1XFLAGS 
    instead of the individual CXX11FLAGS, CXX14FLAGS etc.
    
    Thanks for being persistent on this issue and for bringing the 
    conversation to bioc-devel.
    
    Val
    
    
    
    On 12/18/18 8:39 AM, Obenchain, Valerie wrote:
    > The devel build report hasn't posted yet but I took a look at the new
    > compiler flag output Herve implemented. The results show tokay2 is
    > indeed using
    > 
    > CXX11FLAGS: -O3 -march=native -mtune=native
    > 
    > This is inconsistent with what we have in the R/etc/<arch>/Makeconf for
    > both architectures on both tokay1 and tokay2. The Makeconf looks like 
this:
    > 
    > CXX11 = $(BINPREF)g++ $(M_ARCH)
    > CXX11FLAGS = -O2 -Wall $(DEBUGFLAG) -mtune=generic
    > CXX11PICFLAGS =
    > CXX11STD = -std=gnu++11
    > 
    > I don't know why the Makeconf is not being respected on tokay2. I can
    > confirm the inconsistency in an R session -
    > 
    > tokay2:
    > 
    > PS C:\Users\biocbuild\bbs-3.9-bioc\R> ./bin/R CMD config CXX11FLAGS
    > -O3 -march=native -mtune=native
    > 
    > tokay1:
    > 
    > PS C:\Users\biocbuild\bbs-3.8-bioc\R> ./bin/R CMD config CXX11FLAGS
    > -O2 -Wall -mtune=generic
    > 
    > I'll work with Herve to resolve this.
    > 
    > Val
    > 
    > 
    > 
    > On 12/17/18 5:05 PM, Aaron Lun wrote:
    >> Thanks Val. I don�t think it�s a BiocNeighbors thing, as it doesn�t try
    >> to customize the compilation flags or have its own Makevars. Moreover,
    >> the �-O3 -mtune=native -mtune=generic� flags seem to show up on all of
    >> my packages containing C++11 code. Some cursory checks of other packages
    >> suggest that the correct flags (�-O2 -mtune=generic�) are used for C++98
    >> code.
    >>
    >> -A
    >>
    >>> On 17 Dec 2018, at 17:47, Obenchain, Valerie 
<valerie.obench...@roswellpark.org> wrote:
    >>>
    >>> Hi Aaron,
    >>>
    >>> The only compilation flags that are different for tokay1 (release) and
    >>> tokay2 (devel) are C++14 flags. BiocNeighbors is not using C++14 but
    >>> C++11 so I think the changes we discussed previously actually don't
    >>> apply to your case.
    >>>
    >>> All compilation flags we use are listed at the top of the build report,
    >>> e.g., for tokay2:
    >>>
    >>> 
https://www.bioconductor.org/checkResults/devel/bioc-LATEST/tokay2-NodeInfo.html
    >> 
<https://www.bioconductor.org/checkResults/devel/bioc-LATEST/tokay2-NodeInfo.html>
    >>>
    >>> I can look into this further but right now I'm not sure where the '-O3
    >>> -march=native -mtune=native' is coming from in the check output for
    >>> BiocNeighbors. We don't use 'native' on the builders for build/check or
    >>> for creating binaries.
    >>>
    >>> Herve might have more insight on this.
    >>>
    >>> Val
    >>>
    >>>
    >>>
    >>>
    >>>
    >>>
    >>>
    >>> On 12/15/18 10:56 PM, Aaron Lun wrote:
    >>>> Sometime between 6-18 November, BiocNeighbors� BioC-devel builds began 
failing on Windows 64-bit, and have continued to fail since:
    >>>>
    >>>> http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/
    >> <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>
    >> <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/
    >> <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>>
    >>>>
    >>>> The most interesting part is the nature of the failures. They are not 
segmentation faults but rather �incorrect� output in the unit tests:
    >>>>
    >>>> - BiocNeighbors uses the Annoy algorithm for approximate nearest 
neighbor search, which is provided as a header-only C++ library in the 
RcppAnnoy package.
    >>>>
    >>>> - I have compiled the BiocNeighhbors C++ code with an �#include" for 
these libraries to use the Annoy routines. For testing, I compared the output 
of my C++ code to the output of the code in the RcppAnnoy package.
    >>>>
    >>>> - It is these tests that are failing (i.e., the output does not match 
up) during CHECK on Windows 64-bit only, despite the fact that the same library 
is being �#include�d in both the BiocNeighbors and RcppAnnoy sources!
    >>>>
    >>>> What makes this particularly intriguing is that the differences 
between BiocNeighbors and RcppAnnoy are very minor. Less than 1% of the 
neighbor identities differ, and only for some of the scenarios, so it�s not an 
obvious bug that would be changing the  output en masse. Now, the package also 
uses/tests Annoy in
    >> BioC-release but builds fine on tokay1:
    >>>>
    >>>> http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/
    >> 
<http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/> 
<http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/
    >> 
<http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/>>
    >>>>
    >>>> The major difference between the Bioc-release/devel builds is the 
compilation flags, which have changed from �-O2 -mtune=generic� to �-O3 
-march=native -mtune=native� in tokay2. I am told (thanks Val) that the timing 
of this change is consistent with the  start of the BiocNeighbors build 
failures on tokay2. I would guess
    >> that RcppAnnoy is also compiled with �-O2 -mtune=generic� on the CRAN
    >> build systems, introducing differences in optimization levels between
    >> the BiocNeighbors and RcppAnnoy binaries. These could be responsible for
    >> the discrepancies in the search results.
    >>>>
    >>>> I was able to reproduce this on my Unix cluster (gcc 6.5.0) where 
setting �-march=native� with either �-O3� or �-O2� caused a difference in the 
calculations. After much trial and error, I eventually narrowed this down to 
the �-mfma� flag, which seems to  change the precision of multiply-and-add 
operations and thus the
    >> search results. This occurs even when AVX support is turned off; I guess
    >> the compiler tries to be smart if it detects you are doing some kind of
    >> simultaneous multiply and addition, which is a pretty common thing to do
    >> when computing Euclidean distances.
    >>>>
    >>>> In summary: can we not use �-march=native� on tokay2? (Val, I know we 
discussed this, but whatever changes you made to the compilation flags don�t 
seem to have propagated to the build machines.) As the case study with 
BiocNeighbors shows, this leads to inconsistencies  between the CRAN and 
BioC-devel binaries for the same code, which
    >> unnecessarily complicates downstream usage and unit tests. I also wonder
    >> how binaries specialized for tokay2�s architecture would behave on other
    >> CPUs with different instruction sets, if they would run at all.
    >>>>
    >>>> Cheers,
    >>>>
    >>>> Aaron
    >>>>         [[alternative HTML version deleted]]
    >>>>
    >>>> _______________________________________________
    >>>> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list
    >>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
    >> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
    >>>>
    >>>
    >>>
    >>>
    >>> This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that  any disclosure, copying, 
distribution, or use of this email message is
    >> prohibited.  If you have received this message in error, please notify
    >> the sender immediately by e-mail and delete this email message from your
    >> computer. Thank you.
    >>
    >>
    >>           [[alternative HTML version deleted]]
    >>
    >> _______________________________________________
    >> Bioc-devel@r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
    > 
    > 
    > 
    > This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
    > _______________________________________________
    > Bioc-devel@r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/bioc-devel
    > 
    
    
    
    This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
    
    ------------------------------
    
    Subject: Digest Footer
    
    _______________________________________________
    Bioc-devel mailing list
    Bioc-devel@r-project.org
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    
    
    ------------------------------
    
    End of Bioc-devel Digest, Vol 177, Issue 17
    *******************************************

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to