just to acknowledge that I see my name here, and am working to remember why the 
change was introduced -- it wasn't arbitrary, but I don't remember the full 
context.

Will report back ASAP.

Martin

On 5/11/20, 5:59 PM, "Bioc-devel on behalf of James W. MacDonald" 
<bioc-devel-boun...@r-project.org on behalf of jmac...@uw.edu> wrote:

    There is a bug in the way that the OrgDb packages that use the NOSCHEMA
    schema figure out which tables to use, that was introduced by some changes
    that Martin made to AnnotationForge last September:

    commit 02749e3779eb5036211d600915506bab86633ea0
    Author: Martin Morgan <martin.mor...@roswellpark.org>
    Date:   Fri Sep 27 12:18:48 2019 -0400

        support go_cc, go_cc_all, etc when making OrgDb from data.frame()s

    Which can be shown by doing

    library(AnnotationForge)
    example(makeOrgPackage)
    library(org.Tgutatta.eg.db)
    select(org.Tguttata.eg.db, head(keys(org.Tguttata.eg.db)), "GO")
    Error in FUN(X[[i]], ...) :
      Two fields in the source DB have the same name.

    This comes from the internal function  .deriveTableNameFromField, which
    tries to infer the correct tables to use for the SQL query. Since there are
    now lots of tables with 'GO' as one of their field names:

    > con <- org.Tguttata.eg_dbconn()
    > z <- grep("go", dbListTables(con), value = TRUE)
    > sapply(z, dbListFields, con = con)
    $go
    [1] "_id"      "GO"       "EVIDENCE" "ONTOLOGY"

    $go_all
    [1] "_id"         "GOALL"       "EVIDENCEALL" "ONTOLOGYALL"

    $go_bp
    [1] "_id"      "GO"       "EVIDENCE"

    $go_bp_all
    [1] "_id"      "GO"       "EVIDENCE"

    $go_cc
    [1] "_id"      "GO"       "EVIDENCE"

    $go_cc_all
    [1] "_id"      "GO"       "EVIDENCE"

    $go_mf
    [1] "_id"      "GO"       "EVIDENCE"

    $go_mf_all
    [1] "_id"      "GO"       "EVIDENCE"

    It's no longer possible to figure out which table to use when a user wants
    data from the 'GO' column. In the past this wasn't a problem because there
    were just two tables (go and go_all) for NOSCHEMA OrgDbs, so it would pick
    the go table.

    An easy fix would be to subset out any but the go and go_all table as part
    of .deriveTableNameFromField, but then it seems weird to even have these
    other tables.  Which made me wonder if there are any instances where
    anything but the go or go_all tables are used, but I can't find one, which
    makes me wonder why we even have these other tables? So maybe the real easy
    fix is to just back out the changes that Martin made, and maybe even remove
    the subsetted GO tables from the DBSCHEMA packages as well?

    Best,

    Jim

    -- 
    James W. MacDonald, M.S.
    Biostatistician
    University of Washington
    Environmental and Occupational Health Sciences
    4225 Roosevelt Way NE, # 100
    Seattle WA 98105-6099

        [[alternative HTML version deleted]]

    _______________________________________________
    Bioc-devel@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to