Hi Thomas,

I get the following error when I try to obtain the feature types using the function genFeatures()


> library(systemPipeR)
> library(GenomicFeatures)
Loading required package: AnnotationDbi
> txdb <-  makeTxDbFromUCSC(genome = "hg19", tablename = "refGene")
Download the refGene table ... OK
Download the refLink table ... OK
Extract the 'transcripts' data frame ... OK
Extract the 'splicings' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .extractCdsLocsFromUCSCTxTable(ucsc_txtable, exon_locs) :
  UCSC data anomaly in 359 transcript(s): the cds cumulative length is
  not a multiple of 3 for transcripts ‘NM_001037501’ ‘NM_001277444’
  ‘NM_001037675’ ‘NM_001271872’ ‘NM_001170637’ ‘NM_001300952’
  ‘NM_015326’ ‘NM_017940’ ‘NM_001271870’ ‘NM_001143962’ ‘NM_001305275’
  ‘NM_001146344’ ‘NM_001300891’ ‘NM_001010890’ ‘NM_001300891’
  ‘NM_001289974’ ‘NM_001291281’ ‘NM_001301371’ ‘NM_016178’
  ‘NM_001134939’ ‘NM_001080427’ ‘NM_001145710’ ‘NM_001291328’
  ‘NM_001271466’ ‘NM_001017915’ ‘NM_005541’ ‘NM_000348’ ‘NM_001145051’
  ‘NM_001135649’ ‘NM_001128929’ ‘NM_001080423’ ‘NM_001144382’
  ‘NM_001291661’ ‘NM_002958’ ‘NM_001005861’ ‘NM_004636’ ‘NM_001005914’
  ‘NM_001290060’ ‘NM_001290061’ ‘NM_001289930’ ‘NM_003715’
  ‘NM_001290049’ ‘NM_001286054’ ‘NM_001286053’ ‘NM_001286052’
  ‘NM_182524’ ‘NM_001075’ ‘NM_00 [... truncated]
> feat <- genFeatures(txdb, featuretype="all", reduce_ranges=TRUE, upstream=1000,
+                    downstream=0, verbose=TRUE)
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
  subscript contains NAs


probably because -

Browse[2]> tx
GRanges object with 54439 ranges and 3 metadata columns:
                seqnames           ranges strand   |      tx_name
                   <Rle>        <IRanges>  <Rle> |  <character>
      [1]           chr1   [11874, 14409]      +   |    NR_046018
      [2]           chr1   [30366, 30503]      +   |    NR_036051
      [3]           chr1   [30366, 30503]      +   |    NR_036266
      [4]           chr1   [30366, 30503]      +   |    NR_036267
      [5]           chr1   [30366, 30503]      +   |    NR_036268
      ...            ...              ...    ... ...          ...
  [54435] chrUn_gl000228 [112605, 114676]      +   | NM_001306068
  [54436] chrUn_gl000228 [ 29339,  32226]      -   | NM_001005217
  [54437] chrUn_gl000228 [ 29339,  32226]      -   | NM_001286820
  [54438] chrUn_gl000241 [ 14739,  36767]      -   |    NR_132315
  [54439] chrUn_gl000241 [ 16025,  36957]      -   |    NR_132320
                  gene_id     tx_type
          <CharacterList> <character>
      [1]       100287102        <NA>
      [2]       100302278        <NA>
      [3]       100422831        <NA>
      [4]       100422834        <NA>
      [5]       100422919        <NA>
      ...             ...         ...
  [54435]       100288687        <NA>
  [54436]          448831        <NA>
  [54437]          448831        <NA>
  [54438]       100289097        <NA>
  [54439]       102723780        <NA>
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome
Browse[2]>  unique(mcols(tx)$tx_type)
[1] NA
debug: tmp <- tx[mcols(tx)$tx_type == tx_type[i]]
Browse[2]>
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
  subscript contains NAs


Here is my sessionInfo

> sessionInfo()
R Under development (unstable) (2015-10-15 r69519)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils datasets
[8] methods   base

other attached packages:
 [1] GenomicFeatures_1.23.3     AnnotationDbi_1.33.0
 [3] systemPipeR_1.5.1          RSQLite_1.0.0
 [5] DBI_0.3.1                  ShortRead_1.25.10
 [7] GenomicAlignments_1.7.1    SummarizedExperiment_1.1.0
 [9] Biobase_2.31.0             BiocParallel_1.5.0
[11] Rsamtools_1.23.0           Biostrings_2.39.0
[13] XVector_0.11.0             GenomicRanges_1.21.32
[15] GenomeInfoDb_1.7.1         IRanges_2.5.3
[17] S4Vectors_0.9.5            BiocGenerics_0.17.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.1            lattice_0.20-33        GO.db_3.2.2
 [4] digest_0.6.8           plyr_1.8.3 futile.options_1.0.0
 [7] BatchJobs_1.6          ggplot2_1.0.1          zlibbioc_1.17.0
[10] annotate_1.49.0        Matrix_1.2-2           checkmate_1.6.2
[13] proto_0.3-10           GOstats_2.37.0         splines_3.3.0
[16] stringr_1.0.0          pheatmap_1.0.7         RCurl_1.95-4.7
[19] biomaRt_2.27.0         munsell_0.4.2          sendmailR_1.2-1
[22] rtracklayer_1.31.1     base64enc_0.1-3        BBmisc_1.9
[25] fail_1.3               edgeR_3.13.0           XML_3.98-1.3
[28] AnnotationForge_1.13.0 MASS_7.3-44            bitops_1.0-6
[31] grid_3.3.0             RBGL_1.47.0            xtable_1.7-4
[34] GSEABase_1.33.0        gtable_0.1.2           magrittr_1.5
[37] scales_0.3.0           graph_1.49.1           stringi_1.0-1
[40] hwriter_1.3.2          reshape2_1.4.1         genefilter_1.53.0
[43] limma_3.27.0           latticeExtra_0.6-26 futile.logger_1.4.1
[46] brew_1.0-6             rjson_0.2.15           lambda.r_1.1.7
[49] RColorBrewer_1.1-2     tools_3.3.0            Category_2.37.0
[52] survival_2.38-3        colorspace_1.2-6




--
Thanks and Regards,
Sonali

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to