Hi Stephanie,

The error is thrown from SeqArray:::.info at line 216 in the file and is related to the handing of NA values.

  x[x == ""] <- NA

Output from the == comparison can contain NAs and therefore can't be used (consistently) in subsetting operations.

'x' is a NumericList.

Browse[2]> x
NumericList of length 5
[[1]] 0.5
[[2]] 0.017000000923872
[[3]] 0.333000004291534 0.666999995708466
[[4]] <NA>
[[5]] <NA> <NA>

Here we see NAs returned for the NA values,

Browse[2]> x == ""
LogicalList of length 5
[[1]] FALSE
[[2]] FALSE
[[3]] FALSE FALSE
[[4]] <NA>
[[5]] <NA> <NA>

which fail on subsetting.

Browse[2]> x[x == ""]
Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs

One solution is use %in% which does not return NAs.

Browse[2]> x %in% ""
LogicalList of length 5
[[1]] FALSE
[[2]] FALSE
[[3]] FALSE FALSE
[[4]] FALSE
[[5]] FALSE FALSE


Valerie


On 11/22/2013 03:11 PM, Stephanie M. Gogarten wrote:
Hi Valerie,

The asVCF method in SeqArray is failing as of today with a (to me)
mysterious error.  I get it for the test files chr22.vcf.gz, ex2.vcf,
and gl_chr1.vcf in extdata of VariantAnnotation, but not for
SeqArray/extdata/CEU_Exon.vcf.  Do you have any suggestions of where I
might look to figure out where this error is coming from?

thanks,
Stephanie

 > vcffile <- system.file("extdata", "ex2.vcf",
package="VariantAnnotation")
 > gdsfile <- tempfile()
 > seqVCF2GDS(vcffile, gdsfile)
 > gdsobj <- seqOpen(gdsfile)
 > options(error=recover)
 > vcfg <- asVCF(gdsobj)
Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs

Enter a frame number, or 0 to exit

  1: asVCF(gdsobj)
  2: asVCF(gdsobj)
  3: .local(x, ...)
  4: VCF(rowData = .rowData(x), colData = .colData(x), exptData =
SimpleList(hea
  5: .info(x, info)
  6: `[<-`(`*tmp*`, x == "", value = NA)
  7: `[<-`(`*tmp*`, x == "", value = NA)
  8: lsubset_List_by_List(x, i, value)
  9: .fast_lsubset_List_by_List(x, i, value)
10: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
11: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
12: extractROWS(setNames(seq_along(x), names(x)), i)
13: extractROWS(setNames(seq_along(x), names(x)), i)
14: normalizeSingleBracketSubscript(i, x)

 > sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] VariantAnnotation_1.8.6 Rsamtools_1.14.1        Biostrings_2.30.1
[4] GenomicRanges_1.14.3    XVector_0.2.0           IRanges_1.20.6
[7] BiocGenerics_0.8.0      SeqArray_1.2.0          gdsfmt_1.0.0

loaded via a namespace (and not attached):
  [1] AnnotationDbi_1.24.0   Biobase_2.22.0         biomaRt_2.18.0
  [4] bitops_1.0-6           BSgenome_1.30.0        DBI_0.2-7
  [7] GenomicFeatures_1.14.2 RCurl_1.95-4.1         RSQLite_0.11.4
[10] rtracklayer_1.22.0     stats4_3.0.2           tools_3.0.2
[13] XML_3.95-0.2           zlibbioc_1.8.0

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to