Hi Jialin, Thanks for the excellent report. These "show" methods like many others in Bioconductor, rely on low-level helper showAsCell() which was not working properly on data-frame-like or array-like objects with a single column, or on SplitDataFrameList objects.
This should now be addressed. The fix is in S4Vectors 0.14.5 (release) and 0.15.10 (devel). Both should become available via biocLite() in about 24 hours. Let us know if you still see "show" problems after you update. Thanks, H. On 09/28/2017 01:19 AM, Jialin Ma wrote:
Dear all, I have a package in reviewing at https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_Contributions_issues_487&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=npFXtfKAjVRDigSzntYatjIYfWrBU30MFNqbP6u8Njg&s=P6CWpnkqCx0GPBTlw7QD2gGs_Lc3c063in1J_F4vvDY&e=, in which I would like to use a GRanges with nested data.frame or DataFrameList to represent the track data internally. However, the default show method does not seem to work well with such structures. I have an example for GRanges in which one meta-column is a one-column data frame: gr <- GRanges("chr21", IRanges(1:5, width = 1)) gr$df <- data.frame(x = 1:5) show(gr) GRanges object with 5 ranges and 1 metadata column: Error in .Method(..., deparse.level = deparse.level) : number of rows of matrices must match (see arg 3) However, if the nested data frame has two columns, it can be printed out correctly: gr <- GRanges("chr21", IRanges(1:5, width = 1)) gr$df <- data.frame(x = 1:5, y = 11:15) show(gr) GRanges object with 5 ranges and 1 metadata column: seqnames ranges strand | df <Rle> <IRanges> <Rle> | <data.frame> [1] chr21 [1, 1] * | 1:11 [2] chr21 [2, 2] * | 2:12 [3] chr21 [3, 3] * | 3:13 [4] chr21 [4, 4] * | 4:14 [5] chr21 [5, 5] * | 5:15 ------- seqinfo: 1 sequence from an unspecified genome; no seqlengths In some cases, it can be printed with a warning message, but the form is wrong: gr <- GRanges("chr21", IRanges(1:5, width = 1), emm = 6:10) gr$df <- data.frame(x = 1:5) show(gr) # The nested df is not printed with correct format, there is only # one column in the nested df. GRanges object with 5 ranges and 2 metadata columns: seqnames ranges strand | emm df <Rle> <IRanges> <Rle> | <integer> <data.frame> [1] chr21 [1, 1] * | 6 1,2,3,... [2] chr21 [2, 2] * | 7 1,2,3,... [3] chr21 [3, 3] * | 8 1,2,3,... [4] chr21 [4, 4] * | 9 1,2,3,... [5] chr21 [5, 5] * | 10 1,2,3,... ------- seqinfo: 1 sequence from an unspecified genome; no seqlengths Warning message: In (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : row names were found from a short variable and have been discarded Nested DataFrameList can not be printed: DF <- DataFrame(x = 1:2) DF$split = split(DataFrame(aa = 1:4), c(1,1,2,2)) show(DF) DataFrame with 2 rows and 2 columns Error in dim(object) <- c(nrow(object), prod(tail(dim(object), -1))) : invalid first argument class(DF$split) [1] "CompressedSplitDataFrameList" attr(,"package") [1] "IRanges" In the case above, I understand that it is hard to create a short string representation of the nested structure, but I think printing dimensions of the nested element may be sufficient. Any comments? Best, Jialin ----------- Session Info: R version 3.4.1 (2017-06-30) Platform: x86_64-suse-linux-gnu (64-bit) Running under: openSUSE Tumbleweed Matrix products: default BLAS: /usr/lib64/R/lib/libRblas.so LAPACK: /usr/lib64/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] Biobase_2.37.2 GenomicRanges_1.29.14 GenomeInfoDb_1.13.4 [4] IRanges_2.11.17 S4Vectors_0.15.8 BiocGenerics_0.23.1 [7] magrittr_1.5 loaded via a namesp r$> DF$split <- DF$split %>% as.list %>% lapply(as.data.frame) r$> DF DataFrame with 2 rows and 2 columns x split <integer> <list> 1 1 1,2 2 2 3,4 ace (and not attached): [1] zlibbioc_1.23.0 compiler_3.4.1 XVector_0.17.1 [4] tools_3.4.1 GenomeInfoDbData_0.99.1 RCurl_1.95- 4.8 [7] ulimit_0.0-3 bitops_1.0-6 _______________________________________________ Bioc-devel@r-project.org mailing list https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=npFXtfKAjVRDigSzntYatjIYfWrBU30MFNqbP6u8Njg&s=J5tukPZSuK7728ZillLQJHHrfu7e0o1QsLm0OPNiS2Y&e=
-- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319 _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel