[Bioc-devel] SRAdb::sraConvert returns results in arbitrary order

2016-11-17 Thread Ryan C. Thompson
Hello, I was recently bitten by an unexpected behavior in the sraConvert function from the SRAdb package. I wanted to fetch the other SRA IDs associated with the SRX numbers of 32 samples, and I used the sraConvert function to do so. However, I did not realized that sraConvert returns the res

[Bioc-devel] Moderating "answers" into comments on other answers/comments on the support site?

2015-10-08 Thread Ryan C. Thompson
Hello, On support.bioconductor.org, we've seen a lot of instances of people using the answer box when they should be adding a comment on an existing answer. I don't necessarily blame uses, since the "Add your answer" text box is far more obvious than the "add comment" button on an existing an

[Bioc-devel] bumphunter package has unstated dependency on digest package?

2015-08-07 Thread Ryan C. Thompson
Konsole output Hello, I was recently setting up the latest version of R & Bioc on a system, installing all packages from scratch, and I ran into an error while installing bumphunter. It failed to install because it couldn't load the "digest" package. After installing this package manually, bump

Re: [Bioc-devel] Hunting for the subset generic definition?

2015-07-29 Thread Ryan C. Thompson
From base, according to my R console: > subset standardGeneric for "subset" defined from package "base" function (x, ...) standardGeneric("subset") Methods may be defined for arguments: x Use showMethods("subset") for currently available ones. On 07/29/2015 10:40 AM, Steve Lianoglou wrote:

Re: [Bioc-devel] Bug in frmaTools

2015-07-07 Thread Ryan C. Thompson
t. Best, Matt On Mon, Jul 6, 2015 at 5:42 PM, Ryan C. Thompson <mailto:r...@thompsonclan.org>> wrote: I also discovered another apparent bug later in the same function. The second to last line of makeVectorsAffyBatch is vers <- ifelse(!is.null(cdfname), as.cha

Re: [Bioc-devel] Bug in frmaTools

2015-07-06 Thread Ryan C. Thompson
o ifelse will have length zero and "ifelse" does NOT do lazy evaluation. On 07/06/2015 12:21 PM, Ryan C. Thompson wrote: Hello, I just encountered a bug in frmaTools that makes it impossible to use on certain array platforms. The following lines in makeVectorsAffyBatch fail on an

[Bioc-devel] Bug in frmaTools

2015-07-06 Thread Ryan C. Thompson
Hello, I just encountered a bug in frmaTools that makes it impossible to use on certain array platforms. The following lines in makeVectorsAffyBatch fail on an AffyBatch object on the hthgu133pluspm platform: pms <- pm(object) pns <- probeNames(object) pmi <- unlist(pmindex(object

Re: [Bioc-devel] Broken pathway in graphite package?

2015-06-16 Thread Ryan C. Thompson
Ivana Dne 2015-06-16 21:17, Ryan C. Thompson napsal: Hello, I was attempting to run SPIA through the graphite package and ran into an odd error when running prepareSPIA on the human Reactome pathways. You can reproduce the error simply and quickly by: library(graphite) prepareSPI

Re: [Bioc-devel] Bioconductor Git/GitHub Mirrors

2015-06-16 Thread Ryan C. Thompson
This is great to hear. I sometimes want to delve into the source code of a package's internals, but doing so through the SVN web interface is clunky. Being able to use Github's repo browsing functionality for Bioc packages is great. On 06/16/2015 12:00 PM, Dan Tenenbaum wrote: Dear Bioconduct

[Bioc-devel] Broken pathway in graphite package?

2015-06-16 Thread Ryan C. Thompson
Hello, I was attempting to run SPIA through the graphite package and ran into an odd error when running prepareSPIA on the human Reactome pathways. You can reproduce the error simply and quickly by: > library(graphite) > prepareSPIA(pathways("hsapiens", "reactome")["Insulin receptor signalli

[Bioc-devel] Interoperability between DataFrame and dplyr?

2015-04-23 Thread Ryan C. Thompson
Hi all, So, dplyr is a pretty cool thing, but it currently works with data.frame and data.table, but not S4Vectors::DataFrame. I'd like to change that if possible, and I assume that this would "simply" involve writing some glue code. However, I'm not really sure where to start, and I expect t

Re: [Bioc-devel] Append/combine option for filterFastq and similar?

2015-04-22 Thread Ryan C. Thompson
tq assuming it can read from connections > rather than just files, but I have not tested it to be sure. > > On Wed, Apr 22, 2015 at 1:16 PM, Ryan C. Thompson > mailto:r...@thompsonclan.org>> wrote: > > That's not ideal because it's duplicating storag

Re: [Bioc-devel] Append/combine option for filterFastq and similar?

2015-04-22 Thread Ryan C. Thompson
That's not ideal because it's duplicating storage unnecessarily. On 04/22/2015 04:07 AM, Aedin wrote: This is one instance were a system or simple unix command is very easy system('cat *.fastq > all.fastq') --- On Apr 22, 2015, at 6:00, bioc-devel-requ...@r-project.org wrote: Re: Append/co

[Bioc-devel] Append/combine option for filterFastq and similar?

2015-04-21 Thread Ryan C. Thompson
Hello, Often when sequence data is delivered to me, I receive each sample in several input files. Generally I want to get them into a single file ASAP, and the filterFastq step would be a convenient place to do it. Is there any possibility to add some way to append to an output file, or maybe

Re: [Bioc-devel] CRAN package with Bioconductor dependencies

2015-03-02 Thread Ryan C. Thompson
I thought CRAN packages weren't allowed to depend on Bioconductor packages for exactly this reason. On 03/02/2015 03:18 PM, Laurent Gatto wrote: Dear all, I had never realised that CRAN packages that depended on Bioc packages could actually not be installed with install.packages without setti

Re: [Bioc-devel] plotPCA for BiocGenerics

2014-10-31 Thread Ryan C. Thompson
I'd just like to chime in that regardless of what approach is chosen, I definitely would appreciate a way to get the plot data without actually making the plot. I often end up reimplementing plots in ggplot so that I can easily customize some aspect of them, so in such cases I need a way to jus

Re: [Bioc-devel] droplevels method for DataFrame?

2014-10-06 Thread Ryan C. Thompson
olving via a pull request? Would social coding increase external contributions to the infrastructure? On Mon, Oct 6, 2014 at 5:13 PM, Ryan C. Thompson mailto:r...@thompsonclan.org>> wrote: Hi, I've just noticed that DataFrame doesn't have a "dropleve

[Bioc-devel] droplevels method for DataFrame?

2014-10-06 Thread Ryan C. Thompson
Hi, I've just noticed that DataFrame doesn't have a "droplevels" method, but "data.frame" does. In fact, "droplevels.data.frame" seems to work just fine on DataFrame objects. Could this be added? -Ryan > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit)

Re: [Bioc-devel] Please bump version number when committing changes

2014-09-05 Thread Ryan C. Thompson
Hi all, Just to throw in a suggestion here, I know that many people use a tool like git-svn in this kind of situation. They want the ability to make multiple small commits in order to save their progress, but they don't want those commits visible until they are ready to push all at once. This

Re: [Bioc-devel] Additional summarizeOverlaps counting modes for ChIP-Seq

2014-08-05 Thread Ryan C. Thompson
ads function should be required to only take one argument, or else the method of passing through additional arguments to it should be documented. -Ryan On Tue 05 Aug 2014 05:12:41 PM PDT, Ryan C. Thompson wrote: Hi Valerie, I got really busy around May and never got a chance to thank you f

Re: [Bioc-devel] Additional summarizeOverlaps counting modes for ChIP-Seq

2014-08-05 Thread Ryan C. Thompson
fix, use.names=use.names, ignore.strand=ignore.strand)) ov <- findOverlaps(features, reads, type=type, ignore.strand=ignore.strand, maxgap=maxgap, minoverlap=minoverlap) if (inter.feature) { ## Remove reads that overlap multiple features. reads_to_keep <- which(countSubjectHits(ov) == 1L) ov

[Bioc-devel] Possible bug in edgeR::aveLogCPM.default?

2014-07-11 Thread Ryan C. Thompson
Hello, I think I may have found a bug in the code for aveLogCPM.default in the edgeR package. Near the end of the function, the variable "prior.count.scale" is conditionally assigned to, and never subsequently used. I assume this is a typo and the variable name is supposed to be "prior.count.

Re: [Bioc-devel] "nearest" & related methods for GRangesList & friends?

2014-05-23 Thread Ryan C. Thompson
M PDT, Hervé Pagès wrote: Hi Ryan, On 05/22/2014 03:38 PM, Ryan C. Thompson wrote: Hello, I recently found myself in want of a nearest method that handles GRangesList objects. Is there any plan to add one? Not that I know of. I guess most of the times it's probably good enough to call range

[Bioc-devel] "nearest" & related methods for GRangesList & friends?

2014-05-22 Thread Ryan C. Thompson
Hello, I recently found myself in want of a nearest method that handles GRangesList objects. Is there any plan to add one? I just want to define "nearest" for elements of a GRangesList by the shortest distance between any query range and any subject range. Obviously I can do this by unlisting

Re: [Bioc-devel] Additional summarizeOverlaps counting modes for ChIP-Seq

2014-04-30 Thread Ryan C. Thompson
04/30/2014 01:06 PM, Ryan C. Thompson wrote: Hi all, I recently asked about ways to do non-standard read counting in summarizeOverlaps, and Martin Morgan directed me toward writing a custom function to pass as the "mode" parameter. I have now written the custom modes that I require for c

[Bioc-devel] Additional summarizeOverlaps counting modes for ChIP-Seq

2014-04-30 Thread Ryan C. Thompson
Hi all, I recently asked about ways to do non-standard read counting in summarizeOverlaps, and Martin Morgan directed me toward writing a custom function to pass as the "mode" parameter. I have now written the custom modes that I require for counting my ChIP-Seq reads, and I figured I would c

[Bioc-devel] Missing seqinfo method for BamFileList?

2014-04-25 Thread Ryan C. Thompson
Hi all, I noticed that the seqinfo works on BamFile objects, but not on BamFileList objects. For BamFileList, it does not throw an error, but rather uses the inherited method for "List", which does not return a useful result for BamFileList. I suggest the following implementation of a useful

[Bioc-devel] Bug in les:::cdfDuplicates

2014-03-24 Thread Ryan C. Thompson
Hello, I have discovered a bug in the cdfDuplicates function in the les package. This function is used indirectly by the GSRI package, and I was attempting to use this package when I encountered an error. The error appears to occur because both rle and table are used to deduplicate a (sorted)

Re: [Bioc-devel] 'droplevels' argument in `[` method for SummarizedExperiment?

2014-03-12 Thread Ryan C. Thompson
I would prefer the droplevels method for SummarizedExperiment, since this is consistent with the use of droplevels on data.frame objects. On Wed 12 Mar 2014 03:02:37 PM PDT, Wolfgang Huber wrote: Hi Martin, Mike a DESeq2 user brought up the observation that when he subsets a ‘DESeqDataSet’ ob

Re: [Bioc-devel] edgeR crashes when xlsxjars is loaded

2013-12-16 Thread Ryan C. Thompson
Indeed, loading rJava and calling .jinit() also triggers the bug. I have updated my script (same URL as before) to demonstrate this. I run the bad code before and after calling .jinit(), and it only crashes the second time. On Mon 16 Dec 2013 02:30:34 PM PST, Simon Urbanek wrote: On Dec 16, 2

Re: [Bioc-devel] edgeR crashes when xlsxjars is loaded

2013-12-16 Thread Ryan C. Thompson
By the way, here is my sessionInfo after a successful run of the bug script (without loading xlsxjars): sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UT

[Bioc-devel] edgeR crashes when xlsxjars is loaded

2013-12-16 Thread Ryan C. Thompson
Hello, I have found an issue where having the xlsxjars package loaded kills the entire R session with a segfault when "edgeR::estimateDisp" is called on my dataset. The issue seems to be specific to my data, since a random integer matrix of identical dimension does not trigger the bug. Other

Re: [Bioc-devel] subread-buildindex fails on genome with many scaffolds?

2013-11-26 Thread Ryan C. Thompson
Actually, scratch that. I just tried running subread-buildindex on a file with only the 20-ish chromosome sequences, and it didn't give the message about 5 sections, but it still crashed with "Killed" and exit code 137. On Tue 26 Nov 2013 05:01:48 PM PST, Ryan C. Thompso

[Bioc-devel] subread-buildindex fails on genome with many scaffolds?

2013-11-26 Thread Ryan C. Thompson
Hello, I'm trying to test out subjunc for mapping my RNA-seq data to the cynomolgus monkey genome, but when I try to build the index with subread-buildindex, I get the error: "There are too many sections in the chromosome data files (more than 5 sections)." and then after "Building the

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Ryan C. Thompson
Just a note: the foreach package has solved this by providing a "nesting" operator, which effectively converts multiple nested foreach loops into one big one: http://cran.r-project.org/web/packages/foreach/vignettes/nested.pdf On Thu 14 Nov 2013 09:24:29 AM PST, Michael Lawrence wrote: I like

Re: [Bioc-devel] Splitting design matrix contrast pivoting logic into a separate function?

2013-09-01 Thread Ryan C. Thompson
atics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. http://www.statsci.org/smyth On Thu, 29 Aug 2013, Ryan C. Thompson wrote: Hi Gordon, I am currently working on some code that would benefit from having direct access to the co

[Bioc-devel] Splitting design matrix contrast pivoting logic into a separate function?

2013-08-29 Thread Ryan C. Thompson
Hi Gordon, I am currently working on some code that would benefit from having direct access to the code in edgeR::glmLRT that handles reparametrizing the design matrix so that the N linearly independent components of the specified contrasts are represented in the first N columns of the design

[Bioc-devel] Bug in voom/cbind for EList

2013-06-04 Thread Ryan C. Thompson
Hi Gordon, I'm implementing the "voom-by-group" procedure that you suggested previously, and I'm running into a small snag: when I call voom several times to get several EList objects, I then need to cbind them together. However, the design and lib.size components of the ELists are not merged

Re: [Bioc-devel] Combining Ordinary List of GRanges Optimisation

2013-01-07 Thread Ryan C. Thompson
With such a huge difference, I would wonder if the "c" method for GRanges objects is doing N-1 pairwise merges instead of a single N-way merge. On Mon 07 Jan 2013 09:08:28 AM PST, Michael Lawrence wrote: Would be interesting to do some profiling. Could be the merging of the sequence info, or t

Re: [Bioc-devel] Combining Ordinary List of GRanges Optimisation

2013-01-06 Thread Ryan C. Thompson
Hi Dario Strbenac, Are you asking if you can rewrite your code to work faster, or are you asking if the BioC devs need to improve the code to be faster? As a first test, I would try a few alternatives to see if they are significantly faster. One would be "unlist(GRangesList(blockRanges))". An

Re: [Bioc-devel] Making hypothesis testing easier with design matrices?

2012-12-11 Thread Ryan C. Thompson
Dear Gordon, On 12/10/2012 11:06 PM, Gordon K Smyth wrote: I don't see a proposal below, only a question. Yes, I ended up not really proposing anything because I realized that I didn't really have anything that improves on linear modeling. But see below for what I was trying to get at. Wha

Re: [Bioc-devel] Making hypothesis testing easier with design matrices?

2012-12-10 Thread Ryan C. Thompson
Dear Gordon, After a bit of pen-and-paper work, I see what you mean about additive models. I constructed a simple 2x2 additive model (i.e. "~a+b" where a and b each have 2 levels) and tried to solve for all 4 groups, and found that it was impossible. The best that can be done is solving for tw

[Bioc-devel] Making hypothesis testing easier with design matrices?

2012-12-10 Thread Ryan C. Thompson
Hi Gordon and list, I've been thinking about how to make it easier to specify what hypotheses one wants to test in microarray or RNA-seq differential expression data sets, and I think one of the major stumbling blocks that confuses people is the way in which design matrices must have one coef

[Bioc-devel] Does it matter whether I pool technical replicates for edgeR

2012-12-06 Thread Ryan C. Thompson
Hi all, I'm working with an RNA-seq dataset where every biological sample has two technical replicates. Is important for me to merge the technical replicates so that my count matrix has exactly one column per biological sample, or is it ok to leave them separate? My worry would be that leavin

Re: [Bioc-devel] Reading FASTQ/BAM from open file handle?

2012-12-04 Thread Ryan C. Thompson
Perfect, that's just what I wanted for Fastq files. Is there no R facility for reading unindexed bam? On Tue 04 Dec 2012 02:47:56 PM PST, Martin Morgan wrote: On 12/04/2012 01:27 PM, Ryan C. Thompson wrote: Hi all, I'm currently experimenting with using quip (https://github.com/dc

Re: [Bioc-devel] BiocParallel -- update

2012-12-04 Thread Ryan C. Thompson
ction that is to mapply as mclapply is to lapply. I plan to implement a param-generic version called bpmapply, which may become the backend for bpvectorize. On Tue 04 Dec 2012 01:15:24 PM PST, Michael Lawrence wrote: On Tue, Dec 4, 2012 at 12:47 PM, Ryan C. Thompson mailto:r...@thompsonclan.o

[Bioc-devel] Reading FASTQ/BAM from open file handle?

2012-12-04 Thread Ryan C. Thompson
Hi all, I'm currently experimenting with using quip (https://github.com/dcjones/quip#readme) to save disk space when storing FASTQ and BAM files. One thing that would be nice is to read quip-compressed FASTQ or BAM files directly into R. Obviously direct support for reading quip compression w

Re: [Bioc-devel] BiocParallel -- update

2012-12-04 Thread Ryan C. Thompson
One issue that I see is that for some kinds of parallel backends, there may not be any way for "bpworkers" to return something meaningful. For example, a backend that submits jobs to a large cluster may not know exactly how many nodes are in the cluster, and in any case returning the total numb

Re: [Bioc-devel] BiocParallel -- update

2012-12-04 Thread Ryan C. Thompson
On Tue 04 Dec 2012 11:31:59 AM PST, Michael Lawrence wrote: The name "pvec" is not very intuitive. What about "bpchunk"? And since the function passed to bpvectorize is already vectorized, maybe bpvectorize should be bparallelize? I know everyone has different intuitions/preferences when it comes

[Bioc-devel] Utility functions and bug fixes for edgeR/DESeq

2012-11-19 Thread Ryan C. Thompson
In my work, I've developed a few useful functions relating to edgeR and DESeq. First, I wrote a version of glmQLFTest that does not throw errors on zero-count genes. See the comment for details on what exactly it does: ## This version of glmQLFTest excludes genes with zero counts in all ## samp

Re: [Bioc-devel] BiocParallel

2012-11-18 Thread Ryan C. Thompson
On 11/17/2012 08:38 PM, Michael Lawrence wrote: You can use mclapply via parLapply using the fork backend. Oh, well, that's new to me. I guess that was added when multicore and snow were merged into the parallel package? If that's so, then parLapply is perfectly fine to use. However, I will not

Re: [Bioc-devel] Use DataFrame's printing format for data.frames

2012-11-17 Thread Ryan C. Thompson
Actually, my previous post had a small bug in it: it would throw an error when printing a zero-column data frame. The following code fixes this: print.data.frame <- function(df) { if (ncol(df) > 0 && require("IRanges")) { prev.max.print <- getOption("max.print") on.exit(options(max.pri

[Bioc-devel] Use DataFrame's printing format for data.frames

2012-11-17 Thread Ryan C. Thompson
Hi all, I noticed that DataFrame objects have a much faster and more practical printing format than base R's data.frame class. So I wrote a replacement for "print.data.frame" that prints data.frames in the same style as DataFrames. Just stick it in your ~/.Rprofile and your data.frames will m

Re: [Bioc-devel] BiocParallel

2012-11-17 Thread Ryan C. Thompson
On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote: In addition to Steve's comment, is it really a good thing that "all code stays the same."? I mean, multiple machines vs. multiple cores are, often, _very_ different things: for instance, shared vs. distributed memory, communication overhead diff

Re: [Bioc-devel] BiocParallel

2012-11-17 Thread Ryan C. Thompson
In reply to: On 11/16/2012 09:45 PM, Steve Lianoglou wrote: But then you have the situation of multi-machines w/ multiple cores -- is this (2) or (3) here? How do you explicitly write code for that w/ foreach mojo? I guess the answer to that is that you let your "grid engine" (or whatever your

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Ryan C. Thompson
you have to do is register a different backend, which is one line of code to load the new backend and a second one to register it, and the rest of your code stays the same. On Fri 16 Nov 2012 03:24:56 PM PST, Michael Lawrence wrote: On Fri, Nov 16, 2012 at 11:44 AM, Ryan C. Thompson mailto:r

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Ryan C. Thompson
To be more specific, instead of: library(parallel) cl <- ... # Make a cluster parLapply(cl, X, fun, ...) you can do: library(parallel) library(doParallel) library(plyr) cl <- ... registerDoParallel(cl) llply(X, fun, ..., .parallel=TRUE) On Fri 16 Nov 2012 11:44:06 AM PST, Ryan C. Th

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Ryan C. Thompson
27;m not sure I understand the appeal of foreach. Why not do this within the functional paradigm, i.e, parLapply? Michael On Fri, Nov 16, 2012 at 9:41 AM, Ryan C. Thompson mailto:r...@thompsonclan.org>> wrote: You could write a %dopar% backend for the foreach package, which would

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Ryan C. Thompson
You could write a %dopar% backend for the foreach package, which would allow any code using foreach (or plyr which uses foreach) to parallelize using your code. On a related note, it might be nice to add Bioconductor-compatible versions of foreach and the plyr functions to BiocParallel if they

Re: [Bioc-devel] BiocParallel

2012-11-15 Thread Ryan C. Thompson
You can probably parallelize the findOverlaps function, but you'd have to write the code yourself, and that code would be mostly bookkeeping code to get the indices right. Maybe there's a case for adding a parallelized findOverlaps function to BiocParallel? You can't parallelize the disjoin op

Re: [Bioc-devel] BiocParallel

2012-11-14 Thread Ryan C. Thompson
I just submitted a pull request. I'll add tests shortly if I can figure out how to write them. On Wed 14 Nov 2012 03:50:36 PM PST, Martin Morgan wrote: On 11/14/2012 03:43 PM, Ryan C. Thompson wrote: Here are two alternative implementations of pvec. pvec2 is just a simple rewrite of pv

Re: [Bioc-devel] BiocParallel

2012-11-14 Thread Ryan C. Thompson
Here are two alternative implementations of pvec. pvec2 is just a simple rewrite of pvec to use mclapply. pvec3 then extends pvec2 to accept a specified chunk size or a specified number of chunks. If the number of chunks exceeds the number of cores, then multiple chunks will get run sequentiall