Thanks Val! I've been playing some more with that function and I was wondering if it would make sense to introduce an arguments to filter out synthetic exons if they are smaller than a given size. I.e. In my data I often get 1bp exons, which are obviously of no interest in the downstream analysis. I'm currently doing a post-filtering, but as that could benefit others it may be better if it's directly a function argument. The only issue is that I can't think of a decent default value; i.e. it depends much on the aligner used and on the kind of sequencing data, so it might have to be set to "NULL" by default.
Alejandro - I've seen that in DEXSeq you conserve only these exons you can test - what you call testable exons - does that include a size filter? What's your take? Nico --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delho...@embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On Aug 2, 2013, at 7:29 AM, Valerie Obenchain wrote: > These changes are implemented in GenomicFeatures 1.13.26. > > Valerie > > > On 08/01/2013 08:45 AM, Nicolas Delhomme wrote: >> Fantastic! >> >> Cheers, >> >> Nico >> >> --------------------------------------------------------------- >> Nicolas Delhomme >> >> Genome Biology Computational Support >> >> European Molecular Biology Laboratory >> >> Tel: +49 6221 387 8310 >> Email: nicolas.delho...@embl.de >> Meyerhofstrasse 1 - Postfach 10.2209 >> 69102 Heidelberg, Germany >> --------------------------------------------------------------- >> >> >> >> >> >> On Jul 31, 2013, at 10:41 PM, Alejandro Reyes wrote: >> >>> Dear all, >>> >>> No problem from my side, I can adapt DEXSeq to those changes. >>> >>> Best regards, >>> Alejandro Reyes >>> >>>> Mike, Alejandro, >>>> >>>> I also wonder about getting rid of the 'exonID' metadata column. This is >>>> redundant with 'exonic_part_number'. Do you have a preference for keeping >>>> one or the other? >>>> >>>> Valerie >>>> >>>> >>>> On 07/31/2013 10:04 AM, Valerie Obenchain wrote: >>>>> Hi Nico, >>>>> >>>>> (Adding Mike and Alejandro.) >>>>> >>>>> Because disjointExons() came from DEXSeq I wanted to preserve the >>>>> behavior for backwards compatibility and familiarity to DEXSeq users. >>>>> There are a couple of changes I'd like to make so disjointExons() is >>>>> consistent with the other extractors in GenomicFeatures. >>>>> >>>>> (1) Change metadata column names from 'geneNames' and 'transcripts' to >>>>> 'gene_id' and tx_name'. >>>>> >>>>> (2) Instead of '+' or ';' to separate gene id's or transcript names, >>>>> these columns would each be a CharacterList. >>>>> >>>>> If Mike and Alejandro are ok with these I'll go ahead and implement them. >>>>> >>>>> Valerie >>>>> >>>>> >>>>> >>>>> On 07/31/2013 06:29 AM, Nicolas Delhomme wrote: >>>>>> Hej Val, I believe that one is for you :-) >>>>>> >>>>>> When using the aggregateGenes=TRUE parameter of the disjointExons >>>>>> function, the gene names are separated by a "+" character. Is there a >>>>>> particular reason for that? The reason I'm asking is that in the >>>>>> "transcripts" column the transcripts ID are separated by a semi-column >>>>>> and I was wondering if the "separator" could not be unified - i.e. >>>>>> using semi-colon for both the geneNames and transcripts column. Here a >>>>>> visual example of what I mean: >>>>>> >>>>>> GRanges with 1 range and 4 metadata columns: >>>>>> seqnames ranges strand | >>>>>> <Rle> <IRanges> <Rle> | >>>>>> [1] Chr03 [4541747, 4541782] - | >>>>>> geneNames >>>>>> <character> >>>>>> [1] Potri.003G035500+Potri.003G035600+Potri.003G035700 >>>>>> transcripts >>>>>> <character> >>>>>> [1] PAC:26999771;PAC:26999331;PAC:26999330;PAC:26999332;PAC:26999333 >>>>>> exonic_part_number exonID >>>>>> <integer> <character> >>>>>> [1] 1 E001 >>>>>> --- >>>>>> seqlengths: >>>>>> Chr01 Chr02 Chr03 ... scaffold_99 >>>>>> scaffold_991 >>>>>> NA NA NA ... >>>>>> NA NA >>>>>> >>>>>> >>>>>> What do you say? >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Nico >>>>>> >>>>>> --------------------------------------------------------------- >>>>>> Nicolas Delhomme >>>>>> >>>>>> Genome Biology Computational Support >>>>>> >>>>>> European Molecular Biology Laboratory >>>>>> >>>>>> Tel: +49 6221 387 8310 >>>>>> Email: nicolas.delho...@embl.de >>>>>> Meyerhofstrasse 1 - Postfach 10.2209 >>>>>> 69102 Heidelberg, Germany >>>>>> --------------------------------------------------------------- >>>>>> >>>>>> My sessionInfo()R version 3.0.1 (2013-05-16) >>>>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>>>> >>>>>> locale: >>>>>> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C >>>>>> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 >>>>>> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 >>>>>> [7] LC_PAPER=C LC_NAME=C >>>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>>>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >>>>>> >>>>>> attached base packages: >>>>>> [1] parallel stats graphics grDevices utils datasets methods >>>>>> [8] base >>>>>> >>>>>> other attached packages: >>>>>> [1] Rsamtools_1.13.26 Biostrings_2.29.14 DEXSeq_1.7.6 >>>>>> [4] GenomicFeatures_1.13.21 AnnotationDbi_1.23.18 Biobase_2.21.6 >>>>>> [7] GenomicRanges_1.13.35 XVector_0.1.0 IRanges_1.19.19 >>>>>> [10] BiocGenerics_0.7.3 BiocInstaller_1.11.4 >>>>>> >>>>>> loaded via a namespace (and not attached): >>>>>> [1] biomaRt_2.17.2 bitops_1.0-5 BSgenome_1.29.1 DBI_0.2-7 >>>>>> [5] hwriter_1.3 RCurl_1.95-4.1 RSQLite_0.11.4 >>>>>> rtracklayer_1.21.9 >>>>>> [9] statmod_1.4.17 stats4_3.0.1 stringr_0.6.2 >>>>>> tools_3.0.1 >>>>>> [13] XML_3.98-1.1 zlibbioc_1.7.0 >>>>>> >>>>>> _______________________________________________ >>>>>> Bioc-devel@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioc-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>> >>> >> > _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel