Thanks Martin

This was thought as a feauture request/discussion of biostrings, which is why I 
posted it here. Thought biostrings io capabilities was behind most other fasts 
readers on bioconductor...

/Thomas


> Den 29/01/2015 kl. 15.45 skrev Martin Morgan <mtmor...@fredhutch.org>:
> 
>> On 01/29/2015 06:41 AM, Thomas Lin Pedersen wrote:
>> Hi
>> 
>> I’m querying on whether there are any plans on supporting random access 
>> reading of fasta files in the sense that it is possible to upfront specify 
>> the indexes of sequences that should be read in.
>> 
>> I’m working on a package for comparative microbial genomics and it would be 
>> a huge speed improvement if it was possible to quickly read in 1000’s of 
>> sequences distributed on as many files. Currently the proper, vectorised 
>> approach requires all files to be read in at once and then subsetted, but 
>> this can result in XStringSet’s in the Gb range, just to access some 
>> sequences. The slow, un-R way would be to loop through each file (or each 
>> sequence using skip and nrec to only read in relevant sequences). I’m 
>> preferentially looking for an interface like:
>> 
>> readXStringSet(files, rec)
>> 
>> Where rec is either a vector that would index into the XStringSet as if 
>> everything from files had been read in, or a list with the same length as 
>> files, containing the indexes of interest for each file.
> 
> Hi Thomas -- this should really be posted to support.bioconductor.org, but 
> see Rsamtools::FaFile and rtracklayer::TwoBitFile access through getSeq.
> 
> Martin
> 
>> with best wishes
>> 
>> Thomas
>> _______________________________________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> 
> -- 
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> 
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to