Re: [Bioc-devel] oligonucleotideFrequency Performance Enhancement

2013-02-07 Thread Kasper Daniel Hansen
For reference, Jellyfish is supposed to be state of the art for fast k-mer counting http://www.cbcb.umd.edu/software/jellyfish/ Kasper On Thu, Feb 7, 2013 at 6:51 PM, Hervé Pagès wrote: > Hi Dario, > > > On 02/05/2013 05:00 PM, Dario Strbenac wrote: >> >> Hello, >> >> Would it be possible to i

Re: [Bioc-devel] oligonucleotideFrequency Performance Enhancement

2013-02-07 Thread Hervé Pagès
Hi Dario, On 02/05/2013 05:00 PM, Dario Strbenac wrote: Hello, Would it be possible to include an option that firstly goes through all of the strings and runs a sliding window along them, to find all the unique k-mers present in the dataset ? Finding the unique k-mers in the dataset can eas

[Bioc-devel] oligonucleotideFrequency Performance Enhancement

2013-02-05 Thread Dario Strbenac
Hello, Would it be possible to include an option that firstly goes through all of the strings and runs a sliding window along them, to find all the unique k-mers present in the dataset ? This would avoid having a sparse matrix with many columns of all zero counts, when a larger value of width i