Hi Marek,
Sorry for the time it took for me to get back to you.
I am at the point that I have the kmc package ready to be submitted.
Just doing final tidy ups and automating man page creation for the two
binaries (kmc and kmc_dump).
To do this I use help2man which has been proving great for turning any kind
of usage into man pages.
I should be done with this by lunch time today.
At the moment, for the creaiton of the kmc debian package I'm using my
github repository.
I hope that's ok for you for the moment.
Once you guys are done with writing the new version of kmc we can point the
package to look at your own VCS repository.
Let me know if this ok for now.
On Wed, Nov 12, 2014 at 6:14 PM, Marek Kokot wrote:
> Hello Jorge,
>
> Hope all is well.
>
> Its fine, thanks. How are you? ;-)
>
>
Great!
I'm very well as well. :)
> Thank you for your tests. I have made a couple of tests on my own and
> the results was pretty similar to your. As I said it was only couple of
> tests but on reative big data (human genome about 100 GB gzipped).
>
Cool!
>
> My question is: Have you used HDD or SSD? I have used HDD, for SSD it is
> possible that differences in time may be bigger. I will try to test it for
> SSD.
>
I was running it on HDD.
Did you run it on SSD already? Did you get any significant improvement?
>
> I spoke with Sebastian today and we agree that it is a good option to
> make compilation without asmlib possible. I will deal with this and I let
> you know when it is ready.
>
Awesome!
I'm so glad to hear this.
This will also mean that KMC will be able to be built for other chips other
than the intel ones.
This feature will be available only with the new verison of KMC?
>
> KMC in current version has also "boost" dependency, but we think to go
> native, because g++ supports c++11 threads. Other functionalities from
> "boost" that we use are easy to replace with c++11. It should make
> compilation time shorter for the ones who hasn't "boost" installed.
>
I see. It does take a while ot compile, but dependency on Boost is
definitely not an issue from where I'm sitting.
The package builds properly with the Debian boost, zlib and libbz2
dependencies.
I do appreciate that it does take a while to compile.
>
> In one of earlier e-mails you asked about Agner (asmlib author):
>
> Do you think it would make sense for me to approach him through the Debian
> Med team?
>
> We would prefer to deal with it on our own.
>
> No problem at all.
> If there is something I ommited and I didn't answer to any of your
> questions please let me know.
>
I think you answered everything. I do appologise for stressing you guys out
with these requests, especially at a time when you're concentrating on
writing a new release.
Kind regards,
Jorge Soares
--
*Od:* Jorge Sebastião Soares [j.s.soa...@gmail.com]
*Wysłano:* 11 listopada 2014 18:00
*Do:* Marek Kokot; Sebastian Deorowicz
*DW:* Debian Med Project List
*Temat:* [KMC + asmlib] KMC Debian package progress
Hi guys,
I had to send this again as I used my Sanger email, but that is not
subscribed to the Debian Med Mailing list.
So here it goes again (if you can ignore the previous one and respond to
this one).
Hope all is well.
Have you given any thought to my proposal of a compile time option that
won't use asmlib?
I have included the Debian Med team on this email as they are aware of the
packaging of KMC and the whole issue with asmlib.
I have been doing some benchmarking on KMC for the past couple of days.
I have compiled KMC in three ways:
kmc_original - kmc code compiled against the version of asmlib distributed
with KMC- alibelf64.a
kmc_native - kmc code compiled against the native OS libraries
kmc_js21 - kmc code compiled against the new version of asmlib, compiled on
my machine with my Unix makefile - libaelf64.a
I have also used the executables provided in your website in the benchmark.
kmc_exe
The machine I used for this is a Debian Virtual Machine running on Vagrant.
Here are the architecture details:
vagrant@debian:~$ cat /proc/cpuinfo
processor: 0
vendor_id: GenuineIntel
cpu family: 6
model: 23
model name: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
stepping: 10
microcode: 0x60b
cpu MHz: 1426.514
cache size: 6144 KB
physical id: 0
siblings: 1
core id: 0
cpu cores: 1
apicid: 0
initial apicid: 0
fpu: yes
fpu_exception: yes
cpuid level: 5
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc
rep_good nopl pni monitor ssse3 lahf_lm
bogomips: 2853.02
clflush size: 64
cache_alignment: 64
address sizes: 36 bits physical, 48 bits virtual
To do the benchmark I used a fastq file that has a fair bit of
contamination (many different kmers). The file is about 227M in size.
Here are some of the results:
For a