From: Eric Auer <e.a...@jpberlin.de>
Hi Mateusz, > BSUM (by Mateusz Viste) : 6.0s (100%) > CRC32 (by Joe Forster) : 8.5s (70%) .... > MD5 (by Colin Plumb) : 52.9s (11%) > SHA1 (by Colin Plumb) : 85.7s (7%) Entertaining :-) Still you need to find a good balance between speed and collision risk. If you want to find duplicate files, you can first check simply the sizes. For the remaining candidates, I would say BSUM can be useful if your disk is fast and your CPU is slow. If it is the other way round, you feel the extra cost to read the file as second time for a stronger checksum after the quick BSUM says "possible" duplicate. For checking if downloads worked without noise, I would already want something "stronger" than BSUM, such as https://en.wikipedia.org/wiki/Fletcher%27s_checksum or Adler-32, CRC-32 or -64, but for prevention of faked downloads even MD5 and SHA1 are actually too weak today. You could check http://skein-hash.info/sha3-engineering for candidates like groestl.info for fast and quite okay hash/checksum tasks or use the official choice for that, Keccak, which got selected as SHA-3 algorithm... https://en.wikipedia.org/wiki/Skein_(hash_function) is even faster than Groestl but only on modern 64-bit CPU. > BSUM is the fastest, which is no surprise since the algorithm is > extremely simple (4 CPU instructions). The CRC32 computation by Joe > Forster is surprisingly fast as well... If you feel like trying a new DOS project: It would be a very fancy thing to have a disk-backed TEA encrypted disk image based "disk" or a disk-backed COMPRESSED disk image based "disk" driver with some very minimalistic compression algorithm. Example abstraction layer: You could have some array of CLUSTER offsets into the disk image, with units in the order of SECTORS. The image could be pre-compressed with a tool and the disk driver could open it read-only, or you could store all changed clusters in a new offset as soon as they no longer fit into their allocated compressed file, using some offline re-compression process to "defrag" that growth away later. Advantage of using sectors as units of cluster offsets would be extremely fast seeking and the ease of having "small" 16, 24 or 32 bit int as array items. Which you could even store instead of one of the FAT and then hide the change by showing users a copy of the other when they try to access it ;-) https://en.wikipedia.org/wiki/Tiny_Encryption_Algorithm A tiny-amount-of-RAM compression algorithm would be for example run length encoding. LZ variants such as LZO can decompress without needing extra RAM outside the unpack buffer itself. https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Oberhumer Classic exe-packers used algorithms similar to LZ4, which focus on easy data formats for easiest decompression: https://en.wikipedia.org/wiki/LZ4_(compression_algorithm) Nibble-based always feels better than having to scrape up individual bits of some Huffman coded stream even though a decompressor for those is still reasonably small and fast. Various harddisk compressors also used methods in the same style as LZ4 today, so you could say LZ is a real classic. LZO and LZ4 are simple enough to even be used in Linux zram which can swap out RAM to a compresed RAM disk on the fly. > Currently bsum uses an 8K memory buffer to optimize disk reads. Using a > buffer of 64KB increases the overall speed by 10%. Not that much, for a > 700% increase of memory usage. Interesting! Cheers, Eric :-) ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Freedos-user mailing list Freedos-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-user --- Internet Rex 2.29 * Origin: capcity2.synchro.net - 502/875-8938 (276:10/901) --- Synchronet 3.15a-Linux ListGate 1.3 * Capitol City Online - Frankfort, KY - telnet://capitolcityonline.net ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Freedos-user mailing list Freedos-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-user