> On Sat, 7 Nov 2009, Dennis Clarke wrote: >> >> Now the first test I did was to write 26^2 files [a-z][a-z].dat in 26^2 >> directories named [a-z][a-z] where each file is 64K of random >> non-compressible data and then some english text. > > What method did you use to produce this "random" data?
I'm using the tt800 method from Makoto Matsumoto described here : see http://random.mat.sbg.ac.at/generators/ and then here : /* * Generate the random text before we need it and also * outside of the area that measures the IO time. * We could have just read bytes from /dev/urandom but * you would be *amazed* how slow that is. */ random_buffer_start_hrt = gethrtime(); if ( random_buffer_start_hrt == -1 ) { perror("Could not get random_buffer high res start time"); exit(EXIT_FAILURE); } for ( char_count = 0; char_count < 65535; ++char_count ) { k_index = (int) ( genrand() * (double) 62 ); buffer_64k_rand_text[char_count]=alph[k_index]; } /* would be nice to break this into 0x40h char lines */ for ( p = 0x03fu; p < 65535; p = p + 0x040u ) buffer_64k_rand_text[p]='\n'; buffer_64k_rand_text[65535]='\n'; buffer_64k_rand_text[65536]='\0'; random_buffer_end_hrt = gethrtime(); That works well. You know what ... I'm a schmuck. I didn't grab a time based seed first. All those files with random text .. have identical twins on the filesystem somewhere. :-P damn I'll go fix that. >> The dedupe ratio has climbed to 1.95x with all those unique files that >> are less than %recordsize% bytes. > > Perhaps there are other types of blocks besides user data blocks (e.g. > metadata blocks) which become subject to deduplication? Presumably > 'dedupratio' is based on a count of blocks rather than percentage of > total data. I have no idea .. yet. I figure I'll try a few more experiments to see what it does and maybe, dare I say it, look at the source :-) -- Dennis Clarke dcla...@opensolaris.ca <- Email related to the open source Solaris dcla...@blastwave.org <- Email related to open source for Solaris _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss