Michael Alipio wrote: > Hi, > >> ----- Original Message ---- From: Igor Sutton <[EMAIL PROTECTED]> To: >> Mumia W. <[EMAIL PROTECTED]> Cc: Beginners List >> <beginners@perl.org> Sent: Friday, January 19, 2007 4:42:09 PM Subject: Re: >> Searching hash if a given value exists > >> [...] > >> New benchmarks about the subject: > >> foreach_hash_keys: 4 wallclock secs ( 4.40 usr + 0.00 sys = 4.40 CPU) @ >> 227272.73/s (n=1000000) foreach_hash_values: 4 wallclock secs ( 3.46 usr + >> 0.01 sys = 3.47 CPU) @ 288184.44/s (n=1000000) reverse_hash: 6 wallclock >> secs ( 6.85 usr + 0.01 sys = 6.86 CPU) @ 145772.59/s (n=1000000) > Rate reverse_hash foreach_hash_keys foreach_hash_values >> reverse_hash 145773/s -- -36% >> -49% foreach_hash_keys 227273/s 56% -- >> -21% foreach_hash_values 288184/s 98% 27% >> -- > > > Cool! so the winner is foreach_hash_values??
No. I thought this was going to mislead somebody. foreach (values %hash) { return 1 if $_ eq 'house'; } is quicker than foreach (keys %hash) { return 1 if $hash{$_} eq 'house'; } as you would expect, I hope. But using a reverse hash is only slower if you rebuild it in its entirety every time you need to do a lookup. If you use it properly it is many many times faster than the very blunt linear search tool. Look at these benchmarks: use strict; use warnings; use Benchmark qw/cmpthese/; my %hash = ( dog => 'house', pig => 'barn', bird => 'cage' ); my %rhash = reverse %hash; cmpthese(5_000_000, { 'reverse hash' => sub { return exists $rhash{house}; }, 'foreach hash keys' => sub { foreach (keys %hash) { return 1 if $hash{$_} eq 'house'; } return ''; }, 'foreach hash values' => sub { foreach (values %hash) { return 1 if $_ eq 'house'; } return ''; }, } ); **OUTPUT** Rate foreach hash keys foreach hash values reverse hash foreach hash keys 503931/s -- -51% -86% foreach hash values 1022495/s 103% -- -71% reverse hash 3516174/s 598% 244% -- So looping over the values is about twice as fast as looping over the keys and extracting the corresponding values, but accessing a prebuilt reversed hash is about seven times faster. And this is for an unrepresentatively tiny set of data; if there was some real content in the sample the reverse hash would be streets ahead. In summary, my rule is to start by writing a program so that it does things in the most obvious way so as to minimise the risk of errors and bugs. Once it is working, if it also runs quickly enough then the job is done; great. If not, then we must start to optimise the nice clean code and make it dirtier but faster. I think it is clear which of the above is the most transparent solution to checking whether a hash contains a particular value, and in practise it is actually going to be the fastest, so lucky us. > Never knew I could do a foreach (values %hash), I only use "foreach keys" for > a very long time now. I tried searching the documentation for "foreach" but no > avail. Take a look at perldoc -f values and, while you're at it, perldoc -f each HTH, Rob Oh, and a PS. If you don't need the full reversal of a hash, but just a derivative hash containing all the first hash's values as keys, then the hash slice I put in my original solution my %rhash; @rhash{values %hash} = (); is actually faster than my %rhash = reverse %hash; as it doesn't have to duplicate all the original hash keys as well: they are irrelevant. R -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/