Is there something wrong with this algo ??
#################################################### my @start = (3,4, 15, 23, 29, 34, 37); my @stop = (5,10,20, 29, 33, 36, 37); # Assuming that the start and the stop array have the # same number of elements and each start corresponds with # each stop my $i = 0; my $hash{$start[0]} = $stop[0]; my $refStart = $start[0]; my $refStop = $stop[0]; for ($i = 1; $i <= $#start; $i++) { if ($refStop >= $start[$i]) { $refStop = $stop[$i]; $hash{$refStart} = $stop[$i]; } else { $refStart = $start[$i]; $refStop = $stop[$i]; $hash{$refStart} = $stop[$i]; } } foreach (sort { $a <=> $b } keys %hash) { print $_, " ", $hash{$_}, "\n"; } ###################################################### -----Original Message----- From: Jonathan E. Paton [mailto:[EMAIL PROTECTED]] Sent: Friday, May 31, 2002 1:58 PM To: [EMAIL PROTECTED] Subject: RE: union of times algorithm --- [EMAIL PROTECTED] wrote: > What we have is as stated a union of times. So all I do is take the > timeslices and generate a hash element for each time. I don't check to see > if it is already there, but just set to one. Then I sort the hash down > numerically and total where current minus previous equal 1. Here is a shot: > > use Data::Dumper; > > my @timeslices = ( > [4, 9], > [3, 6], > [2, 4], > [11,12], > [14,14], > [17,19] > ); > > my %MyHashCounts = (); > > for(my $MyId=0;$MyId<=$#timeslices;$MyId++){ > for($MyId1=$timeslices[$MyId][0];$MyId1<=$timeslices[$MyId][1];$MyId1++) > { > $MyHashCounts{$MyId1}=1; > } > } > print Dumper(\@timeslices); > print Dumper(\%MyHashCounts); > > my $MyPrev = -10; > my $MyTotal = 0; > foreach my $MyKeys (sort {$a <=> $b} keys %MyHashCounts) { > printf "%3d ", $MyKeys; > $MyTotal++ if ( ($MyKeys - $MyPrev) == 1 ); > $MyPrev = $MyKeys; > } > printf "\n"; > printf "Total: %4d\n", $MyTotal; > > Output: > 2 3 4 5 6 7 8 9 11 12 14 17 18 19 > Total: 10 Which is wrong, since the 14 sneaked in even though the contribution of [14,14] should be zero. Notice: 0 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 > | [4, | 9], | | > | [3, 6], | | | | | > |[2,| 4], | | | | | | | > | | | | | | | | | |[11, > | | | | | | | | | | | 12], > | | | | | | | | | | | | |[14, > | | | | | | | | | | | | | 14], > | | | | | | | | | | | | | | | |[17, | | | | | | | | | | | | | | | | | |19] X X X X X X X X X X X = 11 Looking at my algorithm, I can remove the use of the hash. The algorithm you are using is inferiour to the approach I tried (even though BOTH our algorithms are wrong). If I start using numbers in the region of Unix timestamps, (i.e. large) it is an impossible to satisfy memory/CPU hog. The algorithm I use is pictorally: XXXXXXXXXXXXXXXXXXX YYYYYYYYYYYYYYYYYYYYYYY ZZZZZZZZZZZZZZZZZZZZZZ ^ ^ ^ start X ^ start Y end Z ^ start Z end Y ^ end X Sorting results in the following events: ^start X ^start Z ^end X ^start Y ^end Z ^ end Y And we can find the number of overlapping sections now: ACTIVE V 1 ^start X 12 ^start Z 12 ^end X 23 ^start Y 2 ^end Z 3 ^ end Y So, by sorting we can loop through, starting and stopping "active" timeslots. We need to do this in order, otherwise we will get incorrect results. I sort the second column to always get a start event before a stop event. Jonathan Paton ===== $_=q|.,&@$$. ,.@$&@$. .&$$@. ,,$ ....!$_=$p.'&$@.',y'&$@' ..,';for(/\S+/g){ !|.q| .$ .,@, ,$, .,.. @, ,$ ,,@ ..,,.!++$.<22?${'y'.$_}=chr$.+64:[$$=${'y' !|.q| ,@$@&.,. $$$&, ..@&&$,,, $., ...!.$_},$y.=($.=~/22\|26\|3(3\|7)/x?' ' !|.q|. @ ., ,.&,,, , .$..&. .,$ .,,!.$$:"\l$$")]};$y=~/ (.*)/;warn"$1\n" !|.q|. $ .,. .,$$&&$...&., @.,.&@$@ ..|,map{-$|--?$r:$p.=$_}split'!';eval$r __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]