Vincent Snijders schrieb: > Jeff Pohlmeyer schreef: >>>> this kludge is about 25% faster than your perl script >>>> on my machine.... >> >>> Nope. It's still more or less twice slower. :-D >> >> >> I guess it depends on the hardware: >> >> % time koleksi.pl # perl >> Word count: 126944 >> Unique word count: 11793 >> >> real 0m1.019s >> user 0m0.992s >> sys 0m0.028s >> >> >> % time koleksi # fpc >> Word count:126944 >> Unique word count:11793 >> >> real 0m0.817s >> user 0m0.784s >> sys 0m0.020s >> >> >> AMD-K6-700 / SuSE-10.3 / Linux-2.6.22 / perl-5.8.8 / fpc-2.2.0 >> >> > > Thanks Jeff, for writing that parser code, I am not good in doing that. > > I made it three times as fast on my computer (windows 2000, fpc 2.3.1, > P4 1.5 Ghz) using a hashlist for the unique word count. Using a larger > textbuf gave an additional 10% speed up: > > program project1; > {$MODE OBJFPC} {$H+} > > uses classes, strings, contnrs; > > const > bufsize = $1FFF; > > var > f: text; > s:ansistring; > wc:longint=0; > wl:TStringList; > uhl: TFPStringHashTable; > i,n:LongInt; > textbuf: array[0..bufsize-1] of byte; > > begin > assign(f, 'Koleksi.dat'); > reset(f); > SetTextBuf(f, textbuf, sizeof(textbuf)); > wl:=TStringList.Create(); > uhl:=TFPStringHashTable.Create; > while not eof(f) do begin > readln(f,s); > n:=length(s); > if (n>0) then begin > StrLower(@s[1]); > if (s[1]='<') then begin > if StrLComp(@s[1], '<title>',7) = 0 then begin > delete(s,1,7); > end else continue; > end; > for i:=1 to n do if not (s[i] in ['a'..'z','0'..'9']) then begin > if ( s[i] <> '<' ) then begin > s[i]:=#10 > end else begin > s[i]:=#0; > SetLength(s,StrLen(@s[1]));
Why not SetLength(s,i)? StrLen is _very_ expensive. I don't see a way how another #0 can be before. _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal