So now you've had Marco's ideas + my 0.1 euro's worth (there are some differences in our approaches + some common ideas) re what the problem is & how you should test what's happening, please let us know what the results of your experiments are...J ----------- Now that the list seems to be working again...I'll try posting this again: > > I am wondering why our new PC is not executing our fpc-compiled program > very much faster than the old one. It was really quite a disappointment: > > Old PC: Laptop, Intel PII, 300 MHz, 64 MB. Execution times: 8:30, 2:30 (min:sec) > > New PC: Desktop, AMD Duron, 1.6 GHz, 128 MB. Execution times: 5:15, 1:15
I'm not the processor-crack of the FPC team, but I'll give it a shot. (Jonas and Florian will probably correct/comment on this heavily :-) I'm afraid you have fell for the MHz'itis, iow that the throughput speed of a processor is purely dependant on the speed of the CPU (in MHz): Some important things I noticed immediately from your msg: - there is still a nearly two fold increase. (less for the first, exactly twofold for the second) - you use 4 MB memory, and I assume from the story that is rather random access - The Duron has less cache than an Athlon, and the Duron's is probably about the same magnitude as the P-II - the 4 MB doesn't fit in the cache -> processor is waiting for memory all the time. > The new PC ought to be 5 times faster (1600 MHz / 300 MHz, right? Depends on the job. The memory interface is probably only two times faster (66 MHz <-> 133 MHz) or so, and the cache (that can in some cases "hide" the slower memory), is also hardly larger. >Of course the speed of the memory is also a factor) but it's not even twice > as fast. Which is indeed the reason that it is memory bound. (together with the problem being not OS dependant, I assume you tried some *nix) I went from a K6-2 500 to an Athlon 1666 (XP2000+), which is about a fat 3 step, but the compiler compiles itself more than 3 times as fast. > The execution time pairs are determined from three time stamps that > occur during one run of the program. The sequence is as follows: > > * Stamp 1 > -Initialize (5-10 secs reading/processing from HD) > -Process 1 (5-9 mins) > * Stamp 2 > -Process 2 (1-3 mins) > * Stamp 3 Since the second process scales better, I assume it approaches memory in a way that can be better > Both machines are running Win98 Second Edition (could Windows 98 be > preventing the faster machine from running at full capacity? Not for pure calculation I think. Maybe for heavily IO-bound or threading programs 98 makes a huge difference, but if there is a difference in calculation speed in 98, it won't be more than a few percent (and since NT and unix have more to do in the background, this could even be positive) > Or perhaps it's because fpc runs in a DOS window, and the DOS mode is forcing it to > run slow?) > > The program is very processor intensive. Only about 4MB of memory space > is used. You could try to change the memory usage in a way that subsequent memory access will be adjacent in memory, and play with alignments. You could also try to find/borrow a processor with a large cache (e.g. a P-III Xeon with 2 MB cache would be ideal, but an Athlon MP or even a simple Athlon would be interesting), and do the test on such a machine. > During runtime, we are doing less than 400 kb of read/write combined to > the HD. We put about 10 lines of text on the DOS screen to show > progress. So I can't imagine the I/O could be slowing us down. Not likely no. > I tried compiling with the two different target platforms, but it didn't > make a difference. Stackchecking is on, but it was on on both computers. Did you use the same amounts of optimization? Maybe you have -OG3p3r or so in the ppc386.cfg on the P-II (which automatically adds the heaviest optimizations), and not on the Duron. > I also tried a few different bios settings (the computer has ready-made > bios configurations for "Optimal" and "Best Performance" (?) as well as > the factory default I started with.) But the compile times were the same > regardless of the bios settings. Usually this is a few percent max, not magnitudes. Action list: (in order that I would do them, from first to last resort) 1 verify that you use the same degree of optimizations. 2 Try on a machine with more cache. 3 Try to rewrite programs to do more accesses to the same block of memory. _______________________________________________ fpc-pascal maillist - [EMAIL PROTECTED] http://lists.freepascal.org/mailman/listinfo/fpc-pascal This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you. _______________________________________________ fpc-pascal maillist - [EMAIL PROTECTED] http://lists.freepascal.org/mailman/listinfo/fpc-pascal