Another thing of note: The destructor Value_P::~Value_P() was called 395899364 times, and used 16.20% time.
On 23 August 2015 at 19:04, Elias Mårtenson <loke...@gmail.com> wrote: > Well, I've run the test and I have some results. They were somewhat > unexpected as the time spent in Value::clone() is much less than it was > for other tests. That said, the clone issue is mostly visible when > manipulating very large arrays, which this test case do not. In this case, > 9.26% of the time was spent in Value::clone() and its descendants. > > The main consumer of CPU time in this test case is the reduction on + in > the following command: > > Z←((¯1+⍴Z)⌊+/∧\'0'=Z)↓Z←D[⌽Z] > > The +/ reduction uses 70% of the CPU time. This includes 28% performing > the addition operation (Bif_F12_PLUS::eval_AB()). Another huge > contributor was the call to Cell::to_value() which contributed 29.21%. > Note that the 28% time spend in the addition and the almost 30% in > to_value() are separate. > > In other words, the addition and the value conversion consumes 60% of the > total time, which is part of the reduction operation (70%). > > Regards, > Elias > > On 23 August 2015 at 05:21, fred <fred_wei...@hotmail.com> wrote: > >> Mike Duvos >> >> Thank you for the correction. I have timed your code: >> >> ⎕IO←0 >> >> ∇TIME X;TS >> >> ∇Z←SHOW X;I >> >> ∇Z←X TIMES Y;D;I;C >> >> ∇Z←FACTORIAL N;I >> >> TIME 'SHOW FACTORIAL 300' >> 30605751221644063603537046129726862938858880417357 >> 69994167767412594765331767168674655152914224775733 >> 49939147888701726368864263907759003154226842927906 >> 97455984122547693027195460400801221577625217685425 >> 59653569035067887252643218962642993652045764488303 >> 88909753943489625436053225980776521270822437639449 >> 12012867867536830571229368194364995646049816645022 >> 77165001851765464693401122260347297240663332585835 >> 06870150169794168850353752137554910289126407157154 >> 83028228493795263658014523523315693648223343679925 >> 45940952768206080622328123873838808170496000000000 >> 00000000000000000000000000000000000000000000000000 >> 000000000000000 >> 22.977 Seconds. >> >> Now, the code under GNU APL runs in comparable time to the >> implementation in SNOBOL4, at least. >> >> This is still not good. Maybe not horrible, though. >> >> I rather suspect that the data copying that Elias Mårtenson alludes to >> is dominant in execution. The SNOBOL4 code has (probably) considerably >> more interpretation overhead, and is forced to copy the numeric string >> on each modification (strings are immutable). It hashes each string >> into a global hash on each such modification. If the APL code is forced >> into the same contortions (essentially, copying each vector), it should >> perform at a similar speed. Given that it does execute at the "speed of >> SNOBOL", I suspect that is what is going on. >> >> Eagerly awaiting Elias' results. >> >> FredW >> >> On Sat, 2015-08-22 at 12:20 -0400, fred wrote: >> > Ok, so infinite precision integer arithmetic takes over 50 seconds with >> >> > GNU APL to compute 300! >> > >> > Um... not good. Actually, this is horrific. >> > >> > I will attempt to put this into perspective. I use the interpretive >> > SNOBOL4 implementation from Griswold. This is code that implements a >> > SNOBOL4 interpreter. The implementation is that the code implementing >> > the interpreter (which was written in the 1960's) is macro-expanded >> > into a C program, which is then compiled and run to actually interpret >> > the SNOBOL4 language source. >> > >> > Ok? This is the SLOWEST SNOBOL4 implementation that I know... >> > >> > I used an infinite precision arithmetic package written 40 years ago >> > (specifically, for education -- not for performance). Now, one of the >> > reasons this package is slow is that it REDEFINES '+', '-', '*', '/' >> > operators AT RUN TIME... Not only are the data types dynamic, the >> > actual functions are also dynamic, and have been redefined. >> > >> > Now, if you are still with me -- an interpreter that is macro expanded >> > to C running a run-time binding operator redefinition program in a >> > language where strings are immutable, and must be completely >> > hashed/copied on each change... and has complete mark/sweep garbage >> > collection -- again, implemented in the macro expanded interpreter. >> > >> > Let us look at the code: >> > >> > -include 'INFINIP.INC' >> > infinip_start() ;* redefine basic math functions to work on strings >> > x = '1' >> > i = 1 >> > l = 300 >> > t = time() >> > top gt(i, l) :s(btm) >> > x = x * i >> > i = i + 1 :(top) >> > btm t = time() - t >> > output = x >> > output = t ' milliseconds' >> > end >> > >> > Now, I had a problem running the Davos code (but I haven't attempted >> > debugging - 52 seconds seemed extreme) -- but I assume 52 seconds is.. >> > um.. normal. I will run this code on a 1.5Ghz Intel i5 (this is my >> > Linux tablet, a three year old Acer Iconia tablet): >> > >> > $ snobol4 -s ifact >> > 30605751221644063603537046129726862938858880417357699941677674125947653 >> > 31767168674655152914224775733499391478887017263688642639077590031542268 >> > 42927906974559841225476930271954604008012215776252176854255965356903506 >> > 78872526432189626429936520457644883038890975394348962543605322598077652 >> > 12708224376394491201286786753683057122936819436499564604981664502277165 >> > 00185176546469340112226034729724066333258583506870150169794168850353752 >> > 13755491028912640715715483028228493795263658014523523315693648223343679 >> > 92545940952768206080622328123873838808170496000000000000000000000000000 >> > 00000000000000000000000000000000000000000000000 >> > 20315.096404 milliseconds >> > SNOBOL4 statistics summary- >> > 1.080 ms. Compilation time >> > 20315.416 ms. Execution time >> > 36453943 Statements executed, 19353232 failed >> > 1834289 Arithmetic operations performed >> > 7899675 Pattern matches performed >> > 338 Regenerations of dynamic storage >> > 1680.975 ms. Execution time in GC >> > 0 Reads performed >> > 2 Writes performed >> > 557.290 ns. Average per statement executed >> > 1794.398 Thousand statements per second >> > $ >> > >> > 20.3 seconds total. Since this was running with ONLY 8MB memory >> > (default), and EVERY string change needed a new copy, 338 garbage >> > collections where needed. That is mark&sweep. GC took 1.7 seconds (of >> > that 20.3 seconds total) >> > >> > FredW >> > >> >> >