Hardware:
Jûrgen and others make a very good point about the memory interface channel
speed to the CPUs as a factor that SW cannot alter.
Software speedups:
Back in the “old” days when I worked at I.P. Sharp we did make some hand
crafted speedups in the interpreter to handle special cases.
I ws
CPU cache is the reason for a JIT compilation/conversion of APL expressions, in
order to fuse local operations together.
If you are in to CPU cache aspects of HPC, the blis papers would be interesting
to you. See the citations at https://github.com/flame/blis
> On Oct 17, 2019, at 10:03 AM, Ro
Peter:
I am new to APL (~9 months). Most of my day-to-day work is sql/shell,
however I use APL for a couple things: 1.) as an ad-hoc calculator & 2.) a
symbolic notation that greatly simplifies complex mathematical calculations
in how I think about, remember, and approach them.
As for things I've
Hi Blake,
as a matter of fact, the loops in my benchmarks are small, but
the data on which these small loops operate is not. Practically
this
means that all instructions run from the instruction cache (with
an
instruction cache hit rate of 100%
On Wed, Oct 16, 2019 at 7:06 AM Dr. Jürgen Sauermann <
mail@jürgen-sauermann.de> wrote:
> ...
>
> My current interpretation of various benchmarks that Elias Mårtenson and
> myself did some years ago is that the bandwidth of the memory interface
> between the CPUs (or cores) and the memory is the l