>
>
> That's the sort of cheap checks that I had mind in the very first post
> when I talked about "I envisaged a call to CPUID and then some bool tests
> along the way to utilise SSE[2-4]/AVX[2] (or NEON on ARM) if available. All
> in a static, portable package." Thanks for a good example of t
On Thursday, 3 November 2016 01:40:29 UTC, Nigel Tao wrote:
>
>
> Another ignorant question from me, but what do you mean exactly by
> universal binary?
>
Apologies for the confusing and nonsensical term. What I meant was a binary
that works for a number of CPUs within an architecture, with or
On Tue, Nov 1, 2016 at 8:58 PM, Ondrej wrote:
> It seems that a universal binary, as Go requires it, would be slow on
> dispatch, because there would be too much checking for individual intrinsics
> support. Do I understand it correctly, that to overcome this, people either
> compile natively (whi
Klaus, that's a great thread, completely missed it.
It seems that a universal binary, as Go requires it, would be slow on
dispatch, because there would be too much checking for individual
intrinsics support. Do I understand it correctly, that to overcome this,
people either compile natively (wh
> Yes, speeding up an accumulation step, described at
>
> https://medium.com/@raphlinus/inside-the-fastest-font-renderer-in-the-world-75ae5270c445#.qz8jram0o
>
>
> The generated code are SIMD implementations of very simple Go functions.
>
> For example, the fixedAccumulateOpSrcSIMD function i
>
> Take for instance the PSHUFB instruction, which allows a very fast
> [16]byte lookup in SSSE3 capable machines. This is helpful in various ways,
> but if it isn't available, it will have to commit the XMM register to
> memory and do 16 lookups, which is at least an order of magnitude slower
just, Machine Code, may be a less common term now.
strictly you might say the 'text' being generated here is assembly, and it
becomes m/c 'numbers' after the assembler, but since its just a one-one
relationship, there isn't really much of a conceptual difference.
On Friday, 28 October 2016 01:
On Friday, 28 October 2016 02:37:38 UTC+2, Erwin Driessens wrote:
> I'd love to see SIMD intrinsics in the Go compiler(s), even if it would
mean separate packages for all the architectures. I'm not experienced
enough to tell how far one could get with designing a cross-platform set of
intrinsi
I'd love to see SIMD intrinsics in the Go compiler(s), even if it would
mean separate packages for all the architectures. I'm not experienced
enough to tell how far one could get with designing a cross-platform set of
intrinsics instructions? Using the hardware when it is available, falling
bac
On Thu, Oct 27, 2016 at 9:24 AM, 'simon place' via golang-nuts
wrote:
> the approach i took was to try to minimise the M/C, so;
Sorry for the ignorant question, but what does M/C stand for?
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To uns
On Fri, Oct 28, 2016 at 6:54 AM, 'simon place' via golang-nuts
wrote:
> however, from looking at it, couldn’t find documentation, that code is
> specific to speeding up graphics overlays? maybe? (accumulate)
Yes, speeding up an accumulation step, described at
https://medium.com/@raphlinus/inside-
> Something like that?
short answer, Yes.
however, from looking at it, couldn’t find documentation, that code is
specific to speeding up graphics overlays? maybe? (accumulate)
but it’s confusing me that its using templates, when there seems to only be
one template.
i was thinking of one, ve
Something like that?
https://github.com/golang/image/blob/master/vector/gen.go
-s
sent from my droid
On Oct 27, 2016 12:24 AM, "'simon place' via golang-nuts" <
golang-nuts@googlegroups.com> wrote:
> i was playing with SIMD last year,
>
> the approach i took was to try to minimise the M/C, so;
13 matches
Mail list logo