Looks like something is wrong with immediate loading for the 1 << ... 
operation.  Could you open a bug with repro instructions?  I can look at it 
when 1.8 opens.

On Thursday, June 16, 2016 at 5:30:12 PM UTC-7, gordo...@gmail.com wrote:
>
> > Modern x86 CPUs don't work like that. 
>
> > In general, optimally scheduled assembly code which uses more registers 
> has higher performance than optimally scheduled assembly code which uses 
> smaller number of registers. Assuming both assembly codes correspond to the 
> same source code. 
>
> > Register renaming: since Intel Pentium Pro and AMD K5. 
>
> > Suggestion for reading: 
> http://www.agner.org/optimize/microarchitecture.pdf 
>
> > An excerpt from the above PDF document (Section 10 about Haswell and 
> Broadwell pipeline): "... the register file has 168 integer registers and 
> 168 vector registers ..." 
>
> I am aware of all of the above and have already read Agner Fogg's 
> publications.  In addition modern CPU's do Out of Order Execution (OOE) so 
> rearrange the instructions to best reduce instruction latencies and 
> increase throughput given that there are parallel execution pipelines and 
> ahead-of-time execution, so the actual execution order is almost certainly 
> not as per the assembly listing. 
>
> Yes, both assembly listings are from the same tight loop code, but the 
> "C/C++" one has been converted from another assembly format to the golang 
> assembly format. 
>
> Daniel Bernstein, the author of "primegen" wrote for the Pentium 3 in x86 
> (32-bit) code, as the Pentium Pro processor wasn't commonly available at 
> that time and 64-bit code didn't exist.  His hand optimized C code for the 
> Sieve of Eratosthenes ("eratspeed.c" in the "doit()" function for the 
> "while k < B loop") uses six registers for this inner culling loop being 
> discussed, and takes about 3.5 CPU clock cycles per loop on a modern CPU 
> (Haswell). 
>
> The number of internal CPU registers actually used by the CPU to effect 
> OOE is beside the point, as they have to do with the CPU's internal 
> optimizations and not compiler optimizations; my point is that the 
> compiler's incorrect use of registers still costs time. 
>
> While I don't expect golang, with its philosophy of preserving "safe" 
> paradigms in doing array bounds checks by default, to run as fast as C/C++ 
> code that doesn't have that philosophy, I do expect it to run at least as 
> fast as C#/Java code which are Just In Time (JIT) compiled and do have the 
> "safe" philosophy.  The golang compiler version 1.7beta1 is not quite there 
> yet for the indicated reasons:  inconsistent use of registers, using one 
> too many in one place in order to avoid an immediate load which doesn't 
> cost any execution time, and saving one register by use of the "trick" 
> which does cost execution time as compared to the use of a single register. 
>
> However, there is hope as version 1.7 has made great advances since 
> version 1.6; surely version 1.8, which is said to intend to improve this 
> further will be faster yet.  At any rate, version 1.7 speed is "adequate" 
> for many purposes as at least it comes close (probably within about 15% to 
> 20% or less) of C#/Java speed in many of the most demanding tight loop 
> algorithms, and thus is quite usable as compared to previous versions.  But 
> even the most avid golang protagonists must admit that it isn't the 
> language to re-write Kim Walisch's "primesieve" with its extreme loop 
> unrolling that takes an average of about 1.4 CPU clock cycles per composite 
> number cull for small ranges of primes, as even with array bounds checking 
> turned off, golang would still take at least twice and more likely three 
> times as long. 
>
> That is also why I first started posting to this thread:  the only reason 
> the golang version of "primegen" is reasonably comparable in speed to C/C++ 
> "primegen" is that it uses multi-threading on a multi-core processor, which 
> weren't available to Daniel Bernstein when he wrote "primegen".  My point 
> was one should compare like with like. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to