For #1, that is https://github.com/golang/go/issues/15808

For #2, the Go compiler is generally not great at optimizing arrays, 
particularly copies of them (in your case, from ret to the return slot). I 
would recommend using either slices ([]int) or structs (struct {x, y int}), 
or even pointer to array (*[2]int).

On Friday, August 26, 2022 at 1:02:03 AM UTC-7 sydnash wrote:

> I have some golang code like this:
>
> ```golang
> type C [2]int
>
> //go:noinline
> func t1(a, b C) C {
>     var ret C
>     for i := 0; i < len(a); i++ {
>         ret[i] = a[i] + b[i]
>     }
>     return ret
> }
>
> //go:noinline
> func t2(a, b C) C {
>     return t1(a, b)
> }
> ```
>
> and build with command: go build main.go
>
> then i use objdump -d main to check the output assembly code, it like this:
>
> ```golang
> 0000000100086b40 <_main.t1>:
> 100086b40: fe 0f 1e f8  str     x30, [sp, #-32]!
> 100086b44: fd 83 1f f8  stur    x29, [sp, #-8]
> 100086b48: fd 23 00 d1  sub     x29, sp, #8
> 100086b4c: ff ff 04 a9  stp     xzr, xzr, [sp, #72]
> 100086b50: ff ff 00 a9  stp     xzr, xzr, [sp, #8]
> 100086b54: 00 00 80 d2  mov     x0, #0
> 100086b58: 09 00 00 14  b       0x100086b7c <_main.t1+0x3c>
> 100086b5c: e1 a3 00 91  add     x1, sp, #40
> 100086b60: 22 78 60 f8  ldr     x2, [x1, x0, lsl #3]
> 100086b64: e3 e3 00 91  add     x3, sp, #56
> 100086b68: 64 78 60 f8  ldr     x4, [x3, x0, lsl #3]
> 100086b6c: 42 00 04 8b  add     x2, x2, x4
> 100086b70: e4 23 00 91  add     x4, sp, #8
> 100086b74: 82 78 20 f8  str     x2, [x4, x0, lsl #3]
> 100086b78: 00 04 00 91  add     x0, x0, #1
> 100086b7c: 1f 08 00 f1  cmp     x0, #2
> 100086b80: eb fe ff 54  b.lt    0x100086b5c <_main.t1+0x1c>
> 100086b84: e0 07 40 f9  ldr     x0, [sp, #8]
> 100086b88: e1 0b 40 f9  ldr     x1, [sp, #16]
> 100086b8c: e0 27 00 f9  str     x0, [sp, #72]
> 100086b90: e1 2b 00 f9  str     x1, [sp, #80]
> 100086b94: ff 83 00 91  add     sp, sp, #32
> 100086b98: fd 23 00 d1  sub     x29, sp, #8
> 100086b9c: c0 03 5f d6  ret
> ```
> I think the above assembly code is not optimal。
>
> 1. the register value of x1,x3 can be compute out side of the loop.
> 2.  store the result directly to [sp, #72], and [sp, #80], the temporary 
> data in [sp, #8] and [sp, #16] not necessary.
>
> i want to know why the compiler couldn't do the optimize, or which command 
> cat go more optimized code.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/3d4bfed3-b71f-4abe-a6d9-2c05e9c8b1a9n%40googlegroups.com.

Reply via email to