On Thu, 2014-05-22 at 09:54 -0400, Ilia Mirkin wrote: > On Thu, May 22, 2014 at 2:56 AM, Timothy Arceri <t_arc...@yahoo.com.au> wrote: > > On Wed, 2014-05-21 at 18:05 -0400, Ilia Mirkin wrote: > > > > On Wed, May 21, 2014 at 6:02 PM, Timothy Arceri <t_arc...@yahoo.com.au> > > wrote: > >> Unless we run out of memory the old if statement would always fail so > >> reflect the more likely outcome. Should be be faster most of the time and > >> slightly cleaner looking code. > >> > >> Signed-off-by: Timothy Arceri <t_arc...@yahoo.com.au> > >> --- > >> src/gallium/auxiliary/util/u_upload_mgr.c | 7 +++---- > >> 1 file changed, 3 insertions(+), 4 deletions(-) > >> > >> diff --git a/src/gallium/auxiliary/util/u_upload_mgr.c > >> b/src/gallium/auxiliary/util/u_upload_mgr.c > >> index 744ea2e..99f9a08 100644 > >> --- a/src/gallium/auxiliary/util/u_upload_mgr.c > >> +++ b/src/gallium/auxiliary/util/u_upload_mgr.c > >> @@ -247,11 +247,10 @@ enum pipe_error u_upload_data( struct u_upload_mgr > >> *upload, > >> enum pipe_error ret = u_upload_alloc(upload, min_out_offset, size, > >> out_offset, outbuf, > >> (void**)&ptr); > >> - if (ret != PIPE_OK) > >> - return ret; > >> + if (ret == PIPE_OK) > >> + memcpy(ptr, data, size); > >> > >> - memcpy(ptr, data, size); > >> - return PIPE_OK; > >> + return ret; > >> } > > > > Have you actually compared the generated code? I find it can be fairly > > instructive to do so... btw, there's a likely() (and unlikely() ) > > primitive (in p_compiler.h) which will indicate to gcc whether a > > particular condition is likely or unlikely, so that it can adjust its > > code generation accordingly. > > > > -ilia > > > > > > Hi Ilia, > > > > I've taken a look at the results (see below) and read up on likely/unlikely > > (I was aware of these but wasn't confident in when they should be used). > > > > The result is my patch avoids a jump which I assume should help with branch > > prediction in the cpu? Something like this is what I was expecting with the > > change. > > The interesting thing is that in this case using likely seems to have added > > an extra 'test' instruction for no added value (please correct me if I'm > > wrong). However there is probably other places in u_upload_alloc(), etc > > where likely might be more helpful. > > > > Tim > > > > Common code: > > > > 444fce: 55 push %rbp > > 444fcf: 48 89 e5 mov %rsp,%rbp > > 444fd2: 48 83 ec 40 sub $0x40,%rsp > > 444fd6: 48 89 7d e8 mov %rdi,-0x18(%rbp) > > 444fda: 89 75 e4 mov %esi,-0x1c(%rbp) > > 444fdd: 89 55 e0 mov %edx,-0x20(%rbp) > > 444fe0: 48 89 4d d8 mov %rcx,-0x28(%rbp) > > 444fe4: 4c 89 45 d0 mov %r8,-0x30(%rbp) > > 444fe8: 4c 89 4d c8 mov %r9,-0x38(%rbp) > > 444fec: 4c 8d 45 f0 lea -0x10(%rbp),%r8 > > 444ff0: 48 8b 7d c8 mov -0x38(%rbp),%rdi > > 444ff4: 48 8b 4d d0 mov -0x30(%rbp),%rcx > > 444ff8: 8b 55 e0 mov -0x20(%rbp),%edx > > 444ffb: 8b 75 e4 mov -0x1c(%rbp),%esi > > 444ffe: 48 8b 45 e8 mov -0x18(%rbp),%rax > > 445002: 4d 89 c1 mov %r8,%r9 > > 445005: 49 89 f8 mov %rdi,%r8 > > This is some seriously disgusting code... > > 444fe8: 4c 89 4d c8 mov %r9,-0x38(%rbp) > 444ff0: 48 8b 7d c8 mov -0x38(%rbp),%rdi > 445005: 49 89 f8 mov %rdi,%r8 > > gcc should be able to do better. Do you perhaps have --enable-debug in > your mesa build? That disables -O2... If that is the case, might want > to rerun your analysis without --enable-debug. > > -ilia
Yes I did have --enable-debug set whoops. With -O2 enabled gcc produces the same output from the current code as my patch does without optimisations, so looks like there is no performance gains at all. I also tried using unlikely() in u_upload_alloc() and while this does produce different code it doesn't seem to have any noticeable impact. Guess I'll keep looking. > > > 445008: 48 89 c7 mov %rax,%rdi > > 44500b: e8 78 fd ff ff callq 444d88 <u_upload_alloc> > > 445010: 89 45 fc mov %eax,-0x4(%rbp) > > 445013: 83 7d fc 00 cmpl $0x0,-0x4(%rbp) > > > > Pre change: > > > > 0000000000444fce <u_upload_data>: > > 445017: 74 05 je 44501e <u_upload_data+0x50> > > 445019: 8b 45 fc mov -0x4(%rbp),%eax > > 44501c: eb 1b jmp 445039 <u_upload_data+0x6b> > > 44501e: 8b 55 e0 mov -0x20(%rbp),%edx > > 445021: 48 8b 45 f0 mov -0x10(%rbp),%rax > > 445025: 48 8b 4d d8 mov -0x28(%rbp),%rcx > > 445029: 48 89 ce mov %rcx,%rsi > > 44502c: 48 89 c7 mov %rax,%rdi > > 44502f: e8 2c 40 c0 ff callq 49060 <memcpy@plt> > > 445034: b8 00 00 00 00 mov $0x0,%eax > > 445039: c9 leaveq > > 44503a: c3 retq > > > > > > Post change: > > > > 0000000000444fce <u_upload_data>: > > 445017: 75 16 jne 44502f <u_upload_data+0x61> > > 445019: 8b 55 e0 mov -0x20(%rbp),%edx > > 44501c: 48 8b 45 f0 mov -0x10(%rbp),%rax > > 445020: 48 8b 4d d8 mov -0x28(%rbp),%rcx > > 445024: 48 89 ce mov %rcx,%rsi > > 445027: 48 89 c7 mov %rax,%rdi > > 44502a: e8 31 40 c0 ff callq 49060 <memcpy@plt> > > 44502f: 8b 45 fc mov -0x4(%rbp),%eax > > 445032: c9 leaveq > > 445033: c3 retq > > > > > > With likely: > > > > 0000000000444fce <u_upload_data>: > > 44501d: 48 85 c0 test %rax,%rax > > 445020: 74 16 je 445038 <u_upload_data+0x6a> > > 445022: 8b 55 e0 mov -0x20(%rbp),%edx > > 445025: 48 8b 45 f0 mov -0x10(%rbp),%rax > > 445029: 48 8b 4d d8 mov -0x28(%rbp),%rcx > > 44502d: 48 89 ce mov %rcx,%rsi > > 445030: 48 89 c7 mov %rax,%rdi > > 445033: e8 28 40 c0 ff callq 49060 <memcpy@plt> > > 445038: 8b 45 fc mov -0x4(%rbp),%eax > > 44503b: c9 leaveq > > 44503c: c3 retq > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev