On Sep 29, 2020, at 9:30 PM, Alvaro Herrera <alvhe...@2ndquadrant.com<mailto:alvhe...@2ndquadrant.com>> wrote:
On 2020-Sep-26, Li Japin wrote: Thanks! How big is this overhead? Is there any way I can test it? You could also have a look at the assembly code that your compiler generates -- particularly examine how it changes. Thanks for your advice! The origin assembly code for palloc0 is: 0000000000517690 <palloc0>: 517690: 55 push %rbp 517691: 53 push %rbx 517692: 48 89 fb mov %rdi,%rbx 517695: 48 83 ec 08 sub $0x8,%rsp 517699: 48 81 ff ff ff ff 3f cmp $0x3fffffff,%rdi 5176a0: 48 8b 2d d9 0c 48 00 mov 0x480cd9(%rip),%rbp # 998380 <CurrentMemoryContext> 5176a7: 0f 87 d5 00 00 00 ja 517782 <palloc0+0xf2> 5176ad: 48 8b 45 10 mov 0x10(%rbp),%rax 5176b1: 48 89 fe mov %rdi,%rsi 5176b4: c6 45 04 00 movb $0x0,0x4(%rbp) 5176b8: 48 89 ef mov %rbp,%rdi 5176bb: ff 10 callq *(%rax) 5176bd: 48 85 c0 test %rax,%rax 5176c0: 48 89 c1 mov %rax,%rcx 5176c3: 74 5b je 517720 <palloc0+0x90> 5176c5: f6 c3 07 test $0x7,%bl 5176c8: 75 36 jne 517700 <palloc0+0x70> 5176ca: 48 81 fb 00 04 00 00 cmp $0x400,%rbx 5176d1: 77 2d ja 517700 <palloc0+0x70> 5176d3: 48 01 c3 add %rax,%rbx 5176d6: 48 39 d8 cmp %rbx,%rax 5176d9: 73 35 jae 517710 <palloc0+0x80> 5176db: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 5176e0: 48 83 c0 08 add $0x8,%rax 5176e4: 48 c7 40 f8 00 00 00 movq $0x0,-0x8(%rax) 5176eb: 00 5176ec: 48 39 c3 cmp %rax,%rbx 5176ef: 77 ef ja 5176e0 <palloc0+0x50> 5176f1: 48 83 c4 08 add $0x8,%rsp 5176f5: 48 89 c8 mov %rcx,%rax 5176f8: 5b pop %rbx 5176f9: 5d pop %rbp 5176fa: c3 retq 5176fb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 517700: 48 89 cf mov %rcx,%rdi 517703: 48 89 da mov %rbx,%rdx 517706: 31 f6 xor %esi,%esi 517708: e8 e3 0e ba ff callq b85f0 <memset@plt> 51770d: 48 89 c1 mov %rax,%rcx 517710: 48 83 c4 08 add $0x8,%rsp 517714: 48 89 c8 mov %rcx,%rax 517717: 5b pop %rbx 517718: 5d pop %rbp 517719: c3 retq 51771a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 517720: 48 8b 3d 51 0c 48 00 mov 0x480c51(%rip),%rdi # 998378 <TopMemoryContext> 517727: be 64 00 00 00 mov $0x64,%esi 51772c: e8 1f f9 ff ff callq 517050 <MemoryContextStatsDetail> 517731: 31 f6 xor %esi,%esi 517733: bf 14 00 00 00 mov $0x14,%edi 517738: e8 53 6d fd ff callq 4ee490 <errstart> 51773d: bf c5 20 00 00 mov $0x20c5,%edi 517742: e8 99 9b fd ff callq 4f12e0 <errcode> 517747: 48 8d 3d 07 54 03 00 lea 0x35407(%rip),%rdi # 54cb55 <__func__.7554+0x45> 51774e: 31 c0 xor %eax,%eax 517750: e8 ab 9d fd ff callq 4f1500 <errmsg> 517755: 48 8b 55 38 mov 0x38(%rbp),%rdx 517759: 48 8d 3d 80 11 16 00 lea 0x161180(%rip),%rdi # 6788e0 <__func__.6248+0x150> 517760: 48 89 de mov %rbx,%rsi 517763: 31 c0 xor %eax,%eax 517765: e8 56 a2 fd ff callq 4f19c0 <errdetail> 51776a: 48 8d 15 ff 11 16 00 lea 0x1611ff(%rip),%rdx # 678970 <__func__.7326> 517771: 48 8d 3d 20 11 16 00 lea 0x161120(%rip),%rdi # 678898 <__func__.6248+0x108> 517778: be eb 03 00 00 mov $0x3eb,%esi 51777d: e8 0e 95 fd ff callq 4f0c90 <errfinish> 517782: 31 f6 xor %esi,%esi 517784: bf 14 00 00 00 mov $0x14,%edi 517789: e8 02 6d fd ff callq 4ee490 <errstart> 51778e: 48 8d 3d db 10 16 00 lea 0x1610db(%rip),%rdi # 678870 <__func__.6248+0xe0> 517795: 48 89 de mov %rbx,%rsi 517798: 31 c0 xor %eax,%eax 51779a: e8 91 98 fd ff callq 4f1030 <errmsg_internal> 51779f: 48 8d 15 ca 11 16 00 lea 0x1611ca(%rip),%rdx # 678970 <__func__.7326> 5177a6: 48 8d 3d eb 10 16 00 lea 0x1610eb(%rip),%rdi # 678898 <__func__.6248+0x108> 5177ad: be df 03 00 00 mov $0x3df,%esi 5177b2: e8 d9 94 fd ff callq 4f0c90 <errfinish> 5177b7: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) 5177be: 00 00 After modified, the palloc0 assembly code is: 0000000000517690 <palloc0>: 517690: 53 push %rbx 517691: 48 89 fb mov %rdi,%rbx 517694: e8 17 ff ff ff callq 5175b0 <palloc> 517699: f6 c3 07 test $0x7,%bl 51769c: 48 89 c1 mov %rax,%rcx 51769f: 75 2f jne 5176d0 <palloc0+0x40> 5176a1: 48 81 fb 00 04 00 00 cmp $0x400,%rbx 5176a8: 77 26 ja 5176d0 <palloc0+0x40> 5176aa: 48 01 c3 add %rax,%rbx 5176ad: 48 39 d8 cmp %rbx,%rax 5176b0: 73 2e jae 5176e0 <palloc0+0x50> 5176b2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 5176b8: 48 83 c0 08 add $0x8,%rax 5176bc: 48 c7 40 f8 00 00 00 movq $0x0,-0x8(%rax) 5176c3: 00 5176c4: 48 39 c3 cmp %rax,%rbx 5176c7: 77 ef ja 5176b8 <palloc0+0x28> 5176c9: 48 89 c8 mov %rcx,%rax 5176cc: 5b pop %rbx 5176cd: c3 retq 5176ce: 66 90 xchg %ax,%ax 5176d0: 48 89 cf mov %rcx,%rdi 5176d3: 48 89 da mov %rbx,%rdx 5176d6: 31 f6 xor %esi,%esi 5176d8: e8 13 0f ba ff callq b85f0 <memset@plt> 5176dd: 48 89 c1 mov %rax,%rcx 5176e0: 48 89 c8 mov %rcx,%rax 5176e3: 5b pop %rbx 5176e4: c3 retq 5176e5: 90 nop 5176e6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 5176ed: 00 00 00 I know why we need the duplication code in palloc0. -- Best regrads Japin Li