Hello Tristan,

02.12.2023 00:48, Tristan Partin wrote:

So it looks like the asan feature detect_stack_use_after_return implemented
in gcc allows itself to add some data on stack, that breaks our alignment
expectations. With all three such Asserts in md.c removed,
`make check-world` passes for me.

Decided to do some digging into this, and Google actually documents[0] how it works. After reading the algorithm, it is obvious why this fails. What happens if you throw an __attribute__((no_sanitize("address")) on the function? I assume the Asserts would then pass. The commit[1] which added pg_attribute_aligned() provides insight as to why the Asserts exist.

Thank you for spending your time on this!

Yes, I understand what those Asserts were added for, I removed them just
to check what else is on the way.
And I can confirm that marking that function with the no_sanitize attribute
fixes that exact failure. Then the same attribute has to be added to
_hash_alloc_buckets(), to prevent:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: 
"md.c", Line: 471, PID: 1766976

#5  0x00005594a6a0f0d0 in ExceptionalCondition (conditionName=0x5594a7454a60 "(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)", fileName=0x5594a7454880 "md.c", lineNumber=471) at assert.c:66 #6  0x00005594a61ce133 in mdextend (reln=0x625000037e48, forknum=MAIN_FORKNUM, blocknum=9, buffer=0x7fc3b3947020, skipFsync=false) at md.c:471 #7  0x00005594a61d89ab in smgrextend (reln=0x625000037e48, forknum=MAIN_FORKNUM, blocknum=9, buffer=0x7fc3b3947020, skipFsync=false) at smgr.c:501
#8  0x00005594a4a0c43d in _hash_alloc_buckets (rel=0x7fc3a89714f8, 
firstblock=6, nblocks=4) at hashpage.c:1033

And to RelationCopyStorage(), to prevent:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: 
"md.c", Line: 752, PID: 1787855

#5  0x000056081d5688bc in ExceptionalCondition (conditionName=0x56081dfaea40 "(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)", fileName=0x56081dfae860 "md.c", lineNumber=752) at assert.c:66
#6  0x000056081cd29415 in mdread (reln=0x629000043158, forknum=MAIN_FORKNUM, 
blocknum=0, buffer=0x7fe480633020) at md.c:752
#7  0x000056081cd32cb3 in smgrread (reln=0x629000043158, forknum=MAIN_FORKNUM, blocknum=0, buffer=0x7fe480633020) at smgr.c:565 #8  0x000056081b9ed5f2 in RelationCopyStorage (src=0x629000043158, dst=0x629000041248, forkNum=MAIN_FORKNUM, relpersistence=112 'p') at storage.c:487

Probably, it has to be added for all the functions where PGIOAlignedBlock
located on stack...

But I still wonder, how it works with clang, why that extra attribute is
not required?
In other words, such implementation specifics discourage me...


Possibly, but I think I would rather see upstream support running with all features with instrumentation turned off in various sections of code. Even some assistance from AddressSanitizer is better than none. Here[1][2] are all the AddressSanitizer flags for those curious.

Yeah, and you might also need to specify extra flags to successfully run
postgres with newer sanitizers' versions. Say, for clang-18 you need to
specify -fno-sanitize=function (which is not recognized by gcc 13.2), to
avoid errors like this:
running bootstrap script ... dynahash.c:1120:4: runtime error: call to function strlcpy through pointer to incorrect function type 'void *(*)(void *, const void *, unsigned long)'
.../src/port/strlcpy.c:46: note: strlcpy defined here
    #0 0x556af5e0b0a9 in hash_search_with_hash_value 
.../src/backend/utils/hash/dynahash.c:1120:4
    #1 0x556af5e08f4f in hash_search .../src/backend/utils/hash/dynahash.c:958:9

I personally would like to see Postgres have support for  AddressSanitizer. I think it already supports UndefinedBehaviorSanitizer if I am remembering the buildfarm properly. AddressSanitizer has been so helpful in past experiences writing C.

Me too. I find it very valuable for my personal usage but I'm afraid it's
still not very stable/mature.
One more example. Just adding -fsanitize=undefined for gcc 12, 13 (I tried
12.1, 13.0, 13.2) produces new warnings like:
In function 'PageGetItemId',
    inlined from 'heap_xlog_update' at heapam.c:9569:9:
../../../../src/include/storage/bufpage.h:243:16: warning: array subscript -1 is below array bounds of 'ItemIdData[]' [-Warray-bounds=]
  243 |         return &((PageHeader) page)->pd_linp[offsetNumber - 1];
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

But I don't get such warnings when I use gcc 11.3 (though it generates
other ones) or clang (15, 18). They also aren't produced with -O0, -O1...
Maybe it's another gcc bug, I'm not sure how to deal with it.
(I can research this issue, if it makes any sense.)

So I would say that cost of providing/maintaining full support for asan
(hwasan), ubsan is not near zero, unfortunately. I would estimate it to
10-20 discussions/commits on start/5-10 per year later (not including fixes
for bugs that would be found). If it's affordable for the project, I'd like
to have such support out-of-the-box.

Best regards,
Alexander


Reply via email to