Dear @misc readers, I have recently started using OpenBSD and have encountered the problem shown in the subject when porting a software used on Linux to OpenBSD. The problem is outlined as follows: SEGV occurs when trying to read huge size data placed in a .text section that exceeds the `PAGE_SIZE` by a large amount.
My environments are as follows: machine: 1. QEMU/KVM x86_64 6.2.0 2. Dynabook R63/J (Intel Core i5-7300U) OS: OpenBSD 7.5 compiler: clang 16.0.6 Below is the minimal code that reproduces the problem. https://github.com/sheinasker/data-asm/tree/main What this code does is to copy the contents of a global string variable defined in the assembler to a dynamically allocated area and display the address, size, and leading and trailing data. The entity of `sample_code` is defined in assembler and its content is a string of 12289 bytes filled with 'A'. The SEGV occurs in the part of the code below that executes `memcpy`. ```cpp #include <iostream> #include <string> #include <cstring> extern "C" char sample_code[]; extern "C" std::uint32_t sample_code_size; int main() { std::cout << "address: " << reinterpret_cast<void*>(sample_code) << std::endl; char* buf = (char*)std::malloc(sample_code_size); // SEGV std::memcpy(buf, sample_code, sample_code_size); std::cout << "size: " << std::strlen(buf) << std::endl; std::cout << "head: " << std::string(buf, buf + 10) << std::endl; std::cout << "tail: " << std::string(buf + sample_code_size - 11, buf + sample_code_size - 1) << std::endl; } ``` Running it with `make run1`, you will see that it crashes with SIGSEGV. The log when debugging with `lldb` is as follows: ``` openbsd-host$ lldb sample1 (lldb) target create "sample1" Current executable set to '/home/asker/src/data-asm/sample1' (x86_64). (lldb) b main Breakpoint 1: where = sample1`main, address = 0x0000000000006410 (lldb) run Process 8967 launched: '/home/asker/src/data-asm/sample1' (x86_64) Process 8967 stopped * thread #1, stop reason = breakpoint 1.1 frame #0: 0x00000befee364410 sample1`main sample1`main: -> 0xbefee364410 <+0>: endbr64 0xbefee364414 <+4>: movq 0x372d(%rip), %r11 ; __retguard_831 0xbefee36441b <+11>: xorq (%rsp), %r11 0xbefee36441f <+15>: pushq %rbp (lldb) c Process 8967 resuming address: 0xbefee361400 Process 8967 stopped * thread #1, stop reason = signal SIGSEGV frame #0: 0x00000bf2b0c282b0 libc.so.99.0`memcpy(dst0=0x00000bf29066c000, src0=<unavailable>, length=12289) at memcpy.c:103:2 (lldb) c Process 8967 resuming Process 8967 exited with status = 11 (0x0000000b) (lldb) q ``` At the same time, the history of system calls was also recorded by `ktrace`, so that is also shown. ``` 8967 sample1 CALL kbind(0x6fe6698ee708,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee6b8,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee628,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee608,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee628,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee5d8,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL mprotect(0xbf24ee36000,0x1000,0x3<PROT_READ|PROT_WRITE>) 8967 sample1 RET mprotect 0 8967 sample1 CALL mprotect(0xbf24ee36000,0x1000,0x1<PROT_READ>) 8967 sample1 RET mprotect 0 8967 sample1 CALL fstat(1,0x6fe6698ee500) 8967 sample1 STRU struct stat { dev=0, ino=104192, mode=crw--w---- , nlink=1, uid=1000<"asker">, gid=4<"tty">, rdev=1283, atime=1722062206<"Jul 27 15:36:46 2024">.276320559, mtime=1722062206<"Jul 27 15:36:46 2024">.276320559, ctime=1722062206<"Jul 27 15:36:46 2024">.276320559, size=0, blocks=0, blksize=65536, flags=0x0, gen=0x0 } 8967 sample1 RET fstat 0 8967 sample1 CALL mmap(0,0x10000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0) 8967 sample1 RET mmap 13137847422976/0xbf2e4ba9000 8967 sample1 CALL fcntl(1,F_ISATTY) 8967 sample1 RET fcntl 1 8967 sample1 CALL kbind(0x6fe6698ee6b8,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee798,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee738,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee738,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee668,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee568,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee738,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee738,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL write(1,0xbf2e4ba9000,0x17) 8967 sample1 GIO fd 1 wrote 23 bytes "address: 0xbefee361400 " 8967 sample1 RET write 23/0x17 8967 sample1 CALL kbind(0x6fe6698ee738,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee6a8,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL kbind(0x6fe6698ee798,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 CALL mmap(0,0x4000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0) 8967 sample1 RET mmap 13136432644096/0xbf29066c000 8967 sample1 CALL kbind(0x6fe6698ee798,24,0x3e7ebd77b6a5befb) 8967 sample1 RET kbind 0 8967 sample1 PSIG SIGSEGV SIG_DFL code=SEGV_ACCERR addr=0xbefee362000 trapno=6 8967 sample1 NAMI "sample1.core" ``` As these logs show, the first address of `sample_code` is 0xbefee361400 and the first 3072 bytes can be read, but a SEGV occurs when trying to read 0xbefee362000. Looking at the address values, it appears that this issue is happening at the boundary of the page. Oddly enough, if I run the same executable directly after debugging with lldb, it succeeds, but when I recompile it, the symptoms return. I want to know two things. - Whether this behavior is due to OpenBSD's protection feature (i.e., whether it is an OpenBSD specification). - How to resolve this. As a supplement, when I ran the exact same code on Ubuntu and FreeBSD, no problems occurred. Best regards, Shein