Issue 132736
Summary [RISCV] llvm-objdump disassembly ends prematurely using --disassemble-symbols when generating object files directly from assembly
Labels new issue
Assignees
Reporter thebigclub
    I'm getting strange behavior when disassembling object files generated from C files vs. assembly:

```bash
$ cat << EOF > test.c
int testfn(int n) {

        for (int i = 0; i < 5; i++) {
 n += i;
        }

        return n;
}
EOF
$
$ # First generate directly from C file
$
$ clang -c test.c
$ llvm-nm -S test.o
00000000 00000048 T testfn
$
$ llvm-objdump -t test.o | grep testfn
00000000 g     F .text	00000048 testfn
$
$ llvm-readobj --elf-output-style=GNU -s test.o | grep testfn
     9: 00000000    72 FUNC GLOBAL DEFAULT     2 testfn
$
$ llvm-objdump --disassemble-symbols=testfn test.o

test.o:	file format elf32-littleriscv

Disassembly of section .text:

00000000 <testfn>:
       0: 1141         	addi	sp, sp, -0x10
 2: c606         	sw	ra, 0xc(sp)
       4: c422         	sw	s0, 0x8(sp)
       6: 0800         	addi	s0, sp, 0x10
       8: fea42a23 	sw	a0, -0xc(s0)
       c: 4501         	li	a0, 0x0
       e: fea42823 	sw	a0, -0x10(s0)
      12: a001         	j	0x12 <testfn+0x12>
 14: ff042583     	lw	a1, -0x10(s0)
      18: 4511         	li	a0, 0x4
 1a: 00b54063     	blt	a0, a1, 0x1a <testfn+0x1a>
      1e: a001 	j	0x1e <testfn+0x1e>
      20: ff042583     	lw	a1, -0x10(s0)
 24: ff442503     	lw	a0, -0xc(s0)
      28: 952e         	add	a0, a0, a1
      2a: fea42a23     	sw	a0, -0xc(s0)
      2e: a001 	j	0x2e <testfn+0x2e>
      30: ff042503     	lw	a0, -0x10(s0)
 34: 0505         	addi	a0, a0, 0x1
      36: fea42823     	sw	a0, -0x10(s0)
      3a: a001         	j	0x3a <testfn+0x3a>
      3c: ff442503 	lw	a0, -0xc(s0)
      40: 40b2         	lw	ra, 0xc(sp)
      42: 4422         	lw	s0, 0x8(sp)
      44: 0141         	addi	sp, sp, 0x10
 46: 8082         	ret
$
$ # Now generate from assembly file
$
$ clang -S test.c
$ clang -c test.s
$ llvm-nm -S test.o | grep testfn
00000000 00000048 T testfn
$
$ llvm-objdump -t test.o | grep testfn
00000000 g     F .text	00000048 testfn
$
$ llvm-readobj --elf-output-style=GNU -s test.o | grep testfn
     9: 00000000    72 FUNC GLOBAL DEFAULT     2 testfn
$
$ llvm-objdump --disassemble-symbols=testfn test.o

test.o:	file format elf32-littleriscv

Disassembly of section .text:

00000000 <testfn>:
       0: 1141         	addi	sp, sp, -0x10
 2: c606         	sw	ra, 0xc(sp)
       4: c422         	sw	s0, 0x8(sp)
       6: 0800         	addi	s0, sp, 0x10
       8: fea42a23 	sw	a0, -0xc(s0)
       c: 4501         	li	a0, 0x0
       e: fea42823 	sw	a0, -0x10(s0)
      12: a001         	j	0x12 <testfn+0x12>
```
You can see that the disassembly output ends prematurely when going from C => assembly => object file compared to C => object file. The size of the `testfn()` function is 72 (0x48) bytes in both cases. If I use the `-d` option instead of `--disassemble-symbols`, the entire file is disassembled properly for the assembly version.

<br>The local labels are different for C => object file compared to C => assembly => object file:
```bash
# When compiled from C file
$ llvm-readobj --elf-output-style=GNU -s test.o

Symbol table '.symtab' contains 10 entries:
   Num:    Value  Size Type    Bind   Vis       Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT   UND 
     1: 00000000     0 FILE LOCAL  DEFAULT   ABS test.c
     2: 00000000     0 NOTYPE  LOCAL  DEFAULT 2 $x
     3: 00000014     0 NOTYPE  LOCAL  DEFAULT     2 .L0 
     4: 0000003c     0 NOTYPE  LOCAL  DEFAULT     2 .L0 
     5: 00000020     0 NOTYPE  LOCAL  DEFAULT     2 .L0 
     6: 00000030     0 NOTYPE  LOCAL DEFAULT     2 .L0 
     7: 00000000     0 NOTYPE  LOCAL  DEFAULT     4 $d
 8: 00000000     0 NOTYPE  LOCAL  DEFAULT     6 $d
     9: 00000000    72 FUNC    GLOBAL DEFAULT     2 testfn
$
# When compiled from assembly file
$ llvm-readobj --elf-output-style=GNU -s test.o

Symbol table '.symtab' contains 10 entries:
   Num:    Value  Size Type    Bind   Vis       Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT   UND 
     1: 00000000 0 FILE    LOCAL  DEFAULT   ABS test.c
     2: 00000000     0 NOTYPE LOCAL  DEFAULT     2 $x
     3: 00000014     0 NOTYPE  LOCAL  DEFAULT     2 .LBB0_1
     4: 0000003c     0 NOTYPE  LOCAL  DEFAULT     2 .LBB0_4
     5: 00000020     0 NOTYPE  LOCAL  DEFAULT     2 .LBB0_2
     6: 00000030     0 NOTYPE  LOCAL  DEFAULT     2 .LBB0_3
     7: 00000000     0 NOTYPE  LOCAL DEFAULT     4 $d
     8: 00000000     0 NOTYPE  LOCAL  DEFAULT     6 $d
 9: 00000000    72 FUNC    GLOBAL DEFAULT     2 testfn
```

<br>I noticed that if I edit the assembly file and change the jump references to `.LBB0_1` to another local label, the disassembly output advances and stops at the next local label (`.LBB0_2`):
```bash
$ # Edit test.s
$ clang -c test.s
$ llvm-objdump --disassemble-symbols=testfn test.o

test.o:	file format elf32-littleriscv

Disassembly of section .text:

00000000 <testfn>:
 0: 1141         	addi	sp, sp, -0x10
       2: c606         	sw	ra, 0xc(sp)
       4: c422         	sw	s0, 0x8(sp)
       6: 0800 	addi	s0, sp, 0x10
       8: fea42a23     	sw	a0, -0xc(s0)
       c: 4501         	li	a0, 0x0
       e: fea42823     	sw	a0, -0x10(s0)
 12: a001         	j	0x12 <testfn+0x12>
      14: ff042583     	lw	a1, -0x10(s0)
      18: 4511         	li	a0, 0x4
      1a: 00b54063 	blt	a0, a1, 0x1a <testfn+0x1a>
      1e: a001         	j	0x1e <testfn+0x1e>
```

<br>Tool versions:
```bash
$ clang -v
clang version 21.0.0git (https://github.com/llvm/llvm-project.git 30ff508614c90311509adc0890e32e7f86ec4fb8)
Target: riscv32-unknown-unknown-elf
$
$ llvm-objdump -v
LLVM (http://llvm.org/):
 LLVM version 21.0.0git
  Optimized build.


  Registered Targets:
 riscv32 - 32-bit RISC-V
    riscv64 - 64-bit RISC-V
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to