I submitted this to bugzilla last week, but I didn't know who to assign it to, and I wanted it to get more exposure. If I'm posting this to the wrong place, I apologize:
I suspect there might be an overflow bug related to the R_ALPHA_LITERAL relocation type in binutils, but I cannot be certain. This is a summary of what happened that lead me to this conjecture: For the past few days I've been attempting to compile the openafs-1.3.87 client kernel module, libafs, against linux-2.6.12.5 on an alpha. Specifically this machine is running: Debian GNU/Linux 3.1 (sarge) linux-2.6.12.5 gcc-3.3.5-13 binutils-2.15-6 I have been able to successfully compile libafs previously, the most recent version 1.3.81. However, while each of the versions 1.3.82-1.3.87 appear to compile properly, inserting the libafs.ko module always fails with the same error: # insmod libafs-2.6.12.5.ko insmod: error inserting 'libafs-2.6.12.5.ko': -1 Invalid module format Which produces the dmesg output: # dmesg ... module libafs: Relocation overflow vs __divqu I then recompiled both libafs 1.3.81 and libafs 1.3.87 with a cvs build of binutils (2.16.91 20050827). Again, the former libafs loaded successfully and the latter failed. Here is some cursory analysis of what's going on: Checking the both modules for __divqu relocated symbols: 1.3.81 (success): # readelf -r libafs-2.6.12.5.ko Relocation section '.rela.text' at offset 0x94d60 contains 28885 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000000 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 000000000060 02ce00000004 R_ALPHA_LITERAL 0000000000000000 __divqu + 0 000000000070 000100000005 R_ALPHA_LITUSE 0000000000000000 .text + 3 000000000070 02ce00000008 R_ALPHA_HINT 0000000000000000 __divqu + 0 0000000000a0 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 000000000100 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 000000000154 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 ... 1.3.87 (failure): # readelf -r libafs-2.6.12.5.ko Relocation section '.rela.text' at offset 0xa6ae0 contains 35987 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000000 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 000000000060 02f000000004 R_ALPHA_LITERAL 0000000000000000 __divqu + 0 000000000070 000100000005 R_ALPHA_LITUSE 0000000000000000 .text + 3 000000000070 02f000000008 R_ALPHA_HINT 0000000000000000 __divqu + 0 0000000000a0 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 000000000100 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 000000000154 000100000006 R_ALPHA_GPDISP 0000000000000000 .text + 4 ... So, in both modules there appears to be two relocation entries for __divqu. I then looked at the kernel source code to see how linux handles these relocations. The R_ALPHA_HINT case in linux-2.6.12.5/arch/alpha/kernel/module.c showed nothing that it would result in that relocation error being printed to the console. Looking at the R_ALPHA_LITERAL case: case R_ALPHA_LITERAL: hi = got + r_got_offset; lo = hi - gp; if ((short)lo != lo) goto reloc_overflow; *(u16 *)location = lo; *(u64 *)hi = value; break; I'm not entirely sure what those variable are, but as far as I can tell, got is the address of the global offset table, gp is the value of the gp register, and lo is the signed 16-bit relative location. I added some debugging output to the above routine and reran insmod on both the working 1.3.81 libafs.ko, and the broken 1.3.87 libafs.ko: 1.3.81 (success): # insmod libafs-2.6.12.5.ko # dmesg ... module libafs: Relocation R_ALPHA_LITERAL: __divqu got: fffffffc0051efc8 r_got_offset: 0 gp: fffffffc005261a8 hi: fffffffc0051efc8 lo: ffffffffffff8e20 (short)lo: 8e20 Found system call table at 0xfffffc00006c0748 (scan: close+wait4) It looks like lo contains a negative location offset which properly truncates to a negative signed short. Thus the ((short)lo != lo) evaluates false, and the module continues to load. Meanwhile: 1.3.87 (failure): # insmod libafs-2.6.12.5.ko insmod: error inserting 'libafs-2.6.12.5.ko': -1 Invalid module format # dmesg ... module libafs: Relocation R_ALPHA_LITERAL: __divqu got: fffffffc0056f3f0 r_got_offset: 0 gp: fffffffc0057a358 hi: fffffffc0056f3f0 lo: ffffffffffff5098 (short)lo: 5098 module libafs: Relocation overflow vs __divqu Again it looks like lo contains a negative location offset. But signed short truncation causes it to overflow and become a positive value. Thus the ((short)lo != lo) evaluates true, and the kernel error results. I assumed that this overflow comes directly form the relocation section in the module itself, thus leading me to believe that there is a bug somewhere in binutils that improperly determines the relocation type for the given boundaries. I started to look through the binutils/bfd source itself, but lacking familiarity with it I can't make sense of much. A glance at the changelog between openafs 1.3.82 (the first version that failed in this way) and 1.3.81 showed nothing obvious that would suggest another reason for this error. However, 1.3.82+ does contain some additional code in the module that increases its size from 1.4M to 1.6M. A search on lkml suggested that relocation overflows on alpha linux is often associated to larger sized modules. I believe that these recent versions are the largest sized libafs modules on alpha to date. If you would like more information to help analyze this problem (build output, object modules, source trees) please let me know, and I'll try to provide any useful information that I can. Thanks. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-binutils