Re: [PATCH 1/8] unwind: build kernel with sframe info

2025-02-04 Thread Indu Bhagat

On 1/27/25 1:33 PM, Weinan Liu wrote:

Use the -Wa,--gsframe flags to build the code, so GAS will generate
a new .sframe section for the stack trace information.
Currently, the sframe format only supports arm64 and x86_64
architectures. Add this configuration on arm64 to enable sframe
unwinder in the future.

Signed-off-by: Weinan Liu 
---
  Makefile  |  6 ++
  arch/Kconfig  |  8 
  arch/arm64/Kconfig.debug  | 10 ++
  include/asm-generic/vmlinux.lds.h | 12 
  4 files changed, 36 insertions(+)

diff --git a/Makefile b/Makefile
index b9464c88ac72..35200c39b98d 100644
--- a/Makefile
+++ b/Makefile
@@ -1064,6 +1064,12 @@ ifdef CONFIG_CC_IS_GCC
  KBUILD_CFLAGS   += -fconserve-stack
  endif
  
+# build with sframe table

+ifdef CONFIG_SFRAME_UNWIND_TABLE
+KBUILD_CFLAGS  += -Wa,--gsframe
+KBUILD_AFLAGS  += -Wa,--gsframe
+endif
+
  # change __FILE__ to the relative path to the source directory
  ifdef building_out_of_srctree
  KBUILD_CPPFLAGS += $(call cc-option,-fmacro-prefix-map=$(srcroot)/=)
diff --git a/arch/Kconfig b/arch/Kconfig
index 6682b2a53e34..ae70f7dbe326 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1736,4 +1736,12 @@ config ARCH_WANTS_PRE_LINK_VMLINUX
  An architecture can select this if it provides 
arch//tools/Makefile
  with .arch.vmlinux.o target to be linked into vmlinux.
  
+config AS_HAS_SFRAME_SUPPORT

+   # Detect availability of the AS option -Wa,--gsframe for generating
+   # sframe unwind table.
+   def_bool $(cc-option,-Wa$(comma)--gsframe)
+


Since the version of an admissible SFrame section needs to be atleast 
SFRAME_VERSION_2, it will make sense to include SFrame version check 
when detecting compatible toolchain.



+config SFRAME_UNWIND_TABLE
+   bool
+
  endmenu
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index 265c4461031f..ed619fcb18b3 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -20,4 +20,14 @@ config ARM64_RELOC_TEST
depends on m
tristate "Relocation testing module"
  
+config SFRAME_UNWINDER

+   bool "Sframe unwinder"
+   depends on AS_HAS_SFRAME_SUPPORT
+   depends on 64BIT
+   select SFRAME_UNWIND_TABLE
+   help
+ This option enables the sframe (Simple Frame) unwinder for unwinding
+ kernel stack traces. It uses unwind table that is direclty generated
+ by toolchain based on DWARF CFI information
+
  source "drivers/hwtracing/coresight/Kconfig"
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 54504013c749..6a437bd084c7 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -469,6 +469,8 @@ defined(CONFIG_AUTOFDO_CLANG) || 
defined(CONFIG_PROPELLER_CLANG)
*(.rodata1) \
}   \
\
+   SFRAME  \
+   \
/* PCI quirks */\
.pci_fixup: AT(ADDR(.pci_fixup) - LOAD_OFFSET) {\
BOUNDED_SECTION_PRE_LABEL(.pci_fixup_early,  _pci_fixups_early, 
 __start, __end) \
@@ -886,6 +888,16 @@ defined(CONFIG_AUTOFDO_CLANG) || 
defined(CONFIG_PROPELLER_CLANG)
  #define TRACEDATA
  #endif
  
+#ifdef CONFIG_SFRAME_UNWIND_TABLE

+#define SFRAME \
+   /* sframe */\
+   .sframe: AT(ADDR(.sframe) - LOAD_OFFSET) {  \
+   BOUNDED_SECTION_BY(.sframe, _sframe_header) \
+   }
+#else
+#define SFRAME
+#endif
+
  #ifdef CONFIG_PRINTK_INDEX
  #define PRINTK_INDEX  \
.printk_index : AT(ADDR(.printk_index) - LOAD_OFFSET) { \





Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-01-28 Thread Indu Bhagat

On 1/27/25 1:33 PM, Weinan Liu wrote:

This patchset implements a generic kernel sframe-based [1] unwinder.
The main goal is to support reliable stacktraces on arm64.

On x86 orc unwinder provides reliable stacktraces. But arm64 misses the
required support from objtool: it cannot generate orc unwind tables for
arm64.

Currently, there's already a sframe unwinder proposed for userspace: [2].
Since the sframe unwind table algorithm is similar, these two proposal
could integrate common functionality in the future.

There are some incomplete features or challenges:
   - The unwinder doesn't yet work with kernel modules. The `start_addr` of
 FRE from kernel modules doesn't appear correct, preventing us from
 unwinding functions from kernel modules.


I did file https://sourceware.org/bugzilla/show_bug.cgi?id=32589 
earlier.  It is misleading (and inconvenient) to see all 0s in the dump 
of non-relocated SFrame section.


I get the sense that while issue 32589 may be causing confusion that the 
SFrame data is not correct, the problem is elsewhere ?


I can share a fix for 32589 so atleast we can verify that the starting 
point is sane.



   - Currently, only GCC supports sframe.

Ref:
[1]: https://sourceware.org/binutils/docs/sframe-spec.html
[2]: https://lore.kernel.org/lkml/cover.1730150953.git.jpoim...@kernel.org/

Madhavan T. Venkataraman (1):
   arm64: Define TIF_PATCH_PENDING for livepatch

Weinan Liu (7):
   unwind: build kernel with sframe info
   arm64: entry: add unwind info for various kernel entries
   unwind: add sframe v2 header
   unwind: Implement generic sframe unwinder library
   unwind: arm64: Add sframe unwinder on arm64
   unwind: arm64: add reliable stacktrace support for arm64
   arm64: Enable livepatch for ARM64

  Makefile   |   6 +
  arch/Kconfig   |   8 +
  arch/arm64/Kconfig |   3 +
  arch/arm64/Kconfig.debug   |  10 +
  arch/arm64/include/asm/stacktrace/common.h |   6 +
  arch/arm64/include/asm/thread_info.h   |   4 +-
  arch/arm64/kernel/entry-common.c   |   4 +
  arch/arm64/kernel/entry.S  |  10 +
  arch/arm64/kernel/setup.c  |   2 +
  arch/arm64/kernel/stacktrace.c | 102 ++
  include/asm-generic/vmlinux.lds.h  |  12 ++
  include/linux/sframe_lookup.h  |  43 +
  kernel/Makefile|   1 +
  kernel/sframe.h| 215 +
  kernel/sframe_lookup.c | 196 +++
  15 files changed, 621 insertions(+), 1 deletion(-)
  create mode 100644 include/linux/sframe_lookup.h
  create mode 100644 kernel/sframe.h
  create mode 100644 kernel/sframe_lookup.c






Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-13 Thread Indu Bhagat

On 2/12/25 11:25 PM, Song Liu wrote:

On Wed, Feb 12, 2025 at 6:45 PM Josh Poimboeuf  wrote:


On Wed, Feb 12, 2025 at 06:36:04PM -0800, Song Liu wrote:

[   81.261748]  copy_process+0xfdc/0xfd58 [livepatch_special_static]


Does that copy_process+0xfdc/0xfd58 resolve to this line in
copy_process()?

 refcount_inc(¤t->signal->sigcnt);

Maybe the klp rela reference to 'current' is bogus, or resolving to the
wrong address somehow?


It resolves the following line.

p->signal->tty = tty_kref_get(current->signal->tty);

I am not quite sure how 'current' should be resolved.


Hm, on arm64 it looks like the value of 'current' is stored in the
SP_EL0 register.  So I guess that shouldn't need any relocations.


The size of copy_process (0xfd58) is wrong. It is only about
5.5kB in size. Also, the copy_process function in the .ko file
looks very broken. I will try a few more things.


When I try each step of kpatch-build, the copy_process function
looks reasonable (according to gdb-disassemble) in fork.o and
output.o. However, copy_process looks weird in livepatch-special-static.o,
which is generated by ld:

ld -EL  -maarch64linux -z norelro -z noexecstack
--no-warn-rwx-segments -T ././kpatch.lds  -r -o
livepatch-special-static.o ./patch-hook.o ./output.o

I have attached these files to the email. I am not sure whether
the email server will let them through.

Indu, does this look like an issue with ld?



Sorry for the delay.

Looks like there has been progress since and issue may be elsewhere, but:

FWIW, I looked at the .sframe and .rela.sframe sections here, the data 
does look OK.  I noted that there is no .sframe for copy_process () in 
output.o... I will take a look into it.




Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-12 Thread Indu Bhagat

On 2/12/25 3:32 PM, Song Liu wrote:

I run some tests with this set and my RFC set [1]. Most of
the test is done with kpatch-build. I tested both Puranjay's
version [3] and my version [4].

For gcc 14.2.1, I have seen the following issue with this
test [2]. This happens with both upstream and 6.13.2.
The livepatch loaded fine, but the system spilled out the
following warning quickly.



In presence of the issue 
https://sourceware.org/bugzilla/show_bug.cgi?id=32666, I'd expect bad 
data in SFrame section.  Which may be causing this symptom?


To be clear, the issue affects loaded kernel modules.  I cannot tell for 
certain - is there module loading involved in your test ?



On the other hand, the same test works with LLVM and
my RFC set (LLVM doesn't support SFRAME, and thus
doesn't work with this set yet).

Thanks,
Song


[   81.250437] [ cut here ]
[   81.250818] refcount_t: saturated; leaking memory.
[   81.251201] WARNING: CPU: 0 PID: 95 at lib/refcount.c:22
refcount_warn_saturate+0x6c/0x140
[   81.251841] Modules linked in: livepatch_special_static(OEK)
[   81.252277] CPU: 0 UID: 0 PID: 95 Comm: bash Tainted: G
OE K6.13.2-00321-g52d2813b4b07 #49
[   81.253003] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE, [K]=LIVEPATCH
[   81.253503] Hardware name: linux,dummy-virt (DT)
[   81.253856] pstate: 634000c5 (nZCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[   81.254383] pc : refcount_warn_saturate+0x6c/0x140
[   81.254748] lr : refcount_warn_saturate+0x6c/0x140
[   81.255114] sp : 800085a6fc00
[   81.255371] x29: 800085a6fc00 x28: 0120 x27: c2966180
[   81.255918] x26:  x25: 8000829c x24: c2e9b608
[   81.256462] x23: 800083351000 x22: c2e9af80 x21: c062e140
[   81.257006] x20: c1c10c00 x19: 800085a6fd80 x18: 
[   81.257544] x17: 0001 x16:  x15: 0006
[   81.258083] x14:  x13: 2e79726f6d656d20 x12: 676e696b61656c20
[   81.258625] x11: 8000829f7d70 x10: 0147 x9 : 8000801546b4
[   81.259165] x8 : fffe x7 :  x6 : 800082f77d70
[   81.259709] x5 : 8000 x4 :  x3 : 0001
[   81.260257] x2 : 8000829f7a88 x1 : 8000829f7a88 x0 : 0026
[   81.260824] Call trace:
[   81.261015]  refcount_warn_saturate+0x6c/0x140 (P)
[   81.261387]  __refcount_add.constprop.0+0x60/0x70
[   81.261748]  copy_process+0xfdc/0xfd58 [livepatch_special_static]
[   81.262217]  kernel_clone+0x80/0x3e0
[   81.262499]  __do_sys_clone+0x5c/0x88
[   81.262786]  __arm64_sys_clone+0x24/0x38
[   81.263085]  invoke_syscall+0x4c/0x108
[   81.263378]  el0_svc_common.constprop.0+0x44/0xe8
[   81.263734]  do_el0_svc+0x20/0x30
[   81.263993]  el0_svc+0x34/0xf8
[   81.264231]  el0t_64_sync_handler+0x104/0x130
[   81.264561]  el0t_64_sync+0x184/0x188
[   81.264846] ---[ end trace  ]---
[   82.335559] [ cut here ]
[   82.335931] refcount_t: underflow; use-after-free.
[   82.336307] WARNING: CPU: 1 PID: 0 at lib/refcount.c:28
refcount_warn_saturate+0xec/0x140
[   82.336949] Modules linked in: livepatch_special_static(OEK)
[   82.337389] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G
W  OE K6.13.2-00321-g52d2813b4b07 #49
[   82.338148] Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE,
[K]=LIVEPATCH
[   82.338721] Hardware name: linux,dummy-virt (DT)
[   82.339083] pstate: 6345 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[   82.339617] pc : refcount_warn_saturate+0xec/0x140
[   82.340007] lr : refcount_warn_saturate+0xec/0x140
[   82.340378] sp : 80008370fe40
[   82.340637] x29: 80008370fe40 x28:  x27: 
[   82.341188] x26: 000a x25: fdaf7ab8 x24: 0014
[   82.341737] x23: 8000829c8d30 x22: 80008370ff28 x21: fe02
[   82.342286] x20: c062e140 x19: c2e9af80 x18: 
[   82.342839] x17: 80007b7a x16: 80008370 x15: 0006
[   82.343389] x14:  x13: 2e656572662d7265 x12: 7466612d65737520
[   82.343944] x11: 8000829f7d70 x10: 016a x9 : 8000801546b4
[   82.344499] x8 : fffe x7 :  x6 : 800082f77d70
[   82.345051] x5 : 8000 x4 :  x3 : 0001
[   82.345604] x2 : 8000829f7a88 x1 : 8000829f7a88 x0 : 0026
[   82.346163] Call trace:
[   82.346359]  refcount_warn_saturate+0xec/0x140 (P)
[   82.346736]  __put_task_struct+0x130/0x170
[   82.347063]  delayed_put_task_struct+0xbc/0xe8
[   82.347411]  rcu_core+0x20c/0x5f8
[   82.347680]  rcu_core_si+0x14/0x28
[   82.347952]  handle_softirqs+0x124/0x308
[   82.348260]  __do_softirq+0x18/0x20
[   82.348536]  do_softirq+0x14/0x28
[   82.348828]  call_on_irq_stack+0x24/0x30
[   82.349137]  do_softirq_own_stack+0x20/0x

Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-14 Thread Indu Bhagat

On 2/13/25 11:57 PM, Puranjay Mohan wrote:

Indu Bhagat  writes:


On 2/12/25 11:25 PM, Song Liu wrote:

On Wed, Feb 12, 2025 at 6:45 PM Josh Poimboeuf  wrote:


On Wed, Feb 12, 2025 at 06:36:04PM -0800, Song Liu wrote:

[   81.261748]  copy_process+0xfdc/0xfd58 [livepatch_special_static]


Does that copy_process+0xfdc/0xfd58 resolve to this line in
copy_process()?

  refcount_inc(¤t->signal->sigcnt);

Maybe the klp rela reference to 'current' is bogus, or resolving to the
wrong address somehow?


It resolves the following line.

p->signal->tty = tty_kref_get(current->signal->tty);

I am not quite sure how 'current' should be resolved.


Hm, on arm64 it looks like the value of 'current' is stored in the
SP_EL0 register.  So I guess that shouldn't need any relocations.


The size of copy_process (0xfd58) is wrong. It is only about
5.5kB in size. Also, the copy_process function in the .ko file
looks very broken. I will try a few more things.


When I try each step of kpatch-build, the copy_process function
looks reasonable (according to gdb-disassemble) in fork.o and
output.o. However, copy_process looks weird in livepatch-special-static.o,
which is generated by ld:

ld -EL  -maarch64linux -z norelro -z noexecstack
--no-warn-rwx-segments -T ././kpatch.lds  -r -o
livepatch-special-static.o ./patch-hook.o ./output.o

I have attached these files to the email. I am not sure whether
the email server will let them through.

Indu, does this look like an issue with ld?



Sorry for the delay.

Looks like there has been progress since and issue may be elsewhere, but:

FWIW, I looked at the .sframe and .rela.sframe sections here, the data
does look OK.  I noted that there is no .sframe for copy_process () in
output.o... I will take a look into it.


Hi Indu,

I saw another issue in my kernel build with sframes enabled (-Wa,--gsframe):

ld: warning: orphan section `.init.sframe' from 
`arch/arm64/kernel/pi/lib-fdt.pi.o' being placed in section `.init.sframe'
[... Many more similar warnings (.init.sframe) ...]

So, this orphan sections is generated in the build process.

I am using GNU ld version 2.41-50 and gcc (GCC) 11.4.1

Is this section needed for sframes to work? or can we discard


No this should not be discarded.  This looks like a wrongly-named but 
valid SFrame section.


Once correctly named as .sframe, the linker should do the right thing. 
Let me check whats going on..



.init.sframe section with a patch like following to the linker script:

-- 8< --

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 6a437bd08..8e704c0a6 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -1044,9 +1044,16 @@ defined(CONFIG_AUTOFDO_CLANG) || 
defined(CONFIG_PROPELLER_CLANG)
  # define SANITIZER_DISCARDS
  #endif

+#if defined(CONFIG_SFRAME_UNWIND_TABLE)
+#define DISCARD_INIT_SFRAME *(.init.sframe)
+#else
+#define DISCARD_INIT_SFRAME
+#endif
+
  #define COMMON_DISCARDS   
 \
 SANITIZER_DISCARDS  \
 PATCHABLE_DISCARDS  \
+   DISCARD_INIT_SFRAME \
 *(.discard) \
 *(.discard.*)   \
 *(.export_symbol)   \

-- >8 --

Thanks,
Puranjay





Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-14 Thread Indu Bhagat

On 2/14/25 9:39 AM, Indu Bhagat wrote:

On 2/13/25 11:57 PM, Puranjay Mohan wrote:

Indu Bhagat  writes:


On 2/12/25 11:25 PM, Song Liu wrote:
On Wed, Feb 12, 2025 at 6:45 PM Josh Poimboeuf  
wrote:


On Wed, Feb 12, 2025 at 06:36:04PM -0800, Song Liu wrote:
[   81.261748]  copy_process+0xfdc/0xfd58 
[livepatch_special_static]


Does that copy_process+0xfdc/0xfd58 resolve to this line in
copy_process()?

  refcount_inc(¤t->signal->sigcnt);

Maybe the klp rela reference to 'current' is bogus, or resolving 
to the

wrong address somehow?


It resolves the following line.

p->signal->tty = tty_kref_get(current->signal->tty);

I am not quite sure how 'current' should be resolved.


Hm, on arm64 it looks like the value of 'current' is stored in the
SP_EL0 register.  So I guess that shouldn't need any relocations.


The size of copy_process (0xfd58) is wrong. It is only about
5.5kB in size. Also, the copy_process function in the .ko file
looks very broken. I will try a few more things.


When I try each step of kpatch-build, the copy_process function
looks reasonable (according to gdb-disassemble) in fork.o and
output.o. However, copy_process looks weird in livepatch-special- 
static.o,

which is generated by ld:

ld -EL  -maarch64linux -z norelro -z noexecstack
--no-warn-rwx-segments -T ././kpatch.lds  -r -o
livepatch-special-static.o ./patch-hook.o ./output.o

I have attached these files to the email. I am not sure whether
the email server will let them through.

Indu, does this look like an issue with ld?



Sorry for the delay.

Looks like there has been progress since and issue may be elsewhere, 
but:


FWIW, I looked at the .sframe and .rela.sframe sections here, the data
does look OK.  I noted that there is no .sframe for copy_process () in
output.o... I will take a look into it.


Hi Indu,

I saw another issue in my kernel build with sframes enabled (-Wa,-- 
gsframe):


ld: warning: orphan section `.init.sframe' from `arch/arm64/kernel/pi/ 
lib-fdt.pi.o' being placed in section `.init.sframe'

[... Many more similar warnings (.init.sframe) ...]

So, this orphan sections is generated in the build process.

I am using GNU ld version 2.41-50 and gcc (GCC) 11.4.1

Is this section needed for sframes to work? or can we discard


No this should not be discarded.  This looks like a wrongly-named but 
valid SFrame section.




Not wrongly named. --prefix-alloc-sections=.init is expected to do that 
as .sframe is an allocated section.


Once correctly named as .sframe, the linker should do the right thing. 
Let me check whats going on..



.init.sframe section with a patch like following to the linker script:

-- 8< --

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/ 
vmlinux.lds.h

index 6a437bd08..8e704c0a6 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -1044,9 +1044,16 @@ defined(CONFIG_AUTOFDO_CLANG) || 
defined(CONFIG_PROPELLER_CLANG)

  # define SANITIZER_DISCARDS
  #endif

+#if defined(CONFIG_SFRAME_UNWIND_TABLE)
+#define DISCARD_INIT_SFRAME *(.init.sframe)
+#else
+#define DISCARD_INIT_SFRAME
+#endif
+
  #define 
COMMON_DISCARDS    \
 
SANITIZER_DISCARDS  \
 
PATCHABLE_DISCARDS  \

+   DISCARD_INIT_SFRAME \
 
*(.discard) \
 
*(.discard.*)   \
 
*(.export_symbol)   \


-- >8 --

Thanks,
Puranjay







Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-26 Thread Indu Bhagat

On 2/26/25 2:23 AM, Puranjay Mohan wrote:

Indu Bhagat  writes:


On 2/25/25 3:54 PM, Weinan Liu wrote:

On Tue, Feb 25, 2025 at 11:38 AM Indu Bhagat  wrote:


On Mon, Feb 10, 2025 at 12:30 AM Weinan Liu  wrote:

I already have a WIP patch to add sframe support to the kernel module.
However, it is not yet working. I had trouble unwinding frames for the
kernel module using the current algorithm.

Indu has likely identified the issue and will be addressing it from the
toolchain side.

https://sourceware.org/bugzilla/show_bug.cgi?id=32666


I have a working in progress patch that adds sframe support for kernel
module.
https://github.com/heuza/linux/tree/sframe_unwinder.rfc

According to the sframe table values I got during runtime testing, looks
like the offsets are not correct .



I hope to sanitize the fix for 32666 and post upstream soon (I had to
address other related issues).  Unless fixed, relocating .sframe
sections using the .rela.sframe is expected to generate incorrect output.


When unwind symbols init_module(0x80007b155048) from the kernel
module(livepatch-sample.ko), the start_address of the FDE entries in the
sframe table of the kernel modules appear incorrect.


init_module will apply the relocations on the .sframe section, isnt it ?


For instance, the first FDE's start_addr is reported as -20564. Adding
this offset to the module's sframe section address (0x80007b15a040)
yields 0x80007b154fec, which is not within the livepatch-sample.ko
memory region(It should be larger than 0x80007b155000).



Hmm..something seems off here.  Having tested a potential fix for 32666
locally, I do not expect the first FDE to show this symptom.





Hi,

Sorry for not responding in the past few days.  I was on PTO and was
trying to improve my snowboarding technique, I am back now!!

I think what we are seeing is expected behaviour:

  | For instance, the first FDE's start_addr is reported as -20564. Adding
  | this offset to the module's sframe section address (0x80007b15a040)
  | yields 0x80007b154fec, which is not within the livepatch-sample.ko
  | memory region(It should be larger than 0x80007b155000).


Let me explain using a __dummy__ example.

Assume Memory layout before relocation:

  | Address | Element | Relocation
  |     | |
  |   60| init_module (start address) |
  |   72| init_module (end address)   |
  |     | .   |
  |   100   | Sframe section header start address |
  |   128   | First FDE's start address   | RELOC_OP_PREL -> Put 
init_module address (60) - current address (128)

So, after relocation First FDE's start address has value 60 - 128 = -68



For SFrame FDE function start address is :

"Signed 32-bit integral field denoting the virtual memory address of the 
described function, for which the SFrame FDE applies.  The value encoded 
in the ‘sfde_func_start_address’ field is the offset in bytes of the 
function’s start address, from the SFrame section."


So, in your case, after applying the relocations, you will get:
S + A - P = 60 - 128 = -68

This is the distance of the function start address (60) from the current 
location in SFrame section (128)


But what we intend to store is the distance of the function start 
address from the start of the SFrame section.  So we need to do an 
additional step for SFrame FDE:  Value += r_offset


-68 + 28 = -40
Where 28 is the r_offset in the RELA.

So we really expect a -40 in the relocated SFrame section instead of -68 
above.  IOW, the RELAs of SFrame section will need an additional step 
after relocation.



Now, while doing unwinding we Try to add this value to the sframe
section header's start address which is in this example 100,

so 100 + (-68) = 32

So, 32 is not within [60, 72], i.e. within init_module.

You can see that it is possible for this value to be less than the start
address of the module's memory region when this function's address is
very close to the start of the memory region.

The crux is that the offset in the FDE's start address is calculated
based on the address of the FDE's start_address and not based on the
address of the sframe section.


Thanks,
Puranjay





Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-25 Thread Indu Bhagat

On 2/25/25 3:54 PM, Weinan Liu wrote:

On Tue, Feb 25, 2025 at 11:38 AM Indu Bhagat  wrote:


On Mon, Feb 10, 2025 at 12:30 AM Weinan Liu  wrote:

I already have a WIP patch to add sframe support to the kernel module.
However, it is not yet working. I had trouble unwinding frames for the
kernel module using the current algorithm.

Indu has likely identified the issue and will be addressing it from the
toolchain side.

https://sourceware.org/bugzilla/show_bug.cgi?id=32666


I have a working in progress patch that adds sframe support for kernel
module.
https://github.com/heuza/linux/tree/sframe_unwinder.rfc

According to the sframe table values I got during runtime testing, looks
like the offsets are not correct .



I hope to sanitize the fix for 32666 and post upstream soon (I had to
address other related issues).  Unless fixed, relocating .sframe
sections using the .rela.sframe is expected to generate incorrect output.


When unwind symbols init_module(0x80007b155048) from the kernel
module(livepatch-sample.ko), the start_address of the FDE entries in the
sframe table of the kernel modules appear incorrect.


init_module will apply the relocations on the .sframe section, isnt it ?


For instance, the first FDE's start_addr is reported as -20564. Adding
this offset to the module's sframe section address (0x80007b15a040)
yields 0x80007b154fec, which is not within the livepatch-sample.ko
memory region(It should be larger than 0x80007b155000).



Hmm..something seems off here.  Having tested a potential fix for 32666
locally, I do not expect the first FDE to show this symptom.



Yes, I think init_module will apply the relocation as well.
To further investigate, here's the relevant relocation and symbol table
information for the kernel module:

Relocation section '.rela.sframe' at offset 0x28350 contains 3 entries:
   Offset          Info           Type           Sym. Value    Sym. Name + 
Addend
001c  00010105 R_AARCH64_PREL32   .text + 8
0030  00010105 R_AARCH64_PREL32   .text + 28
0044  00010105 R_AARCH64_PREL32   .text + 68



The offsets look OK..


Symbol table '.symtab' contains 68 entries:
    Num:    Value          Size Type    Bind   Vis      Ndx Name
      0:      0 NOTYPE  LOCAL  DEFAULT  UND
      1:      0 SECTION LOCAL  DEFAULT    1 .text
...
     32: 0008    12 FUNC    LOCAL  DEFAULT    1 livepatch_exit
     33: 0008     0 NOTYPE  LOCAL  DEFAULT    3 $d
     34: 0028    44 FUNC    LOCAL  DEFAULT    1 livepatch_init
     35:      0 NOTYPE  LOCAL  DEFAULT    9 $d
     36: 0010     0 NOTYPE  LOCAL  DEFAULT    3 $d
     37: 0068    56 FUNC    LOCAL  DEFAULT    1 
livepatch_cmdlin[...]
...
     63: 0008    12 FUNC    GLOBAL DEFAULT    1 cleanup_module
     64:      0 NOTYPE  GLOBAL DEFAULT  UND klp_enable_patch
     65: 0028    44 FUNC    GLOBAL DEFAULT    1 init_module





Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-25 Thread Indu Bhagat

On Mon, Feb 10, 2025 at 12:30 AM Weinan Liu  wrote:

I already have a WIP patch to add sframe support to the kernel module.
However, it is not yet working. I had trouble unwinding frames for the
kernel module using the current algorithm.

Indu has likely identified the issue and will be addressing it from the
toolchain side.

https://sourceware.org/bugzilla/show_bug.cgi?id=32666


I have a working in progress patch that adds sframe support for kernel
module.
https://github.com/heuza/linux/tree/sframe_unwinder.rfc

According to the sframe table values I got during runtime testing, looks
like the offsets are not correct .



I hope to sanitize the fix for 32666 and post upstream soon (I had to 
address other related issues).  Unless fixed, relocating .sframe 
sections using the .rela.sframe is expected to generate incorrect output.



When unwind symbols init_module(0x80007b155048) from the kernel
module(livepatch-sample.ko), the start_address of the FDE entries in the
sframe table of the kernel modules appear incorrect.


init_module will apply the relocations on the .sframe section, isnt it ?


For instance, the first FDE's start_addr is reported as -20564. Adding
this offset to the module's sframe section address (0x80007b15a040)
yields 0x80007b154fec, which is not within the livepatch-sample.ko
memory region(It should be larger than 0x80007b155000).



Hmm..something seems off here.  Having tested a potential fix for 32666 
locally, I do not expect the first FDE to show this symptom.



Here are the sframe table values of the livepatch-samples.ko that I print
by qemu + gdb.

```
$ /usr/bin/aarch64-linux-gnu-objdump -L --sframe=.sframe 
./samples/livepatch/livepatch-sample.ko
./samples/livepatch/livepatch-sample.ko: file format elf64-littleaarch64

Contents of the SFrame section .sframe:
  Header :

Version: SFRAME_VERSION_2
Flags: SFRAME_F_FDE_SORTED
Num FDEs: 3
Num FREs: 11

  Function Index :

func idx [0]: pc = 0x0, size = 12 bytes
STARTPC CFA   FPRA
  sp+0  u u

func idx [1]: pc = 0x0, size = 44 bytes
STARTPC CFA   FPRA
  sp+0  u u
000c  sp+0  u u[s]
0010  sp+16 c-16  c-8[s]
0024  sp+0  u u[s]
0028  sp+0  u u

func idx [2]: pc = 0x0, size = 56 bytes
STARTPC CFA   FPRA
  sp+0  u u
000c  sp+0  u u[s]
0010  sp+16 c-16  c-8[s]
0030  sp+0  u u[s]
0034  sp+0  u u



(gdb) bt
#0  find_fde (tbl=0x80007b157708, pc=18446603338286190664) at 
kernel/sframe_lookup.c:75
#1  0x80008031e260 in sframe_find_pc (pc=18446603338286190664, 
entry=0x800086f83800) at kernel/sframe_lookup.c:175
#2  0x800080035a48 in unwind_next_frame_sframe (state=0x800086f83828) 
at arch/arm64/kernel/stacktrace.c:270
#3  kunwind_next (state=0x800086f83828) at 
arch/arm64/kernel/stacktrace.c:332
...

(gdb) lx-symbols
loading vmlinux
scanning for modules in /home/wnliu/kernel
loading @0x80007b155000: 
/home/wnliu/kernel/samples/livepatch/livepatch-sample.ko
loading @0x80007b14d000: /home/wnliu/kernel/fs/fat/vfat.ko
loading @0x80007b13: /home/wnliu/kernel/fs/fat/fat.ko

(gdb) p/x *tbl->sfhdr_p
$5 = {preamble = {magic = 0xdee2, version = 0x2, flags = 0x1}, abi_arch = 0x2, 
cfa_fixed_fp_offset = 0x0, cfa_fixed_ra_offset = 0x0, auxhdr_len = 0x0, 
num_fdes = 0x3, num_fres = 0xb, fre_len = 0x25, fdes_off = 0x0, fres_off = 0x3c}

(gdb) p/x tbl->sfhdr_p
$6 = 0x80007b15a040

(gdb) p *tbl->fde_p
$7 = {start_addr = -20564, size = 12, fres_off = 0, fres_num = 1, info = 0 
'\000', rep_size = 0 '\000', padding = 0}

(gdb) p *(tbl->fde_p + 1)
$11 = {start_addr = -20552, size = 44, fres_off = 3, fres_num = 5, info = 0 
'\000', rep_size = 0 '\000', padding = 0}

(gdb) p *(tbl->fde_p + 2)
$12 = {start_addr = -20508, size = 56, fres_off = 20, fres_num = 5, info = 0 
'\000', rep_size = 0 '\000', padding = 0}

/* -20564 + 0x80007b15a040 = 0x80007b154fec */
(gdb) info symbol 0x80007b154fec
No symbol matches 0x80007b154fec
```




Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-02-27 Thread Indu Bhagat

On 2/27/25 1:38 AM, Puranjay Mohan wrote:

Indu Bhagat  writes:


On 2/26/25 2:23 AM, Puranjay Mohan wrote:

Indu Bhagat  writes:


On 2/25/25 3:54 PM, Weinan Liu wrote:

On Tue, Feb 25, 2025 at 11:38 AM Indu Bhagat  wrote:


On Mon, Feb 10, 2025 at 12:30 AM Weinan Liu  wrote:

I already have a WIP patch to add sframe support to the kernel module.
However, it is not yet working. I had trouble unwinding frames for the
kernel module using the current algorithm.

Indu has likely identified the issue and will be addressing it from the
toolchain side.

https://sourceware.org/bugzilla/show_bug.cgi?id=32666


I have a working in progress patch that adds sframe support for kernel
module.
https://github.com/heuza/linux/tree/sframe_unwinder.rfc

According to the sframe table values I got during runtime testing, looks
like the offsets are not correct .



I hope to sanitize the fix for 32666 and post upstream soon (I had to
address other related issues).  Unless fixed, relocating .sframe
sections using the .rela.sframe is expected to generate incorrect output.


When unwind symbols init_module(0x80007b155048) from the kernel
module(livepatch-sample.ko), the start_address of the FDE entries in the
sframe table of the kernel modules appear incorrect.


init_module will apply the relocations on the .sframe section, isnt it ?


For instance, the first FDE's start_addr is reported as -20564. Adding
this offset to the module's sframe section address (0x80007b15a040)
yields 0x80007b154fec, which is not within the livepatch-sample.ko
memory region(It should be larger than 0x80007b155000).



Hmm..something seems off here.  Having tested a potential fix for 32666
locally, I do not expect the first FDE to show this symptom.





Hi,

Sorry for not responding in the past few days.  I was on PTO and was
trying to improve my snowboarding technique, I am back now!!

I think what we are seeing is expected behaviour:

   | For instance, the first FDE's start_addr is reported as -20564. Adding
   | this offset to the module's sframe section address (0x80007b15a040)
   | yields 0x80007b154fec, which is not within the livepatch-sample.ko
   | memory region(It should be larger than 0x80007b155000).


Let me explain using a __dummy__ example.

Assume Memory layout before relocation:

   | Address | Element | Relocation
   |     | |
   |   60| init_module (start address) |
   |   72| init_module (end address)   |
   |     | .   |
   |   100   | Sframe section header start address |
   |   128   | First FDE's start address   | RELOC_OP_PREL -> Put 
init_module address (60) - current address (128)

So, after relocation First FDE's start address has value 60 - 128 = -68



For SFrame FDE function start address is :

"Signed 32-bit integral field denoting the virtual memory address of the
described function, for which the SFrame FDE applies.  The value encoded
in the ‘sfde_func_start_address’ field is the offset in bytes of the
function’s start address, from the SFrame section."

So, in your case, after applying the relocations, you will get:
S + A - P = 60 - 128 = -68

This is the distance of the function start address (60) from the current
location in SFrame section (128)

But what we intend to store is the distance of the function start
address from the start of the SFrame section.  So we need to do an
additional step for SFrame FDE:  Value += r_offset


Thanks for the explaination, now it makes sense.

But I couldn't find a relocation type in AARCH64 that does this extra +=
r_offset along with PREL32.

The kernel's module loader is only doing the R_AARCH64_PREL32 which is
why we see this issue.

How is this working even for the kernel itself? or for that matter, any
other binary compiled with sframe?



For the usual executables or shared objects, the calculations are 
applied by ld.bfd at this time.  Hence, the issue manifests in 
relocatable files.



 From my limited undestanding, the way to fix this would be to hack the
relocator to do this additional step while relocating .sframe sections.
Or the 'addend' values in .rela.sframe should already have the +r_offset
added to it, then no change to the relocator would be needed.



Of the two, adjusting the addend values in .rela.sframe may be a 
reasonable way to go about it.  Let me try it out in GAS and ld.bfd.



-68 + 28 = -40
Where 28 is the r_offset in the RELA.

So we really expect a -40 in the relocated SFrame section instead of -68
above.  IOW, the RELAs of SFrame section will need an additional step
after relocation.



Thanks,
Puranjay





Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

2025-03-09 Thread Indu Bhagat

On 2/27/25 10:47 PM, Indu Bhagat wrote:

On 2/27/25 1:38 AM, Puranjay Mohan wrote:

Indu Bhagat  writes:


On 2/26/25 2:23 AM, Puranjay Mohan wrote:

Indu Bhagat  writes:


On 2/25/25 3:54 PM, Weinan Liu wrote:
On Tue, Feb 25, 2025 at 11:38 AM Indu Bhagat 
 wrote:


On Mon, Feb 10, 2025 at 12:30 AM Weinan Liu  
wrote:
I already have a WIP patch to add sframe support to the kernel 
module.
However, it is not yet working. I had trouble unwinding frames 
for the

kernel module using the current algorithm.

Indu has likely identified the issue and will be addressing it 
from the

toolchain side.

https://sourceware.org/bugzilla/show_bug.cgi?id=32666


I have a working in progress patch that adds sframe support for 
kernel

module.
https://github.com/heuza/linux/tree/sframe_unwinder.rfc

According to the sframe table values I got during runtime 
testing, looks

like the offsets are not correct .



I hope to sanitize the fix for 32666 and post upstream soon (I 
had to

address other related issues).  Unless fixed, relocating .sframe
sections using the .rela.sframe is expected to generate incorrect 
output.



When unwind symbols init_module(0x80007b155048) from the kernel
module(livepatch-sample.ko), the start_address of the FDE 
entries in the

sframe table of the kernel modules appear incorrect.


init_module will apply the relocations on the .sframe section, 
isnt it ?


For instance, the first FDE's start_addr is reported as -20564. 
Adding
this offset to the module's sframe section address 
(0x80007b15a040)
yields 0x80007b154fec, which is not within the livepatch- 
sample.ko

memory region(It should be larger than 0x80007b155000).



Hmm..something seems off here.  Having tested a potential fix for 
32666

locally, I do not expect the first FDE to show this symptom.





Hi,

Sorry for not responding in the past few days.  I was on PTO and was
trying to improve my snowboarding technique, I am back now!!

I think what we are seeing is expected behaviour:

   | For instance, the first FDE's start_addr is reported as -20564. 
Adding
   | this offset to the module's sframe section address 
(0x80007b15a040)
   | yields 0x80007b154fec, which is not within the livepatch- 
sample.ko

   | memory region(It should be larger than 0x80007b155000).


Let me explain using a __dummy__ example.

Assume Memory layout before relocation:

   | Address | Element | Relocation
   |     |     |
   |   60    | init_module (start address) |
   |   72    | init_module (end address)   |
   |     | .   |
   |   100   | Sframe section header start address |
   |   128   | First FDE's start address   | 
RELOC_OP_PREL -> Put init_module address (60) - current address (128)


So, after relocation First FDE's start address has value 60 - 128 = -68



For SFrame FDE function start address is :

"Signed 32-bit integral field denoting the virtual memory address of the
described function, for which the SFrame FDE applies.  The value encoded
in the ‘sfde_func_start_address’ field is the offset in bytes of the
function’s start address, from the SFrame section."

So, in your case, after applying the relocations, you will get:
S + A - P = 60 - 128 = -68

This is the distance of the function start address (60) from the current
location in SFrame section (128)

But what we intend to store is the distance of the function start
address from the start of the SFrame section.  So we need to do an
additional step for SFrame FDE:  Value += r_offset


Thanks for the explaination, now it makes sense.

But I couldn't find a relocation type in AARCH64 that does this extra +=
r_offset along with PREL32.

The kernel's module loader is only doing the R_AARCH64_PREL32 which is
why we see this issue.

How is this working even for the kernel itself? or for that matter, any
other binary compiled with sframe?



For the usual executables or shared objects, the calculations are 
applied by ld.bfd at this time.  Hence, the issue manifests in 
relocatable files.



 From my limited undestanding, the way to fix this would be to hack the
relocator to do this additional step while relocating .sframe sections.
Or the 'addend' values in .rela.sframe should already have the +r_offset
added to it, then no change to the relocator would be needed.



Of the two, adjusting the addend values in .rela.sframe may be a 
reasonable way to go about it.  Let me try it out in GAS and ld.bfd.




A fix for this is in the works (being discussed on the 
binutils@sourceware list).  I will keep you posted.


Thanks
Indu