On Wed, Dec 11, 2024 at 02:14:39AM +0100, nick wrote:
> Thank you! I just ran a quick test and encountered the same crash at the
> same line. I’ll have more time to investigate tomorrow and can provide
> additional details then. Do you have any other ideas I could try in the
> meantime?

A part of my previous explanation was wrong, the issue is not related to
net_copy(), as the crash happened even before.

For some reason, struct bgp_prefix is u64-aligned. That makes sense in
the original code, where u64 alignment is forced by struct net_addr, but
i have no idea why this is true even after the patch removing the forced
alignment.

The slab allocator in sl_alloc() just returns u32-aligned memory, because
it is configured for sizeof(struct bgp_prefix) + sizeof(net_addr_ip6) and
the second is 20 as net_addr_ip6 is u32-aligned. The sl_alloc() internally
only enforces word-size alignment, wich is u32 in your case.

Please disregard the previous patch. Can you try the first attached patch
(log-alignment.patch) to log alignment of different structures with or
without the second patch (net-addr-u32-align.patch) and report me results?


> On 12/11/24 1:41 AM, Ondrej Zajicek wrote:
> > On Tue, Dec 10, 2024 at 09:15:46PM +0100, nick via Bird-users wrote:
> > > I also uploaded the coredumpfile:
> > > https://github.com/PolynomialDivision/coredumpupload/blob/main/bird_coredump
> > Thanks. This seems like an interesting issue. In BIRD, generic net_addr
> > structure is explicitly u64-aligned (to accomodate VPN variants), while
> > specific net_addr_ip4 and net_addr_ip6 are just u32-aligned. In this case
> > net_addr_ip6 is allocated with u32 alignment, but then copied with
> > net_copy(), which assumes generic net_addr for arguments, and compiler
> > probably used some u64-optimized copying, which required 64-bit alignment
> > despite being on 32-bit platform,
> > 
> > For starters, try the attached patch. But it is preliminary, we will revisit
> > alignment of these structures.
> > 
> > 
> > > > > > The root cause appears to be insufficient alignment of memory
> > > > > > allocated for
> > > > > > structures, specifically in this line:
> > > > > > 
> > > > > > ```c
> > > > > > px = mb_alloc(c->pool, sizeof(struct bgp_prefix) + net->length);
> > > > > > ```
> > Note that it is really allocated two lines above, here:
> > 
> >    px = sl_alloc(c->prefix_slab);
> > 

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org)
"To err is human -- to blame it on a computer is even more so."
diff --git a/proto/bgp/bgp.c b/proto/bgp/bgp.c
index c18a73fe..3c18f80a 100644
--- a/proto/bgp/bgp.c
+++ b/proto/bgp/bgp.c
@@ -2771,5 +2771,12 @@ struct protocol proto_bgp = {
 
 void bgp_build(void)
 {
+  log(L_WARN "ALIGN: %d %d %d %d %d",
+      (int) _Alignof(net_addr),
+      (int) _Alignof(net_addr_ip4),
+      (int) _Alignof(net_addr_ip6),
+      (int) _Alignof(node),
+      (int) _Alignof(struct bgp_prefix));
+
   proto_build(&proto_bgp);
 }
diff --git a/lib/net.h b/lib/net.h
index 61ed37ba..eee59de5 100644
--- a/lib/net.h
+++ b/lib/net.h
@@ -51,7 +51,7 @@ typedef struct net_addr {
   u8 pxlen;
   u16 length;
   u8 data[20];
-  u64 align[0];
+  u32 align[0];
 } net_addr;
 
 typedef struct net_addr_ip4 {

Reply via email to