Can you give me a quick guide on how to enable the debug output?
I tried for example setting in the config file:
log "/tmp/birddebug.log" all;
debug protocols all;
I am not able to see any output from ALIGN.
On 12/11/24 4:40 AM, Ondrej Zajicek wrote:
On Wed, Dec 11, 2024 at 02:14:39AM +0100, nick wrote:
Thank you! I just ran a quick test and encountered the same crash at the
same line. I’ll have more time to investigate tomorrow and can provide
additional details then. Do you have any other ideas I could try in the
meantime?
A part of my previous explanation was wrong, the issue is not related to
net_copy(), as the crash happened even before.
For some reason, struct bgp_prefix is u64-aligned. That makes sense in
the original code, where u64 alignment is forced by struct net_addr, but
i have no idea why this is true even after the patch removing the forced
alignment.
The slab allocator in sl_alloc() just returns u32-aligned memory, because
it is configured for sizeof(struct bgp_prefix) + sizeof(net_addr_ip6) and
the second is 20 as net_addr_ip6 is u32-aligned. The sl_alloc() internally
only enforces word-size alignment, wich is u32 in your case.
Please disregard the previous patch. Can you try the first attached patch
(log-alignment.patch) to log alignment of different structures with or
without the second patch (net-addr-u32-align.patch) and report me results?
On 12/11/24 1:41 AM, Ondrej Zajicek wrote:
On Tue, Dec 10, 2024 at 09:15:46PM +0100, nick via Bird-users wrote:
I also uploaded the coredumpfile:
https://github.com/PolynomialDivision/coredumpupload/blob/main/bird_coredump
Thanks. This seems like an interesting issue. In BIRD, generic net_addr
structure is explicitly u64-aligned (to accomodate VPN variants), while
specific net_addr_ip4 and net_addr_ip6 are just u32-aligned. In this case
net_addr_ip6 is allocated with u32 alignment, but then copied with
net_copy(), which assumes generic net_addr for arguments, and compiler
probably used some u64-optimized copying, which required 64-bit alignment
despite being on 32-bit platform,
For starters, try the attached patch. But it is preliminary, we will revisit
alignment of these structures.
The root cause appears to be insufficient alignment of memory
allocated for
structures, specifically in this line:
```c
px = mb_alloc(c->pool, sizeof(struct bgp_prefix) + net->length);
```
Note that it is really allocated two lines above, here:
px = sl_alloc(c->prefix_slab);