OK, for the archives:

Someone wrote to me off-list:

> > Other ramdisk-based systems (flashboot, flashrd) have needed to increase
> > NKPTP above the default of 4, and disable isadma (and associated devices).
> > I don't know if it might be relevant here, but easy enough to do that
> > it's probably worth trying.

Very interesting tip, we tried it, but it doesn't make a difference.

(Our kernel has always been far below 16 MB, apparently too small to
ever hit those two limits.)


--
However, it set us on the path to the eventual solution.

It did seem a good idea to try & compare 'flashrd'. Outcome: the
MULTIPROCESSOR variant of 'flashrd' did work normally.

Followed by countless frustrating hours banging head against wall:
comparing & matching 'option' files, transplanting ramdisk images back
and forth. No experiment worked, almost drove us to desperation.


--
Eventually found the cause: it has nothing to do with 'option's or with
the ramdisk image.

The 'src/distrib/i386/common/Makefile.inc' script adds
    COPTS+= -mtune=i486
to the kernel make/gcc command.  (since OpenBSD 4.8)

Rather miffed by this discovery! The OpenBSD project does not
allow/support use of non-default gcc arch options. Not even when
compiling userland apps*, let alone when compiling the kernel.

Reasonable policy; as long as you stick to it! Don't make a
nigh-on-unnoticable deviation in such a canonical place as distrib/i386!


= =
*) "No support for non-default gcc arch options": we don't know whether
that's actually properly documented anywhere, but we learned that when
reporting that 'ntohs16()' miscompiled under -march=i686, back in
OpenBSD 3.8 days:
    http://www.mail-archive.com/misc@openbsd.org/msg19810.html


Thank all for the replies/ideas/etc!

+++chefren


On 18-11-10 12:08, chefren wrote:
> We use a custom i386 RAMDISK_CD kernel: basically we add most options from 
> GENERIC and
> GENERIC.MP.
> 
> Upgrading from 4.6 to 4.8, this kernel hangs forever after:
>     root on rd0a swap on rd0b dump on rd0b
> 
> The problem turns out to be MP; activation of the secondary processors.
> 
> The custom kernel works fine on a single-core machine, and a recompiled 
> kernel without
> config lines
>     option MULTIPROCESSOR
>     cpu* at mainbus?
> also works fine everywhere.
> 
> 
> --
> The problem can be reproduced by simply adding those two MP config lines to 
> the standard
> RAMDISK_CD kernel config.
> 
> 
> --
> Experiments with adding printf()s on a Dell 1950 (2 CPUs, 8 cores) suggest 
> that the hang
> happens during:
>     cpu_boot_secondary(&cpu_info[2])
>       pmap_tlb_shootrange()
>         i386_fast_ipi()
> 
> But treat that as an inconclusive hint: we don't know whether the printf()s 
> are 100%
> reliable, and VirtualBox (2 CPU, IOAPIC) seems to make it past that point and 
> hang
> somewhere after init_main() has entered its intentional infinite waiting 
> loop, and another
> computer (Core 2 Duo) doesn't hang but reboots immediately around that point.
> 
> 
> --
> Are we overlooking an option/driver that's needed for MP on i386?
> 
> Or is this a kernel regression from 4.6 --> 4.8?
> 
> 
> +++chefren
> 

-- 
http://idd.nl/
Chefren Hagens

Reply via email to