I agree with (1), vmx is not ready for nix yet.
as for (2), I'll take a slight change: the goal is to remove nix as an
independent entity, and subsume it in other plan 9 kernels (I prefer 9front
at this point)

Big picture, NIX could be a build option for 9front, or a branch in 9front;
something like that.

It makes no sense to keep NIX as its own thing, so much has improved in
plan 9 in 14 years. I looked into it and it would turn into a lot of
duplicate work, to no good effect.


On Wed, Jan 29, 2025 at 5:51 AM <tlaro...@kergis.com> wrote:

> It seems to me that the best course, for now, is the following:
>
> 1) [Using qemu or booting the kernel on baremetal] Try and correct
> Nix, in its present state, to achieve a running Nix, with 9front, for
> objtype==amd64;
>
> 2) Once 1) is achieved, start cleaning (then, at this moment, the mps
> stuff could be revised) and reorganizing code to clearly segregate
> Machine Independent (M.I.) and Machine Dependent (M.D.), so that
> porting Nix to other archs be possible.
>
> And concurrently, during either step, document...
>
> On Tue, Jan 28, 2025 at 04:02:11PM -0800, Ron Minnich wrote:
> > btw, if you
> > acid 9pc64
> > you can paste this right into acid
> > src(0xfffffffff011cdee); // dumpstack+0x10
> > src(0xfffffffff013d50f); // panic+0x133
> > src(0xfffffffff0116a3b); // KADDR+0x55
> > src(0xfffffffff012fe55); // sigsearch+0xc8
> > src(0xfffffffff012fec9); // mpsinit+0x14
> > src(0xfffffffff011622a); // main+0x30b
> > src(0xfffffffff0110204); // ndnr
> > and see the source.
> >
> > Also, the ndnr is a jmk-ism: it means "no deposit, no return"
> >
> > so, let's see, I can't tell if we went over this before.
> > What is KADDR and KADDR2? They relate to TMFM, another jmk-ism I
> > believe: Too Many F-ing Megabytes, where too many is "more than 2G" --
> > why 2G? well ...
> >
> > basically, amd64, like lots of things (risc-v) uses this one simple
> > trick: if you sign-extend a 32-bit pointer, you get something anchored
> > either at the top 2G (kernel va) or the low 2G (user code).
> >
> > i.e. 0x80000000 -> 0xffffffff_80000000 -> this is convenient. You can
> > use 32-bit pointers for lots of things, and, since the amd64 is a
> > pretty half-way 64-bit CPU (lots of 64-bit instructions only
> > completely work with RAX), this is helpful.
> >
> > And it works great until you get CPUs with TMFM. Then you need to
> > split memory up:
> > physical 0 -> 2 Gb becomes virtual 0xfffffff_80000000
> > physical 2Gb and up becomes ... fffffe0000000000
> > Why fffffe0000000000? the first amd64 only had something like 41(?)
> > bits of virtual address:there's this giant hole in the middle,and
> > kernel virtual HAD to start at that address -- 64 bits - whatever gets
> > you to 23 bits. [I can't find the actual documents on this, I am out
> > of time to look, so you'll need to fill in my likely errors here]
> > It's a hardware mandate from opteron land.
> >
> > OK, so KADDR2 is fffffe0000000000, and that error is saying code
> > called KADDR2 with something that's not in KADDR2. That va
> > fffffffffffffc00 is in the low 2GiB physical, which is KADDR.
> >
> > This is a side effect of the real problem: you don't have the table it
> > wants. So you need to fix that, OR, start using qemu for your testing.
> >
> > I hope I did not mess the details up too much here....
> >
> > On Tue, Jan 28, 2025 at 1:24?PM ron minnich <rminn...@gmail.com> wrote:
> > >
> > > I'd be happier to remove the mps dependency actually. the mps is long
> dead. But that's a bigger story.
> > >
> > >
> > > On Tue, Jan 28, 2025 at 11:24?AM Paul Lalonde <
> paul.a.lalo...@gmail.com> wrote:
> > >>
> > >> Ah, that's the code path that sent me to QEMU.
> > >> Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
> > >> Ron provided this minimal one for me, which I think we could learn
> from to adapt into vmx.  The hacky version of pointing the code directly at
> something like this baked in didn't excite me.
> > >>
> > >> 50 43 4D 50                        ; "PCMP"
> > >> 00 00                              ; Table Length (placeholder)
> > >> 04                                 ; Spec Revision
> > >> 00                                 ; Checksum (placeholder)
> > >> 42 4F 43 48 53 43 50 55            ; "BOCHSCPU"
> > >> 30 2E 31 20 20 20 20 20 20 20 20   ; "0.1         "
> > >> 00 00 00 00                        ; OEM Table Pointer
> > >> 00 00                              ; OEM Table Size
> > >> 14 00                              ; Entry Count (2 CPUs + 18 = 20,
> little-endian)
> > >> 00 00 E0 FE                        ; Local APIC Address (0xfee00000)
> > >> 00 00                              ; Ext Table Length
> > >> 00                                 ; Ext Table Checksum
> > >> 00                                 ; Reserved
> > >>
> > >> On Tue, Jan 28, 2025 at 11:15?AM <tlaro...@kergis.com> wrote:
> > >>>
> > >>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> > >>> > ktrace can generate a stack for you from that dump.  The line
> starting with
> > >>> > "ktrace" is the command line (you might change 9k8cpu to the path
> to the
> > >>> > kernel file in you're not in the directory where you built it).
> > >>> > Then the following lines up to but not including the "cpu0:
> exiting" can be
> > >>> > dropped into ktrace's stdin to have it generate a stack trace.
> You'll need
> > >>> > to add the ^d at the end if you're cut-and-pasting.
> > >>> >
> > >>> > Though it looks like it's just triggering the page fault trap on
> that
> > >>> > 0xfffffffffffffc00 address, which itself looks like a victim of
> > >>> > sign-extension.  So back up to the fault and find the source of
> that
> > >>> > address?
> > >>>
> > >>> Yes:
> > >>>
> > >>> src(0xfffffffff011cdee); // dumpstack+0x10
> > >>> src(0xfffffffff013d50f); // panic+0x133
> > >>> src(0xfffffffff0116a3b); // KADDR+0x55
> > >>> src(0xfffffffff012fe55); // sigsearch+0xc8
> > >>> src(0xfffffffff012fec9); // mpsinit+0x14
> > >>> src(0xfffffffff011622a); // main+0x30b
> > >>> src(0xfffffffff0110204); // ndnr
> > >>>
> > >>> this doesn't tell me much more than what I knew already: it panics in
> > >>> mpsinit, calling KADDR in map.c.
> > >>>
> > >>> During my next wandering under Nix, I will try to track back from
> > >>> where the offending address is taken or with what it is constructed.
> > >>>
> > >>> >
> > >>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaro...@kergis.com> wrote:
> > >>> >
> > >>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > >>> > > > Do you have a stack for the assert, from the ktrace?
> > >>> > > >
> > >>> > >
> > >>> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
> > >>> > > mpsinit.
> > >>> > >
> > >>> > > Here is the info (I added a bunch of print() before each
> function call
> > >>> > > to know where it stumbled upon an incorrect address):
> > >>> > >
> > >>> > > term% nix/test_vmx
> > >>> > >
> > >>> > > NIX
> > >>> > >  mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused
> 0xfffffffff023d000
> > >>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> > >>> > > sys->pd 0x108003 0x108023
> > >>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> > >>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > >>> > >  ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> > >>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> > >>> > > cm 0: addr 0x4000000 npage 0
> > >>> > > 0 0 0
> > >>> > > npage 0 upage 0 kpage 16384
> > >>> > >  confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> > >>> > > 0xfffffffff023d000 nunits 4047617
> > >>> > >  acpiinit... umeminit... trapinit... printinit... i8259init...
> procinit...
> > >>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr
> fffffffffffffc00 >=
> > >>> > > fffffe0000000000
> > >>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> fffffe0000000000
> > >>> > >
> > >>> > > dumpstack
> > >>> > >         ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> > >>> > >         estackx 0xfffffffff0106000
> > >>> > >         0xfffffffff0105c70=0xfffffffff0105da8
> > >>> > > 0xfffffffff0105c78=0xfffffffff011cb91
> > >>> > >         0xfffffffff0105c80=0xfffffffff0105c98
> > >>> > > 0xfffffffff0105c98=0xfffffffff013cff7
> > >>> > >         0xfffffffff0105cb0=0xfffffffff0105cd0
> > >>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
> > >>> > >         0xfffffffff0105cc8=0xfffffffff0105df3
> > >>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
> > >>> > >         0xfffffffff0105d08=0xfffffffff0105d90
> > >>> > > 0xfffffffff0105d28=0xfffffffff011cdee
> > >>> > >         0xfffffffff0105d30=0xfffffffff0105da8
> > >>> > > 0xfffffffff0105d40=0xfffffffff0105d58
> > >>> > >         0xfffffffff0105d48=0xfffffffff0105da8
> > >>> > > 0xfffffffff0105d50=0xfffffffff011cdee
> > >>> > >         0xfffffffff0105d58=0xfffffffff011cb99
> > >>> > > 0xfffffffff0105d68=0xfffffffff013d50f
> > >>> > >         0xfffffffff0105d88=0xfffffffff0105ed0
> > >>> > > 0xfffffffff0105d90=0xfffffffff013cff7
> > >>> > >         0xfffffffff0105d98=0xfffffffff0105db5
> > >>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
> > >>> > >         0xfffffffff0105e10=0xfffffffff0105e00
> > >>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
> > >>> > >         0xfffffffff0105e28=0xfffffffff0105e98
> > >>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
> > >>> > >         0xfffffffff0105e40=0xfffffffff0105e98
> > >>> > > 0xfffffffff0105e60=0xfffffffff013d217
> > >>> > >         0xfffffffff0105e68=0xfffffffff015d9c9
> > >>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
> > >>> > >         0xfffffffff0105e90=0xfffffffff015d5d9
> > >>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
> > >>> > >         0xfffffffff0105ec0=0xfffffffff0116a3b
> > >>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
> > >>> > >         0xfffffffff0105f08=0xfffffffff01a1afa
> > >>> > > 0xfffffffff0105f10=0x0000000000000004
> > >>> > >         0xfffffffff0105f18=0x0000000000000046
> > >>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
> > >>> > >         0xfffffffff0105f28=0x0000000000000006
> > >>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
> > >>> > >         0xfffffffff0105f38=0xfffffffff0000400
> > >>> > > 0xfffffffff0105f40=0x0000000000000000
> > >>> > >         0xfffffffff0105f48=0xfffffffff012fec9
> > >>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
> > >>> > >         0xfffffffff0105f58=0x0000000000000208
> > >>> > > 0xfffffffff0105f60=0x0000000000000124
> > >>> > >         0xfffffffff0105f68=0xfffffffff01149d0
> > >>> > > 0xfffffffff0105f70=0x0000000000000006
> > >>> > >         0xfffffffff0105f78=0xfffffffff0114ba7
> > >>> > > 0xfffffffff0105f80=0xfffffffff0227510
> > >>> > >         0xfffffffff0105f88=0xffffffff00000000
> > >>> > > 0xfffffffff0105f90=0x0000000000000000
> > >>> > >         0xfffffffff0105f98=0xfffffffff0105fb8
> > >>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
> > >>> > >         0xfffffffff0105fa8=0xfffffffff011622a
> > >>> > > 0xfffffffff0105fb0=0xffffffff00000400
> > >>> > >         0xfffffffff0105fb8=0xffffffff00000000
> > >>> > > 0xfffffffff0105fc0=0x0000000000000000
> > >>> > >         0xfffffffff0105fc8=0x0000000000000000
> > >>> > > 0xfffffffff0105fd0=0x0000000000000000
> > >>> > >         0xfffffffff0105fd8=0x0000000000000000
> > >>> > > 0xfffffffff0105fe0=0x0000000000000000
> > >>> > >         0xfffffffff0105fe8=0xfffffffff0110204
> > >>> > > 0xfffffffff0105ff0=0x000000002badb002
> > >>> > >         0xfffffffff0105ff8=0x000000000023b000
> > >>> > >         cpu0: exiting
> > >>> > >
> > >>> > > >
> > >>> > > >
> > >>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaro...@kergis.com> wrote:
> > >>> > > >
> > >>> > > > > After fixing problems leading to compiler
> warnings---legitimate
> > >>> > > > > warnings, but even the too short binary negated unsigned
> 32bits values
> > >>> > > > > promoted to 64 bits with leading bits hence 0 as mask were
> harmless---
> > >>> > > > > now I want to look at the stumbing block.
> > >>> > > > >
> > >>> > > > > For me, under vmx, this is the assert in map.c:17:
> > >>> > > > >
> > >>> > > > > assert(pa < KSEG2);
> > >>> > > > >
> > >>> > > > > that triggers, and it should come from a call from multiboot.
> > >>> > > > >
> > >>> > > > > My first reflex is to start adding printf() instructions to
> track the
> > >>> > > > > problem, but is there a better way when dealing with the
> kernel?
> > >>> > > > >
> > >>> > > > > Second question: since, if I'm not mistaken, 9front doesn't
> use
> > >>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the
> multiboot
> > >>> > > stuff?
> > >>> > > > > The embedded boot stuff should handle the thing by itself
> without load
> > >>> > > > > addresses having to be adjusted because of vmx?
> > >>> > > > > --
> > >>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > >>> > > > >              http://www.kergis.com/
> > >>> > > > >             http://kertex.kergis.com/
> > >>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95
> 6006 F40C
> > >>> > >
> > >>> > > --
> > >>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > >>> > >              http://www.kergis.com/
> > >>> > >             http://kertex.kergis.com/
> > >>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006
> F40C
> > >>>
> > >>> --
> > >>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > >>>              http://www.kergis.com/
> > >>>             http://kertex.kergis.com/
> > >>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
> > >
> > > 9fans / 9fans / see discussions + participants + delivery options
> Permalink
> 
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>              http://www.kergis.com/
>             http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Mecbe892c7c26bc685f4e5f37
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to