On Tue, Jul 20, 2010 at 9:49 PM, Isaku Yamahata <yamah...@valinux.co.jp> wrote: > Added Cc: seab...@seabios.org > > On Wed, Jul 21, 2010 at 06:31:01AM +0300, Michael S. Tsirkin wrote: >> On Tue, Jul 20, 2010 at 06:52:23PM +0900, Isaku Yamahata wrote: >> > On Wed, Jul 14, 2010 at 09:10:28AM -0600, Cam Macdonell wrote: >> > > On Tue, Jul 13, 2010 at 8:52 PM, Isaku Yamahata <yamah...@valinux.co.jp> >> > > wrote: >> > > > On Tue, Jul 13, 2010 at 04:48:19PM -0600, Cam Macdonell wrote: >> > > >> On Tue, Jul 13, 2010 at 2:41 PM, Isaku Yamahata >> > > >> <yamah...@valinux.co.jp> wrote: >> > > >> > On Tue, Jul 13, 2010 at 02:05:51PM -0600, Cam Macdonell wrote: >> > > >> >> >> > Seabios completely ignore the 64-bitness of the BAR. ?Looks >> > > >> >> >> > like it also >> > > >> >> >> > thinks the second half of the BAR is an I/O region instead of >> > > >> >> >> > memory (hence >> > > >> >> >> > the c200, that's part of the pci portio region. >> > > >> >> > >> > > >> >> > I've sent the patches to address it. But they haven't been >> > > >> >> > merged yet. >> > > >> >> > seabios doesn't map BARs beyond 4GB. >> > > >> >> > If bar is mapped beyond 4GB, guest BIOS does it. >> > > >> >> >> > > >> >> Have those patches been merged yet? >> > > >> > >> > > >> > They have been merged into seabios upstream now. >> > > >> > qemu seabios fork hasn't pulled for a while, though. >> > > >> > >> > > >> > >> > > >> >> > To see how seabios works, it would help to increase >> > > >> >> > CONFIG_DEBUG_LEVEL >> > > >> >> > in config.h of seabios >> > > >> >> >> > > >> >> Where does the output from seabios end up? ?Inside dmesg? >> > > >> > >> > > >> > It outputs them to the serial console which qemu emulates. >> > > >> > seabios is out of kernel control, so dmesg doesn't show it. >> > > >> > >> > > >> > >> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr) >> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> >> >> pci_read_config: (val) 0xffffffff <- 0x1c (addr) >> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr) >> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> >> > >> > > >> >> > seabios BAR3. Not sure how it is mapped from this >> > > >> >> > message. >> > > >> >> >> > > >> >> Isn't the BAR3 from the fact that a 64-bit BAR would use both BAR2 >> > > >> >> and >> > > >> >> BAR3 to store all 64-bits? >> > > >> > >> > > >> > Yes. Seabios misbehaves. 64bit bar is(was) a missing feature. >> > > >> > -- >> > > >> > yamahata >> > > >> > >> > > >> > >> > > >> >> > > >> With the latest seabios git passed via -bios, I no longer see the >> > > >> 48-bit address, but instead a 32-bit address and then >> > > >> ffffffff00000000. ?This guest has 1gb of RAM so the address isn't be >> > > >> mapped beyond 4g. >> > > > >> > > > Can I see the debug log like before? >> > > > (hopefully seabios with CONFIG_DEBUG_LEVEL enabled.) >> > > >> > > Here's the dump from SeaBIOS in the region related to the PCI devices. >> > > The SeaBIOS output is identical whether the BAR is 32-bit or 64-bit. >> > > >> > > PCI: bus=0 devfn=0x10: vendor_id=0x1013 device_id=0x00b8 >> > > region 0: 0xf0000000 >> > > region 1: 0xf2000000 >> > > region 6: 0xf2010000 >> > > PCI: bus=0 devfn=0x18: vendor_id=0x1af4 device_id=0x1000 >> > > region 0: 0x0000c020 >> > > region 1: 0xf2020000 >> > > region 6: 0xf2030000 >> > > PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110 >> > > region 0: 0xf2040000 >> > > region 1: 0xf2041000 >> > > region 2: 0x00000000 >> > >> > Is this region (region 2 of devfn=0x20: vendor_id=0x1af4 device_id=0x1110) >> > the BAR in quistion? >> > The value 0 seems odd. Probably BAR address calculation overflowed. >> > Currently seabios doesn't check overflow. I attached the patch. >> > >> > >> > > > Do you know who sets the BAR to ffffffff00000000? >> > > >> > > Here are the config reads/writes related to the 0x18/1c, the 'IVSHMEM' >> > > lines are from the map function passed to pci_register_bar(). It >> > > looks like SeaBIOS sets the address to 0 and then the potentially >> > > useful e0000000 address gets mangled into ffffffff000000. >> > >> > There is something wrong with the debug message of write case, I suppose. >> > All written value are 0, but the resulted effect doesn't seems so. >> > >> > > >> > > IVSHMEM: guest pci addr = 0, guest h/w addr = 1090912256, size = >> > > 536870912 >> > > >> > > ...snip... >> > > >> > > pci_read_config: (val) 0x4 <- 0x18 (addr) >> > > pci_write_config: (val) 0x0 -> 0x18 (addr) >> > > IVSHMEM: guest pci addr = e0000000, guest h/w addr = 1090912256, size = >> > > 20000000 >> > >> > If 0 is written to 0x18, the bar address should be 0, but it says e0000000. >> > >> > > pci_read_config: (val) 0xe0000004 <- 0x18 (addr) >> > >> > The read value isn't 0. and so on... >> > >> > > pci_write_config: (val) 0x0 -> 0x18 (addr) >> > > pci_read_config: (val) 0x0 <- 0x1c (addr) >> > > pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr = >> > > 1090912256, size = 20000000 >> > > pci_read_config: (val) 0xffffffff <- 0x1c (addr) >> > > pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> > > and with the 64-bit guest I get this error as well (recall the guest >> > > fails to boot on 64-bit) >> > > >> > > BUG: kvm_dirty_pages_log_change: invalid parameters >> > > 00000000f0000000-00000000f0ffffff >> > >> > >> > diff --git a/src/pciinit.c b/src/pciinit.c >> > index b110531..6eca2ce 100644 >> > --- a/src/pciinit.c >> > +++ b/src/pciinit.c >> > @@ -90,7 +90,8 @@ static int pci_bios_allocate_region(u16 bdf, int >> > region_num) >> > /* If pci_bios_prefmem_addr == 0, keep old behaviour */ >> > pci_bios_prefmem_addr != 0) { >> > paddr = &pci_bios_prefmem_addr; >> > - if (ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) { >> > + if (ALIGN(*paddr, size) + size < *paddr || >> > + ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) { >> > dprintf(1, >> > "prefmem region of (bdf 0x%x bar %d) can't be >> > mapped. " >> > "decrease BUILD_PCIMEM_SIZE and recompile. size >> > %x\n", >> > @@ -99,7 +100,8 @@ static int pci_bios_allocate_region(u16 bdf, int >> > region_num) >> > } >> > } else { >> > paddr = &pci_bios_mem_addr; >> > - if (ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) { >> > + if (ALIGN(*paddr, size) + size < *paddr || >> > + ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) { >> > dprintf(1, >> > "mem region of (bdf 0x%x bar %d) can't be mapped. >> > " >> > "increase BUILD_PCIMEM_SIZE and recompile. size >> > %x\n", >> >> Looking at the source, all of the values like pci_bios_prefmem_addr seem to >> be >> 32 bit. Since in the spec prefetcheable memory is up to 64 bit, >> can't the math overflow, here and elsewhere? >> Maybe we should switch to 64 bit values all over ... > > Make sense. I'll create a patch to convert them into u64. > >> >> > @@ -116,12 +118,8 @@ static int pci_bios_allocate_region(u16 bdf, int >> > region_num) >> > >> > int is_64bit = !(val & PCI_BASE_ADDRESS_SPACE_IO) && >> > (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == >> > PCI_BASE_ADDRESS_MEM_TYPE_64; >> > - if (is_64bit) { >> > - if (size > 0) { >> > - pci_config_writel(bdf, ofs + 4, 0); >> > - } else { >> > - pci_config_writel(bdf, ofs + 4, ~0); >> > - } >> > + if (is_64bit && size > 0) { >> > + pci_config_writel(bdf, ofs + 4, 0); >> > } >> > return is_64bit; >> > } >> >> >> Was there any reason we wrote all-ones there on size 0? >> BAR sizing? > > No reason. It's just left over from debugging. > So I'd like to remove it. > > -- > yamahata > >
Hi, 64-bit BARs still do not seem to be working. When using the latest seabios the guest does not hit a "BUG:" statement, but booting still fails HPET: 1 timers in total, 0 timers will be used for per-cpu timer divide error: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.35+ #299 /Bochs RIP: 0010:[<ffffffff812a9b5b>] [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b RSP: 0018:ffff88007d7b3d80 EFLAGS: 00010246 RAX: 00038d7ea4c68000 RBX: ffff88007d062cc0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff817bb9b0 RBP: ffff88007d7b3dc0 R08: 00000000000080d0 R09: ffffc90000000000 R10: ffff88007d72b5a0 R11: 0000000000000000 R12: ffff88007d7b3dd0 R13: ffffc90000000000 R14: 0000000000000000 R15: ffffffff817a41c3 FS: 0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a42000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 1, threadinfo ffff88007d7b2000, task ffff88007d7b8000) Stack: ffff88007f43ab90 ffff88007f43ab90 ffffffff81ca6174 ffffffff81b1f5e1 <0> 0000000000000000 0000000000000100 0000000000000100 0000000000000000 <0> ffff88007d7b3e80 ffffffff810294ea 00000000fed00000 ffffc90000000000 Call Trace: [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea [<ffffffff81002069>] do_one_initcall+0x5e/0x159 [<ffffffff81b0d72a>] kernel_init+0x19a/0x228 [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10 [<ffffffff81b0d590>] ? kernel_init+0x0/0x228 [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10 Code: 89 1d ca b2 b3 00 48 c1 ea 21 8b 73 34 49 c7 c7 c3 41 7a 81 48 8d 04 02 4c 89 f2 48 c7 c7 b0 b9 7b 81 48 c1 ea 20 48 89 d1 31 d2 <48> f7 f1 83 7b 30 01 48 c7 c1 86 1c 7d 81 49 0f 46 cf 48 89 43 RIP [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b RSP <ffff88007d7b3d80> ---[ end trace a7919e7f17c0a725 ]--- Kernel panic - not syncing: Attempted to kill init! Pid: 1, comm: swapper Tainted: G D 2.6.35+ #299 Call Trace: [<ffffffff81459a85>] panic+0x8b/0x10b [<ffffffff81056a83>] ? exit_ptrace+0x38/0x121 [<ffffffff8104f9e8>] do_exit+0x7a/0x722 [<ffffffff8104c3bd>] ? spin_unlock_irqrestore+0xe/0x10 [<ffffffff8104cfd6>] ? kmsg_dump+0x12b/0x145 [<ffffffff8145ccc8>] oops_end+0xbf/0xc7 [<ffffffff8100d299>] die+0x5a/0x63 [<ffffffff8145c6d2>] do_trap+0x121/0x130 [<ffffffff8100b560>] do_divide_error+0x96/0x9f [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b [<ffffffff8120cf80>] ? radix_tree_preload+0x34/0x88 [<ffffffff8100a83b>] divide_error+0x1b/0x20 [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea [<ffffffff81002069>] do_one_initcall+0x5e/0x159 [<ffffffff81b0d72a>] kernel_init+0x19a/0x228 [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10 [<ffffffff81b0d590>] ? kernel_init+0x0/0x228 [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10 seabios output for the device: PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110 region 0: 0xf1020000 region 2: 0x00000000 init smm Running the latest seabios, the debug output only remaps the BAR twice, once with a potentially correct address of e00000000 pci_read_config: (val) 0xe0000004 <- 0x18 (addr) ...snip... pci_default_write_config: (val) 0x0 -> 0x18 (addr) IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000 pci_read_config: (val) 0xe0000004 <- 0x18 (addr) pci_default_write_config: (val) 0x0 -> 0x18 (addr) pci_read_config: (val) 0x0 <- 0x1c (addr) pci_default_write_config: (val) 0x0 -> 0x1c (addr) IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr = 2164588544, size = 20000000 pci_read_config: (val) 0xffffffff <- 0x1c (addr) pci_default_write_config: (val) 0x0 -> 0x1c (addr) pci_read_config: (val) 0x0 <- 0x20 (addr) the pci writes are all still 0, I can't see how my debug statements are incorrect though. Below is my trivial pci config debugging patch. diff --git a/hw/pci.c b/hw/pci.c index 70dbace..01087b1 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1159,6 +1159,8 @@ static uint32_t pci_read_config(PCIDevice *d, len = MIN(len, pci_config_size(d) - address); memcpy(&val, d->config + address, len); + if (strncmp(d->name, "ivshmem", 7) == 0) + printf("pci_read_config: (val) 0x%x <- 0x%x (addr)\n", val, address); return le32_to_cpu(val); } @@ -1219,6 +1221,8 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l) d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask); } + if (strncmp(d->name, "ivshmem", 7) == 0) + printf("pci_write_config: (val) 0x%x -> 0x%x (addr)\n", val, addr); #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT if (kvm_enabled() && kvm_irqchip_in_kernel() && addr >= PIIX_CONFIG_IRQ_ROUTE && Cam