Hi Alex, On 11/20/2015 12:44 AM, Alex Williamson wrote: > On Thu, 2015-11-19 at 15:22 +0000, Eric Auger wrote: >> I am resending this RFC from Oct 12, after kernel 4.4-rc1 and >> QEMU 2.5-rc1, hoping things have calmed down a little bit. >> >> This RFC allows to set up AMD XGBE passthrough. This was tested on AMD >> Seattle. >> >> The first upstreamed device supporting KVM platform passthrough was the >> Calxeda Midway XGMAC. Compared to this latter, the XGBE XGMAC exposes a >> much more complex device tree node. Generating the device tree node for >> the guest is the challenging and controversary part of this series. >> >> - First There are 2 device tree node formats: >> one where XGBE and PHY are described in separate nodes and another one >> that combines both description in a single node (only supported by 4.2 >> onwards kernels). Only the combined description is supported for passthrough, >> meaning the host must be >= 4.2 and must feature a device tree with a >> combined >> description. The guest will also be exposed with a combined description, >> meaning only >= 4.2 guest are supported. It is not planned to support >> separate node representation since assignment of the PHY is less >> straigtforward. >> >> - the XGMAC/PHY node depends on 2 clock nodes (DMA and PTP). >> The code checks those clocks are fixed to make sure they cannot be >> switched off at some point after the native driver gets unbound. >> >> - there are many property values to populate on guest side. Most of them >> cannot be hardcoded. That series proposes a way to parse the host device >> tree blob and retrieve host values to feed guest representation. Current >> approach relies on dtc binary availability plus libfdt usage. >> Other alternatives were discussed in: >> http://www.spinics.net/lists/kvm-arm/msg16648.html. >> >> - Currently host booted with ACPI is not supported. > > I won't pretend to know all the politics in the ARM space, but doesn't > this last bullet sort of imply that this is dead-on-arrival code? Maybe > not in the embedded space, but certainly in the server space, I thought > ACPI was declared the winner. Thanks,
When the code was written, no ACPI description was available yet including IOMMU. Now I think there is a specification that would enable the description of the system (IORT/DSDT tables) and I will investigate whether we have one ready for the HW I am using and mainlined. Nethertheless we had a discussion end of Sept with Marc Zyngier, Christoffer, Will Deacon and other people working in the ARM ecosystem and we decided starting with FDT host description was a reasonable choice at this moment. To be fully honest Peter also thought we should consider the ACPI case from day 0. But I think nobody -AFAIK - has an ideal solution about how to address ACPI/DT in a unified way and people seem to be against introducing IOCTL API. Among solutions I foresee the 1st one below was considered the simplest and chosen: 1) rely on external applications to decode/parse dt/ACPI table 2) build a unified fs representation for dt/ACPI 3) create unified IOCTL API to retrieve dt/ACPI info (attempted by VOSYS but at VFIO level) The idea of this series is - to rely on external dtc binary to build the blob from /sys/firmware/devicetree/base - introduce some helpers using libfdt that manipulate the host dt blob - use those helpers in sysbus-fdt.c to build the clock and xgbe nodes for the guest Assuming we have an ACPI description I guess we would/could use a similar approach to parse/decode the ACPI table from QEMU (relying on acpidump, acpixtract, iasl, ../.. combination). So the current proposal brings a solution for embedded world and can be easily reused for other devices. Next step is to propose a similar approach for ACPI. Now I would like to make sure the open approach is accepted (with external dependency on dtc binary as we would have ext dependency on ACPI utilities). Best Regards Eric > > Alex >