On 11/6/2023 12:10, Lukas Wunner wrote:
On Fri, Nov 03, 2023 at 02:07:57PM -0500, Mario Limonciello wrote:
The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric implementation.

Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
to program the device will always find the PCIe ports used for
tunneling as a limiting factor potentially leading to incorrect
performance decisions.

To prevent problems in downstream drivers check explicitly for ports
being used for PCIe tunneling and skip them when looking for bandwidth
limitations of the hierarchy. If the only device connected is a root port
used for tunneling then report that device.

I think a better approach would be to define three new bandwidths for
Thunderbolt in enum pci_bus_speed and add appropriate descriptions in
pci_speed_string().  Those three bandwidths would be 10 GBit/s for
Thunderbolt 1, 20 GBit/s for Thunderbolt 2, 40 GBit/s for Thunderbolt 3
and 4.

It's an interesting idea, but there's a few short comings I can think of.

1) The USB4 specification doesn't actually require 40GB/s support, this is only a Thunderbolt 4 requirement.

https://www.tomsguide.com/features/thunderbolt-4-vs-usb4-whats-the-difference

The TBT/USB4 link speed can be discovered, but it's not a property of the *switch* not of the PCIe tunneling port.

Tangentially related; the link speed is currently symmetric but there are two sysfs files. Mika left a comment in drivers/thunderbolt/switch.c it may be asymmetric in the future. So we may need to keep that in mind on any design that builds on top of them.

On an AMD Phoenix system connected to a TBT3 Alpine Ridge based eGPU enclosure I can see:

$ cat /sys/bus/thunderbolt/devices/1-0/generation
4
$ cat /sys/bus/thunderbolt/devices/1-2/generation
3
$ cat /sys/bus/thunderbolt/devices/1-2/tx_speed
20.0 Gb/s
$ cat /sys/bus/thunderbolt/devices/1-2/rx_speed
20.0 Gb/s

2) This works until you end up with USB4v2 which supports 80GBit/s.

So this might mean an extra 80GB/s enum and porting some variation of usb4_switch_version() outside of the thunderbolt driver so that PCI core can use it too.


Code to determine the Thunderbolt generation from the PCI ID already exists
in tb_switch_get_generation().

As 'thunderbolt' can be a module or built in, we need to bring code into PCI core so that it works in early boot before it loads.

On the presumption that no more "new" TBT3 devices will be released to the market I suppose that *a lot* of that code could come over to PCI core especially if we can bring some variation of usb4_switch_version() over too.

The other option is that we can stop allowing thunderbolt as a module and require it to be built-in. If we do this we can export symbols from it and use some of them in PCI core too.


This will not only address the amdgpu issue you're trying to solve,
but also emit an accurate speed from __pcie_print_link_status().

The speed you're reporting with your approach is not necessarily
accurate because the next non-tunneled device in the hierarchy might
be connected with a far higher PCIe speed than what the Thunderbolt
fabric allows.
 > Thanks,

Lukas

Reply via email to