For PCIe devices the right policy is not a round robin but to use the pcie device closer to the node. I did a prototype for that long ago and the concept can work. Can you look into that and also make that policy used automatically for PCIe devices?
I think that active/active makes sense for fabrics (link throughput aggregation) but also for dual-ported pci devices (given that this is a real use-case). I agree that the default can be a home-node path selection.