Excerpts from Daniel Henrique Barboza's message of February 1, 2022 5:10 am: > > > On 1/29/22 03:50, Nicholas Piggin wrote: >> The behaviour of the Address Translation Mode on Interrupt resource is >> not consistently supported by all CPU versions or all KVM versions. In >> particular KVM HV only supports mode 0 on POWER7 processors, and does >> not support mode 2 on any processors. KVM PR only supports mode 0. TCG >> can support all modes (0,2,3). >> >> This leads to inconsistencies in guest behaviour and could cause >> problems migrating guests. >> >> This was not too noticable for Linux guests for a long time because the >> kernel only used mode 0 or 3, and it used to consider AIL to be somewhat >> advisory (KVM would not always honor it either) and it kept both sets of >> interrupt vectors around. >> >> Recent Linux guests depend on the AIL mode working as defined by the ISA >> to support the SCV facility interrupt. If AIL mode 3 can not be provided, >> then Linux must be given an error so it can disable the SCV facility. > > Is this the scenario where migration failures can occur? I don't understand > what are the migration problems you cited that were possible to happen.
Maybe I'm overly concerned and nothing would practically use it (beyond testing which we could just hack around). I was thinking of if we implemented AIL=2 in KVM HV, or AIL=3 in PR. > >> >> Add the ail-modes capability which is a bitmap of the supported values >> for the H_SET_MODE Address Translation Mode on Interrupt resource. Add >> a new KVM CAP that exports the same thing, and provide defaults for PR >> and HV KVM that predate the cap. > > Why add a new machine cap in this case? Isn't something that the KVM > capability > should be able to handle by itself, where we always assume that we should have > the best AIL value possible? > > Besides, the way it is coded here, we're adding an user-visible capability > that > mimics the exact behavior we want from h_set_mode_resource_addr_trans_mode(), > meaning that only bits 0,1,2 and 3 of cap-ail-modes can be set, but: > > - bit 0 must always be set > - bit 1 must always be cleared > - if kvm_enabled(): > * bit 2 must always be cleared > * bit 3 can be cleared or not depending on kvmppc_has_cap_ail_3(), which > translates > to not allowed if running with KVM_PR and allowing it if it we're running > with Power8 > and newer > > i.e. bit 0 is always set, bit 1 is always cleared, bit 2 can be set or not > for TCG but > always cleared for KVM, and bit 3 can be set depending on the circunstances. > > Note that this would allow an user to set this guest in a Power9/10 machine: > > -machine pseries,accel=kvm,cap-ail-modes=1 > > And the guest will end up having degraded performance because AIL=3 is being > disabled. > > If we want to avoid this and force AIL=3 to be used in this case, then this > capability > would be used just to set or clear AIL=2 when running with TCG. I was thinking how it could be more flexible with maybe possibly future AIL modes and things we don't foresee. In theory AIL=0 could go away (although unlikely in practice). > I believe the chunks in which we check for kvm_pr and allow only AIL=0 are > improvements > of h_set_mode_resource_addr_trans_mode(), but other than that I'm afraid that > exposing > this cap to users is a bit overkill. That said, maybe you are right and it's overkill until a real need comes up. I will split and submit the KVM cap part of it, at least. Thanks, Nick