On 8/26/22 14:29, Daniel P. Berrangé wrote: > On Fri, Aug 26, 2022 at 01:50:40PM +0200, Claudio Fontana wrote: >> On 8/26/22 13:39, Daniel P. Berrangé wrote: >>> The 'qemu64' CPU model implements the least featureful x86_64 CPU that's >>> possible. Historically this hasn't been an issue since it was rare for >>> OS distros to build with a higher mandatory CPU baseline. >>> >>> With RHEL-9, however, the entire distro is built for the x86_64-v2 ABI >>> baseline: >>> >>> >>> https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level >>> >>> It is likely that other distros may take similar steps in the not too >>> distant future. For example, it has been suggested for Fedora on a >>> number of occassions. >>> >>> This new baseline is not compatible with the qemu64 CPU model though. >>> While it is possible to pass a '-cpu xxx' flag to qemu-x86_64, the >>> usage of QEMU doesn't always allow for this. For example, the args >>> are typically controlled via binfmt rules that the user has no ability >>> to change. This impacts users who are trying to use podman on aarch64 >>> platforms, to run containers with x86_64 content. There's no arg to >>> podman that can be used to change the qemu-x86_64 args, and a non-root >>> user of podman can not change binfmt rules without elevating privileges: >>> >>> https://github.com/containers/podman/issues/15456#issuecomment-1228210973 >>> >>> Changing to the 'max' CPU model gives 'qemu-x86_64' maximum >>> compatibility with binaries it is likely to encounter in the wild, >>> and not likely to have a significant downside for existing usage. >> >> How do we know for sure? Do we have a base of binaries to test across >> qemu versions? > > There are never any perfect guarantees, but this assertion is based on > the view that the x86 instruction set changes are considered backwards > compatible. Existing applications from years (even decades) ago can > generally run on arbitrarily newer CPUs with orders of magnitude more > features, as apps have to intentionally opt-in to use of new CPU > instructions. > > So the risk here would be an existing applications, which is able to > dynamically opt-in to optimized code paths if certain CPUID features > exist, and in turn tickles a bug in QEMU's implementation of said > feature that it would not previously hit. That's certainly possible, > but I don't think it would be common, as we would already have seen > that in system emulators. The la57 feature issue Richard mentions > is one example, but that doesn't impact user emulators I believe. > > Weigh that risk against the fact that we have users frequently > hitting problems with the existing qemu64 default because it is > too old. User's have already been making this change in the context > of Docker for this reason. eg > > https://github.com/tonistiigi/binfmt/blob/master/patches/cpu-max/0001-default-to-cpu-max-on-x86-and-arm.patch > >> >>> >>> Most other architectures already use an 'any' CPU model, which is >>> often mapped to 'max' (or similar) already, rather than the oldest >>> possible CPU model. >>> >>> For the sake of consistency the 'i386' architecture is also changed >>> from using 'qemu32' to 'max'. >>> >>> Signed-off-by: Daniel P. Berrangé <berra...@redhat.com> >>> --- >>> linux-user/i386/target_elf.h | 2 +- >>> linux-user/x86_64/target_elf.h | 2 +- >>> 2 files changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/linux-user/i386/target_elf.h b/linux-user/i386/target_elf.h >>> index 1c6142e7da..238a9aba73 100644 >>> --- a/linux-user/i386/target_elf.h >>> +++ b/linux-user/i386/target_elf.h >>> @@ -9,6 +9,6 @@ >>> #define I386_TARGET_ELF_H >>> static inline const char *cpu_get_model(uint32_t eflags) >>> { >>> - return "qemu32"; >>> + return "max"; >>> } >>> #endif >>> diff --git a/linux-user/x86_64/target_elf.h b/linux-user/x86_64/target_elf.h >>> index 7b76a90de8..3f628f8d66 100644 >>> --- a/linux-user/x86_64/target_elf.h >>> +++ b/linux-user/x86_64/target_elf.h >>> @@ -9,6 +9,6 @@ >>> #define X86_64_TARGET_ELF_H >>> static inline const char *cpu_get_model(uint32_t eflags) >>> { >>> - return "qemu64"; >>> + return "max"; >>> } >>> #endif >> >> Just seems an abrupt change to me if we don't have a mechanism in >> place to ensure we don't break existing workloads. > > There are no absolutes here. We have risk of unknown problem possibly > breaking some existing apps, vs a known problem currently breaking > users of CentOS 9 / RHEL 9, which podman and docker need to workaround.
I wonder how bad the workarounds are, when they allow both old and new users to enjoy their running workloads. > > The question is which benefits more users, and which is the better > long term option. I think using modern CPU is better long term, and > if we find bugs in QEMU's TCG impl we just need to fix them regardless. > > If we find bugs in applications, however, then the apps need to fix > them. Hmm... I wonder what the project stance on this is. "Fix the app" does not seem to be a great way to provide emulation. Just my 2c of course, C