On Fri, 1 Nov 2024 at 01:36, Gustavo Romero <gustavo.rom...@linaro.org> wrote:
>
> FEAT_CMOW introduces support for controlling cache maintenance
> instructions executed in EL0/1 and is mandatory from Armv8.8.
>
> On real hardware, the main use for this feature is to prevent processes
> from invalidating or flushing cache lines for addresses they only have
> read permission, which can impact the performance of other processes.
>
> QEMU implements all cache instructions as NOPs, and, according to rule
> [1], which states that generating any Permission fault when a cache
> instruction is implemented as a NOP is implementation-defined, no
> Permission fault is generated for any cache instruction when it lacks
> read and write permissions.
>
> QEMU does not model any cache topology, so the PoU and PoC are before
> any cache, and rules [2] apply. These rules states that generating any
> MMU fault for cache instructions in this topology is also
> implementation-defined. Therefore, for FEAT_CMOW, we do not generate any
> MMU faults either, instead, we only advertise it in the feature
> register.
>
> [1] Rule R_HGLYG of section D8.14.3, Arm ARM K.a.
> [2] Rules R_MZTNR and R_DNZYL of section D8.14.3, Arm ARM K.a.
>
> Signed-off-by: Gustavo Romero <gustavo.rom...@linaro.org>
> ---
>  docs/system/arm/emulation.rst | 1 +
>  target/arm/cpu-features.h     | 5 +++++
>  target/arm/cpu.h              | 1 +
>  target/arm/tcg/cpu64.c        | 1 +
>  4 files changed, 8 insertions(+)
>
> diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
> index 35f52a54b1..a2a388f091 100644
> --- a/docs/system/arm/emulation.rst
> +++ b/docs/system/arm/emulation.rst
> @@ -26,6 +26,7 @@ the following architecture extensions:
>  - FEAT_BF16 (AArch64 BFloat16 instructions)
>  - FEAT_BTI (Branch Target Identification)
>  - FEAT_CCIDX (Extended cache index)
> +- FEAT_CMOW (Control for cache maintenance permission)
>  - FEAT_CRC32 (CRC32 instructions)
>  - FEAT_Crypto (Cryptographic Extension)
>  - FEAT_CSV2 (Cache speculation variant 2)
> diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
> index 04ce281826..e806f138b8 100644
> --- a/target/arm/cpu-features.h
> +++ b/target/arm/cpu-features.h
> @@ -802,6 +802,11 @@ static inline bool isar_feature_aa64_tidcp1(const 
> ARMISARegisters *id)
>      return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, TIDCP1) != 0;
>  }
>
> +static inline bool isar_feature_aa64_cmow(const ARMISARegisters *id)
> +{
> +    return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, CMOW) != 0;
> +}
> +
>  static inline bool isar_feature_aa64_hafs(const ARMISARegisters *id)
>  {
>      return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, HAFDBS) != 0;
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 8fc8b6398f..1ea4c545e0 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1367,6 +1367,7 @@ void pmu_init(ARMCPU *cpu);
>  #define SCTLR_EnIB    (1U << 30) /* v8.3, AArch64 only */
>  #define SCTLR_EnIA    (1U << 31) /* v8.3, AArch64 only */
>  #define SCTLR_DSSBS_32 (1U << 31) /* v8.5, AArch32 only */
> +#define SCTLR_CMOW    (1ULL << 32) /* FEAT_CMOW */
>  #define SCTLR_MSCEN   (1ULL << 33) /* FEAT_MOPS */
>  #define SCTLR_BT0     (1ULL << 35) /* v8.5-BTI */
>  #define SCTLR_BT1     (1ULL << 36) /* v8.5-BTI */
> diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
> index 0168920828..2963d7510f 100644
> --- a/target/arm/tcg/cpu64.c
> +++ b/target/arm/tcg/cpu64.c
> @@ -1218,6 +1218,7 @@ void aarch64_max_tcg_initfn(Object *obj)
>      t = FIELD_DP64(t, ID_AA64MMFR1, ETS, 2);      /* FEAT_ETS2 */
>      t = FIELD_DP64(t, ID_AA64MMFR1, HCX, 1);      /* FEAT_HCX */
>      t = FIELD_DP64(t, ID_AA64MMFR1, TIDCP1, 1);   /* FEAT_TIDCP1 */
> +    t = FIELD_DP64(t, ID_AA64MMFR1, CMOW, 1);     /* FEAT_CMOW */
>      cpu->isar.id_aa64mmfr1 = t;
>
>      t = cpu->isar.id_aa64mmfr2;

We don't need to do anything for the actual cache operations,
but we do need to make sure that the SCTLR_ELx and HCRX_EL2
control bits for it can be set and read back. Our sctlr_write()
doesn't impose a mask, so no change nedeed there, but
our hcrx_write() does set up a valid_mask and doesn't allow
the guest to write bits that aren't in that mask. So we
need to add an
   if (cpu_isar_feature(aa64_cmow, cpu)) {
       valid_mask |= HCRX_CMOW;
   }
in there.

thanks
-- PMM

Reply via email to