On Wed, Jan 22, 2025 at 02:39:43PM +0100, Jan Beulich wrote:
> On 22.01.2025 13:00, Roger Pau Monné wrote:
> > On Tue, Oct 01, 2024 at 10:49:10AM +0200, Jan Beulich wrote:
> >> Both caches may need higher capacity, and the upper bound will need to
> >> be determined dynamically based on CPUID policy (for AMX'es TILELOAD /
> >> TILESTORE at least).
> >>
> >> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> > 
> > Just a couple of comments below.
> > 
> >> ---
> >> This is a patch taken from the AMX series, but wasn't part of the v3
> >> submission. All I did is strip out the actual AMX bits (from
> >> hvmemul_cache_init()), plus of course change the description. As a
> >> result some local variables there may look unnecessary, but this way
> >> it's going to be less churn when the AMX bits are added. The next patch
> >> pretty strongly depends on the changed approach (contextually, not so
> >> much functionally), and I'd really like to avoid rebasing that one ahead
> >> of this one, and then this one on top of that.
> > 
> > Oh, I was just going to ask about the weirdness of nents compared to
> > what was previously.
> 
> And then you did ask; I'll comment on that below.
> 
> >> --- a/xen/arch/x86/hvm/emulate.c
> >> +++ b/xen/arch/x86/hvm/emulate.c
> >> @@ -26,6 +26,18 @@
> >>  #include <asm/iocap.h>
> >>  #include <asm/vm_event.h>
> >>  
> >> +/*
> >> + * We may read or write up to m512 or up to a tile row as a number of
> >> + * device-model transactions.
> >> + */
> >> +struct hvm_mmio_cache {
> >> +    unsigned long gla;
> >> +    unsigned int size;
> >> +    unsigned int space:31;
> > 
> > Having size and space is kind of confusing, would you mind adding a
> > comment that size is the runtime consumed buffer space, while space is
> > the total allocated buffer size (and hence not supposed to change
> > during usage)?
> 
> Sure; I thought the two names would be clear enough when sitting side by
> side, but here you go:
> 
>     unsigned int size;     /* Amount of buffer[] actually used. */
>     unsigned int space:31; /* Allocated size of buffer[]. */
> 
> 
> >> @@ -2978,16 +2991,21 @@ void hvm_dump_emulation_state(const char
> >>  int hvmemul_cache_init(struct vcpu *v)
> >>  {
> >>      /*
> >> -     * No insn can access more than 16 independent linear addresses 
> >> (AVX512F
> >> -     * scatters/gathers being the worst). Each such linear range can span 
> >> a
> >> -     * page boundary, i.e. may require two page walks. Account for each 
> >> insn
> >> -     * byte individually, for simplicity.
> >> +     * AVX512F scatter/gather insns can access up to 16 independent linear
> >> +     * addresses, up to 8 bytes size. Each such linear range can span a 
> >> page
> >> +     * boundary, i.e. may require two page walks.
> >> +     */
> >> +    unsigned int nents = 16 * 2 * (CONFIG_PAGING_LEVELS + 1);
> >> +    unsigned int i, max_bytes = 64;
> >> +    struct hvmemul_cache *cache;
> >> +
> >> +    /*
> >> +     * Account for each insn byte individually, both for simplicity and to
> >> +     * leave some slack space.
> >>       */
> >> -    const unsigned int nents = (CONFIG_PAGING_LEVELS + 1) *
> >> -                               (MAX_INST_LEN + 16 * 2);
> >> -    struct hvmemul_cache *cache = xmalloc_flex_struct(struct 
> >> hvmemul_cache,
> >> -                                                      ents, nents);
> >> +    nents += MAX_INST_LEN * (CONFIG_PAGING_LEVELS + 1);
> >>  
> >> +    cache = xvmalloc_flex_struct(struct hvmemul_cache, ents, nents);
> > 
> > Change here seems completely unrelated, but I guess this is what you
> > refer to in the post-commit remark.  IOW: the split of the nents
> > variable setup, plus the change of xmalloc_flex_struct() ->
> > xvmalloc_flex_struct() don't seem to be related to the change at
> > hand.
> 
> See the post-commit-message remark that you commented on. To repeat:
> It'll be quite a bit easier for me if the seemingly unrelated adjustments
> could be kept like that. Unless of course there's something wrong with
> them.
> 
> >> @@ -2997,6 +3015,15 @@ int hvmemul_cache_init(struct vcpu *v)
> >>  
> >>      v->arch.hvm.hvm_io.cache = cache;
> >>  
> >> +    for ( i = 0; i < ARRAY_SIZE(v->arch.hvm.hvm_io.mmio_cache); ++i )
> >> +    {
> >> +        v->arch.hvm.hvm_io.mmio_cache[i] =
> >> +            xmalloc_flex_struct(struct hvm_mmio_cache, buffer, max_bytes);
> > 
> > TBH I would be tempted to just use xvmalloc here also, even if the
> > structure is never going to be > PAGE_SIZE, it's more consistent IMO.
> 
> Oh, absolutely under the current rules (which weren't in effect yet back
> when all of this was written).

With the two items above fixed (not the nents related change, that's
fine as-is):

Reviewed-by: Roger Pau Monné <roger....@citrix.com>

Thanks, Roger.

Reply via email to