On Wed, 21 Aug 2024 15:25:18 +0200
Igor Mammedov <imamm...@redhat.com> wrote:

> On Thu, 27 Apr 2023 17:10:06 -0400
> Alexander Bulekov <alx...@bu.edu> wrote:
> 
> > Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
> > This flag is set/checked prior to calling a device's MemoryRegion
> > handlers, and set when device code initiates DMA.  The purpose of this
> > flag is to prevent two types of DMA-based reentrancy issues:
> > 
> > 1.) mmio -> dma -> mmio case
> > 2.) bh -> dma write -> mmio case  
> 
> Alexander, with 9.0
> we are getting 
> 
>    warning: Blocked re-entrant IO on MemoryRegion: acpi-cpu-hotplug at addr: 
> 0x0
> 
> during CPU hot-unplug, to my knowledge there shouldn't be any DMA involved
> there.
> The only access should be either from guest kernel or firmware(this one is 
> under SMM mode)).
> 
> Question is how this could happen on MMIO access which should be guarded by 
> BQL?

For prosperity, reproducer (RHEL9.4 Haswell host + upstream QEMU/edk 
https://issues.redhat.com/browse/RHEL-56154):

./qemu-system-x86_64 --enable-kvm -smp 
2,maxcpus=24,cores=12,threads=1,dies=1,sockets=2 \
    -cpu host \
    -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": 
"/tmp/qemu_build/pc-bios/edk2-x86_64-secure-code.fd", "auto-read-only": true, 
"discard": "unmap"}' \
    -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": 
true, "file": "file_ovmf_code"}'  \
    -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": 
"/tmp/edk_VARS.raw", "auto-read-only": true, "discard": "unmap"}'   \
    -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": 
false, "file": "file_ovmf_vars"}'  \
    -M q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
    -m 4G \
    -monitor stdio \
    rhel9.5.raw

once booted 

(qemu) device_add 
host-x86_64-cpu,id=vcpu1,socket-id=1,core-id=10,die-id=0,thread-id=0
(qemu) device_add 
host-x86_64-cpu,id=vcpu2,socket-id=1,core-id=11,die-id=0,thread-id=0
(qemu) device_del vcpu1
(qemu) qemu-system-x86_64: warning: Blocked re-entrant IO on MemoryRegion: 
acpi-cpu-hotplug at addr: 0x0
 
> And where to start digging to find out if it's a genuine issue,
> or whether it's safe to use big hammer and disable reentrancy guard?
I'm hesitant to use hammer so far (though it would make problem nop).
What happens is that cpu_remove_sync() temporarily releases BQL
which lets conflicting access to happen.

But I think unexpected access shouldn't be there in the 1st place,
so guard looks pretty legit at this point.
Lets see what Gerd finds out from edk2 point of view.

> 
> 
> > These issues have led to problems such as stack-exhaustion and
> > use-after-frees.
> > 
> > Summary of the problem from Peter Maydell:
> > https://lore.kernel.org/qemu-devel/cafeaca_23vc7he3iam-jva6w38lk4hjowae5kcknhprd5fp...@mail.gmail.com
> > 
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
> > Resolves: CVE-2023-0330
> > 
> > Signed-off-by: Alexander Bulekov <alx...@bu.edu>
> > Reviewed-by: Thomas Huth <th...@redhat.com>
> > ---
> >  include/exec/memory.h  |  5 +++++
> >  include/hw/qdev-core.h |  7 +++++++
> >  softmmu/memory.c       | 16 ++++++++++++++++
> >  3 files changed, 28 insertions(+)
> > 
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 15ade918ba..e45ce6061f 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -767,6 +767,8 @@ struct MemoryRegion {
> >      bool is_iommu;
> >      RAMBlock *ram_block;
> >      Object *owner;
> > +    /* owner as TYPE_DEVICE. Used for re-entrancy checks in MR access 
> > hotpath */
> > +    DeviceState *dev;
> >  
> >      const MemoryRegionOps *ops;
> >      void *opaque;
> > @@ -791,6 +793,9 @@ struct MemoryRegion {
> >      unsigned ioeventfd_nb;
> >      MemoryRegionIoeventfd *ioeventfds;
> >      RamDiscardManager *rdm; /* Only for RAM */
> > +
> > +    /* For devices designed to perform re-entrant IO into their own IO MRs 
> > */
> > +    bool disable_reentrancy_guard;
> >  };
> >  
> >  struct IOMMUMemoryRegion {
> > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> > index bd50ad5ee1..7623703943 100644
> > --- a/include/hw/qdev-core.h
> > +++ b/include/hw/qdev-core.h
> > @@ -162,6 +162,10 @@ struct NamedClockList {
> >      QLIST_ENTRY(NamedClockList) node;
> >  };
> >  
> > +typedef struct {
> > +    bool engaged_in_io;
> > +} MemReentrancyGuard;
> > +
> >  /**
> >   * DeviceState:
> >   * @realized: Indicates whether the device has been fully constructed.
> > @@ -194,6 +198,9 @@ struct DeviceState {
> >      int alias_required_for_version;
> >      ResettableState reset;
> >      GSList *unplug_blockers;
> > +
> > +    /* Is the device currently in mmio/pio/dma? Used to prevent 
> > re-entrancy */
> > +    MemReentrancyGuard mem_reentrancy_guard;
> >  };
> >  
> >  struct DeviceListener {
> > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > index b1a6cae6f5..fe23f0e5ce 100644
> > --- a/softmmu/memory.c
> > +++ b/softmmu/memory.c
> > @@ -542,6 +542,18 @@ static MemTxResult access_with_adjusted_size(hwaddr 
> > addr,
> >          access_size_max = 4;
> >      }
> >  
> > +    /* Do not allow more than one simultaneous access to a device's IO 
> > Regions */
> > +    if (mr->dev && !mr->disable_reentrancy_guard &&
> > +        !mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {
> > +        if (mr->dev->mem_reentrancy_guard.engaged_in_io) {
> > +            warn_report("Blocked re-entrant IO on "
> > +                    "MemoryRegion: %s at addr: 0x%" HWADDR_PRIX,
> > +                    memory_region_name(mr), addr);
> > +            return MEMTX_ACCESS_ERROR;
> > +        }
> > +        mr->dev->mem_reentrancy_guard.engaged_in_io = true;
> > +    }
> > +
> >      /* FIXME: support unaligned access? */
> >      access_size = MAX(MIN(size, access_size_max), access_size_min);
> >      access_mask = MAKE_64BIT_MASK(0, access_size * 8);
> > @@ -556,6 +568,9 @@ static MemTxResult access_with_adjusted_size(hwaddr 
> > addr,
> >                          access_mask, attrs);
> >          }
> >      }
> > +    if (mr->dev) {
> > +        mr->dev->mem_reentrancy_guard.engaged_in_io = false;
> > +    }
> >      return r;
> >  }
> >  
> > @@ -1170,6 +1185,7 @@ static void memory_region_do_init(MemoryRegion *mr,
> >      }
> >      mr->name = g_strdup(name);
> >      mr->owner = owner;
> > +    mr->dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
> >      mr->ram_block = NULL;
> >  
> >      if (name) {  
> 


Reply via email to