Re: [Qemu-devel] [RFC PATCH 0/8] Towards an Heterogeneous QEMU

Peter Crosthwaite Thu, 12 Nov 2015 23:04:31 -0800

Hi Christian,

Sorry about the delayed response.


On Tue, Oct 27, 2015 at 3:30 AM, Christian Pinto <
c.pi...@virtualopensystems.com> wrote:

>
>
> On 25/10/2015 22:38, Peter Crosthwaite wrote:
>
> On Thu, Oct 22, 2015 at 2:21 AM, Christian 
> Pinto<c.pi...@virtualopensystems.com> <c.pi...@virtualopensystems.com> wrote:
>
> Hello Peter,
>
>
> On 07/10/2015 17:48, Peter Crosthwaite wrote:
>
> On Mon, Oct 5, 2015 at 8:50 AM, Christian 
> Pinto<c.pi...@virtualopensystems.com> <c.pi...@virtualopensystems.com> wrote:
>
> Hello Peter,
>
> thanks for your comments
>
> On 01/10/2015 18:26, Peter Crosthwaite wrote:
>
> On Tue, Sep 29, 2015 at 6:57 AM, Christian 
> Pinto<c.pi...@virtualopensystems.com> <c.pi...@virtualopensystems.com>  wrote:
>
> Hi all,
>
> This RFC patch-series introduces the set of changes enabling the
> architectural elements to model the architecture presented in a
> previous
> RFC
> letter: "[Qemu-devel][RFC] Towards an Heterogeneous QEMU".
>
> and the OS binary image needs
> to be placed in memory at model startup.
>
>
> I don't see what this limitation is exactly. Can you explain more? I
> do see a need to work on the ARM bootloader for AMP flows, it is a
> pure SMP bootloader than assumes total control.
>
> the problem here was to me that when we launch QEMU a binary needs to be
> provided and put in memory
> in order to be executed. In this patch series the slave doesn't have a
> proper memory allocated when first launched.
>
> But it could though couldn't it? Can't the slave guest just have full
> access to it's own address space (probably very similar to the masters
> address space) from machine init time? This seems more realistic than
> setting up the hardware based on guest level information.
>
> Actually the address space for a slave is built at init time, the thing that
> is not
> completely configured is the memory region modeling the RAM. Such region is
> configured
> in terms of size, but there is no pointer to the actual memory. The pointer
> is mmap-ed later
> before the slave boots.
>
>
> based on what information? Is the master guest controlling this? If so
> what is the real-hardware analogue for this concept where the address
> map of the slave can change (i.e. be configured) at runtime?
>
> Hello Peter,
>
> The memory map of a slave is not controlled by the master guest, since it
> is
> dependent from the machine model used for the slave. The only thing the
> master
> controls is the subset of the main memory that is assigned to a slave.  By
> saying that the memory pointer is sent to the slave later, before the
> boot, it is like setting the
> boot address for that specific slave within the whole platform memory. So
> essentially the offset passed for the mmap is from beginning of master
> memory up to the
> beginning of the memory carved out for the specific slave. I see this as a
> way to
> protect the master memory from  malicious accesses from the slave side, so
> this
> way the slave will only "see" the part of the memory that it got assigned.
>
>
That does sound like memory map control though. Is it simpler to just give
the slave full access and implement such protections as a specific feature
(probably some sort of IOMMU)?


> The information about memory (fd + offset for mmap) is sent only later
> when
> the boot is triggered. This is also
> safe since the slave will be waiting in the incoming state, and thus no
> corruption or errors can happen before the
> boot is triggered.
>
> I was thinking more about your comment about slave-to-slave
> interrupts. This would just trivially be a local software-generated
> interrupts of some form within the slave cluster.
>
> Sorry, I did not catch your comment at first time. You are right, if cores
> are in the same cluster
> a software generated interrupt is going to be enough. Of course the eventfd
> based interrupts
> make sense for a remote QEMU.
>
>
> Is eventfd a better implementation of remote-port GPIOs as in the Xilinx work?
>
>
> Functionally I think they provide the same behavior. We went for eventfd
> since
> when designing the code of the IDM we based it on what available on
> upstream QEMU
> to signal events between processes (e.g., eventfd).
>
> Re the terminology, I don't like the idea of thinking of inter-qemu
> "interrupts" as whatever system we decide on should be able to support
> arbitrary signals going from one QEMU to another. I think the Xilinx
> work already has reset signals going between the QEMU peers.
>
>
> We used the inter-qemu interrupt term, since such signal was triggered
> from the IDM
> and is an interrupt. But I see your point and agree that such interrupt
> could be a generic
> inter-qemu signaling mechanism, that can be used as interrupt for this
> specific purpose.
>
>
> The multi client-socket is used for the master to trigger
>         the boot of a slave, and also for each master-slave couple to
> exchancge the
>         eventd file descriptors. The IDM device can be instantiated
> either
> as a
>         PCI or sysbus device.
>
>
> So if everything is is one QEMU, IPIs can be implemented with just a
>
> of registers makes the master in
> "control" each of the slaves. The IDM device is already seen as a regular
> device by each of the QEMU instances
> involved.
>
>
> I'm starting to think this series is two things that should be
> decoupled. One is the abstract device(s) to facilitate your AMP, the
> other is the inter-qemu communication. For the abstract device, I
> guess this would be a new virtio-idm device. We should try and involve
> virtio people perhaps. I can see the value in it quite separate from
> modelling the real sysctrl hardware.
>
> Interesting, which other value/usage do you see in it? For me the IDM was
> meant to
>
> It has value in prototyping with your abstract toolkit even with
> homogeneous hardware. E.g. I should be able to just use single-QEMU
> ARM virt machine -smp 2 and create one of these virtio-AMP setups.
> Homogeneous hardware with heterogenous software using your new pieces
> of abstract hardware.
>
> It is also more practical for getting a merge of your work as you are
> targetting two different audiences with the work. People intersted in
> virtio can handle the new devices you create, while the core
> maintainers can handle your multi-QEMU work. It is two rather big new
> features.
>
>
> This is true, too much meat on the fire for the same patch makes it
> difficult to get merged. Thanks.
> We could split in multi-client socket work, the inter-qemu
> communication and virtio-idm.
>
>
OK.


>
> work as an abstract system controller to centralize the management
> of the slaves (boot_regs and interrupts).
>
>
>
> But I think the implementation
> should be free of any inter-QEMU awareness. E.g. from P4 of this
> series:
>
> +static void send_shmem_fd(IDMState *s, MSClient *c)
> +{
> +    int fd, len;
> +    uint32_t *message;
> +    HostMemoryBackend *backend = MEMORY_BACKEND(s->hostmem);
> +
> +    len = strlen(SEND_MEM_FD_CMD)/4 + 3;
> +    message = malloc(len * sizeof(uint32_t));
> +    strcpy((char *) message, SEND_MEM_FD_CMD);
> +    message[len - 2] = s->pboot_size;
> +    message[len - 1] = s->pboot_offset;
> +
> +    fd = memory_region_get_fd(&backend->mr);
> +
> +    multi_socket_send_fds_to(c, &fd, 1, (char *) message, len *
> sizeof(uint32_t));
>
> The device itself is aware of shared-memory and multi-sockets. Using
> the device for single-QEMU AMP would require neither - can the IDM
> device be used in a homogeneous AMP flow in one of our existing SMP
> machine models (eg on a dual core A9 with one core being master and
> the other slave)?
>
> Can this be architected in two phases for greater utility, with the
> AMP devices as just normal devices, and the inter-qemu communication
> as a separate feature?
>
> I see your point, and it is an interesting proposal.
>
> What I can think here to remove the awareness of how the IDM communicates
> with
> the slaves, is to define a kind of AMP Slave interface. So there will be an
> instance of the interface for each of the slaves, encapsulating the
> communication part (being either local or based on sockets).
> The AMP Slave interfaces would be what you called the AMP devices, with one
> device per slave.
>
>
> Do we need this hard definition of master and slave in the hardware?
> Can the virtio-device be more peer-peer and the master-slave
> relationship is purely implemented by the guest?
>
>
> I think we can architect it in a way that the virtio-idm simply connects
> two or more peers, and depending from the usage done by the
> software, behaving as master from one side and slave on the other.
> I used the term slave AMP interface, I should have used AMP client
> interface, to indicate the cores/procesors the IDM has inter-connect
> (being local or on another QEMU instance).
> So there would be an implementation of the AMP client interface that
> is based on the assumption that all the processors are on the same
> instance, and one based on sockets for the remote instances.
>
>
Do you need this dual mode? Can the IDM just have GPIOs which are then
either directly connected to the local CPUs, or sent out over inter-qemu
connectivity mechanism? Then the inter-qemu can be used for any GPIO
communication.


> to make an example, for a single qemu instance with -smp 2
> you would add something like :
>
> -smp 2
> -device amp-local-client, core_id=0, id=client0
> -device amp-local-client, core_id=1, id=client1
> -device virtio-idm, clients=2, id=idm
>
> while for remote qemu instances something like
> (the opposite to be instantiated on the other remote instance):
>
> -device amp-local-client, id=client0
> -device amp-remote-client, chardev=chdev_id, id=client1
> -device virtio-idm, clients=2, id=idm-dev
>
> This way the idm only knows about clients (all clients are the
> same for the IDM). The software running on the processors
> will enable the interaction between the clients by writing
> into the IDM device registers.
>
> At a first glance, and according to my current proposal, I see
> such AMP client interfaces exporting the following methods:
>
>    - raise_interrupt() function: called by the IDM to trigger an
>    interrupt towards the destination client
>
>
>    - boot_trigger() function: called by the IDM to trigger the boot of
>    the client
>
> If the clients are remote, socket communication will be used and hidden in
> the AMP client interface implementation
>
>
> Do you foresee a different type of interface for the use-case
> you have in mind? I ask because if for example the clients are
> cores of the same cluster (and same instance), interrupts could
> simply be software generated from the linux-kernel/firmware
> running on top of the processors and theoretically no need to
> go through the IDM, same I guess for the boot.
>
True. But if you are developing code for IDM, you can do a
crawl-before-walk test with an SMP test case.

Regards,
Peter

> Another thing that needs to be defined clearly is the interface between
> the IDM and the software running on the cores.
> At the moment I am using a set of registers, namely the boot and
> the interrupt registers. By writing the ID of a client in such registers
> it is possible to forward an interrupt or trigger its boot.
>
>
> Thanks,
>
> Christian
>
>
>
> Regards,
> Peter
>
>
> At master side, besides the IDM, one would instantiate
> as many interface devices as slaves. During the initialization the IDM will
> link
> with all those interfaces, and only call functions like: send_interrupt() or
> boot_slave() to interact with the slaves. The interface will be the same for
> both local or remote slaves, while the implementation of the methods will
> differ and reside in the specific AMP Slave Interface device.
> On the slave side, if the slave is remote, another instance of the
> interface is instantiated so to connect to socket/eventfd.
>
> So as an example the send_shmem_fd function you pointed could be hidden in
> the
> slave interface, and invoked only when the IDM will invoke the slave_boot()
> function of a remote slave interface.
>
> This would higher the level of abstraction and open the door to potentially
> any
> communication mechanism between master and slave, without the need to adapt
> the
> IDM device to the specific case. Or, eventually, to mix between local and
> remote instances.
>
>
> Thanks,
>
> Christian
>
>
> Regards,
> Peter
>
>
>

Re: [Qemu-devel] [RFC PATCH 0/8] Towards an Heterogeneous QEMU

Reply via email to