Hi Christian, Sorry about the delayed response.
On Tue, Oct 27, 2015 at 3:30 AM, Christian Pinto < c.pi...@virtualopensystems.com> wrote: > > > On 25/10/2015 22:38, Peter Crosthwaite wrote: > > On Thu, Oct 22, 2015 at 2:21 AM, Christian > Pinto<c.pi...@virtualopensystems.com> <c.pi...@virtualopensystems.com> wrote: > > Hello Peter, > > > On 07/10/2015 17:48, Peter Crosthwaite wrote: > > On Mon, Oct 5, 2015 at 8:50 AM, Christian > Pinto<c.pi...@virtualopensystems.com> <c.pi...@virtualopensystems.com> wrote: > > Hello Peter, > > thanks for your comments > > On 01/10/2015 18:26, Peter Crosthwaite wrote: > > On Tue, Sep 29, 2015 at 6:57 AM, Christian > Pinto<c.pi...@virtualopensystems.com> <c.pi...@virtualopensystems.com> wrote: > > Hi all, > > This RFC patch-series introduces the set of changes enabling the > architectural elements to model the architecture presented in a > previous > RFC > letter: "[Qemu-devel][RFC] Towards an Heterogeneous QEMU". > > and the OS binary image needs > to be placed in memory at model startup. > > > I don't see what this limitation is exactly. Can you explain more? I > do see a need to work on the ARM bootloader for AMP flows, it is a > pure SMP bootloader than assumes total control. > > the problem here was to me that when we launch QEMU a binary needs to be > provided and put in memory > in order to be executed. In this patch series the slave doesn't have a > proper memory allocated when first launched. > > But it could though couldn't it? Can't the slave guest just have full > access to it's own address space (probably very similar to the masters > address space) from machine init time? This seems more realistic than > setting up the hardware based on guest level information. > > Actually the address space for a slave is built at init time, the thing that > is not > completely configured is the memory region modeling the RAM. Such region is > configured > in terms of size, but there is no pointer to the actual memory. The pointer > is mmap-ed later > before the slave boots. > > > based on what information? Is the master guest controlling this? If so > what is the real-hardware analogue for this concept where the address > map of the slave can change (i.e. be configured) at runtime? > > Hello Peter, > > The memory map of a slave is not controlled by the master guest, since it > is > dependent from the machine model used for the slave. The only thing the > master > controls is the subset of the main memory that is assigned to a slave. By > saying that the memory pointer is sent to the slave later, before the > boot, it is like setting the > boot address for that specific slave within the whole platform memory. So > essentially the offset passed for the mmap is from beginning of master > memory up to the > beginning of the memory carved out for the specific slave. I see this as a > way to > protect the master memory from malicious accesses from the slave side, so > this > way the slave will only "see" the part of the memory that it got assigned. > > That does sound like memory map control though. Is it simpler to just give the slave full access and implement such protections as a specific feature (probably some sort of IOMMU)? > The information about memory (fd + offset for mmap) is sent only later > when > the boot is triggered. This is also > safe since the slave will be waiting in the incoming state, and thus no > corruption or errors can happen before the > boot is triggered. > > I was thinking more about your comment about slave-to-slave > interrupts. This would just trivially be a local software-generated > interrupts of some form within the slave cluster. > > Sorry, I did not catch your comment at first time. You are right, if cores > are in the same cluster > a software generated interrupt is going to be enough. Of course the eventfd > based interrupts > make sense for a remote QEMU. > > > Is eventfd a better implementation of remote-port GPIOs as in the Xilinx work? > > > Functionally I think they provide the same behavior. We went for eventfd > since > when designing the code of the IDM we based it on what available on > upstream QEMU > to signal events between processes (e.g., eventfd). > > Re the terminology, I don't like the idea of thinking of inter-qemu > "interrupts" as whatever system we decide on should be able to support > arbitrary signals going from one QEMU to another. I think the Xilinx > work already has reset signals going between the QEMU peers. > > > We used the inter-qemu interrupt term, since such signal was triggered > from the IDM > and is an interrupt. But I see your point and agree that such interrupt > could be a generic > inter-qemu signaling mechanism, that can be used as interrupt for this > specific purpose. > > > The multi client-socket is used for the master to trigger > the boot of a slave, and also for each master-slave couple to > exchancge the > eventd file descriptors. The IDM device can be instantiated > either > as a > PCI or sysbus device. > > > So if everything is is one QEMU, IPIs can be implemented with just a > > of registers makes the master in > "control" each of the slaves. The IDM device is already seen as a regular > device by each of the QEMU instances > involved. > > > I'm starting to think this series is two things that should be > decoupled. One is the abstract device(s) to facilitate your AMP, the > other is the inter-qemu communication. For the abstract device, I > guess this would be a new virtio-idm device. We should try and involve > virtio people perhaps. I can see the value in it quite separate from > modelling the real sysctrl hardware. > > Interesting, which other value/usage do you see in it? For me the IDM was > meant to > > It has value in prototyping with your abstract toolkit even with > homogeneous hardware. E.g. I should be able to just use single-QEMU > ARM virt machine -smp 2 and create one of these virtio-AMP setups. > Homogeneous hardware with heterogenous software using your new pieces > of abstract hardware. > > It is also more practical for getting a merge of your work as you are > targetting two different audiences with the work. People intersted in > virtio can handle the new devices you create, while the core > maintainers can handle your multi-QEMU work. It is two rather big new > features. > > > This is true, too much meat on the fire for the same patch makes it > difficult to get merged. Thanks. > We could split in multi-client socket work, the inter-qemu > communication and virtio-idm. > > OK. > > work as an abstract system controller to centralize the management > of the slaves (boot_regs and interrupts). > > > > But I think the implementation > should be free of any inter-QEMU awareness. E.g. from P4 of this > series: > > +static void send_shmem_fd(IDMState *s, MSClient *c) > +{ > + int fd, len; > + uint32_t *message; > + HostMemoryBackend *backend = MEMORY_BACKEND(s->hostmem); > + > + len = strlen(SEND_MEM_FD_CMD)/4 + 3; > + message = malloc(len * sizeof(uint32_t)); > + strcpy((char *) message, SEND_MEM_FD_CMD); > + message[len - 2] = s->pboot_size; > + message[len - 1] = s->pboot_offset; > + > + fd = memory_region_get_fd(&backend->mr); > + > + multi_socket_send_fds_to(c, &fd, 1, (char *) message, len * > sizeof(uint32_t)); > > The device itself is aware of shared-memory and multi-sockets. Using > the device for single-QEMU AMP would require neither - can the IDM > device be used in a homogeneous AMP flow in one of our existing SMP > machine models (eg on a dual core A9 with one core being master and > the other slave)? > > Can this be architected in two phases for greater utility, with the > AMP devices as just normal devices, and the inter-qemu communication > as a separate feature? > > I see your point, and it is an interesting proposal. > > What I can think here to remove the awareness of how the IDM communicates > with > the slaves, is to define a kind of AMP Slave interface. So there will be an > instance of the interface for each of the slaves, encapsulating the > communication part (being either local or based on sockets). > The AMP Slave interfaces would be what you called the AMP devices, with one > device per slave. > > > Do we need this hard definition of master and slave in the hardware? > Can the virtio-device be more peer-peer and the master-slave > relationship is purely implemented by the guest? > > > I think we can architect it in a way that the virtio-idm simply connects > two or more peers, and depending from the usage done by the > software, behaving as master from one side and slave on the other. > I used the term slave AMP interface, I should have used AMP client > interface, to indicate the cores/procesors the IDM has inter-connect > (being local or on another QEMU instance). > So there would be an implementation of the AMP client interface that > is based on the assumption that all the processors are on the same > instance, and one based on sockets for the remote instances. > > Do you need this dual mode? Can the IDM just have GPIOs which are then either directly connected to the local CPUs, or sent out over inter-qemu connectivity mechanism? Then the inter-qemu can be used for any GPIO communication. > to make an example, for a single qemu instance with -smp 2 > you would add something like : > > -smp 2 > -device amp-local-client, core_id=0, id=client0 > -device amp-local-client, core_id=1, id=client1 > -device virtio-idm, clients=2, id=idm > > while for remote qemu instances something like > (the opposite to be instantiated on the other remote instance): > > -device amp-local-client, id=client0 > -device amp-remote-client, chardev=chdev_id, id=client1 > -device virtio-idm, clients=2, id=idm-dev > > This way the idm only knows about clients (all clients are the > same for the IDM). The software running on the processors > will enable the interaction between the clients by writing > into the IDM device registers. > > At a first glance, and according to my current proposal, I see > such AMP client interfaces exporting the following methods: > > - raise_interrupt() function: called by the IDM to trigger an > interrupt towards the destination client > > > - boot_trigger() function: called by the IDM to trigger the boot of > the client > > If the clients are remote, socket communication will be used and hidden in > the AMP client interface implementation > > > Do you foresee a different type of interface for the use-case > you have in mind? I ask because if for example the clients are > cores of the same cluster (and same instance), interrupts could > simply be software generated from the linux-kernel/firmware > running on top of the processors and theoretically no need to > go through the IDM, same I guess for the boot. > True. But if you are developing code for IDM, you can do a crawl-before-walk test with an SMP test case. Regards, Peter > Another thing that needs to be defined clearly is the interface between > the IDM and the software running on the cores. > At the moment I am using a set of registers, namely the boot and > the interrupt registers. By writing the ID of a client in such registers > it is possible to forward an interrupt or trigger its boot. > > > Thanks, > > Christian > > > > Regards, > Peter > > > At master side, besides the IDM, one would instantiate > as many interface devices as slaves. During the initialization the IDM will > link > with all those interfaces, and only call functions like: send_interrupt() or > boot_slave() to interact with the slaves. The interface will be the same for > both local or remote slaves, while the implementation of the methods will > differ and reside in the specific AMP Slave Interface device. > On the slave side, if the slave is remote, another instance of the > interface is instantiated so to connect to socket/eventfd. > > So as an example the send_shmem_fd function you pointed could be hidden in > the > slave interface, and invoked only when the IDM will invoke the slave_boot() > function of a remote slave interface. > > This would higher the level of abstraction and open the door to potentially > any > communication mechanism between master and slave, without the need to adapt > the > IDM device to the specific case. Or, eventually, to mix between local and > remote instances. > > > Thanks, > > Christian > > > Regards, > Peter > > >