On 21/01/2015 17:48, Jiri Slaby wrote: > I am using qemu for teaching the Linux kernel at our university. I > wrote a simple PCI device that can answer to writes/reads, generate > interrupts and perform DMA. As I am dragging it locally over 2 years, > I am sending it to you now. > > Signed-off-by: Jiri Slaby <jsl...@suse.cz> > --- > MAINTAINERS | 5 + > default-configs/pci.mak | 1 + > docs/specs/edu.txt | 106 +++++++++++++ > hw/misc/Makefile.objs | 1 + > hw/misc/edu.c | 409 > ++++++++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 522 insertions(+) > create mode 100644 docs/specs/edu.txt > create mode 100644 hw/misc/edu.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index 430688dcab57..fd335a47bf5c 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -599,6 +599,11 @@ F: hw/net/opencores_eth.c > > Devices > ------- > +EDU > +M: Jiri Slaby <jsl...@suse.cz> > +S: Maintained > +F: hw/misc/edu.c > + > IDE > M: Kevin Wolf <kw...@redhat.com> > M: Stefan Hajnoczi <stefa...@redhat.com> > diff --git a/default-configs/pci.mak b/default-configs/pci.mak > index a186c39c0e32..030cdc7d3dd0 100644 > --- a/default-configs/pci.mak > +++ b/default-configs/pci.mak > @@ -32,3 +32,4 @@ CONFIG_PCI_TESTDEV=y > CONFIG_NVME_PCI=y > CONFIG_SD=y > CONFIG_SDHCI=y > +CONFIG_EDU=y > diff --git a/docs/specs/edu.txt b/docs/specs/edu.txt > new file mode 100644 > index 000000000000..360af27ec8b1 > --- /dev/null > +++ b/docs/specs/edu.txt > @@ -0,0 +1,106 @@ > + > +EDU device > +========== > + > +This is an educational device for writing (kernel) drivers. Its original > +intention was to support the Linux kernel lectures taught at the Masaryk > +University. Students are given this virtual device and are expected to write > a > +driver with I/Os, IRQs, DMAs and such. > + > +The devices behaves very similar to the PCI bridge present in the COMBO6 > cards > +developed under the Liberouter wings. Both PCI device ID and PCI space is > +inherited from that device. > + > +Command line switches: > + -device edu[,dma_mask=mask] > + > + dma_mask makes the virtual device work with DMA addresses with the given > + mask. For educational purposes, the device supports only 28 bits (256 > MiB) > + by default. Students shall set dma_mask for the device in the OS driver > + properly. > + > +PCI specs > +--------- > + > +PCI ID: 1234:11e8 > + > +PCI Region 0: > + I/O memory, 1 MB in size. Users are supposed to communicate with the card > + through this memory. > + > +MMIO area spec > +-------------- > + > +Only size == 4 accesses are allowed for addresses < 0x80. size == 4 or > +size == 8 for the rest. > + > +0x00 (RO) : identification (0xRRrr00edu) > + RR -- major version > + rr -- minor version > + > +0x04 (RW) : card liveness check > + It is a simple value inversion (~ C operator). > + > +0x08 (RW) : factorial computation > + The stored value is taken and factorial of it is put back here. > + This happens only after factorial bit in the status register (0x20 > + below) is cleared. > + > +0x20 (RW) : status register, bitwise OR > + 0x01 -- computing factorial (RO) > + 0x80 -- raise interrupt 0x01 after finishing factorial computation > + > +0x24 (RO) : interrupt status register > + It contains values which raised the interrupt (see interrupt raise > + register below). > + > +0x60 (WO) : interrupt raise register > + Raise an interrupt. The value will be put to the interrupt status > + register (using bitwise OR). > + > +0x64 (WO) : interrupt acknowledge register > + Clear an interrupt. The value will be cleared from the interrupt > + status register. This needs to be done from the ISR to stop > + generating interrupts. > + > +0x80 (RW) : DMA source address > + Where to perform the DMA from. > + > +0x88 (RW) : DMA destination address > + Where to perform the DMA to. > + > +0x90 (RW) : DMA transfer count > + The size of the area to perform the DMA on. > + > +0x98 (RW) : DMA command register, bitwise OR > + 0x01 -- start transfer > + 0x02 -- direction (0: from RAM to EDU, 1: from EDU to RAM) > + 0x04 -- raise interrupt 0x100 after finishing the DMA > + > +IRQ controller > +-------------- > +An IRQ is generated when written to the interrupt raise register. The value > +appears in interrupt status register when the interrupt is raised and has to > +be written to the interrupt acknowledge register to lower it. > + > +DMA controller > +-------------- > +One has to specify, source, destination, size, and start the transfer. One > +4096 bytes long buffer at offset 0x40000 is available in the EDU device. I.e. > +one can perform DMA to/from this space when programmed properly. > + > +Example of transferring a 100 byte block to and from the buffer using a given > +PCI address 'addr': > +addr -> DMA source address > +0x40000 -> DMA destination address > +100 -> DMA transfer count > +1 -> DMA command register > +while (DMA command register & 1) > + ; > + > +0x40000 -> DMA source address > +addr+100 -> DMA destination address > +100 -> DMA transfer count > +3 -> DMA command register > +while (DMA command register & 1) > + ; > diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs > index e47fea853065..029a56f279f1 100644 > --- a/hw/misc/Makefile.objs > +++ b/hw/misc/Makefile.objs > @@ -40,3 +40,4 @@ obj-$(CONFIG_SLAVIO) += slavio_misc.o > obj-$(CONFIG_ZYNQ) += zynq_slcr.o > > obj-$(CONFIG_PVPANIC) += pvpanic.o > +obj-$(CONFIG_EDU) += edu.o > diff --git a/hw/misc/edu.c b/hw/misc/edu.c > new file mode 100644 > index 000000000000..c74f9b64540d > --- /dev/null > +++ b/hw/misc/edu.c > @@ -0,0 +1,409 @@ > +/* > + * QEMU educational PCI device > + * > + * Copyright (c) 2012-2014 Jiri Slaby > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > THE > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + */ > + > +#include "hw/pci/pci.h" > +#include "qemu/timer.h" > +#include "qemu/main-loop.h" /* iothread mutex */ > +#include "qapi/visitor.h" > + > +#define EDU(obj) OBJECT_CHECK(EduState, obj, "edu") > + > +#define FACT_IRQ 0x00000001 > +#define DMA_IRQ 0x00000100 > + > +#define DMA_START 0x40000 > +#define DMA_SIZE 4096 > + > +typedef struct { > + PCIDevice pdev; > + MemoryRegion mmio; > + > + QemuThread thread; > + QemuMutex thr_mutex; > + QemuCond thr_cond; > + bool stopping; > + > + uint32_t addr4; > + uint32_t fact; > +#define EDU_STATUS_COMPUTING 0x01 > +#define EDU_STATUS_IRQFACT 0x80 > + uint32_t status; > + > + uint32_t irq_status; > + > +#define EDU_DMA_RUN 0x1 > +#define EDU_DMA_DIR(cmd) (((cmd) & 0x2) >> 1) > +# define EDU_DMA_FROM_PCI 0 > +# define EDU_DMA_TO_PCI 1 > +#define EDU_DMA_IRQ 0x4 > + struct dma_state { > + dma_addr_t src; > + dma_addr_t dst; > + dma_addr_t cnt; > + dma_addr_t cmd; > + } dma; > + QEMUTimer dma_timer; > + char dma_buf[DMA_SIZE]; > + uint64_t dma_mask; > +} EduState; > + > +static void edu_raise_irq(EduState *edu, uint32_t val) > +{ > + edu->irq_status |= val; > + if (edu->irq_status) { > + pci_set_irq(&edu->pdev, 1); > + } > +} > + > +static void edu_lower_irq(EduState *edu, uint32_t val) > +{ > + edu->irq_status &= ~val; > + > + if (!edu->irq_status) { > + pci_set_irq(&edu->pdev, 0); > + } > +} > + > +static bool within(uint32_t addr, uint32_t start, uint32_t end) > +{ > + return start <= addr && addr < end; > +} > + > +static void edu_check_range(uint32_t addr, uint32_t size1, uint32_t start, > + uint32_t size2) > +{ > + uint32_t end1 = addr + size1; > + uint32_t end2 = start + size2; > + > + if (within(addr, start, end2) && > + end1 > addr && within(end1, start, end2)) { > + return; > + } > + > + hw_error("EDU: DMA range 0x%.8x-0x%.8x out of bounds (0x%.8x-0x%.8x)!", > + addr, end1 - 1, start, end2 - 1); > +} > + > +static dma_addr_t edu_clamp_addr(const EduState *edu, dma_addr_t addr) > +{ > + dma_addr_t res = addr & edu->dma_mask; > + > + if (addr != res) { > + printf("EDU: clamping DMA 0x%.16lx to 0x%.16lx!\n", addr, res); > + } > + > + return res; > +} > + > +static void edu_dma_timer(void *opaque) > +{ > + EduState *edu = opaque; > + bool raise_irq = false; > + > + if (!(edu->dma.cmd & EDU_DMA_RUN)) { > + return; > + } > + > + if (EDU_DMA_DIR(edu->dma.cmd) == EDU_DMA_FROM_PCI) { > + uint32_t dst = edu->dma.dst; > + edu_check_range(dst, edu->dma.cnt, DMA_START, DMA_SIZE); > + dst -= DMA_START; > + pci_dma_read(&edu->pdev, edu_clamp_addr(edu, edu->dma.src), > + edu->dma_buf + dst, edu->dma.cnt); > + } else { > + uint32_t src = edu->dma.src; > + edu_check_range(src, edu->dma.cnt, DMA_START, DMA_SIZE); > + src -= DMA_START; > + pci_dma_write(&edu->pdev, edu_clamp_addr(edu, edu->dma.dst), > + edu->dma_buf + src, edu->dma.cnt); > + } > + > + edu->dma.cmd &= ~EDU_DMA_RUN; > + if (edu->dma.cmd & EDU_DMA_IRQ) { > + raise_irq = true; > + } > + > + if (raise_irq) { > + edu_raise_irq(edu, DMA_IRQ); > + } > +} > + > +static void dma_rw(EduState *edu, bool write, dma_addr_t *val, dma_addr_t > *dma, > + bool timer) > +{ > + if (write && (edu->dma.cmd & EDU_DMA_RUN)) { > + return; > + } > + > + if (write) { > + *dma = *val; > + } else { > + *val = *dma; > + } > + > + if (timer) { > + timer_mod(&edu->dma_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + > 100); > + } > +} > + > +static uint64_t edu_mmio_read(void *opaque, hwaddr addr, unsigned size) > +{ > + EduState *edu = opaque; > + uint64_t val = ~0ULL; > + > + if (size != 4) { > + return val; > + } > + > + switch (addr) { > + case 0x00: > + val = 0x010000edu; > + break; > + case 0x04: > + val = edu->addr4; > + break; > + case 0x08: > + qemu_mutex_lock(&edu->thr_mutex); > + val = edu->fact; > + qemu_mutex_unlock(&edu->thr_mutex); > + break; > + case 0x20: > + val = atomic_read(&edu->status); > + break; > + case 0x24: > + val = edu->irq_status; > + break; > + case 0x80: > + dma_rw(edu, false, &val, &edu->dma.src, false); > + break; > + case 0x88: > + dma_rw(edu, false, &val, &edu->dma.dst, false); > + break; > + case 0x90: > + dma_rw(edu, false, &val, &edu->dma.cnt, false); > + break; > + case 0x98: > + dma_rw(edu, false, &val, &edu->dma.cmd, false); > + break; > + } > + > + return val; > +} > + > +static void edu_mmio_write(void *opaque, hwaddr addr, uint64_t val, > + unsigned size) > +{ > + EduState *edu = opaque; > + > + if (addr < 0x80 && size != 4) { > + return; > + } > + > + if (addr >= 0x80 && size != 4 && size != 8) { > + return; > + } > + > + switch (addr) { > + case 0x04: > + edu->addr4 = ~val; > + break; > + case 0x08: > + if (atomic_read(&edu->status) & EDU_STATUS_COMPUTING) { > + break; > + } > + /* EDU_STATUS_COMPUTING cannot go 0->1 concurrently, because it is > only > + * set in this function and it is under the iothread mutex. > + */ > + qemu_mutex_lock(&edu->thr_mutex); > + edu->fact = val; > + atomic_or(&edu->status, EDU_STATUS_COMPUTING); > + qemu_cond_signal(&edu->thr_cond); > + qemu_mutex_unlock(&edu->thr_mutex); > + break; > + case 0x20: > + if (val & EDU_STATUS_IRQFACT) { > + atomic_or(&edu->status, EDU_STATUS_IRQFACT); > + } else { > + atomic_and(&edu->status, ~EDU_STATUS_IRQFACT); > + } > + break; > + case 0x60: > + edu_raise_irq(edu, val); > + break; > + case 0x64: > + edu_lower_irq(edu, val); > + break; > + case 0x80: > + dma_rw(edu, true, &val, &edu->dma.src, false); > + break; > + case 0x88: > + dma_rw(edu, true, &val, &edu->dma.dst, false); > + break; > + case 0x90: > + dma_rw(edu, true, &val, &edu->dma.cnt, false); > + break; > + case 0x98: > + if (!(val & EDU_DMA_RUN)) { > + break; > + } > + dma_rw(edu, true, &val, &edu->dma.cmd, true); > + break; > + } > +} > + > +static const MemoryRegionOps edu_mmio_ops = { > + .read = edu_mmio_read, > + .write = edu_mmio_write, > + .endianness = DEVICE_NATIVE_ENDIAN, > +}; > + > +/* > + * We purposedly use a thread, so that users are forced to wait for the > status > + * register. > + */ > +static void *edu_fact_thread(void *opaque) > +{ > + EduState *edu = opaque; > + > + while (1) { > + uint32_t val, ret = 1; > + > + qemu_mutex_lock(&edu->thr_mutex); > + while ((atomic_read(&edu->status) & EDU_STATUS_COMPUTING) == 0 && > + !edu->stopping) { > + qemu_cond_wait(&edu->thr_cond, &edu->thr_mutex); > + } > + > + if (edu->stopping) { > + qemu_mutex_unlock(&edu->thr_mutex); > + break; > + } > + > + val = edu->fact; > + qemu_mutex_unlock(&edu->thr_mutex); > + > + while (val > 0) { > + ret *= val--; > + } > + > + /* > + * We should sleep for a random period here, so that students are > + * forced to check the status properly. > + */ > + > + qemu_mutex_lock(&edu->thr_mutex); > + edu->fact = ret; > + qemu_mutex_unlock(&edu->thr_mutex); > + atomic_and(&edu->status, ~EDU_STATUS_COMPUTING); > + > + if (atomic_read(&edu->status) & EDU_STATUS_IRQFACT) { > + qemu_mutex_lock_iothread(); > + edu_raise_irq(edu, FACT_IRQ); > + qemu_mutex_unlock_iothread(); > + } > + } > + > + return NULL; > +} > + > +static int pci_edu_init(PCIDevice *pdev) > +{ > + EduState *edu = DO_UPCAST(EduState, pdev, pdev); > + uint8_t *pci_conf = pdev->config; > + > + timer_init(&edu->dma_timer, main_loop_tlg.tl[QEMU_CLOCK_VIRTUAL], > SCALE_MS, > + edu_dma_timer, edu); > + > + qemu_mutex_init(&edu->thr_mutex); > + qemu_cond_init(&edu->thr_cond); > + qemu_thread_create(&edu->thread, "edu", edu_fact_thread, > + edu, QEMU_THREAD_JOINABLE); > + > + pci_config_set_interrupt_pin(pci_conf, 1); > + > + memory_region_init_io(&edu->mmio, OBJECT(edu), &edu_mmio_ops, edu, > + "edu-mmio", 1 << 20); > + pci_register_bar(pdev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &edu->mmio); > + > + return 0; > +} > + > +static void pci_edu_uninit(PCIDevice *pdev) > +{ > + EduState *edu = DO_UPCAST(EduState, pdev, pdev); > + > + qemu_mutex_lock(&edu->thr_mutex); > + edu->stopping = true; > + qemu_mutex_unlock(&edu->thr_mutex); > + qemu_cond_signal(&edu->thr_cond); > + qemu_thread_join(&edu->thread); > + > + qemu_cond_destroy(&edu->thr_cond); > + qemu_mutex_destroy(&edu->thr_mutex); > + > + timer_del(&edu->dma_timer); > +} > + > +static void edu_obj_uint64(Object *obj, struct Visitor *v, void *opaque, > + const char *name, Error **errp) > +{ > + uint64_t *val = opaque; > + > + visit_type_uint64(v, val, name, errp); > +} > + > +static void edu_instance_init(Object *obj) > +{ > + EduState *edu = EDU(obj); > + > + edu->dma_mask = (1UL << 28) - 1; > + object_property_add(obj, "dma_mask", "uint64", edu_obj_uint64, > + edu_obj_uint64, NULL, &edu->dma_mask, NULL); > +} > + > +static void edu_class_init(ObjectClass *class, void *data) > +{ > + PCIDeviceClass *k = PCI_DEVICE_CLASS(class); > + > + k->init = pci_edu_init; > + k->exit = pci_edu_uninit; > + k->vendor_id = PCI_VENDOR_ID_QEMU; > + k->device_id = 0x11e8; > + k->revision = 0x10; > + k->class_id = PCI_CLASS_OTHERS; > +} > + > +static void pci_edu_register_types(void) > +{ > + static const TypeInfo edu_info = { > + .name = "edu", > + .parent = TYPE_PCI_DEVICE, > + .instance_size = sizeof(EduState), > + .instance_init = edu_instance_init, > + .class_init = edu_class_init, > + }; > + > + type_register_static(&edu_info); > +} > +type_init(pci_edu_register_types) >
Applied, thanks. Pull request should come later this week or early next week. Paolo