On Thu, 10 Dec 2015 00:49:08 +0530 Santosh Shukla <sshukla at mvista.com> wrote:
> On Thu, Dec 10, 2015 at 12:34 AM, Stephen Hemminger > <stephen at networkplumber.org> wrote: > > On Thu, 10 Dec 2015 00:29:30 +0530 > > Santosh Shukla <sshukla at mvista.com> wrote: > > > >> On Tue, Dec 8, 2015 at 6:23 PM, Santosh Shukla <sshukla at mvista.com> > >> wrote: > >> > > >> > > >> > On Mon, Dec 7, 2015 at 10:38 PM, Stephen Hemminger > >> > <stephen at networkplumber.org> wrote: > >> >> > >> >> On Fri, 4 Dec 2015 23:05:19 +0530 > >> >> Santosh Shukla <sshukla at mvista.com> wrote: > >> >> > >> >> > > >> >> > +#ifdef RTE_EXEC_ENV_LINUXAPP > >> >> > +/* start address of first pci_iobar slot (user-space virtual-addres) > >> >> > */ > >> >> > +void *ioport_map; > >> >> > +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64) > >> >> > + > >> >> > +#include <sys/mman.h> > >> >> > +#define DEV_NAME "/dev/igb_ioport" > >> >> > + > >> >> > +/* Keeping pci_ioport_size = 4k. > >> >> > + * So maximum mmaped pci_iobar supported = > >> >> > + * (ioport_size/pci_dev->mem_resource[0].len) > >> >> > + * > >> >> > + * note: kernel could allow maximum 32 virtio-net-pci interface, that > >> >> > mean > >> >> > + * maximum 32 PCI_IOBAR(s) where each PCI_IOBAR_LEN=0x20, so > >> >> > virtio_map_ioport() > >> >> > + * func by theory gonna support 4k/0x20 ==> 128 PCI_IOBAR(s), more > >> >> > than > >> >> > + * max-virtio-net-pci interface. > >> >> > + */ > >> >> > +#define PAGE_SIZE 4096 > >> >> > +#define PCI_IOPORT_SIZE PAGE_SIZE > >> >> > +#define PCI_IOPORT_MAX 128 /* 4k / 0x20 */ > >> >> > + > >> >> > +int ioport_map_cnt; > >> >> > +#endif /* ARM, ARM64 */ > >> >> > +#endif /* RTE_EXEC_ENV_LINUXAPP */ > >> >> > >> >> These variables should be static. > >> >> > >> > > >> > (Sorry for delayed follow, Was travelling..) > >> > Right, > >> > > >> >> > >> >> Also, it is should be possible to extract the I/O bar stuff in a generic > >> >> way through sysfs > >> >> and not depend on a character device. The long term goal for DPDK > >> >> acceptance is to > >> >> eliminate (or at least reduce to a minumum) any requirement for special > >> >> kernel drivers. > >> > > >> > > >> > I agree. Existing implementation does read pci_iobar for start address > >> > and > >> > size, But for non-x86 arch, we need someway to map pci_iobar and > >> > thats-why > >> > thought of adding device file for a purpose, as archs like arm lack > >> > iopl() > >> > privileged io syscall support, However iopl() too quite native driver > >> > design > >> > assumption. > >> > > >> > I have few idea in my mind such that - Right now I am updating > >> > ioport_mapped > >> > addr {kernel-virtual-addr-io-memory} to /sys/bus/pci/pci_bus_xxxx/xxx/map > >> > field, instead of mapping their, I'll try to map to uio's pci interface > >> > and > >> > then use existing pci_map_resource() api to mmap > >> > kernel-virtual-io-address > >> > to user-space-virtual-ioaddr. We'll come back on this. > >> > > >> > >> > >> Spent sometime digging dpdk's uio/pci source code, Intent was to map > >> pci ioport region via uio-way. In order to achieve that I tried to > >> hack the virtio-net-pci pmd driver. Right now in virtio-net-pci case, > >> It creates two sysfs entry for pci bars: resource0 /1. > >> > >> Resource0; is ioport region > >> Resource1; is iomem region. > >> > >> By appending a RTE_PCI_DRV_NEED_MAPPING flag to drv_flag and passing > >> hw->io_base = resource1 type pci.mem_resource[slot].addr; where slot > >> =1. Resource1 is IORESOURCE_MEM type so uio/pci driver able to mmap. > >> That way I could get the valid user-space virtual address. However > >> this hack did not worked for me because at qemu side: virtio-pxe.rom > >> has virtio_headers located at ioport pci region and guest driver > >> writing at iomem region, that's why driver init failed. Note that > >> default driver doesn't use resource1 memory. > >> > >> This made me think that either I had to add dependent code in kernel > >> or something similar proposed in this patch. > >> It is because: > >> - uio driver and dependent user-space pci api's in dpdk mmaps > >> IORESOURCE_MEM types address only {refer igbuio_setup_bars() and in > >> particular function pci_parse_sysfs_resource()}. > >> - That mmap in userspace backed by arch specific api > >> pci_mmap_page_range() in kernel. > >> - pci_mmap_page_range() does not support mmaping to IO_RESOURCE_IO type > >> memory. > >> - Having said that, we need routine or a way to to map pci_iobar > >> region from kernel virtual-address to user-space virtual address. > > > > There a couple of gotcha's with this. It turns out the iomem region > > is not mappable on some platforms. I think GCE was one of them. > > afaik, In linux kernel if arch supports pci_mmap_page_range then iomem > region should map, right? I am confused by reading your reply, also I > am not aware of GCE? which platform is GCE, please suggest. I think it was Google Compute Environment that reported an memory region which was huge and not accessible, they have there own vhost.