On 1/11/2017 10:42 AM, Jason Wang wrote:


On 2017年01月10日 14:11, Tan, Jianfeng wrote:

Hi Jason,


On 1/9/2017 12:39 PM, Jason Wang wrote:


On 2016年12月23日 15:14, Jianfeng Tan wrote:
This patch add support vhost kernel as the backend for virtio_user.
Three main hook functions are added:
   - vhost_kernel_setup() to open char device, each vq pair needs one
     vhostfd;
   - vhost_kernel_ioctl() to communicate control messages with vhost
     kernel module;
   - vhost_kernel_enable_queue_pair() to open tap device and set it
as the backend of corresonding vhost fd (that is to say, vq pair).

Signed-off-by: Jianfeng Tan <jianfeng....@intel.com>
---

[...]
+/* TUNSETIFF ifr flags */
+#define IFF_TAP          0x0002
+#define IFF_NO_PI        0x1000
+#define IFF_ONE_QUEUE    0x2000
+#define IFF_VNET_HDR     0x4000
+#define IFF_MULTI_QUEUE  0x0100
+#define IFF_ATTACH_QUEUE 0x0200
+#define IFF_DETACH_QUEUE 0x0400

Do we really want to duplicate those things which has been exposed by uapi here?

You mean those defined in <linux/if_tun.h>? Redefine those common macros, or include standard header file, with respective pros and cons. DPDK prefers the redefinition way as far as I understand, doesn't it?


Well, if you really want to do this, you may want to use an independent file. Then you can sync it with linux headers with a bash script.

Agreed!

[...]
+static struct vhost_memory_kernel *
+prepare_vhost_memory_kernel(void)
+{
+    uint32_t i, j, k = 0;
+    struct rte_memseg *seg;
+    struct vhost_memory_region *mr;
+    struct vhost_memory_kernel *vm;
+
+    vm = malloc(sizeof(struct vhost_memory_kernel) +
+            VHOST_KERNEL_MAX_REGIONS *
+            sizeof(struct vhost_memory_region));
+
+    for (i = 0; i < RTE_MAX_MEMSEG; ++i) {
+        seg = &rte_eal_get_configuration()->mem_config->memseg[i];
+        if (!seg->addr)
+            break;

If we're sure the number of regions is less than 64(or the module parameter read from /sys), can we avoid the iteration here?

The "if" statement under the "for" statement can save us from all RTE_MAX_MEMSEG iteration.

But if we know the number of regions is short than the limit, there's even no need for this?

The problem is: we don't have a variable to know how many segments are there in DPDK. Anyway, we need to report error in the situation that # of segments > # of regions. Or can you give a more specific example?

[...]
+    if (dev->ifname)
+        strncpy(ifr.ifr_name, dev->ifname, IFNAMSIZ);
+    else
+        strncpy(ifr.ifr_name, "tap%d", IFNAMSIZ);
+    if (ioctl(tapfd, TUNSETIFF, (void *)&ifr) == -1) {
+        PMD_DRV_LOG(ERR, "TUNSETIFF failed: %s", strerror(errno));
+        goto error;
+    }

This requires CAP_NET_ADMIN, so we should really consider to accept a pre-created fd here.

It sounds like a future work for me. So far, all DPDK apps are running in privileged mode (including CAP_NET_ADMIN?).


That's not safe.

Yes.

Accepting a per-created fd can solve this, and can even support macvtap

So you are proposing a way to specify the virtio_user parameter like, --vdev=virtio_user,tapfd=fd1,xxx, right? As replied in your another comment, it's a great way to go. But I'm afraid we have no time to implement that in this cycle.

Thanks,
Jianfeng

Reply via email to