I posted a patch to the mailing list. http://patchwork.dpdk.org/project/dpdk/patch/1630389404-13468-1-git-send-email-lon...@linuxonhyperv.com/
Jonathan, please let me know if this has fixed the crash you are seeing. Long > Subject: Re: [dpdk-dev] [PATCH v2] bus/vmbus: Fix crash when handling packets > in secondary process > > In the 2nd part, the mapped addresses for the sub-channels in the second > process should be the same as those mapped addresses in the primary process. > > You are correct on seting br->vbr to mapaddr. (To the same addresses used by > the primary process) This can be done by recording the mapped addresses for > all > the sub-channels in the primary process to vmbus_tailq. Later when the > secondary process re-creates those channels, it can call > vmbus_uio_find_resource() to find out the addresses for the same sub-channels > mapped by the primary process. > > > From: Jonathan Erb <jonathan....@banduracyber.com> > Sent: Friday, August 13, 2021 3:10 PM > To: Long Li <lon...@microsoft.com>; Stephen Hemminger > <sthem...@microsoft.com> > Cc: dev@dpdk.org; sta...@dpdk.org > Subject: Re: [PATCH v2] bus/vmbus: Fix crash when handling packets in > secondary process > > > The first part makes sense. There are several ways to iterate over all the > subchannels. > > I'm probably not fully understanding the second part. > vmbus_uio_map_secondary_subchan() wants to map to br->vbr but that does > not appear to be initialized anywhere. In fact, it looks like we should set > br->vbr > to mapaddr which is what vmbus_uio_map_subchan() does. > > Jonathan > On 8/11/21 6:06 PM, Long Li wrote: > I think the code is on the right track. > > Instead of using vmbus_uio_get_num_subchan() and calling > vmbus_uio_get_subchan() on each channel, you can just create a new function > vmbus_uio_get_secondary_subchan(). This function goes through all > subchannels and map ring buffers to the same addresses used by the primary > process. > > In the original code, vmbus_uio_map_secondary_subchan() has a check for "if > (mapaddr == ring_buf)". If the address is mapped to somewhere else, the > address won't work for the secondary process. So we need to keep this check. > > From: Stephen Hemminger > <sthem...@microsoft.com><mailto:sthem...@microsoft.com> > Sent: Wednesday, August 11, 2021 8:26 AM > To: Jonathan Erb > <jonathan....@banduracyber.com><mailto:jonathan....@banduracyber.com>; > Long Li <lon...@microsoft.com><mailto:lon...@microsoft.com> > Cc: dev@dpdk.org<mailto:dev@dpdk.org>; > sta...@dpdk.org<mailto:sta...@dpdk.org> > Subject: RE: [PATCH v2] bus/vmbus: Fix crash when handling packets in > secondary process > > Looks fine, only comments would be to keep to original style and add check if > primary channel not found? > > From: Jonathan Erb > <jonathan....@banduracyber.com<mailto:jonathan....@banduracyber.com>> > Sent: Monday, August 9, 2021 9:07 AM > To: Long Li <lon...@microsoft.com<mailto:lon...@microsoft.com>>; Stephen > Hemminger <sthem...@microsoft.com<mailto:sthem...@microsoft.com>> > Cc: dev@dpdk.org<mailto:dev@dpdk.org>; > sta...@dpdk.org<mailto:sta...@dpdk.org> > Subject: Re: [PATCH v2] bus/vmbus: Fix crash when handling packets in > secondary process > > You don't often get email from > jonathan....@banduracyber.com<mailto:jonathan....@banduracyber.com>. > Learn why this is important<http://aka.ms/LearnAboutSenderIdentification> > > Would it be possible to resolve the lack of subchannels in secondary processes > as follows: > > > > 1. Create the following function in vmbus_uio.c to obtain the count of > subchannels created by the primary: > int vmbus_uio_get_num_subchan(struct vmbus_channel *primary) { > > const struct rte_vmbus_device *dev = primary->device; > char chan_path[PATH_MAX]; > struct dirent *ent; > DIR *chan_dir; > int err; > int subchan_cnt = 0; > > snprintf(chan_path, sizeof(chan_path), > "%s/%s/channels", > SYSFS_VMBUS_DEVICES, dev->device.name); > > chan_dir = opendir(chan_path); > if (!chan_dir) { > VMBUS_LOG(ERR, "cannot open %s: %s", > chan_path, strerror(errno)); > return 0; > } > > while ((ent = readdir(chan_dir))) { > unsigned long relid, subid, monid; > char *endp; > > if (ent->d_name[0] == '.') > continue; > > errno = 0; > relid = strtoul(ent->d_name, &endp, 0); > if (*endp || errno != 0 || relid > UINT16_MAX) { > VMBUS_LOG(NOTICE, "not a valid channel relid: %s", > ent->d_name); > continue; > } > subchan_cnt++; > } > > closedir(chan_dir); > //Less one for primary channel > return subchan_cnt - 1; > > } > > > > 2. Then change the bottom of vmbus_uio_map_secondary() to be simply: > /* fd is not needed in secondary process, close it */ > close(fd); > > if (vmbus_chan_create(dev, dev->relid, 0, > dev->monitor_id, &dev->primary)) { > VMBUS_LOG(ERR, "cannot create primary channel"); > return -1; > } > > int subchannels_cnt = vmbus_uio_get_num_subchan(dev->primary); > for (i = 0; i < subchannels_cnt; i++) { > if(vmbus_uio_get_subchan(dev->primary, &chan)) > return -1; > STAILQ_INSERT_TAIL(&dev->primary->subchannel_list, chan, next); > } > return 0; > } > > > > In this way the secondary processes detect the number of subchannels created > by the primary, then perform their own mappings as needed. All this can occur > before rte_eal_init has completed. > > > > Jonathan > > > > > On 7/26/21 6:16 PM, Long Li wrote: > > Subject: [PATCH v2] bus/vmbus: Fix crash when handling packets in > > secondary process > > > > Have secondary processes construct their own copy of primary channel with > > own mappings. > > > > Remove vmbus_channel primary ptr from struct mapped_vmbus_resource > > as its not used. > > > > Populate virtual memory address "addr" in struct rte_mem_resource for > > secondary processes as netvsc will attempt to reference it thus causing a > > crash. It was initialized for primary processes but not for secondary. > > Cc: sta...@dpdk.org<mailto:sta...@dpdk.org> > > > > Signed-off-by: Jonathan Erb > <jonathan....@banduracyber.com><mailto:jonathan....@banduracyber.com> > > --- > > v2: > > * Remove unnecessary check for NULL pointer before call to rte_free() per > > reviwer comment. > > > > drivers/bus/vmbus/private.h | 1 - > > drivers/bus/vmbus/vmbus_channel.c | 4 +--- > > drivers/bus/vmbus/vmbus_common_uio.c | 14 +++++++++----- > > 3 files changed, 10 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/bus/vmbus/private.h b/drivers/bus/vmbus/private.h > > index 528d60a42f..746212bd5f 100644 > > --- a/drivers/bus/vmbus/private.h > > +++ b/drivers/bus/vmbus/private.h > > @@ -42,7 +42,6 @@ struct mapped_vmbus_resource { > > > > rte_uuid_t id; > > int nb_maps; > > - struct vmbus_channel *primary; > > struct vmbus_map maps[VMBUS_MAX_RESOURCE]; > > char path[PATH_MAX]; > > }; > > diff --git a/drivers/bus/vmbus/vmbus_channel.c > > b/drivers/bus/vmbus/vmbus_channel.c > > index f67f1c438a..119b9b367e 100644 > > --- a/drivers/bus/vmbus/vmbus_channel.c > > +++ b/drivers/bus/vmbus/vmbus_channel.c > > @@ -351,10 +351,8 @@ int rte_vmbus_chan_open(struct rte_vmbus_device > > *device, > > > > err = vmbus_chan_create(device, device->relid, 0, > > device->monitor_id, new_chan); > > - if (!err) { > > + if (!err) > > device->primary = *new_chan; > > - uio_res->primary = *new_chan; > > - } > > > > return err; > > } > > diff --git a/drivers/bus/vmbus/vmbus_common_uio.c > > b/drivers/bus/vmbus/vmbus_common_uio.c > > index 8582e32c1d..83c56b6fa2 100644 > > --- a/drivers/bus/vmbus/vmbus_common_uio.c > > +++ b/drivers/bus/vmbus/vmbus_common_uio.c > > @@ -69,8 +69,10 @@ vmbus_uio_map_secondary(struct rte_vmbus_device > > *dev) > > fd, offset, > > uio_res->maps[i].size, 0); > > > > - if (mapaddr == uio_res->maps[i].addr) > > + if (mapaddr == uio_res->maps[i].addr) { > > + dev->resource[i].addr = mapaddr; > > continue; /* successful map */ > > + } > > > > if (mapaddr == MAP_FAILED) > > VMBUS_LOG(ERR, > > @@ -88,9 +90,9 @@ vmbus_uio_map_secondary(struct rte_vmbus_device > > *dev) > > /* fd is not needed in secondary process, close it */ > > close(fd); > > > > - dev->primary = uio_res->primary; > > - if (!dev->primary) { > > - VMBUS_LOG(ERR, "missing primary channel"); > > + if (vmbus_chan_create(dev, dev->relid, 0, > > + dev->monitor_id, &dev->primary)) { > > + VMBUS_LOG(ERR, "cannot create primary channel"); > > return -1; > > } > > > > Looking at this closer, I don't think it will work for subchannels in the > secondary > process. > > > > The code after is: > > > > STAILQ_FOREACH(chan, &dev->primary->subchannel_list, next) { > > if (vmbus_uio_map_secondary_subchan(dev, chan) != 0) { > > VMBUS_LOG(ERR, "cannot map secondary subchan"); > > return -1; > > } > > } > > > > Because at this time, the primary channel is just created, "subchannel_list" > should be NULL. The secondary process ends up running without subchannels. In > the primary process, "subchannel_list" are populated when it calls > hn_dev_configure(), so it is good. > > > > Sorry I didn't spot this earlier. The best way to fix this is to use rte > memory > functions for allocating vmbus device. In this way they can be properly > mapped > to the secondary process. But rte memory functions are not available at the > time vmbus device is probed. I haven't been able to find a good way to fix > this. > Will keep looking. > > > > Thanks, > > Long > > > > > > > > @@ -211,8 +213,10 @@ vmbus_uio_unmap_resource(struct > > rte_vmbus_device *dev) > > return; > > > > /* secondary processes - just free maps */ > > - if (rte_eal_process_type() != RTE_PROC_PRIMARY) > > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > > + rte_free(dev->primary); > > return vmbus_uio_unmap(uio_res); > > + } > > > > TAILQ_REMOVE(uio_res_list, uio_res, next); > > > > -- > > 2.17.1 >