[PATCH] staging: lustre: remove IOC_LIBCFS_PING_TEST ioctl
From: James Simmons The ioctl IOC_LIBCFS_PING_TEST has not been used in ages. The recent nidstring changes which moved all the nidstring operations from libcfs to the LNet layer but this ioctl code was still using an nidstring operation that was causing an circular dependency loop between libcfs and LNet. Signed-off-by: James Simmons --- .../lustre/include/linux/libcfs/libcfs_ioctl.h |1 - drivers/staging/lustre/lustre/libcfs/module.c | 17 - 2 files changed, 0 insertions(+), 18 deletions(-) diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h index f5d741f..485ab26 100644 --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h @@ -110,7 +110,6 @@ struct libcfs_ioctl_handler { #define IOC_LIBCFS_CLEAR_DEBUG _IOWR('e', 31, long) #define IOC_LIBCFS_MARK_DEBUG_IOWR('e', 32, long) #define IOC_LIBCFS_MEMHOG_IOWR('e', 36, long) -#define IOC_LIBCFS_PING_TEST _IOWR('e', 37, long) /* lnet ioctls */ #define IOC_LIBCFS_GET_NI_IOWR('e', 50, long) #define IOC_LIBCFS_FAIL_NID_IOWR('e', 51, long) diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c index 50e8fd2..5e22820 100644 --- a/drivers/staging/lustre/lustre/libcfs/module.c +++ b/drivers/staging/lustre/lustre/libcfs/module.c @@ -274,23 +274,6 @@ static int libcfs_ioctl_int(struct cfs_psdev_file *pfile, unsigned long cmd, } break; - case IOC_LIBCFS_PING_TEST: { - extern void (kping_client)(struct libcfs_ioctl_data *); - void (*ping)(struct libcfs_ioctl_data *); - - CDEBUG(D_IOCTL, "doing %d pings to nid %s (%s)\n", - data->ioc_count, libcfs_nid2str(data->ioc_nid), - libcfs_nid2str(data->ioc_nid)); - ping = symbol_get(kping_client); - if (!ping) - CERROR("symbol_get failed\n"); - else { - ping(data); - symbol_put(kping_client); - } - return 0; - } - default: { struct libcfs_ioctl_handler *hand; -- 1.7.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH 3/3] tools: hv: Check return value of poll call
On Thu, Jun 27, 2013 at 5:22 PM, Tomas Hozza wrote: > Check return value of poll call and if it fails print error to the > system log. If errno is EINVAL then exit with non-zero value otherwise > continue the while loop and call poll again. > > Signed-off-by: Tomas Hozza > --- > tools/hv/hv_vss_daemon.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/tools/hv/hv_vss_daemon.c b/tools/hv/hv_vss_daemon.c > index 6c4d2f1..826d499 100644 > --- a/tools/hv/hv_vss_daemon.c > +++ b/tools/hv/hv_vss_daemon.c > @@ -204,7 +204,16 @@ int main(void) > socklen_t addr_l = sizeof(addr); > pfd.events = POLLIN; > pfd.revents = 0; > - poll(&pfd, 1, -1); > + > + if (poll(&pfd, 1, -1) < 0) { >From the description it seems that this is a typo. It should be 'while'? > + syslog(LOG_ERR, "poll failed; error:%d %s", errno, > strerror(errno)); > + if (errno == EINVAL) { > + close(fd); > + exit(EXIT_FAILURE); > + } > + else > + continue; > + } > > len = recvfrom(fd, vss_recv_buffer, sizeof(vss_recv_buffer), > 0, > addr_p, &addr_l); > -- > 1.8.1.4 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel -- ~Santosh ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: leaking path in android binder: set_nice
On Tue, Sep 25, 2018 at 01:52:57PM -0400, Stephen Smalley wrote: > On 09/25/2018 01:27 PM, Tong Zhang wrote: > > Kernel Version: 4.18.5 > > > > Problem Description: > > > > When setting nice value, it is checked by LSM function > > security_task_setnice(). > > see kernel/sched/core.c:3972 SYSCALL_DEFINE1(nice, int, increment) > > > > We discovered a leaking path in android binder which allows using binder’s > > interface to change > > a process’s nice value. This path is leaked from being monitored by LSM. > > see drivers/android/binder.c:1107 binder_set_nice. > > Not sure you want to invoke the LSM hook (or at least the same hook) when > binder is performing priority inheritance. There is a difference between a > userspace process switching its own priority and the kernel binder driver > performing it. IIUC, the can_nice() check is more about honoring > RLIMIT_NICE than anything else. I agree with Stephen; it doesn't make sense to subject the binder PI mechanism to the LSM hook. - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH] erofs: move erofs out of staging
On Sun, Aug 18, 2019 at 11:21:13AM +0200, Richard Weinberger wrote: > > Not to say that erofs shouldn't be worked on to fix these kinds of > > issues, just that it's not an unheard of thing to trust the disk image. > > Especially for the normal usage model of erofs, where the whole disk > > image is verfied before it is allowed to be mounted as part of the boot > > process. > > For normal use I see no problem at all. > I fear distros that try to mount anything you plug into your USB. > > At least SUSE already blacklists erofs: > https://github.com/openSUSE/suse-module-tools/blob/master/suse-module-tools.spec#L24 Note that of the mainstream file systems, ext4 and xfs don't guarantee that it's safe to blindly take maliciously provided file systems, such as those provided by a untrusted container, and mount it on a file system without problems. As I recall, one of the XFS developers described file system fuzzing reports as a denial of service attack on the developers. And on the ext4 side, while I try to address them, it is by no means considered a high priority work item, and I don't consider fixes of fuzzing reports to be worthy of coordinated disclosure or a high priority bug to fix. (I have closed more bugs in this area than XFS has, although that may be that ext4 started with more file system format bugs than XFS; however given the time to first bug in 2017 using American Fuzzy Lop[1] being 5 seconds for btrfs, 10 seconds for f2fs, 25 seconds for reiserfs, 4 minutes for ntfs, 1h45m for xfs, and 2h for ext4, that seems unlikely.) [1] https://events.static.linuxfound.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf So holding a file system like EROFS to a higher standard than say, ext4, xfs, or btrfs hardly seems fair. There seems to be a very unfortunate tendancy for us to hold new file systems to impossibly high standards, when in fact, adding a file system to Linux should not, in my opinion, be a remarkable event. We have a huge number of them already, many of which are barely maintained and probably have far worse issues than file systems trying to get into the clubhouse. If a file system is requesting core changes to the VM or block layers, sure, we should care about the interfaces. But this nitpicking about whether or not a file system can be trusted in what I consider to be COMPLETELY INSANE CONTAINER USE CASES is really not fair. Cheers, - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH] erofs: move erofs out of staging
On Sun, Aug 18, 2019 at 08:58:12AM -0700, Christoph Hellwig wrote: > On Sun, Aug 18, 2019 at 11:11:54AM -0400, Theodore Y. Ts'o wrote: > > Note that of the mainstream file systems, ext4 and xfs don't guarantee > > that it's safe to blindly take maliciously provided file systems, such > > as those provided by a untrusted container, and mount it on a file > > system without problems. As I recall, one of the XFS developers > > described file system fuzzing reports as a denial of service attack on > > the developers. > > I think this greatly misrepresents the general attitute of the XFS > developers. We take sanity checks for the modern v5 on disk format > very series, and put a lot of effort into handling corrupted file > systems as good as possible, although there are of course no guaranteeѕ. > > The quote that you've taken out of context is for the legacy v4 format > that has no checksums and other integrity features. Actually, what Prof. Kim's research group was doing was taking the latest file system formats (for ext4 and xfs) and fixing up the checksum after fuzzing the metadata blocks. The goal was to find potential security vulnerabilities, not to see if file systems would crash if fed invalid input. At least for ext4, at least one of Prof. Kim's fuzzing results was one that that I believe could have been leveraged into a stack overflow attack. I can't speak to his results with respect to XFS, since I didn't look at them. Cheers, - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH] erofs: move erofs out of staging
On Sun, Aug 18, 2019 at 07:06:40PM +0200, Richard Weinberger wrote: > > So holding a file system like EROFS to a higher standard than say, > > ext4, xfs, or btrfs hardly seems fair. > > Nobody claimed that. Pointing out that erofs has issues in this area when Gao Xiang is asking if erofs can be moved out of staging and join the "official clubhouse" of file systems could certainly be reasonable interpreted as such. Reporting such vulnerablities are a good thing, and hopefully all file system maintainers will welcome them. Doing them on a e-mail thread about promoting out of erofs is certainly going to lead to inferences of a double standard. Cheers, - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH] erofs: move erofs out of staging
On Tue, Aug 20, 2019 at 10:24:11AM +0800, Chao Yu wrote: > Out of curiosity, it looks like every mainstream filesystem has its own > fuzz/injection tool in their tool-set, if it's really such a generic > requirement, why shouldn't there be a common tool to handle that, let > specified > filesystem fill the tool's callback to seek a node/block and supported fields > can be fuzzed in inode. It can help to avoid redundant work whenever Linux > welcomes a new filesystem The reason why there needs to be at least some file system specific code for fuzz testing is because for efficiency's sake, you don't want to fuzz every single bit in the file system, but just the ones which are most interesting (e.g., the metadata blocks). For file systems which use checksum to protect against accidental corruption, the file system fuzzer needs to also fix up the checksums (since you can be sure malicious attackers will do this). What you *can* do is to make the file system specific portion of the work as small as possible. Great work in this area is Professor Kim's Janus[1][2] and Hydra[2] work. (Hydra is about to be published at SOSP 19, and was partially funded from a Google Faculty Research Work.) [1] https://taesoo.kim/pubs/2019/xu:janus.pdf [2] https://github.com/sslab-gatech/janus [3] https://github.com/sslab-gatech/hydra > > Personally speaking, debugging tool is way more important than a running > > kernel module/fuse. > > It's human trying to write the code, most of time is spent educating > > code readers, thus debugging tool is way more important than dead cold code. I personally find that having a tool like e2fsprogs' debugfs program to be really handy. It's useful for creating regression test images; it's useful for debugging results from fuzzing systems like Janus; it's useful for examining broken file systems extracted from busted Android handsets during dogfood to root cause bugs which escaped xfstests testing; etc. Cheers, - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH] erofs: move erofs out of staging
On Wed, Aug 21, 2019 at 12:35:08AM +0800, Gao Xiang wrote: > > For EROFS, it's a special case since it is a RO fs, and erofs mkfs > will generate reproducable images (which means, for one dir trees, > it only generates exact one result except for build time). Agreed, and given that, doing the fuzzing in your mkfs tool makes perfect sense. I wasn't surprised at all that you chose that path. Cheers, - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC] errno.h: Provide EFSCORRUPTED for everybody
On Wed, Oct 30, 2019 at 09:07:33PM -0400, Valdis Kletnieks wrote: > Three questions: (a) ACK/NAK on this patch, (b) should it be all in one > patch, or one to add to errno.h and 6 patches for 6 filesystems?), and > (c) if one patch, who gets to shepherd it through? Acked-by: Theodore Ts'o ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/3] Drivers: hv: hv_balloon
Some bug fixes for the balloon driver. K. Y. Srinivasan (3): Drivers: hv: hv_balloon: Make adjustments in computing the floor Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver Drivers: hv: hv_balloon: Don't post pressure status from interrupt context drivers/hv/hv_balloon.c | 89 ++ 1 files changed, 73 insertions(+), 16 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/3] Drivers: hv: hv_balloon: Make adjustments in computing the floor
Make adjustments in computing the balloon floor. Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index b958ded..9cbbb83 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -928,9 +928,8 @@ static unsigned long compute_balloon_floor(void) * 12872(1/2) * 512 168(1/4) *2048 360(1/8) -*8192 552(1/32) -* 32768 1320 -* 131072 4392 +*8192 768(1/16) +* 32768 1536(1/32) */ if (totalram_pages < MB2PAGES(128)) min_pages = MB2PAGES(8) + (totalram_pages >> 1); @@ -938,8 +937,10 @@ static unsigned long compute_balloon_floor(void) min_pages = MB2PAGES(40) + (totalram_pages >> 2); else if (totalram_pages < MB2PAGES(2048)) min_pages = MB2PAGES(104) + (totalram_pages >> 3); + else if (totalram_pages < MB2PAGES(8192)) + min_pages = MB2PAGES(256) + (totalram_pages >> 4); else - min_pages = MB2PAGES(296) + (totalram_pages >> 5); + min_pages = MB2PAGES(512) + (totalram_pages >> 5); #undef MB2PAGES return min_pages; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/3] Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver
We support memory hot-add in the Hyper-V balloon driver by hot adding an appropriately sized and aligned region and controlling the on-lining of pages within that region based on the pages that the host wants us to online. We do this because the granularity and alignment requirements in Linux are different from what Windows expects. The state to manage the onlining of pages needs to be correctly protected. Fix this bug. Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c | 69 +++--- 1 files changed, 64 insertions(+), 5 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 9cbbb83..2c610ec 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -533,6 +533,9 @@ struct hv_dynmem_device { */ struct task_struct *thread; + struct mutex ha_region_mutex; + struct completion waiter_event; + /* * A list of hot-add regions. */ @@ -549,7 +552,60 @@ struct hv_dynmem_device { static struct hv_dynmem_device dm_device; static void post_status(struct hv_dynmem_device *dm); + +static void acquire_region_mutex(bool trylock) +{ + if (trylock) { + reinit_completion(&dm_device.waiter_event); + while (!mutex_trylock(&dm_device.ha_region_mutex)) + wait_for_completion(&dm_device.waiter_event); + } else { + mutex_lock(&dm_device.ha_region_mutex); + } +} + +static void release_region_mutex(bool trylock) +{ + if (trylock) { + mutex_unlock(&dm_device.ha_region_mutex); + } else { + mutex_unlock(&dm_device.ha_region_mutex); + complete(&dm_device.waiter_event); + } +} + + #ifdef CONFIG_MEMORY_HOTPLUG +static int hv_memory_notifier(struct notifier_block *nb, unsigned long val, + void *v) +{ + switch (val) { + case MEM_GOING_ONLINE: + acquire_region_mutex(true); + break; + + case MEM_ONLINE: + case MEM_CANCEL_ONLINE: + release_region_mutex(true); + if (dm_device.ha_waiting) { + dm_device.ha_waiting = false; + complete(&dm_device.ol_waitevent); + } + break; + + case MEM_GOING_OFFLINE: + case MEM_OFFLINE: + case MEM_CANCEL_OFFLINE: + break; + } + return NOTIFY_OK; +} + +static struct notifier_block hv_memory_nb = { + .notifier_call = hv_memory_notifier, + .priority = 0 +}; + static void hv_bring_pgs_online(unsigned long start_pfn, unsigned long size) { @@ -591,6 +647,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, init_completion(&dm_device.ol_waitevent); dm_device.ha_waiting = true; + release_region_mutex(false); nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), (HA_CHUNK << PAGE_SHIFT)); @@ -619,6 +676,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, * have not been "onlined" within the allowed time. */ wait_for_completion_timeout(&dm_device.ol_waitevent, 5*HZ); + acquire_region_mutex(false); post_status(&dm_device); } @@ -632,11 +690,6 @@ static void hv_online_page(struct page *pg) unsigned long cur_start_pgp; unsigned long cur_end_pgp; - if (dm_device.ha_waiting) { - dm_device.ha_waiting = false; - complete(&dm_device.ol_waitevent); - } - list_for_each(cur, &dm_device.ha_region_list) { has = list_entry(cur, struct hv_hotadd_state, list); cur_start_pgp = (unsigned long) @@ -834,6 +887,7 @@ static void hot_add_req(struct work_struct *dummy) resp.hdr.size = sizeof(struct dm_hot_add_response); #ifdef CONFIG_MEMORY_HOTPLUG + acquire_region_mutex(false); pg_start = dm->ha_wrk.ha_page_range.finfo.start_page; pfn_cnt = dm->ha_wrk.ha_page_range.finfo.page_cnt; @@ -865,6 +919,7 @@ static void hot_add_req(struct work_struct *dummy) if (do_hot_add) resp.page_count = process_hot_add(pg_start, pfn_cnt, rg_start, rg_sz); + release_region_mutex(false); #endif /* * The result field of the response structure has the @@ -1388,7 +1443,9 @@ static int balloon_probe(struct hv_device *dev, dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN7; init_completion(&dm_device.host_event); init_completion(&dm_device.config_event); + init_completion(&dm_device.waiter_event); IN
[PATCH 3/3] Drivers: hv: hv_balloon: Don't post pressure status from interrupt context
We currently release memory (balloon down) in the interrupt context and we also post memory status while releasing memory. Rather than posting the status in the interrupt context, wakeup the status posting thread to post the status. This will address the inconsistent lock state that Sitsofe Wheeler reported: http://lkml.iu.edu/hypermail/linux/kernel/1411.1/00075.html Signed-off-by: K. Y. Srinivasan Reported-by: Sitsofe Wheeler --- drivers/hv/hv_balloon.c | 11 --- 1 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 2c610ec..afdb0d5 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1227,7 +1227,7 @@ static void balloon_down(struct hv_dynmem_device *dm, for (i = 0; i < range_count; i++) { free_balloon_pages(dm, &range_array[i]); - post_status(&dm_device); + complete(&dm_device.config_event); } if (req->more_pages == 1) @@ -1251,19 +1251,16 @@ static void balloon_onchannelcallback(void *context); static int dm_thread_func(void *dm_dev) { struct hv_dynmem_device *dm = dm_dev; - int t; while (!kthread_should_stop()) { - t = wait_for_completion_interruptible_timeout( + wait_for_completion_interruptible_timeout( &dm_device.config_event, 1*HZ); /* * The host expects us to post information on the memory * pressure every second. */ - - if (t == 0) - post_status(dm); - + reinit_completion(&dm_device.config_event); + post_status(dm); } return 0; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] X86: Mark the Hyper-V clocksource as being continuous
The clocksource based on Hyper-V per-partition reference count MSR is continuous. Mark it accordingly. Signed-off-by: K. Y. Srinivasan cc: sta...@vger.kernel.org --- arch/x86/kernel/cpu/mshyperv.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index a450373..939155f 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -107,6 +107,7 @@ static struct clocksource hyperv_cs = { .rating = 400, /* use this when running on Hyperv*/ .read = read_hv_clock, .mask = CLOCKSOURCE_MASK(64), + .flags = CLOCK_SOURCE_IS_CONTINUOUS, }; static void __init ms_hyperv_init_platform(void) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 0/3] Drivers: hv: hv_balloon
Some bug fixes for the balloon driver. In this version, based on Dan Carpenter's comment, I have added some additional information to the change log. K. Y. Srinivasan (3): Drivers: hv: hv_balloon: Make adjustments in computing the floor Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver Drivers: hv: hv_balloon: Don't post pressure status from interrupt context drivers/hv/hv_balloon.c | 89 ++ 1 files changed, 73 insertions(+), 16 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 1/3] Drivers: hv: hv_balloon: Make adjustments in computing the floor
Make adjustments in computing the balloon floor. The current computation of the balloon floor was not appropriate for virtual machines with more than 10 GB of assigned memory - we would get into situations where the host would agressively balloon down the guest and leave the guest in an unusable state. This patch fixes the issue by raising the floor. Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index b958ded..9cbbb83 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -928,9 +928,8 @@ static unsigned long compute_balloon_floor(void) * 12872(1/2) * 512 168(1/4) *2048 360(1/8) -*8192 552(1/32) -* 32768 1320 -* 131072 4392 +*8192 768(1/16) +* 32768 1536(1/32) */ if (totalram_pages < MB2PAGES(128)) min_pages = MB2PAGES(8) + (totalram_pages >> 1); @@ -938,8 +937,10 @@ static unsigned long compute_balloon_floor(void) min_pages = MB2PAGES(40) + (totalram_pages >> 2); else if (totalram_pages < MB2PAGES(2048)) min_pages = MB2PAGES(104) + (totalram_pages >> 3); + else if (totalram_pages < MB2PAGES(8192)) + min_pages = MB2PAGES(256) + (totalram_pages >> 4); else - min_pages = MB2PAGES(296) + (totalram_pages >> 5); + min_pages = MB2PAGES(512) + (totalram_pages >> 5); #undef MB2PAGES return min_pages; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 2/3] Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver
We support memory hot-add in the Hyper-V balloon driver by hot adding an appropriately sized and aligned region and controlling the on-lining of pages within that region based on the pages that the host wants us to online. We do this because the granularity and alignment requirements in Linux are different from what Windows expects. The state to manage the onlining of pages needs to be correctly protected. Fix this bug. Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c | 69 +++--- 1 files changed, 64 insertions(+), 5 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 9cbbb83..2c610ec 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -533,6 +533,9 @@ struct hv_dynmem_device { */ struct task_struct *thread; + struct mutex ha_region_mutex; + struct completion waiter_event; + /* * A list of hot-add regions. */ @@ -549,7 +552,60 @@ struct hv_dynmem_device { static struct hv_dynmem_device dm_device; static void post_status(struct hv_dynmem_device *dm); + +static void acquire_region_mutex(bool trylock) +{ + if (trylock) { + reinit_completion(&dm_device.waiter_event); + while (!mutex_trylock(&dm_device.ha_region_mutex)) + wait_for_completion(&dm_device.waiter_event); + } else { + mutex_lock(&dm_device.ha_region_mutex); + } +} + +static void release_region_mutex(bool trylock) +{ + if (trylock) { + mutex_unlock(&dm_device.ha_region_mutex); + } else { + mutex_unlock(&dm_device.ha_region_mutex); + complete(&dm_device.waiter_event); + } +} + + #ifdef CONFIG_MEMORY_HOTPLUG +static int hv_memory_notifier(struct notifier_block *nb, unsigned long val, + void *v) +{ + switch (val) { + case MEM_GOING_ONLINE: + acquire_region_mutex(true); + break; + + case MEM_ONLINE: + case MEM_CANCEL_ONLINE: + release_region_mutex(true); + if (dm_device.ha_waiting) { + dm_device.ha_waiting = false; + complete(&dm_device.ol_waitevent); + } + break; + + case MEM_GOING_OFFLINE: + case MEM_OFFLINE: + case MEM_CANCEL_OFFLINE: + break; + } + return NOTIFY_OK; +} + +static struct notifier_block hv_memory_nb = { + .notifier_call = hv_memory_notifier, + .priority = 0 +}; + static void hv_bring_pgs_online(unsigned long start_pfn, unsigned long size) { @@ -591,6 +647,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, init_completion(&dm_device.ol_waitevent); dm_device.ha_waiting = true; + release_region_mutex(false); nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), (HA_CHUNK << PAGE_SHIFT)); @@ -619,6 +676,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, * have not been "onlined" within the allowed time. */ wait_for_completion_timeout(&dm_device.ol_waitevent, 5*HZ); + acquire_region_mutex(false); post_status(&dm_device); } @@ -632,11 +690,6 @@ static void hv_online_page(struct page *pg) unsigned long cur_start_pgp; unsigned long cur_end_pgp; - if (dm_device.ha_waiting) { - dm_device.ha_waiting = false; - complete(&dm_device.ol_waitevent); - } - list_for_each(cur, &dm_device.ha_region_list) { has = list_entry(cur, struct hv_hotadd_state, list); cur_start_pgp = (unsigned long) @@ -834,6 +887,7 @@ static void hot_add_req(struct work_struct *dummy) resp.hdr.size = sizeof(struct dm_hot_add_response); #ifdef CONFIG_MEMORY_HOTPLUG + acquire_region_mutex(false); pg_start = dm->ha_wrk.ha_page_range.finfo.start_page; pfn_cnt = dm->ha_wrk.ha_page_range.finfo.page_cnt; @@ -865,6 +919,7 @@ static void hot_add_req(struct work_struct *dummy) if (do_hot_add) resp.page_count = process_hot_add(pg_start, pfn_cnt, rg_start, rg_sz); + release_region_mutex(false); #endif /* * The result field of the response structure has the @@ -1388,7 +1443,9 @@ static int balloon_probe(struct hv_device *dev, dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN7; init_completion(&dm_device.host_event); init_completion(&dm_device.config_event); + init_completion(&dm_device.waiter_event); IN
[PATCH V2 3/3] Drivers: hv: hv_balloon: Don't post pressure status from interrupt context
We currently release memory (balloon down) in the interrupt context and we also post memory status while releasing memory. Rather than posting the status in the interrupt context, wakeup the status posting thread to post the status. This will address the inconsistent lock state that Sitsofe Wheeler reported: http://lkml.iu.edu/hypermail/linux/kernel/1411.1/00075.html Signed-off-by: K. Y. Srinivasan Reported-by: Sitsofe Wheeler --- drivers/hv/hv_balloon.c | 11 --- 1 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 2c610ec..afdb0d5 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1227,7 +1227,7 @@ static void balloon_down(struct hv_dynmem_device *dm, for (i = 0; i < range_count; i++) { free_balloon_pages(dm, &range_array[i]); - post_status(&dm_device); + complete(&dm_device.config_event); } if (req->more_pages == 1) @@ -1251,19 +1251,16 @@ static void balloon_onchannelcallback(void *context); static int dm_thread_func(void *dm_dev) { struct hv_dynmem_device *dm = dm_dev; - int t; while (!kthread_should_stop()) { - t = wait_for_completion_interruptible_timeout( + wait_for_completion_interruptible_timeout( &dm_device.config_event, 1*HZ); /* * The host expects us to post information on the memory * pressure every second. */ - - if (t == 0) - post_status(dm); - + reinit_completion(&dm_device.config_event); + post_status(dm); } return 0; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Implement a clockevent device
Signed-off-by: K. Y. Srinivasan --- arch/x86/include/uapi/asm/hyperv.h | 11 + drivers/hv/hv.c| 78 drivers/hv/hyperv_vmbus.h | 21 ++ drivers/hv/vmbus_drv.c | 40 +- 4 files changed, 148 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 462efe7..90c458e 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -187,6 +187,17 @@ #define HV_X64_MSR_SINT14 0x409E #define HV_X64_MSR_SINT15 0x409F +/* + * Synthetic Timer MSRs. Four timers per vcpu. + */ +#define HV_X64_MSR_STIMER0_CONFIG 0x40B0 +#define HV_X64_MSR_STIMER0_COUNT 0x40B1 +#define HV_X64_MSR_STIMER1_CONFIG 0x40B2 +#define HV_X64_MSR_STIMER1_COUNT 0x40B3 +#define HV_X64_MSR_STIMER2_CONFIG 0x40B4 +#define HV_X64_MSR_STIMER2_COUNT 0x40B5 +#define HV_X64_MSR_STIMER3_CONFIG 0x40B6 +#define HV_X64_MSR_STIMER3_COUNT 0x40B7 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12 diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 3e4235c..e2749c0 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -28,7 +28,9 @@ #include #include #include +#include #include +#include #include "hyperv_vmbus.h" /* The one and only */ @@ -37,6 +39,10 @@ struct hv_context hv_context = { .hypercall_page = NULL, }; +#define HV_TIMER_FREQUENCY (10 * 1000 * 1000) /* 100ns period */ +#define HV_MAX_MAX_DELTA_TICKS 0x +#define HV_MIN_DELTA_TICKS 1 + /* * query_hypervisor_info - Get version info of the windows hypervisor */ @@ -144,6 +150,8 @@ int hv_init(void) sizeof(int) * NR_CPUS); memset(hv_context.event_dpc, 0, sizeof(void *) * NR_CPUS); + memset(hv_context.clk_evt, 0, + sizeof(void *) * NR_CPUS); max_leaf = query_hypervisor_info(); @@ -258,10 +266,63 @@ u16 hv_signal_event(void *con_id) return status; } +static int hv_ce_set_next_event(unsigned long delta, + struct clock_event_device *evt) +{ + cycle_t current_tick; + + WARN_ON(evt->mode != CLOCK_EVT_MODE_ONESHOT); + + rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); + current_tick += delta; + wrmsrl(HV_X64_MSR_STIMER0_COUNT, current_tick); + return 0; +} + +static void hv_ce_setmode(enum clock_event_mode mode, + struct clock_event_device *evt) +{ + union hv_timer_config timer_cfg; + + switch (mode) { + case CLOCK_EVT_MODE_PERIODIC: + /* unsupported */ + break; + + case CLOCK_EVT_MODE_ONESHOT: + timer_cfg.enable = 1; + timer_cfg.auto_enable = 1; + timer_cfg.sintx = VMBUS_MESSAGE_SINT; + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, timer_cfg.as_uint64); + break; + + case CLOCK_EVT_MODE_UNUSED: + case CLOCK_EVT_MODE_SHUTDOWN: + wrmsrl(HV_X64_MSR_STIMER0_COUNT, 0); + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, 0); + break; + case CLOCK_EVT_MODE_RESUME: + break; + } +} + +static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu) +{ + dev->name = "Hyper-V clockevent"; + dev->features = CLOCK_EVT_FEAT_ONESHOT; + dev->cpumask = cpumask_of(cpu); + dev->rating = 1000, + dev->owner = THIS_MODULE, + + dev->set_mode = hv_ce_setmode; + dev->set_next_event = hv_ce_set_next_event; +} + int hv_synic_alloc(void) { size_t size = sizeof(struct tasklet_struct); + size_t ced_size = sizeof(struct clock_event_device); int cpu; for_each_online_cpu(cpu) { @@ -272,6 +333,13 @@ int hv_synic_alloc(void) } tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); + hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); + if (hv_context.clk_evt[cpu] == NULL) { + pr_err("Unable to allocate clock event device\n"); + goto err; + } + hv_init_clockevent_device(hv_context.clk_evt[cpu], cpu); + hv_context.synic_message_page[cpu] = (void *)get_zeroed_page(GFP_ATOMIC); @@ -305,6 +373,7 @@ err: static void hv_synic_free_cpu(int cpu) { kfree(hv_context.event_dpc[cpu]); + kfree(hv_context.clk_evt[cpu]); if (hv_context.synic_event_page[cpu]) free_page((unsigned long)hv_context.synic_event_page[cpu]); if (hv_context.synic_mess
[PATCH V2 1/1] Drivers: hv: vmbus: Implement a clockevent device
Implement a clockevent device based on the timer support available on Hyper-V. Signed-off-by: K. Y. Srinivasan --- arch/x86/include/uapi/asm/hyperv.h | 11 + drivers/hv/hv.c| 78 drivers/hv/hyperv_vmbus.h | 21 ++ drivers/hv/vmbus_drv.c | 40 +- 4 files changed, 148 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 462efe7..90c458e 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -187,6 +187,17 @@ #define HV_X64_MSR_SINT14 0x409E #define HV_X64_MSR_SINT15 0x409F +/* + * Synthetic Timer MSRs. Four timers per vcpu. + */ +#define HV_X64_MSR_STIMER0_CONFIG 0x40B0 +#define HV_X64_MSR_STIMER0_COUNT 0x40B1 +#define HV_X64_MSR_STIMER1_CONFIG 0x40B2 +#define HV_X64_MSR_STIMER1_COUNT 0x40B3 +#define HV_X64_MSR_STIMER2_CONFIG 0x40B4 +#define HV_X64_MSR_STIMER2_COUNT 0x40B5 +#define HV_X64_MSR_STIMER3_CONFIG 0x40B6 +#define HV_X64_MSR_STIMER3_COUNT 0x40B7 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12 diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 3e4235c..e2749c0 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -28,7 +28,9 @@ #include #include #include +#include #include +#include #include "hyperv_vmbus.h" /* The one and only */ @@ -37,6 +39,10 @@ struct hv_context hv_context = { .hypercall_page = NULL, }; +#define HV_TIMER_FREQUENCY (10 * 1000 * 1000) /* 100ns period */ +#define HV_MAX_MAX_DELTA_TICKS 0x +#define HV_MIN_DELTA_TICKS 1 + /* * query_hypervisor_info - Get version info of the windows hypervisor */ @@ -144,6 +150,8 @@ int hv_init(void) sizeof(int) * NR_CPUS); memset(hv_context.event_dpc, 0, sizeof(void *) * NR_CPUS); + memset(hv_context.clk_evt, 0, + sizeof(void *) * NR_CPUS); max_leaf = query_hypervisor_info(); @@ -258,10 +266,63 @@ u16 hv_signal_event(void *con_id) return status; } +static int hv_ce_set_next_event(unsigned long delta, + struct clock_event_device *evt) +{ + cycle_t current_tick; + + WARN_ON(evt->mode != CLOCK_EVT_MODE_ONESHOT); + + rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); + current_tick += delta; + wrmsrl(HV_X64_MSR_STIMER0_COUNT, current_tick); + return 0; +} + +static void hv_ce_setmode(enum clock_event_mode mode, + struct clock_event_device *evt) +{ + union hv_timer_config timer_cfg; + + switch (mode) { + case CLOCK_EVT_MODE_PERIODIC: + /* unsupported */ + break; + + case CLOCK_EVT_MODE_ONESHOT: + timer_cfg.enable = 1; + timer_cfg.auto_enable = 1; + timer_cfg.sintx = VMBUS_MESSAGE_SINT; + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, timer_cfg.as_uint64); + break; + + case CLOCK_EVT_MODE_UNUSED: + case CLOCK_EVT_MODE_SHUTDOWN: + wrmsrl(HV_X64_MSR_STIMER0_COUNT, 0); + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, 0); + break; + case CLOCK_EVT_MODE_RESUME: + break; + } +} + +static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu) +{ + dev->name = "Hyper-V clockevent"; + dev->features = CLOCK_EVT_FEAT_ONESHOT; + dev->cpumask = cpumask_of(cpu); + dev->rating = 1000, + dev->owner = THIS_MODULE, + + dev->set_mode = hv_ce_setmode; + dev->set_next_event = hv_ce_set_next_event; +} + int hv_synic_alloc(void) { size_t size = sizeof(struct tasklet_struct); + size_t ced_size = sizeof(struct clock_event_device); int cpu; for_each_online_cpu(cpu) { @@ -272,6 +333,13 @@ int hv_synic_alloc(void) } tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); + hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); + if (hv_context.clk_evt[cpu] == NULL) { + pr_err("Unable to allocate clock event device\n"); + goto err; + } + hv_init_clockevent_device(hv_context.clk_evt[cpu], cpu); + hv_context.synic_message_page[cpu] = (void *)get_zeroed_page(GFP_ATOMIC); @@ -305,6 +373,7 @@ err: static void hv_synic_free_cpu(int cpu) { kfree(hv_context.event_dpc[cpu]); + kfree(hv_context.clk_evt[cpu]); if (hv_context.synic_event_page[cpu]) free_page((unsig
[PATCH V3 1/1] Drivers: hv: vmbus: Implement a clockevent device
Implement a clockevent device based on the timer support available on Hyper-V. In This version of the patch I have addressed Jason's review comments. Signed-off-by: K. Y. Srinivasan Reviewed-by: Jason Wang --- arch/x86/include/uapi/asm/hyperv.h | 11 + drivers/hv/hv.c| 78 drivers/hv/hyperv_vmbus.h | 21 ++ drivers/hv/vmbus_drv.c | 37 - 4 files changed, 145 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 462efe7..90c458e 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -187,6 +187,17 @@ #define HV_X64_MSR_SINT14 0x409E #define HV_X64_MSR_SINT15 0x409F +/* + * Synthetic Timer MSRs. Four timers per vcpu. + */ +#define HV_X64_MSR_STIMER0_CONFIG 0x40B0 +#define HV_X64_MSR_STIMER0_COUNT 0x40B1 +#define HV_X64_MSR_STIMER1_CONFIG 0x40B2 +#define HV_X64_MSR_STIMER1_COUNT 0x40B3 +#define HV_X64_MSR_STIMER2_CONFIG 0x40B4 +#define HV_X64_MSR_STIMER2_COUNT 0x40B5 +#define HV_X64_MSR_STIMER3_CONFIG 0x40B6 +#define HV_X64_MSR_STIMER3_COUNT 0x40B7 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12 diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 3e4235c..e2749c0 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -28,7 +28,9 @@ #include #include #include +#include #include +#include #include "hyperv_vmbus.h" /* The one and only */ @@ -37,6 +39,10 @@ struct hv_context hv_context = { .hypercall_page = NULL, }; +#define HV_TIMER_FREQUENCY (10 * 1000 * 1000) /* 100ns period */ +#define HV_MAX_MAX_DELTA_TICKS 0x +#define HV_MIN_DELTA_TICKS 1 + /* * query_hypervisor_info - Get version info of the windows hypervisor */ @@ -144,6 +150,8 @@ int hv_init(void) sizeof(int) * NR_CPUS); memset(hv_context.event_dpc, 0, sizeof(void *) * NR_CPUS); + memset(hv_context.clk_evt, 0, + sizeof(void *) * NR_CPUS); max_leaf = query_hypervisor_info(); @@ -258,10 +266,63 @@ u16 hv_signal_event(void *con_id) return status; } +static int hv_ce_set_next_event(unsigned long delta, + struct clock_event_device *evt) +{ + cycle_t current_tick; + + WARN_ON(evt->mode != CLOCK_EVT_MODE_ONESHOT); + + rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); + current_tick += delta; + wrmsrl(HV_X64_MSR_STIMER0_COUNT, current_tick); + return 0; +} + +static void hv_ce_setmode(enum clock_event_mode mode, + struct clock_event_device *evt) +{ + union hv_timer_config timer_cfg; + + switch (mode) { + case CLOCK_EVT_MODE_PERIODIC: + /* unsupported */ + break; + + case CLOCK_EVT_MODE_ONESHOT: + timer_cfg.enable = 1; + timer_cfg.auto_enable = 1; + timer_cfg.sintx = VMBUS_MESSAGE_SINT; + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, timer_cfg.as_uint64); + break; + + case CLOCK_EVT_MODE_UNUSED: + case CLOCK_EVT_MODE_SHUTDOWN: + wrmsrl(HV_X64_MSR_STIMER0_COUNT, 0); + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, 0); + break; + case CLOCK_EVT_MODE_RESUME: + break; + } +} + +static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu) +{ + dev->name = "Hyper-V clockevent"; + dev->features = CLOCK_EVT_FEAT_ONESHOT; + dev->cpumask = cpumask_of(cpu); + dev->rating = 1000, + dev->owner = THIS_MODULE, + + dev->set_mode = hv_ce_setmode; + dev->set_next_event = hv_ce_set_next_event; +} + int hv_synic_alloc(void) { size_t size = sizeof(struct tasklet_struct); + size_t ced_size = sizeof(struct clock_event_device); int cpu; for_each_online_cpu(cpu) { @@ -272,6 +333,13 @@ int hv_synic_alloc(void) } tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); + hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); + if (hv_context.clk_evt[cpu] == NULL) { + pr_err("Unable to allocate clock event device\n"); + goto err; + } + hv_init_clockevent_device(hv_context.clk_evt[cpu], cpu); + hv_context.synic_message_page[cpu] = (void *)get_zeroed_page(GFP_ATOMIC); @@ -305,6 +373,7 @@ err: static void hv_synic_free_cpu(int cpu) { kfree(hv_context.event_dp
[PATCH 0/2] Drivers: hv: hv_balloon: Fix a deadlock in the hot-add path.
Fix a deadlock in the hot-add path in the Hyper-V balloon driver. K. Y. Srinivasan (2): Drivers: base: core: Export functions to lock/unlock device hotplug lock Drivers: hv: balloon: Fix the deadlock issue in the memory hot-add code drivers/base/core.c |2 ++ drivers/hv/hv_balloon.c |4 2 files changed, 6 insertions(+), 0 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2] Drivers: hv: balloon: Fix the deadlock issue in the memory hot-add code
Andy Whitcroft initially saw this deadlock. We have seen this as well. Here is the original description of the problem (and a potential solution) from Andy: https://lkml.org/lkml/2014/3/14/451 Here is an excerpt from that mail: "We are seeing machines lockup with what appears to be an ABBA deadlock in the memory hotplug system. These are from the 3.13.6 based Ubuntu kernels. The hv_balloon driver is adding memory using add_memory() which takes the hotplug lock, and then emits a udev event, and then attempts to lock the sysfs device. In response to the udev event udev opens the sysfs device and locks it, then attempts to grab the hotplug lock to online the memory. This seems to be inverted nesting in the two cases, leading to the hangs below: [ 240.608612] INFO: task kworker/0:2:861 blocked for more than 120 seconds. [ 240.608705] INFO: task systemd-udevd:1906 blocked for more than 120 seconds. I note that the device hotplug locking allows complete retries (via ERESTARTSYS) and if we could detect this at the online stage it could be used to get us out. But before I go down this road I wanted to make sure I am reading this right. Or indeed if the hv_balloon driver is just doing this wrong." This patch is based on the suggestion from Yasuaki Ishimatsu Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index afdb0d5..f525a62 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -649,8 +650,11 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, release_region_mutex(false); nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); + + lock_device_hotplug(); ret = add_memory(nid, PFN_PHYS((start_pfn)), (HA_CHUNK << PAGE_SHIFT)); + unlock_device_hotplug(); if (ret) { pr_info("hot_add memory failed error is %d\n", ret); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/2] Drivers: base: core: Export functions to lock/unlock device hotplug lock
The Hyper-V balloon driver does memory hot-add. The device_hotplug_lock is designed to address AB BA deadlock issues between the hot-add path and the sysfs path. Export the APIs to acquire and release the device_hotplug_lock for use by loadable modules that want to hot-add memory or CPU. Signed-off-by: K. Y. Srinivasan --- drivers/base/core.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 97e2baf..b3073af 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -55,11 +55,13 @@ void lock_device_hotplug(void) { mutex_lock(&device_hotplug_lock); } +EXPORT_SYMBOL_GPL(lock_device_hotplug); void unlock_device_hotplug(void) { mutex_unlock(&device_hotplug_lock); } +EXPORT_SYMBOL_GPL(unlock_device_hotplug); int lock_device_hotplug_sysfs(void) { -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Implement a clockevent device
Implement a clockevent device based on the timer support available on Hyper-V. In this version of the patch I have addressed Jason's review comments. Signed-off-by: K. Y. Srinivasan Reviewed-by: Jason Wang --- arch/x86/include/uapi/asm/hyperv.h | 11 + drivers/hv/hv.c| 78 drivers/hv/hyperv_vmbus.h | 21 ++ drivers/hv/vmbus_drv.c | 37 - 4 files changed, 145 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 462efe7..90c458e 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -187,6 +187,17 @@ #define HV_X64_MSR_SINT14 0x409E #define HV_X64_MSR_SINT15 0x409F +/* + * Synthetic Timer MSRs. Four timers per vcpu. + */ +#define HV_X64_MSR_STIMER0_CONFIG 0x40B0 +#define HV_X64_MSR_STIMER0_COUNT 0x40B1 +#define HV_X64_MSR_STIMER1_CONFIG 0x40B2 +#define HV_X64_MSR_STIMER1_COUNT 0x40B3 +#define HV_X64_MSR_STIMER2_CONFIG 0x40B4 +#define HV_X64_MSR_STIMER2_COUNT 0x40B5 +#define HV_X64_MSR_STIMER3_CONFIG 0x40B6 +#define HV_X64_MSR_STIMER3_COUNT 0x40B7 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12 diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 3e4235c..50e51a5 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -28,7 +28,9 @@ #include #include #include +#include #include +#include #include "hyperv_vmbus.h" /* The one and only */ @@ -37,6 +39,10 @@ struct hv_context hv_context = { .hypercall_page = NULL, }; +#define HV_TIMER_FREQUENCY (10 * 1000 * 1000) /* 100ns period */ +#define HV_MAX_MAX_DELTA_TICKS 0x +#define HV_MIN_DELTA_TICKS 1 + /* * query_hypervisor_info - Get version info of the windows hypervisor */ @@ -144,6 +150,8 @@ int hv_init(void) sizeof(int) * NR_CPUS); memset(hv_context.event_dpc, 0, sizeof(void *) * NR_CPUS); + memset(hv_context.clk_evt, 0, + sizeof(void *) * NR_CPUS); max_leaf = query_hypervisor_info(); @@ -258,10 +266,63 @@ u16 hv_signal_event(void *con_id) return status; } +static int hv_ce_set_next_event(unsigned long delta, + struct clock_event_device *evt) +{ + cycle_t current_tick; + + WARN_ON(evt->mode != CLOCK_EVT_MODE_ONESHOT); + + rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); + current_tick += delta; + wrmsrl(HV_X64_MSR_STIMER0_COUNT, current_tick); + return 0; +} + +static void hv_ce_setmode(enum clock_event_mode mode, + struct clock_event_device *evt) +{ + union hv_timer_config timer_cfg; + + switch (mode) { + case CLOCK_EVT_MODE_PERIODIC: + /* unsupported */ + break; + + case CLOCK_EVT_MODE_ONESHOT: + timer_cfg.enable = 1; + timer_cfg.auto_enable = 1; + timer_cfg.sintx = VMBUS_MESSAGE_SINT; + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, timer_cfg.as_uint64); + break; + + case CLOCK_EVT_MODE_UNUSED: + case CLOCK_EVT_MODE_SHUTDOWN: + wrmsrl(HV_X64_MSR_STIMER0_COUNT, 0); + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, 0); + break; + case CLOCK_EVT_MODE_RESUME: + break; + } +} + +static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu) +{ + dev->name = "Hyper-V clockevent"; + dev->features = CLOCK_EVT_FEAT_ONESHOT; + dev->cpumask = cpumask_of(cpu); + dev->rating = 1000; + dev->owner = THIS_MODULE; + + dev->set_mode = hv_ce_setmode; + dev->set_next_event = hv_ce_set_next_event; +} + int hv_synic_alloc(void) { size_t size = sizeof(struct tasklet_struct); + size_t ced_size = sizeof(struct clock_event_device); int cpu; for_each_online_cpu(cpu) { @@ -272,6 +333,13 @@ int hv_synic_alloc(void) } tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); + hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); + if (hv_context.clk_evt[cpu] == NULL) { + pr_err("Unable to allocate clock event device\n"); + goto err; + } + hv_init_clockevent_device(hv_context.clk_evt[cpu], cpu); + hv_context.synic_message_page[cpu] = (void *)get_zeroed_page(GFP_ATOMIC); @@ -305,6 +373,7 @@ err: static void hv_synic_free_cpu(int cpu) { kfree(hv_context.event_dp
[PATCH V4 1/1] Drivers: hv: vmbus: Implement a clockevent device
Implement a clockevent device based on the timer support available on Hyper-V. In this version of the patch I have addressed Jason's review comments. Signed-off-by: K. Y. Srinivasan Reviewed-by: Jason Wang --- arch/x86/include/uapi/asm/hyperv.h | 11 + drivers/hv/hv.c| 78 drivers/hv/hyperv_vmbus.h | 21 ++ drivers/hv/vmbus_drv.c | 37 - 4 files changed, 145 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 462efe7..90c458e 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -187,6 +187,17 @@ #define HV_X64_MSR_SINT14 0x409E #define HV_X64_MSR_SINT15 0x409F +/* + * Synthetic Timer MSRs. Four timers per vcpu. + */ +#define HV_X64_MSR_STIMER0_CONFIG 0x40B0 +#define HV_X64_MSR_STIMER0_COUNT 0x40B1 +#define HV_X64_MSR_STIMER1_CONFIG 0x40B2 +#define HV_X64_MSR_STIMER1_COUNT 0x40B3 +#define HV_X64_MSR_STIMER2_CONFIG 0x40B4 +#define HV_X64_MSR_STIMER2_COUNT 0x40B5 +#define HV_X64_MSR_STIMER3_CONFIG 0x40B6 +#define HV_X64_MSR_STIMER3_COUNT 0x40B7 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12 diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 3e4235c..50e51a5 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -28,7 +28,9 @@ #include #include #include +#include #include +#include #include "hyperv_vmbus.h" /* The one and only */ @@ -37,6 +39,10 @@ struct hv_context hv_context = { .hypercall_page = NULL, }; +#define HV_TIMER_FREQUENCY (10 * 1000 * 1000) /* 100ns period */ +#define HV_MAX_MAX_DELTA_TICKS 0x +#define HV_MIN_DELTA_TICKS 1 + /* * query_hypervisor_info - Get version info of the windows hypervisor */ @@ -144,6 +150,8 @@ int hv_init(void) sizeof(int) * NR_CPUS); memset(hv_context.event_dpc, 0, sizeof(void *) * NR_CPUS); + memset(hv_context.clk_evt, 0, + sizeof(void *) * NR_CPUS); max_leaf = query_hypervisor_info(); @@ -258,10 +266,63 @@ u16 hv_signal_event(void *con_id) return status; } +static int hv_ce_set_next_event(unsigned long delta, + struct clock_event_device *evt) +{ + cycle_t current_tick; + + WARN_ON(evt->mode != CLOCK_EVT_MODE_ONESHOT); + + rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); + current_tick += delta; + wrmsrl(HV_X64_MSR_STIMER0_COUNT, current_tick); + return 0; +} + +static void hv_ce_setmode(enum clock_event_mode mode, + struct clock_event_device *evt) +{ + union hv_timer_config timer_cfg; + + switch (mode) { + case CLOCK_EVT_MODE_PERIODIC: + /* unsupported */ + break; + + case CLOCK_EVT_MODE_ONESHOT: + timer_cfg.enable = 1; + timer_cfg.auto_enable = 1; + timer_cfg.sintx = VMBUS_MESSAGE_SINT; + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, timer_cfg.as_uint64); + break; + + case CLOCK_EVT_MODE_UNUSED: + case CLOCK_EVT_MODE_SHUTDOWN: + wrmsrl(HV_X64_MSR_STIMER0_COUNT, 0); + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, 0); + break; + case CLOCK_EVT_MODE_RESUME: + break; + } +} + +static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu) +{ + dev->name = "Hyper-V clockevent"; + dev->features = CLOCK_EVT_FEAT_ONESHOT; + dev->cpumask = cpumask_of(cpu); + dev->rating = 1000; + dev->owner = THIS_MODULE; + + dev->set_mode = hv_ce_setmode; + dev->set_next_event = hv_ce_set_next_event; +} + int hv_synic_alloc(void) { size_t size = sizeof(struct tasklet_struct); + size_t ced_size = sizeof(struct clock_event_device); int cpu; for_each_online_cpu(cpu) { @@ -272,6 +333,13 @@ int hv_synic_alloc(void) } tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); + hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); + if (hv_context.clk_evt[cpu] == NULL) { + pr_err("Unable to allocate clock event device\n"); + goto err; + } + hv_init_clockevent_device(hv_context.clk_evt[cpu], cpu); + hv_context.synic_message_page[cpu] = (void *)get_zeroed_page(GFP_ATOMIC); @@ -305,6 +373,7 @@ err: static void hv_synic_free_cpu(int cpu) { kfree(hv_context.event_dp
[PATCH 1/1] Drivers: hv: vmbus: Fix a bug in vmbus_establish_gpadl()
Fix a bug in vmbus_establish_gpadl(). I would like to thank Michael Brown for seeing this bug. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 433f72a..c76ffbe 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -366,8 +366,8 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, unsigned long flags; int ret = 0; - next_gpadl_handle = atomic_read(&vmbus_connection.next_gpadl_handle); - atomic_inc(&vmbus_connection.next_gpadl_handle); + next_gpadl_handle = + (atomic_inc_return(&vmbus_connection.next_gpadl_handle) - 1); ret = create_gpadl_header(kbuffer, size, &msginfo, &msgcount); if (ret) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 1/1] Drivers: hv: vmbus: Fix a bug in vmbus_establish_gpadl()
Fix a bug in vmbus_establish_gpadl(). I would like to thank Michael Brown for seeing this bug. In this version, I have added the Reported-by tag. Signed-off-by: K. Y. Srinivasan Reported-by: Michael Brown --- drivers/hv/channel.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 433f72a..c76ffbe 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -366,8 +366,8 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, unsigned long flags; int ret = 0; - next_gpadl_handle = atomic_read(&vmbus_connection.next_gpadl_handle); - atomic_inc(&vmbus_connection.next_gpadl_handle); + next_gpadl_handle = + (atomic_inc_return(&vmbus_connection.next_gpadl_handle) - 1); ret = create_gpadl_header(kbuffer, size, &msginfo, &msgcount); if (ret) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V3 1/1] Drivers: hv: vmbus: Fix a bug in vmbus_establish_gpadl()
Correctly compute the local (gpadl) handle. I would like to thank Michael Brown for seeing this bug. Signed-off-by: K. Y. Srinivasan Reported-by: Michael Brown --- Changes in V2: Added the Reported-by tag. Changes in V3: Cleaned up the commit log. drivers/hv/channel.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 433f72a..c76ffbe 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -366,8 +366,8 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, unsigned long flags; int ret = 0; - next_gpadl_handle = atomic_read(&vmbus_connection.next_gpadl_handle); - atomic_inc(&vmbus_connection.next_gpadl_handle); + next_gpadl_handle = + (atomic_inc_return(&vmbus_connection.next_gpadl_handle) - 1); ret = create_gpadl_header(kbuffer, size, &msginfo, &msgcount); if (ret) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Use get_cpu() to get the current CPU
Replace calls for smp_processor_id() to get_cpu() to get the CPU ID of the current CPU. In these instances, there is no correctness issue with regards to preemption, we just need the current CPU ID. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c |4 +++- drivers/hv/connection.c |6 -- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index d36ce68..a205246 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -813,7 +813,7 @@ cleanup: struct vmbus_channel *vmbus_get_outgoing_channel(struct vmbus_channel *primary) { struct list_head *cur, *tmp; - int cur_cpu = hv_context.vp_index[smp_processor_id()]; + int cur_cpu; struct vmbus_channel *cur_channel; struct vmbus_channel *outgoing_channel = primary; int cpu_distance, new_cpu_distance; @@ -821,6 +821,8 @@ struct vmbus_channel *vmbus_get_outgoing_channel(struct vmbus_channel *primary) if (list_empty(&primary->sc_list)) return outgoing_channel; + cur_cpu = hv_context.vp_index[get_cpu()]; + put_cpu(); list_for_each_safe(cur, tmp, &primary->sc_list) { cur_channel = list_entry(cur, struct vmbus_channel, sc_list); if (cur_channel->state != CHANNEL_OPENED_STATE) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index e206619..a63a795 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -80,8 +80,10 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo, msg->interrupt_page = virt_to_phys(vmbus_connection.int_page); msg->monitor_page1 = virt_to_phys(vmbus_connection.monitor_pages[0]); msg->monitor_page2 = virt_to_phys(vmbus_connection.monitor_pages[1]); - if (version == VERSION_WIN8_1) - msg->target_vcpu = hv_context.vp_index[smp_processor_id()]; + if (version == VERSION_WIN8_1) { + msg->target_vcpu = hv_context.vp_index[get_cpu()]; + put_cpu(); + } /* * Add to list before we send the request since we may -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Support a vmbus API for efficiently sending page arrays
Currently, the API for sending a multi-page buffer over VMBUS is limited to a maximum pfn array of MAX_MULTIPAGE_BUFFER_COUNT. This limitation is not imposed by the host and unnecessarily limits the maximum payload that can be sent. Implement an API that does not have this restriction. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c | 44 include/linux/hyperv.h | 31 +++ 2 files changed, 75 insertions(+), 0 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index c76ffbe..18c4f23 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -686,6 +686,50 @@ EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer); /* * vmbus_sendpacket_multipagebuffer - Send a multi-page buffer packet * using a GPADL Direct packet type. + * The buffer includes the vmbus descriptor. + */ +int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, + struct vmbus_packet_mpb_array *desc, + u32 desc_size, + void *buffer, u32 bufferlen, u64 requestid) +{ + int ret; + u32 packetlen; + u32 packetlen_aligned; + struct kvec bufferlist[3]; + u64 aligned_data = 0; + bool signal = false; + + packetlen = desc_size + bufferlen; + packetlen_aligned = ALIGN(packetlen, sizeof(u64)); + + /* Setup the descriptor */ + desc->type = VM_PKT_DATA_USING_GPA_DIRECT; + desc->flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; + desc->dataoffset8 = desc_size >> 3; /* in 8-bytes grandularity */ + desc->length8 = (u16)(packetlen_aligned >> 3); + desc->transactionid = requestid; + desc->rangecount = 1; + + bufferlist[0].iov_base = desc; + bufferlist[0].iov_len = desc_size; + bufferlist[1].iov_base = buffer; + bufferlist[1].iov_len = bufferlen; + bufferlist[2].iov_base = &aligned_data; + bufferlist[2].iov_len = (packetlen_aligned - packetlen); + + ret = hv_ringbuffer_write(&channel->outbound, bufferlist, 3, &signal); + + if (ret == 0 && signal) + vmbus_setevent(channel); + + return ret; +} +EXPORT_SYMBOL_GPL(vmbus_sendpacket_mpb_desc); + +/* + * vmbus_sendpacket_multipagebuffer - Send a multi-page buffer packet + * using a GPADL Direct packet type. */ int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, struct hv_multipage_buffer *multi_pagebuffer, diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 08cfaff..8615b0d 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -57,6 +57,18 @@ struct hv_multipage_buffer { u64 pfn_array[MAX_MULTIPAGE_BUFFER_COUNT]; }; +/* + * Multiple-page buffer array; the pfn array is variable size: + * The number of entries in the PFN array is determined by + * "len" and "offset". + */ +struct hv_mpb_array { + /* Length and Offset determines the # of pfns in the array */ + u32 len; + u32 offset; + u64 pfn_array[]; +}; + /* 0x18 includes the proprietary packet header */ #define MAX_PAGE_BUFFER_PACKET (0x18 + \ (sizeof(struct hv_page_buffer) * \ @@ -812,6 +824,18 @@ struct vmbus_channel_packet_multipage_buffer { struct hv_multipage_buffer range; } __packed; +/* The format must be the same as struct vmdata_gpa_direct */ +struct vmbus_packet_mpb_array { + u16 type; + u16 dataoffset8; + u16 length8; + u16 flags; + u64 transactionid; + u32 reserved; + u32 rangecount; /* Always 1 in this case */ + struct hv_mpb_array range; +} __packed; + extern int vmbus_open(struct vmbus_channel *channel, u32 send_ringbuffersize, @@ -843,6 +867,13 @@ extern int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, u32 bufferlen, u64 requestid); +extern int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, +struct vmbus_packet_mpb_array *mpb, +u32 desc_size, +void *buffer, +u32 bufferlen, +u64 requestid); + extern int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, u32 size, -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/4] Drivers: scsi: storvsc: Fix miscellaneous issues
The first two patches in this series are a resend; these were submitted some months ago and as far as I know, there were no outstanding issues. Win8 and win8 r2 hosts do support SPC-3 features but claim SPC-2 compliance. This issue is fixed here as well. K. Y. Srinivasan (4): Drivers: scsi: storvsc: In responce to a scan event, scan the host Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed. Drivers: scsi: storvsc: Fix a bug in storvsc limits Drivers: scsi: storvsc: Force SPC-3 compliance on win8 and win8 r2 hosts drivers/scsi/storvsc_drv.c | 71 +++ 1 files changed, 57 insertions(+), 14 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/4] Drivers: scsi: storvsc: Fix a bug in storvsc limits
Commit 4cd83ecdac20d30725b4f96e5d7814a1e290bc7e changed the limits to reflect the values on the host. It turns out that WS2008R2 cannot correctly handle these new limits. Fix this bug by setting the limits based on the host. Signed-off-by: K. Y. Srinivasan --- drivers/scsi/storvsc_drv.c | 15 --- 1 files changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index a7163c6..fdc5164 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1782,6 +1782,9 @@ static int storvsc_probe(struct hv_device *device, bool dev_is_ide = ((dev_id->driver_data == IDE_GUID) ? true : false); int target = 0; struct storvsc_device *stor_device; + int max_luns_per_target; + int max_targets; + int max_channels; /* * Based on the windows host we are running on, @@ -1795,12 +1798,18 @@ static int storvsc_probe(struct hv_device *device, vmscsi_size_delta = sizeof(struct vmscsi_win8_extension); vmstor_current_major = VMSTOR_WIN7_MAJOR; vmstor_current_minor = VMSTOR_WIN7_MINOR; + max_luns_per_target = STORVSC_IDE_MAX_LUNS_PER_TARGET; + max_targets = STORVSC_IDE_MAX_TARGETS; + max_channels = STORVSC_IDE_MAX_CHANNELS; break; default: sense_buffer_size = POST_WIN7_STORVSC_SENSE_BUFFER_SIZE; vmscsi_size_delta = 0; vmstor_current_major = VMSTOR_WIN8_MAJOR; vmstor_current_minor = VMSTOR_WIN8_MINOR; + max_luns_per_target = STORVSC_MAX_LUNS_PER_TARGET; + max_targets = STORVSC_MAX_TARGETS; + max_channels = STORVSC_MAX_CHANNELS; break; } @@ -1848,9 +1857,9 @@ static int storvsc_probe(struct hv_device *device, break; case SCSI_GUID: - host->max_lun = STORVSC_MAX_LUNS_PER_TARGET; - host->max_id = STORVSC_MAX_TARGETS; - host->max_channel = STORVSC_MAX_CHANNELS - 1; + host->max_lun = max_luns_per_target; + host->max_id = max_targets; + host->max_channel = max_channels - 1; break; default: -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/4] Drivers: scsi: storvsc: In responce to a scan event, scan the host
The virtual HBA that storvsc implements can support multiple channels and targets. So, scan the host when the host notifies that a scan is needed. Signed-off-by: K. Y. Srinivasan --- drivers/scsi/storvsc_drv.c | 19 +++ 1 files changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index e3ba251..0a96fef 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -426,21 +426,16 @@ done: kfree(wrk); } -static void storvsc_bus_scan(struct work_struct *work) +static void storvsc_host_scan(struct work_struct *work) { struct storvsc_scan_work *wrk; - int id, order_id; + struct Scsi_Host *host; wrk = container_of(work, struct storvsc_scan_work, work); - for (id = 0; id < wrk->host->max_id; ++id) { - if (wrk->host->reverse_ordering) - order_id = wrk->host->max_id - id - 1; - else - order_id = id; - - scsi_scan_target(&wrk->host->shost_gendev, 0, - order_id, SCAN_WILD_CARD, 1); - } + host = wrk->host; + + scsi_scan_host(host); + kfree(wrk); } @@ -1198,7 +1193,7 @@ static void storvsc_on_receive(struct hv_device *device, if (!work) return; - INIT_WORK(&work->work, storvsc_bus_scan); + INIT_WORK(&work->work, storvsc_host_scan); work->host = stor_device->host; schedule_work(&work->work); break; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/4] Drivers: scsi: storvsc: Force SPC-3 compliance on win8 and win8 r2 hosts
On win8 and win8 r2 hosts force SPC-3 compliance for MSFT virtual disks. Ubuntu has been carrying a similar patch outside the tree for a while now. Starting with win10, the host will support SPC-3 compliance. Based on all the testing that has been done on win8 and win8 r2 hosts, we are comfortable claiming SPC-3 compliance on these hosts as well. This will enable TRIM support on these hosts. Suggested by: James Bottomley Signed-off-by: K. Y. Srinivasan --- drivers/scsi/storvsc_drv.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index fdc5164..7487e07 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1468,6 +1468,19 @@ static int storvsc_device_configure(struct scsi_device *sdevice) */ sdevice->sdev_bflags |= msft_blist_flags; + /* +* If the host is WIN8 or WIN8 R2, claim conformance to SPC-3 +* if the device is a MSFT virtual device. +*/ + if (!strncmp(sdevice->vendor, "Msft", 4)) { + switch (vmbus_proto_version) { + case VERSION_WIN8: + case VERSION_WIN8_1: + sdevice->scsi_level = SCSI_SPC_3; + break; + } + } + return 0; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/4] Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed.
The host asks the guest to scan when a LUN is removed or added. The only way a guest can identify the removed LUN is when an I/O is attempted on a removed LUN - the SRB status code indicates that the LUN is invalid. We currently handle this SRB status and remove the device. Rather than waiting for an I/O to remove the device, force the discovery of LUNs that may have been removed prior to discovering LUNs that may have been added. Signed-off-by: K. Y. Srinivasan --- drivers/scsi/storvsc_drv.c | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 0a96fef..a7163c6 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -430,10 +430,36 @@ static void storvsc_host_scan(struct work_struct *work) { struct storvsc_scan_work *wrk; struct Scsi_Host *host; + struct scsi_device *sdev; + unsigned long flags; wrk = container_of(work, struct storvsc_scan_work, work); host = wrk->host; + /* +* Before scanning the host, first check to see if any of the +* currrently known devices have been hot removed. We issue a +* "unit ready" command against all currently known devices. +* This I/O will result in an error for devices that have been +* removed. As part of handling the I/O error, we remove the device. +* +* When a LUN is added or removed, the host sends us a signal to +* scan the host. Thus we are forced to discover the LUNs that +* may have been removed this way. +*/ + mutex_lock(&host->scan_mutex); + spin_lock_irqsave(host->host_lock, flags); + list_for_each_entry(sdev, &host->__devices, siblings) { + spin_unlock_irqrestore(host->host_lock, flags); + scsi_test_unit_ready(sdev, 1, 1, NULL); + spin_lock_irqsave(host->host_lock, flags); + continue; + } + spin_unlock_irqrestore(host->host_lock, flags); + mutex_unlock(&host->scan_mutex); + /* +* Now scan the host to discover LUNs that may have been added. +*/ scsi_scan_host(host); kfree(wrk); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Add device and vendor ID to vmbus devices
Add vendor and device ID attributes to vmbus devices. This would allow us to support vmbus based devices that can support guest RDMA. Signed-off-by: K. Y. Srinivasan --- drivers/hv/vmbus_drv.c | 20 include/linux/hyperv.h |2 ++ 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 4d6b269..215aac9 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -385,6 +385,24 @@ static ssize_t in_write_bytes_avail_show(struct device *dev, } static DEVICE_ATTR_RO(in_write_bytes_avail); +static ssize_t vendor_show(struct device *dev, + struct device_attribute *dev_atttr, + char *buf) +{ + struct hv_device *hv_dev = device_to_hv_device(dev); + return sprintf(buf, "0x%x\n", hv_dev->vendor_id); +} +static DEVICE_ATTR_RO(vendor); + +static ssize_t device_show(struct device *dev, + struct device_attribute *dev_atttr, + char *buf) +{ + struct hv_device *hv_dev = device_to_hv_device(dev); + return sprintf(buf, "0x%x\n", hv_dev->device_id); +} +static DEVICE_ATTR_RO(device); + /* Set up per device attributes in /sys/bus/vmbus/devices/ */ static struct attribute *vmbus_attrs[] = { &dev_attr_id.attr, @@ -409,6 +427,8 @@ static struct attribute *vmbus_attrs[] = { &dev_attr_in_write_index.attr, &dev_attr_in_read_bytes_avail.attr, &dev_attr_in_write_bytes_avail.attr, + &dev_attr_vendor.attr, + &dev_attr_device.attr, NULL, }; ATTRIBUTE_GROUPS(vmbus); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 476c685..6fe4dfe 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -891,6 +891,8 @@ struct hv_device { /* the device instance id of this device */ uuid_le dev_instance; + u16 vendor_id; + u16 device_id; struct device device; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Drivers/hv
Greg, Some time back I had sent a buch of patches for Hyper-V drivers. Are they still in the queue or should I resend them. Regards, K. Y ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 0/5] Tools: hv: fix compiler warnings and do minor cleanup
When someone does 'make' in tools/hv/ issues appear: - hv_fcopy_daemon is not being built; - lots of compiler warnings. This is just a cleanup. Compile-tested by myself on top of linux-next/master. Piggyback this series and send "[PATCH 5/5] Tools: hv: do not add redundant '/' in hv_start_fcopy()" Vitaly Kuznetsov (5): Tools: hv: add mising fcopyd to the Makefile Tools: hv: remove unused bytes_written from kvp_update_file() Tools: hv: address compiler warnings for hv_kvp_daemon.c Tools: hv: address compiler warnings for hv_fcopy_daemon.c Tools: hv: do not add redundant '/' in hv_start_fcopy() tools/hv/Makefile |4 ++-- tools/hv/hv_fcopy_daemon.c | 10 ++ tools/hv/hv_kvp_daemon.c | 29 + 3 files changed, 17 insertions(+), 26 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 3/5] Tools: hv: address compiler warnings for hv_kvp_daemon.c
From: Vitaly Kuznetsov This patch addresses two types of compiler warnings: ... warning: comparison between signed and unsigned integer expressions [-Wsign-compare] and ... warning: pointer targets in passing argument N of .kvp_ differ in signedness [-Wpointer-sign] Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- tools/hv/hv_kvp_daemon.c | 25 - 1 files changed, 12 insertions(+), 13 deletions(-) diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index 5a274ca..48a95f9 100644 --- a/tools/hv/hv_kvp_daemon.c +++ b/tools/hv/hv_kvp_daemon.c @@ -308,7 +308,7 @@ static int kvp_file_init(void) return 0; } -static int kvp_key_delete(int pool, const char *key, int key_size) +static int kvp_key_delete(int pool, const __u8 *key, int key_size) { int i; int j, k; @@ -351,8 +351,8 @@ static int kvp_key_delete(int pool, const char *key, int key_size) return 1; } -static int kvp_key_add_or_modify(int pool, const char *key, int key_size, const char *value, - int value_size) +static int kvp_key_add_or_modify(int pool, const __u8 *key, int key_size, +const __u8 *value, int value_size) { int i; int num_records; @@ -405,7 +405,7 @@ static int kvp_key_add_or_modify(int pool, const char *key, int key_size, const return 0; } -static int kvp_get_value(int pool, const char *key, int key_size, char *value, +static int kvp_get_value(int pool, const __u8 *key, int key_size, __u8 *value, int value_size) { int i; @@ -437,8 +437,8 @@ static int kvp_get_value(int pool, const char *key, int key_size, char *value, return 1; } -static int kvp_pool_enumerate(int pool, int index, char *key, int key_size, - char *value, int value_size) +static int kvp_pool_enumerate(int pool, int index, __u8 *key, int key_size, + __u8 *value, int value_size) { struct kvp_record *record; @@ -659,7 +659,7 @@ static char *kvp_if_name_to_mac(char *if_name) char*p, *x; charbuf[256]; char addr_file[256]; - int i; + unsigned int i; char *mac_addr = NULL; snprintf(addr_file, sizeof(addr_file), "%s%s%s", "/sys/class/net/", @@ -698,7 +698,7 @@ static char *kvp_mac_to_if_name(char *mac) charbuf[256]; char *kvp_net_dir = "/sys/class/net/"; char dev_id[256]; - int i; + unsigned int i; dir = opendir(kvp_net_dir); if (dir == NULL) @@ -748,7 +748,7 @@ static char *kvp_mac_to_if_name(char *mac) static void kvp_process_ipconfig_file(char *cmd, - char *config_buf, int len, + char *config_buf, unsigned int len, int element_size, int offset) { char buf[256]; @@ -766,7 +766,7 @@ static void kvp_process_ipconfig_file(char *cmd, if (offset == 0) memset(config_buf, 0, len); while ((p = fgets(buf, sizeof(buf), file)) != NULL) { - if ((len - strlen(config_buf)) < (element_size + 1)) + if (len < strlen(config_buf) + element_size + 1) break; x = strchr(p, '\n'); @@ -914,7 +914,7 @@ static int kvp_process_ip_address(void *addrp, static int kvp_get_ip_info(int family, char *if_name, int op, -void *out_buffer, int length) +void *out_buffer, unsigned int length) { struct ifaddrs *ifap; struct ifaddrs *curp; @@ -1017,8 +1017,7 @@ kvp_get_ip_info(int family, char *if_name, int op, weight += hweight32(&w[i]); sprintf(cidr_mask, "/%d", weight); - if ((length - sn_offset) < - (strlen(cidr_mask) + 1)) + if (length < sn_offset + strlen(cidr_mask) + 1) goto gather_ipaddr; if (sn_offset == 0) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 1/5] Tools: hv: add mising fcopyd to the Makefile
From: Vitaly Kuznetsov fcopyd in missing in the Makefile, add it there. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- tools/hv/Makefile |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/hv/Makefile b/tools/hv/Makefile index bd22f78..99ffe61 100644 --- a/tools/hv/Makefile +++ b/tools/hv/Makefile @@ -5,9 +5,9 @@ PTHREAD_LIBS = -lpthread WARNINGS = -Wall -Wextra CFLAGS = $(WARNINGS) -g $(PTHREAD_LIBS) -all: hv_kvp_daemon hv_vss_daemon +all: hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon %: %.c $(CC) $(CFLAGS) -o $@ $^ clean: - $(RM) hv_kvp_daemon hv_vss_daemon + $(RM) hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 4/5] Tools: hv: address compiler warnings for hv_fcopy_daemon.c
From: Vitaly Kuznetsov This patch addresses two types of compiler warnings: ... warning: unused variable .fd. [-Wunused-variable] and ... warning: format .%s. expects argument of type .char *., but argument 5 has type .__u16 *. [-Wformat=] Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- tools/hv/hv_fcopy_daemon.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c index f437d73..1a23872 100644 --- a/tools/hv/hv_fcopy_daemon.c +++ b/tools/hv/hv_fcopy_daemon.c @@ -51,7 +51,7 @@ static int hv_start_fcopy(struct hv_start_fcopy *smsg) p = (char *)smsg->path_name; snprintf(target_fname, sizeof(target_fname), "%s/%s", - (char *)smsg->path_name, smsg->file_name); +(char *)smsg->path_name, (char *)smsg->file_name); syslog(LOG_INFO, "Target file name: %s", target_fname); /* @@ -137,7 +137,7 @@ void print_usage(char *argv[]) int main(int argc, char *argv[]) { - int fd, fcopy_fd, len; + int fcopy_fd, len; int error; int daemonize = 1, long_index = 0, opt; int version = FCOPY_CURRENT_VERSION; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 5/5] Tools: hv: do not add redundant '/' in hv_start_fcopy()
From: Vitaly Kuznetsov We don't need to add additional '/' to smsg->path_name as snprintf("%s/%s") does the right thing. Without the patch we get doubled '//' in the log message. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- tools/hv/hv_fcopy_daemon.c |6 -- 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c index 1a23872..9445d8f 100644 --- a/tools/hv/hv_fcopy_daemon.c +++ b/tools/hv/hv_fcopy_daemon.c @@ -43,12 +43,6 @@ static int hv_start_fcopy(struct hv_start_fcopy *smsg) int error = HV_E_FAIL; char *q, *p; - /* -* If possile append a path seperator to the path. -*/ - if (strlen((char *)smsg->path_name) < (W_MAX_PATH - 2)) - strcat((char *)smsg->path_name, "/"); - p = (char *)smsg->path_name; snprintf(target_fname, sizeof(target_fname), "%s/%s", (char *)smsg->path_name, (char *)smsg->file_name); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 2/5] Tools: hv: remove unused bytes_written from kvp_update_file()
From: Vitaly Kuznetsov fwrite() does not actually return the number of bytes written and this value is being ignored anyway and ferror() is being called to check for an error. As we assign to this variable and never use it we get the following compile-time warning: hv_kvp_daemon.c:149:9: warning: variable .bytes_written. set but not used [-Wunused-but-set-variable] Remove bytes_written completely. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- tools/hv/hv_kvp_daemon.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index 6a6432a..5a274ca 100644 --- a/tools/hv/hv_kvp_daemon.c +++ b/tools/hv/hv_kvp_daemon.c @@ -147,7 +147,6 @@ static void kvp_release_lock(int pool) static void kvp_update_file(int pool) { FILE *filep; - size_t bytes_written; /* * We are going to write our in-memory registry out to @@ -163,8 +162,7 @@ static void kvp_update_file(int pool) exit(EXIT_FAILURE); } - bytes_written = fwrite(kvp_file_info[pool].records, - sizeof(struct kvp_record), + fwrite(kvp_file_info[pool].records, sizeof(struct kvp_record), kvp_file_info[pool].num_records, filep); if (ferror(filep) || fclose(filep)) { -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 0/6] Drivers: hv: Miscellaneous fixes and enhancements
In addition to some miscellaneous bug fixes and enhancements implement a clockevent device based on the functionality supported by Hyper-V. K. Y. Srinivasan (6): Drivers: hv: hv_balloon: Make adjustments in computing the floor Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver Drivers: hv: hv_balloon: Don't post pressure status from interrupt context Drivers: hv: vmbus: Implement a clockevent device Drivers: hv: vmbus: Fix a bug in vmbus_establish_gpadl() Drivers: hv: vmbus: Support a vmbus API for efficiently sending page arrays arch/x86/include/uapi/asm/hyperv.h | 11 + drivers/hv/channel.c | 48 +++- drivers/hv/hv.c| 78 +++ drivers/hv/hv_balloon.c| 88 +--- drivers/hv/hyperv_vmbus.h | 21 + drivers/hv/vmbus_drv.c | 40 +++- include/linux/hyperv.h | 31 + 7 files changed, 297 insertions(+), 20 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 3/6] Drivers: hv: hv_balloon: Don't post pressure status from interrupt context
We currently release memory (balloon down) in the interrupt context and we also post memory status while releasing memory. Rather than posting the status in the interrupt context, wakeup the status posting thread to post the status. This will address the inconsistent lock state that Sitsofe Wheeler reported: http://lkml.iu.edu/hypermail/linux/kernel/1411.1/00075.html Signed-off-by: K. Y. Srinivasan Reported-by: Sitsofe Wheeler --- drivers/hv/hv_balloon.c | 11 --- 1 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 8e30415..ff16938 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1226,7 +1226,7 @@ static void balloon_down(struct hv_dynmem_device *dm, for (i = 0; i < range_count; i++) { free_balloon_pages(dm, &range_array[i]); - post_status(&dm_device); + complete(&dm_device.config_event); } if (req->more_pages == 1) @@ -1250,19 +1250,16 @@ static void balloon_onchannelcallback(void *context); static int dm_thread_func(void *dm_dev) { struct hv_dynmem_device *dm = dm_dev; - int t; while (!kthread_should_stop()) { - t = wait_for_completion_interruptible_timeout( + wait_for_completion_interruptible_timeout( &dm_device.config_event, 1*HZ); /* * The host expects us to post information on the memory * pressure every second. */ - - if (t == 0) - post_status(dm); - + reinit_completion(&dm_device.config_event); + post_status(dm); } return 0; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 6/6] Drivers: hv: vmbus: Support a vmbus API for efficiently sending page arrays
Currently, the API for sending a multi-page buffer over VMBUS is limited to a maximum pfn array of MAX_MULTIPAGE_BUFFER_COUNT. This limitation is not imposed by the host and unnecessarily limits the maximum payload that can be sent. Implement an API that does not have this restriction. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c | 44 include/linux/hyperv.h | 31 +++ 2 files changed, 75 insertions(+), 0 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index c76ffbe..18c4f23 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -686,6 +686,50 @@ EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer); /* * vmbus_sendpacket_multipagebuffer - Send a multi-page buffer packet * using a GPADL Direct packet type. + * The buffer includes the vmbus descriptor. + */ +int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, + struct vmbus_packet_mpb_array *desc, + u32 desc_size, + void *buffer, u32 bufferlen, u64 requestid) +{ + int ret; + u32 packetlen; + u32 packetlen_aligned; + struct kvec bufferlist[3]; + u64 aligned_data = 0; + bool signal = false; + + packetlen = desc_size + bufferlen; + packetlen_aligned = ALIGN(packetlen, sizeof(u64)); + + /* Setup the descriptor */ + desc->type = VM_PKT_DATA_USING_GPA_DIRECT; + desc->flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; + desc->dataoffset8 = desc_size >> 3; /* in 8-bytes grandularity */ + desc->length8 = (u16)(packetlen_aligned >> 3); + desc->transactionid = requestid; + desc->rangecount = 1; + + bufferlist[0].iov_base = desc; + bufferlist[0].iov_len = desc_size; + bufferlist[1].iov_base = buffer; + bufferlist[1].iov_len = bufferlen; + bufferlist[2].iov_base = &aligned_data; + bufferlist[2].iov_len = (packetlen_aligned - packetlen); + + ret = hv_ringbuffer_write(&channel->outbound, bufferlist, 3, &signal); + + if (ret == 0 && signal) + vmbus_setevent(channel); + + return ret; +} +EXPORT_SYMBOL_GPL(vmbus_sendpacket_mpb_desc); + +/* + * vmbus_sendpacket_multipagebuffer - Send a multi-page buffer packet + * using a GPADL Direct packet type. */ int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, struct hv_multipage_buffer *multi_pagebuffer, diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 476c685..259023a 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -57,6 +57,18 @@ struct hv_multipage_buffer { u64 pfn_array[MAX_MULTIPAGE_BUFFER_COUNT]; }; +/* + * Multiple-page buffer array; the pfn array is variable size: + * The number of entries in the PFN array is determined by + * "len" and "offset". + */ +struct hv_mpb_array { + /* Length and Offset determines the # of pfns in the array */ + u32 len; + u32 offset; + u64 pfn_array[]; +}; + /* 0x18 includes the proprietary packet header */ #define MAX_PAGE_BUFFER_PACKET (0x18 + \ (sizeof(struct hv_page_buffer) * \ @@ -814,6 +826,18 @@ struct vmbus_channel_packet_multipage_buffer { struct hv_multipage_buffer range; } __packed; +/* The format must be the same as struct vmdata_gpa_direct */ +struct vmbus_packet_mpb_array { + u16 type; + u16 dataoffset8; + u16 length8; + u16 flags; + u64 transactionid; + u32 reserved; + u32 rangecount; /* Always 1 in this case */ + struct hv_mpb_array range; +} __packed; + extern int vmbus_open(struct vmbus_channel *channel, u32 send_ringbuffersize, @@ -845,6 +869,13 @@ extern int vmbus_sendpacket_multipagebuffer(struct vmbus_channel *channel, u32 bufferlen, u64 requestid); +extern int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, +struct vmbus_packet_mpb_array *mpb, +u32 desc_size, +void *buffer, +u32 bufferlen, +u64 requestid); + extern int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, u32 size, -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 1/6] Drivers: hv: hv_balloon: Make adjustments in computing the floor
Make adjustments in computing the balloon floor. The current computation of the balloon floor was not appropriate for virtual machines with more than 10 GB of assigned memory - we would get into situations where the host would agressively balloon down the guest and leave the guest in an unusable state. This patch fixes the issue by raising the floor. Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index b958ded..9cbbb83 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -928,9 +928,8 @@ static unsigned long compute_balloon_floor(void) * 12872(1/2) * 512 168(1/4) *2048 360(1/8) -*8192 552(1/32) -* 32768 1320 -* 131072 4392 +*8192 768(1/16) +* 32768 1536(1/32) */ if (totalram_pages < MB2PAGES(128)) min_pages = MB2PAGES(8) + (totalram_pages >> 1); @@ -938,8 +937,10 @@ static unsigned long compute_balloon_floor(void) min_pages = MB2PAGES(40) + (totalram_pages >> 2); else if (totalram_pages < MB2PAGES(2048)) min_pages = MB2PAGES(104) + (totalram_pages >> 3); + else if (totalram_pages < MB2PAGES(8192)) + min_pages = MB2PAGES(256) + (totalram_pages >> 4); else - min_pages = MB2PAGES(296) + (totalram_pages >> 5); + min_pages = MB2PAGES(512) + (totalram_pages >> 5); #undef MB2PAGES return min_pages; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 2/6] Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver
We support memory hot-add in the Hyper-V balloon driver by hot adding an appropriately sized and aligned region and controlling the on-lining of pages within that region based on the pages that the host wants us to online. We do this because the granularity and alignment requirements in Linux are different from what Windows expects. The state to manage the onlining of pages needs to be correctly protected. Fix this bug. Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c | 68 +++--- 1 files changed, 63 insertions(+), 5 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 9cbbb83..8e30415 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -533,6 +533,9 @@ struct hv_dynmem_device { */ struct task_struct *thread; + struct mutex ha_region_mutex; + struct completion waiter_event; + /* * A list of hot-add regions. */ @@ -549,7 +552,59 @@ struct hv_dynmem_device { static struct hv_dynmem_device dm_device; static void post_status(struct hv_dynmem_device *dm); + #ifdef CONFIG_MEMORY_HOTPLUG +static void acquire_region_mutex(bool trylock) +{ + if (trylock) { + reinit_completion(&dm_device.waiter_event); + while (!mutex_trylock(&dm_device.ha_region_mutex)) + wait_for_completion(&dm_device.waiter_event); + } else { + mutex_lock(&dm_device.ha_region_mutex); + } +} + +static void release_region_mutex(bool trylock) +{ + if (trylock) { + mutex_unlock(&dm_device.ha_region_mutex); + } else { + mutex_unlock(&dm_device.ha_region_mutex); + complete(&dm_device.waiter_event); + } +} + +static int hv_memory_notifier(struct notifier_block *nb, unsigned long val, + void *v) +{ + switch (val) { + case MEM_GOING_ONLINE: + acquire_region_mutex(true); + break; + + case MEM_ONLINE: + case MEM_CANCEL_ONLINE: + release_region_mutex(true); + if (dm_device.ha_waiting) { + dm_device.ha_waiting = false; + complete(&dm_device.ol_waitevent); + } + break; + + case MEM_GOING_OFFLINE: + case MEM_OFFLINE: + case MEM_CANCEL_OFFLINE: + break; + } + return NOTIFY_OK; +} + +static struct notifier_block hv_memory_nb = { + .notifier_call = hv_memory_notifier, + .priority = 0 +}; + static void hv_bring_pgs_online(unsigned long start_pfn, unsigned long size) { @@ -591,6 +646,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, init_completion(&dm_device.ol_waitevent); dm_device.ha_waiting = true; + release_region_mutex(false); nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), (HA_CHUNK << PAGE_SHIFT)); @@ -619,6 +675,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, * have not been "onlined" within the allowed time. */ wait_for_completion_timeout(&dm_device.ol_waitevent, 5*HZ); + acquire_region_mutex(false); post_status(&dm_device); } @@ -632,11 +689,6 @@ static void hv_online_page(struct page *pg) unsigned long cur_start_pgp; unsigned long cur_end_pgp; - if (dm_device.ha_waiting) { - dm_device.ha_waiting = false; - complete(&dm_device.ol_waitevent); - } - list_for_each(cur, &dm_device.ha_region_list) { has = list_entry(cur, struct hv_hotadd_state, list); cur_start_pgp = (unsigned long) @@ -834,6 +886,7 @@ static void hot_add_req(struct work_struct *dummy) resp.hdr.size = sizeof(struct dm_hot_add_response); #ifdef CONFIG_MEMORY_HOTPLUG + acquire_region_mutex(false); pg_start = dm->ha_wrk.ha_page_range.finfo.start_page; pfn_cnt = dm->ha_wrk.ha_page_range.finfo.page_cnt; @@ -865,6 +918,7 @@ static void hot_add_req(struct work_struct *dummy) if (do_hot_add) resp.page_count = process_hot_add(pg_start, pfn_cnt, rg_start, rg_sz); + release_region_mutex(false); #endif /* * The result field of the response structure has the @@ -1388,7 +1442,9 @@ static int balloon_probe(struct hv_device *dev, dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN7; init_completion(&dm_device.host_event); init_completion(&dm_device.config_event); + init_completion(&dm_device.waiter_event); IN
[PATCH RESEND 5/6] Drivers: hv: vmbus: Fix a bug in vmbus_establish_gpadl()
Correctly compute the local (gpadl) handle. I would like to thank Michael Brown for seeing this bug. Signed-off-by: K. Y. Srinivasan Reported-by: Michael Brown --- drivers/hv/channel.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 433f72a..c76ffbe 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -366,8 +366,8 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, unsigned long flags; int ret = 0; - next_gpadl_handle = atomic_read(&vmbus_connection.next_gpadl_handle); - atomic_inc(&vmbus_connection.next_gpadl_handle); + next_gpadl_handle = + (atomic_inc_return(&vmbus_connection.next_gpadl_handle) - 1); ret = create_gpadl_header(kbuffer, size, &msginfo, &msgcount); if (ret) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 4/6] Drivers: hv: vmbus: Implement a clockevent device
Implement a clockevent device based on the timer support available on Hyper-V. In this version of the patch I have addressed Jason's review comments. Signed-off-by: K. Y. Srinivasan Reviewed-by: Jason Wang --- arch/x86/include/uapi/asm/hyperv.h | 11 + drivers/hv/hv.c| 78 drivers/hv/hyperv_vmbus.h | 21 ++ drivers/hv/vmbus_drv.c | 37 - 4 files changed, 145 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 462efe7..90c458e 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -187,6 +187,17 @@ #define HV_X64_MSR_SINT14 0x409E #define HV_X64_MSR_SINT15 0x409F +/* + * Synthetic Timer MSRs. Four timers per vcpu. + */ +#define HV_X64_MSR_STIMER0_CONFIG 0x40B0 +#define HV_X64_MSR_STIMER0_COUNT 0x40B1 +#define HV_X64_MSR_STIMER1_CONFIG 0x40B2 +#define HV_X64_MSR_STIMER1_COUNT 0x40B3 +#define HV_X64_MSR_STIMER2_CONFIG 0x40B4 +#define HV_X64_MSR_STIMER2_COUNT 0x40B5 +#define HV_X64_MSR_STIMER3_CONFIG 0x40B6 +#define HV_X64_MSR_STIMER3_COUNT 0x40B7 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12 diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 3e4235c..50e51a5 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -28,7 +28,9 @@ #include #include #include +#include #include +#include #include "hyperv_vmbus.h" /* The one and only */ @@ -37,6 +39,10 @@ struct hv_context hv_context = { .hypercall_page = NULL, }; +#define HV_TIMER_FREQUENCY (10 * 1000 * 1000) /* 100ns period */ +#define HV_MAX_MAX_DELTA_TICKS 0x +#define HV_MIN_DELTA_TICKS 1 + /* * query_hypervisor_info - Get version info of the windows hypervisor */ @@ -144,6 +150,8 @@ int hv_init(void) sizeof(int) * NR_CPUS); memset(hv_context.event_dpc, 0, sizeof(void *) * NR_CPUS); + memset(hv_context.clk_evt, 0, + sizeof(void *) * NR_CPUS); max_leaf = query_hypervisor_info(); @@ -258,10 +266,63 @@ u16 hv_signal_event(void *con_id) return status; } +static int hv_ce_set_next_event(unsigned long delta, + struct clock_event_device *evt) +{ + cycle_t current_tick; + + WARN_ON(evt->mode != CLOCK_EVT_MODE_ONESHOT); + + rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); + current_tick += delta; + wrmsrl(HV_X64_MSR_STIMER0_COUNT, current_tick); + return 0; +} + +static void hv_ce_setmode(enum clock_event_mode mode, + struct clock_event_device *evt) +{ + union hv_timer_config timer_cfg; + + switch (mode) { + case CLOCK_EVT_MODE_PERIODIC: + /* unsupported */ + break; + + case CLOCK_EVT_MODE_ONESHOT: + timer_cfg.enable = 1; + timer_cfg.auto_enable = 1; + timer_cfg.sintx = VMBUS_MESSAGE_SINT; + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, timer_cfg.as_uint64); + break; + + case CLOCK_EVT_MODE_UNUSED: + case CLOCK_EVT_MODE_SHUTDOWN: + wrmsrl(HV_X64_MSR_STIMER0_COUNT, 0); + wrmsrl(HV_X64_MSR_STIMER0_CONFIG, 0); + break; + case CLOCK_EVT_MODE_RESUME: + break; + } +} + +static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu) +{ + dev->name = "Hyper-V clockevent"; + dev->features = CLOCK_EVT_FEAT_ONESHOT; + dev->cpumask = cpumask_of(cpu); + dev->rating = 1000; + dev->owner = THIS_MODULE; + + dev->set_mode = hv_ce_setmode; + dev->set_next_event = hv_ce_set_next_event; +} + int hv_synic_alloc(void) { size_t size = sizeof(struct tasklet_struct); + size_t ced_size = sizeof(struct clock_event_device); int cpu; for_each_online_cpu(cpu) { @@ -272,6 +333,13 @@ int hv_synic_alloc(void) } tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); + hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); + if (hv_context.clk_evt[cpu] == NULL) { + pr_err("Unable to allocate clock event device\n"); + goto err; + } + hv_init_clockevent_device(hv_context.clk_evt[cpu], cpu); + hv_context.synic_message_page[cpu] = (void *)get_zeroed_page(GFP_ATOMIC); @@ -305,6 +373,7 @@ err: static void hv_synic_free_cpu(int cpu) { kfree(hv_context.event_dp
[PATCH RESEND 1/1] X86: Mark the Hyper-V clocksource as being continuous
The Hyper-V clocksource is continuous; mark it accordingly. Signed-off-by: K. Y. Srinivasan Cc: stable --- arch/x86/kernel/cpu/mshyperv.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index a450373..939155f 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -107,6 +107,7 @@ static struct clocksource hyperv_cs = { .rating = 400, /* use this when running on Hyperv*/ .read = read_hv_clock, .mask = CLOCKSOURCE_MASK(64), + .flags = CLOCK_SOURCE_IS_CONTINUOUS, }; static void __init ms_hyperv_init_platform(void) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH v3 23/24] erofs: introduce cached decompression
On Mon, Jul 22, 2019 at 06:58:59PM +0800, Gao Xiang wrote: > > The number of individual Kconfig options is quite high, are you sure you > > need them to be split like that? > > You mean the above? these are 3 cache strategies, which impact the > runtime memory consumption and performance. I tend to leave the above > as it-is... Unless cache strategies involve a huge amount of kernel code, I'd recommend always compiling all of the cache strategies, and then have a way to change the cache strategy via a mount option (and possibly remount, although that can get tricky if there is already cached information). You could also specify a default in the erofs superblock, you think that would be useful. - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [PATCH v3 23/24] erofs: introduce cached decompression
On Mon, Jul 22, 2019 at 10:16:44PM +0800, Gao Xiang wrote: > OK, I will give a try. One point I think is how to deal with the case > if there is already cached information when remounting as well as you said. > > As the first step, maybe the mount option can be defined as > allowing/forbiding caching from now on, which can be refined later. Yes; possible solutions include ignoring the issue (assuming that cached data structures that "shouldn't" be in the cache given the new cache strategy will fall out of the cache over time), forcibly flushing the cache when the caching strategy has changed, and of course, forbidding caching strategy change at remount time. Cheers, - Ted ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 10/21] Drivers: hv: vss: switch to using the hvutil_device_state state machine
From: Vitaly Kuznetsov Switch to using the hvutil_device_state state machine from using kvp_transaction.active. State transitions are: -> HVUTIL_DEVICE_INIT when driver loads or on device release -> HVUTIL_READY if the handshake was successful -> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host -> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied -> HVUTIL_READY after we respond to the host -> HVUTIL_DEVICE_DYING on driver unload In hv_vss_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when the userspace daemon is disconnected, otherwise we can make the host think we don't support VSS and disable the service completely. Unfortunately there is no good way we can figure out that the userspace daemon has died (unless we start treating all timeouts as such), add a protection against processing new VSS_OP_REGISTER messages while being in the middle of a transaction (HVUTIL_USERSPACE_REQ or HVUTIL_USERSPACE_RECV state). Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_snapshot.c | 87 ++--- 1 files changed, 58 insertions(+), 29 deletions(-) diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c index 4bb9b1c..ddb1cda 100644 --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -33,16 +33,21 @@ #define VSS_USERSPACE_TIMEOUT (msecs_to_jiffies(10 * 1000)) /* - * Global state maintained for transaction that is being processed. - * Note that only one transaction can be active at any point in time. + * Global state maintained for transaction that is being processed. For a class + * of integration services, including the "VSS service", the specified protocol + * is a "request/response" protocol which means that there can only be single + * outstanding transaction from the host at any given point in time. We use + * this to simplify memory management in this driver - we cache and process + * only one message at a time. * - * This state is set when we receive a request from the host; we - * cleanup this state when the transaction is completed - when we respond - * to the host with the key value. + * While the request/response protocol is guaranteed by the host, we further + * ensure this by serializing packet processing in this driver - we do not + * read additional packets from the VMBUs until the current packet is fully + * handled. */ static struct { - bool active; /* transaction status - active or not */ + int state; /* hvutil_device_state */ int recv_len; /* number of bytes received. */ struct vmbus_channel *recv_channel; /* chn we got the request */ u64 recv_req_id; /* request ID. */ @@ -75,6 +80,10 @@ static void vss_timeout_func(struct work_struct *dummy) pr_warn("VSS: timeout waiting for daemon to reply\n"); vss_respond_to_host(HV_E_FAIL); + /* Transaction is finished, reset the state. */ + if (vss_transaction.state > HVUTIL_READY) + vss_transaction.state = HVUTIL_READY; + hv_poll_channel(vss_transaction.vss_context, hv_vss_onchannelcallback); } @@ -86,15 +95,32 @@ vss_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) vss_msg = (struct hv_vss_msg *)msg->data; - if (vss_msg->vss_hdr.operation == VSS_OP_REGISTER) { + /* +* Don't process registration messages if we're in the middle of +* a transaction processing. +*/ + if (vss_transaction.state > HVUTIL_READY && + vss_msg->vss_hdr.operation == VSS_OP_REGISTER) + return; + + if (vss_transaction.state == HVUTIL_DEVICE_INIT && + vss_msg->vss_hdr.operation == VSS_OP_REGISTER) { pr_info("VSS daemon registered\n"); - vss_transaction.active = false; + vss_transaction.state = HVUTIL_READY; + } else if (vss_transaction.state == HVUTIL_USERSPACE_REQ) { + vss_transaction.state = HVUTIL_USERSPACE_RECV; + if (cancel_delayed_work_sync(&vss_timeout_work)) { + vss_respond_to_host(vss_msg->error); + /* Transaction is finished, reset the state. */ + vss_transaction.state = HVUTIL_READY; + hv_poll_channel(vss_transaction.vss_context, + hv_vss_onchannelcallback); + } + } else { + /* This is a spurious call! */ + pr_warn("VSS: Transaction not active\n"); + return; } - if (cancel_delayed_work_sync(&vss_timeout_work)) - vss_respond_to_host(vss_msg->error); - - hv_poll_chann
[PATCH 21/21] Drivers: hv: utils: unify driver registration reporting
From: Vitaly Kuznetsov Unify driver registration reporting and move it to debug level as normally daemons write to syslog themselves and these kernel messages are useless. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c|3 +-- drivers/hv/hv_kvp.c |3 ++- drivers/hv/hv_snapshot.c |2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index b7b528c..b50dd33 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -107,8 +107,7 @@ static int fcopy_handle_handshake(u32 version) */ return -EINVAL; } - pr_info("FCP: user-mode registering done. Daemon version: %d\n", - version); + pr_debug("FCP: userspace daemon ver. %d registered\n", version); fcopy_transaction.state = HVUTIL_READY; hv_poll_channel(fcopy_transaction.fcopy_context, hv_fcopy_onchannelcallback); diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index baa1208..d85798d 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -150,7 +150,8 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) /* * We have a compatible daemon; complete the handshake. */ - pr_info("KVP: user-mode registering done.\n"); + pr_debug("KVP: userspace daemon ver. %d registered\n", +KVP_OP_REGISTER); kvp_register(dm_reg_value); kvp_transaction.state = HVUTIL_READY; diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c index ee1762b..815405f 100644 --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -113,7 +113,7 @@ static int vss_handle_handshake(struct hv_vss_msg *vss_msg) return -EINVAL; } vss_transaction.state = HVUTIL_READY; - pr_info("VSS daemon registered\n"); + pr_debug("VSS: userspace daemon ver. %d registered\n", dm_reg_value); return 0; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 00/21] Drivers: hv: utils: re-implement the kernel/userspace communication layer
Changes in v3: - Removed RFC from subject, rebased on top of current char-misc-next tree. RFCv2: http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2015-March/066629.html Anatomy of the series: Patches 01 - 07 are cleanup with minor functional change. Patch 08 defines the state machine. Patches 09-11 convert all 3 drivers to using the state machine. Patch 12 fixes a bug in fcopy. This change is going away in Patch 15, I just want to highlight the fix. Patch 13 introduces a transport abstraction. Patch 14-16 convert all drivers to using the transport abstraction. Patches 17-18 switch KVP and VSS daemon to using char devices. Patches 19-20 convert FCOPY and VSS to hull handshake (the same we have in KVP). These two can be postponed till we really need to distinguish between different kernels in the daemon code. Patch 21 unifies log messages on daemons connect across all drivers and moves these messages to debug level. I smoke-tested this series with both old (netlink) and new (char devices) daemons and tested the daemon upgrade procedure. Original description: This series converts kvp/vss daemons to use misc char devices instead of netlink for userspace/kernel communication and then updates fcopy to be consistent with kvp/vss. Userspace/kernel communication via netlink has a number of issues: - It is hard for userspace to figure out if the kernel part was loaded or not and this fact can change as there is a way to enable/disable the service from host side. Racy daemon startup is also a problem. - When the userspace daemon restarts/dies kernel part doesn't receive a notification. - Netlink communication is not stable under heavy load. - ... Vitaly Kuznetsov (21): Drivers: hv: util: move kvp/vss function declarations to hyperv_vmbus.h Drivers: hv: kvp: reset kvp_context Drivers: hv: kvp: move poll_channel() to hyperv_vmbus.h Drivers: hv: fcopy: process deferred messages when we complete the transaction Drivers: hv: vss: process deferred messages when we complete the transaction Drivers: hv: kvp: rename kvp_work -> kvp_timeout_work Drivers: hv: fcopy: rename fcopy_work -> fcopy_timeout_work Drivers: hv: util: introduce state machine for util drivers Drivers: hv: kvp: switch to using the hvutil_device_state state machine Drivers: hv: vss: switch to using the hvutil_device_state state machine Drivers: hv: fcopy: switch to using the hvutil_device_state state machine Drivers: hv: fcopy: set .owner reference for file operations Drivers: hv: util: introduce hv_utils_transport abstraction Drivers: hv: vss: convert to hv_utils_transport Drivers: hv: fcopy: convert to hv_utils_transport Drivers: hv: kvp: convert to hv_utils_transport Tools: hv: kvp: use misc char device to communicate with kernel Tools: hv: vss: use misc char device to communicate with kernel Drivers: hv: vss: full handshake support Drivers: hv: fcopy: full handshake support Drivers: hv: utils: unify driver registration reporting drivers/hv/Makefile |2 +- drivers/hv/hv_fcopy.c | 287 +-- drivers/hv/hv_kvp.c | 192 +- drivers/hv/hv_snapshot.c| 168 --- drivers/hv/hv_utils_transport.c | 276 + drivers/hv/hv_utils_transport.h | 51 +++ drivers/hv/hyperv_vmbus.h | 29 include/linux/hyperv.h |7 - include/uapi/linux/hyperv.h |8 +- tools/hv/hv_fcopy_daemon.c | 15 ++ tools/hv/hv_kvp_daemon.c| 166 -- tools/hv/hv_vss_daemon.c| 149 +--- 12 files changed, 752 insertions(+), 598 deletions(-) create mode 100644 drivers/hv/hv_utils_transport.c create mode 100644 drivers/hv/hv_utils_transport.h -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 04/21] Drivers: hv: fcopy: process deferred messages when we complete the transaction
From: Vitaly Kuznetsov In theory, the host is not supposed to issue any requests before be reply to the previous one. In KVP we, however, support the following scenarios: 1) A message was received before userspace daemon registered; 2) A message was received while the previous one is still being processed. In FCOPY we support only the former. Add support for the later, use hv_poll_channel() to do the job. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index cd453e4..8bdf752 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -98,6 +98,8 @@ static void fcopy_work_func(struct work_struct *dummy) if (down_trylock(&fcopy_transaction.read_sema)) ; + hv_poll_channel(fcopy_transaction.fcopy_context, + hv_fcopy_onchannelcallback); } static int fcopy_handle_handshake(u32 version) @@ -117,8 +119,8 @@ static int fcopy_handle_handshake(u32 version) pr_info("FCP: user-mode registering done. Daemon version: %d\n", version); fcopy_transaction.active = false; - if (fcopy_transaction.fcopy_context) - hv_fcopy_onchannelcallback(fcopy_transaction.fcopy_context); + hv_poll_channel(fcopy_transaction.fcopy_context, + hv_fcopy_onchannelcallback); in_hand_shake = false; return 0; } @@ -226,6 +228,7 @@ void hv_fcopy_onchannelcallback(void *context) fcopy_transaction.fcopy_context = context; return; } + fcopy_transaction.fcopy_context = NULL; vmbus_recvpacket(channel, recv_buffer, PAGE_SIZE * 2, &recvlen, &requestid); @@ -333,8 +336,11 @@ static ssize_t fcopy_write(struct file *file, const char __user *buf, * Complete the transaction by forwarding the result * to the host. But first, cancel the timeout. */ - if (cancel_delayed_work_sync(&fcopy_work)) + if (cancel_delayed_work_sync(&fcopy_work)) { fcopy_respond_to_host(response); + hv_poll_channel(fcopy_transaction.fcopy_context, + hv_fcopy_onchannelcallback); + } return sizeof(int); } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 20/21] Drivers: hv: fcopy: full handshake support
From: Vitaly Kuznetsov Introduce FCOPY_VERSION_1 to support kernel replying to the negotiation message with its own version. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c | 16 +++- include/uapi/linux/hyperv.h |3 ++- tools/hv/hv_fcopy_daemon.c | 15 +++ 3 files changed, 32 insertions(+), 2 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index 6a8ec9f..b7b528c 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -62,6 +62,10 @@ static DECLARE_WORK(fcopy_send_work, fcopy_send_data); static const char fcopy_devname[] = "vmbus/hv_fcopy"; static u8 *recv_buffer; static struct hvutil_transport *hvt; +/* + * This state maintains the version number registered by the daemon. + */ +static int dm_reg_value; static void fcopy_timeout_func(struct work_struct *dummy) { @@ -81,8 +85,18 @@ static void fcopy_timeout_func(struct work_struct *dummy) static int fcopy_handle_handshake(u32 version) { + u32 our_ver = FCOPY_CURRENT_VERSION; + switch (version) { - case FCOPY_CURRENT_VERSION: + case FCOPY_VERSION_0: + /* Daemon doesn't expect us to reply */ + dm_reg_value = version; + break; + case FCOPY_VERSION_1: + /* Daemon expects us to reply with our own version */ + if (hvutil_transport_send(hvt, &our_ver, sizeof(our_ver))) + return -EFAULT; + dm_reg_value = version; break; default: /* diff --git a/include/uapi/linux/hyperv.h b/include/uapi/linux/hyperv.h index 66c76df..e4c0a35 100644 --- a/include/uapi/linux/hyperv.h +++ b/include/uapi/linux/hyperv.h @@ -105,7 +105,8 @@ struct hv_vss_msg { */ #define FCOPY_VERSION_0 0 -#define FCOPY_CURRENT_VERSION FCOPY_VERSION_0 +#define FCOPY_VERSION_1 1 +#define FCOPY_CURRENT_VERSION FCOPY_VERSION_1 #define W_MAX_PATH 260 enum hv_fcopy_op { diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c index 9445d8f..5480e4e 100644 --- a/tools/hv/hv_fcopy_daemon.c +++ b/tools/hv/hv_fcopy_daemon.c @@ -137,6 +137,8 @@ int main(int argc, char *argv[]) int version = FCOPY_CURRENT_VERSION; char *buffer[4096 * 2]; struct hv_fcopy_hdr *in_msg; + int in_handshake = 1; + __u32 kernel_modver; static struct option long_options[] = { {"help",no_argument, 0, 'h' }, @@ -191,6 +193,19 @@ int main(int argc, char *argv[]) syslog(LOG_ERR, "pread failed: %s", strerror(errno)); exit(EXIT_FAILURE); } + + if (in_handshake) { + if (len != sizeof(kernel_modver)) { + syslog(LOG_ERR, "invalid version negotiation"); + exit(EXIT_FAILURE); + } + kernel_modver = *(__u32 *)buffer; + in_handshake = 0; + syslog(LOG_INFO, "HV_FCOPY: kernel module version: %d", + kernel_modver); + continue; + } + in_msg = (struct hv_fcopy_hdr *)buffer; switch (in_msg->operation) { -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 02/21] Drivers: hv: kvp: reset kvp_context
From: Vitaly Kuznetsov We set kvp_context when we want to postpone receiving a packet from vmbus due to the previous transaction being unfinished. We, however, never reset this state, all consequent kvp_respond_to_host() calls will result in poll_channel() calling hv_kvp_onchannelcallback(). This doesn't cause real issues as: 1) Host is supposed to serialize transactions as well 2) If no message is pending vmbus_recvpacket() will return 0 recvlen. This is just a cleanup. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_kvp.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index a414c83..caa467d 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -621,6 +621,7 @@ void hv_kvp_onchannelcallback(void *context) kvp_transaction.kvp_context = context; return; } + kvp_transaction.kvp_context = NULL; vmbus_recvpacket(channel, recv_buffer, PAGE_SIZE * 4, &recvlen, &requestid); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 01/21] Drivers: hv: util: move kvp/vss function declarations to hyperv_vmbus.h
From: Vitaly Kuznetsov These declarations are internal to hv_util module and hv_fcopy_* declarations already reside there. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_kvp.c |1 + drivers/hv/hv_snapshot.c |2 ++ drivers/hv/hyperv_vmbus.h |8 include/linux/hyperv.h|7 --- 4 files changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index beb8105..a414c83 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -28,6 +28,7 @@ #include #include +#include "hyperv_vmbus.h" /* * Pre win8 version numbers used in ws2008 and ws 2008 r2 (win7) diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c index 9d5e0d1..c1a3604 100644 --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -24,6 +24,8 @@ #include #include +#include "hyperv_vmbus.h" + #define VSS_MAJOR 5 #define VSS_MINOR 0 #define VSS_VERSION(VSS_MAJOR << 16 | VSS_MINOR) diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 887287a..ddcf6c4 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -730,6 +730,14 @@ int vmbus_set_event(struct vmbus_channel *channel); void vmbus_on_event(unsigned long data); +int hv_kvp_init(struct hv_util_service *); +void hv_kvp_deinit(void); +void hv_kvp_onchannelcallback(void *); + +int hv_vss_init(struct hv_util_service *); +void hv_vss_deinit(void); +void hv_vss_onchannelcallback(void *); + int hv_fcopy_init(struct hv_util_service *); void hv_fcopy_deinit(void); void hv_fcopy_onchannelcallback(void *); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 902c37a..1744148 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1236,13 +1236,6 @@ extern bool vmbus_prep_negotiate_resp(struct icmsg_hdr *, struct icmsg_negotiate *, u8 *, int, int); -int hv_kvp_init(struct hv_util_service *); -void hv_kvp_deinit(void); -void hv_kvp_onchannelcallback(void *); - -int hv_vss_init(struct hv_util_service *); -void hv_vss_deinit(void); -void hv_vss_onchannelcallback(void *); void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid); extern struct resource hyperv_mmio; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 03/21] Drivers: hv: kvp: move poll_channel() to hyperv_vmbus.h
From: Vitaly Kuznetsov Move poll_channel() to hyperv_vmbus.h and make it inline and rename it to hv_poll_channel() so it can be reused in other hv_util modules. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_kvp.c | 17 +++-- drivers/hv/hyperv_vmbus.h | 12 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index caa467d..939c3e7 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -128,17 +128,6 @@ kvp_work_func(struct work_struct *dummy) kvp_respond_to_host(NULL, HV_E_FAIL); } -static void poll_channel(struct vmbus_channel *channel) -{ - if (channel->target_cpu != smp_processor_id()) - smp_call_function_single(channel->target_cpu, -hv_kvp_onchannelcallback, -channel, true); - else - hv_kvp_onchannelcallback(channel); -} - - static int kvp_handle_handshake(struct hv_kvp_msg *msg) { int ret = 1; @@ -166,8 +155,8 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_info("KVP: user-mode registering done.\n"); kvp_register(dm_reg_value); kvp_transaction.active = false; - if (kvp_transaction.kvp_context) - poll_channel(kvp_transaction.kvp_context); + hv_poll_channel(kvp_transaction.kvp_context, + hv_kvp_onchannelcallback); } return ret; } @@ -587,7 +576,7 @@ response_done: vmbus_sendpacket(channel, recv_buffer, buf_len, req_id, VM_PKT_DATA_INBAND, 0); - poll_channel(channel); + hv_poll_channel(channel, hv_kvp_onchannelcallback); } /* diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index ddcf6c4..2f30456 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -742,5 +742,17 @@ int hv_fcopy_init(struct hv_util_service *); void hv_fcopy_deinit(void); void hv_fcopy_onchannelcallback(void *); +static inline void hv_poll_channel(struct vmbus_channel *channel, + void (*cb)(void *)) +{ + if (!channel) + return; + + if (channel->target_cpu != smp_processor_id()) + smp_call_function_single(channel->target_cpu, +cb, channel, true); + else + cb(channel); +} #endif /* _HYPERV_VMBUS_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 18/21] Tools: hv: vss: use misc char device to communicate with kernel
From: Vitaly Kuznetsov Use /dev/vmbus/hv_vss instead of netlink. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- tools/hv/hv_vss_daemon.c | 139 - 1 files changed, 25 insertions(+), 114 deletions(-) diff --git a/tools/hv/hv_vss_daemon.c b/tools/hv/hv_vss_daemon.c index 506dd01..36f1821 100644 --- a/tools/hv/hv_vss_daemon.c +++ b/tools/hv/hv_vss_daemon.c @@ -19,7 +19,6 @@ #include -#include #include #include #include @@ -30,21 +29,11 @@ #include #include #include -#include #include -#include #include -#include #include #include -static struct sockaddr_nl addr; - -#ifndef SOL_NETLINK -#define SOL_NETLINK 270 -#endif - - /* Don't use syslog() in the function since that can cause write to disk */ static int vss_do_freeze(char *dir, unsigned int cmd) { @@ -143,33 +132,6 @@ out: return error; } -static int netlink_send(int fd, struct cn_msg *msg) -{ - struct nlmsghdr nlh = { .nlmsg_type = NLMSG_DONE }; - unsigned int size; - struct msghdr message; - struct iovec iov[2]; - - size = sizeof(struct cn_msg) + msg->len; - - nlh.nlmsg_pid = getpid(); - nlh.nlmsg_len = NLMSG_LENGTH(size); - - iov[0].iov_base = &nlh; - iov[0].iov_len = sizeof(nlh); - - iov[1].iov_base = msg; - iov[1].iov_len = size; - - memset(&message, 0, sizeof(message)); - message.msg_name = &addr; - message.msg_namelen = sizeof(addr); - message.msg_iov = iov; - message.msg_iovlen = 2; - - return sendmsg(fd, &message, 0); -} - void print_usage(char *argv[]) { fprintf(stderr, "Usage: %s [options]\n" @@ -180,16 +142,11 @@ void print_usage(char *argv[]) int main(int argc, char *argv[]) { - int fd, len, nl_group; + int vss_fd, len; int error; - struct cn_msg *message; struct pollfd pfd; - struct nlmsghdr *incoming_msg; - struct cn_msg *incoming_cn_msg; int op; - struct hv_vss_msg *vss_msg; - char *vss_recv_buffer; - size_t vss_recv_buffer_len; + struct hv_vss_msg vss_msg[1]; int daemonize = 1, long_index = 0, opt; static struct option long_options[] = { @@ -217,98 +174,50 @@ int main(int argc, char *argv[]) openlog("Hyper-V VSS", 0, LOG_USER); syslog(LOG_INFO, "VSS starting; pid is:%d", getpid()); - vss_recv_buffer_len = NLMSG_LENGTH(0) + sizeof(struct cn_msg) + sizeof(struct hv_vss_msg); - vss_recv_buffer = calloc(1, vss_recv_buffer_len); - if (!vss_recv_buffer) { - syslog(LOG_ERR, "Failed to allocate netlink buffers"); - exit(EXIT_FAILURE); - } - - fd = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); - if (fd < 0) { - syslog(LOG_ERR, "netlink socket creation failed; error:%d %s", - errno, strerror(errno)); - exit(EXIT_FAILURE); - } - addr.nl_family = AF_NETLINK; - addr.nl_pad = 0; - addr.nl_pid = 0; - addr.nl_groups = 0; - - - error = bind(fd, (struct sockaddr *)&addr, sizeof(addr)); - if (error < 0) { - syslog(LOG_ERR, "bind failed; error:%d %s", errno, strerror(errno)); - close(fd); - exit(EXIT_FAILURE); - } - nl_group = CN_VSS_IDX; - if (setsockopt(fd, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP, &nl_group, sizeof(nl_group)) < 0) { - syslog(LOG_ERR, "setsockopt failed; error:%d %s", errno, strerror(errno)); - close(fd); + vss_fd = open("/dev/vmbus/hv_vss", O_RDWR); + if (vss_fd < 0) { + syslog(LOG_ERR, "open /dev/vmbus/hv_vss failed; error: %d %s", + errno, strerror(errno)); exit(EXIT_FAILURE); } /* * Register ourselves with the kernel. */ - message = (struct cn_msg *)vss_recv_buffer; - message->id.idx = CN_VSS_IDX; - message->id.val = CN_VSS_VAL; - message->ack = 0; - vss_msg = (struct hv_vss_msg *)message->data; - vss_msg->vss_hdr.operation = VSS_OP_REGISTER; - - message->len = sizeof(struct hv_vss_msg); + vss_msg->vss_hdr.operation = VSS_OP_REGISTER1; - len = netlink_send(fd, message); + len = write(vss_fd, vss_msg, sizeof(struct hv_vss_msg)); if (len < 0) { - syslog(LOG_ERR, "netlink_send failed; error:%d %s", errno, strerror(errno)); - close(fd); + syslog(LOG_ERR, "registration to kernel failed; error: %d %s", + errno, strerror(errno)); + close(vss_fd); exit(EXIT_FAILURE); } -
[PATCH 09/21] Drivers: hv: kvp: switch to using the hvutil_device_state state machine
From: Vitaly Kuznetsov Switch to using the hvutil_device_state state machine from using 2 different state variables: kvp_transaction.active and in_hand_shake. State transitions are: -> HVUTIL_DEVICE_INIT when driver loads or on device release -> HVUTIL_READY if the handshake was successful -> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host -> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied -> HVUTIL_READY after we respond to the host -> HVUTIL_DEVICE_DYING on driver unload In hv_kvp_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when the userspace daemon is disconnected, otherwise we can make the host think we don't support KVP and disable the service completely. Unfortunately there is no good way we can figure out that the userspace daemon has died (unless we start treating all timeouts as such). In case the daemon restarts we skip the negotiation procedure (so the daemon is supposed to has the same version). This behavior is unchanged from in_handshake approach. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_kvp.c | 87 -- 1 files changed, 49 insertions(+), 38 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 14c62e2..a70d202 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -46,16 +46,21 @@ #define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR) /* - * Global state maintained for transaction that is being processed. - * Note that only one transaction can be active at any point in time. + * Global state maintained for transaction that is being processed. For a class + * of integration services, including the "KVP service", the specified protocol + * is a "request/response" protocol which means that there can only be single + * outstanding transaction from the host at any given point in time. We use + * this to simplify memory management in this driver - we cache and process + * only one message at a time. * - * This state is set when we receive a request from the host; we - * cleanup this state when the transaction is completed - when we respond - * to the host with the key value. + * While the request/response protocol is guaranteed by the host, we further + * ensure this by serializing packet processing in this driver - we do not + * read additional packets from the VMBUs until the current packet is fully + * handled. */ static struct { - bool active; /* transaction status - active or not */ + int state; /* hvutil_device_state */ int recv_len; /* number of bytes received. */ struct hv_kvp_msg *kvp_msg; /* current message */ struct vmbus_channel *recv_channel; /* chn we got the request */ @@ -64,13 +69,6 @@ static struct { } kvp_transaction; /* - * Before we can accept KVP messages from the host, we need - * to handshake with the user level daemon. This state tracks - * if we are in the handshake phase. - */ -static bool in_hand_shake = true; - -/* * This state maintains the version number registered by the daemon. */ static int dm_reg_value; @@ -126,6 +124,13 @@ static void kvp_timeout_func(struct work_struct *dummy) * process the pending transaction. */ kvp_respond_to_host(NULL, HV_E_FAIL); + + /* Transaction is finished, reset the state. */ + if (kvp_transaction.state > HVUTIL_READY) + kvp_transaction.state = HVUTIL_READY; + + hv_poll_channel(kvp_transaction.kvp_context, + hv_kvp_onchannelcallback); } static int kvp_handle_handshake(struct hv_kvp_msg *msg) @@ -154,9 +159,7 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) */ pr_info("KVP: user-mode registering done.\n"); kvp_register(dm_reg_value); - kvp_transaction.active = false; - hv_poll_channel(kvp_transaction.kvp_context, - hv_kvp_onchannelcallback); + kvp_transaction.state = HVUTIL_READY; } return ret; } @@ -180,12 +183,16 @@ kvp_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) * with the daemon; handle that first. */ - if (in_hand_shake) { - if (kvp_handle_handshake(message)) - in_hand_shake = false; + if (kvp_transaction.state < HVUTIL_READY) { + kvp_handle_handshake(message); return; } + /* We didn't send anything to userspace so the reply is spurious */ + if (kvp_transaction.state < HVUTIL_USERSPACE_REQ) + return; + kvp_transaction.state = HVUTIL_USERSPACE_RECV; + /* * Based on the version of the daemon, w
[PATCH 07/21] Drivers: hv: fcopy: rename fcopy_work -> fcopy_timeout_work
From: Vitaly Kuznetsov 'fcopy_work' (and fcopy_work_func) is a misnomer as it sounds like we expect this useful work to happen and in reality it is just an emergency escape when timeout happens. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index 8bdf752..c14f0f4 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -75,11 +75,11 @@ static bool opened; /* currently device opened */ static bool in_hand_shake = true; static void fcopy_send_data(void); static void fcopy_respond_to_host(int error); -static void fcopy_work_func(struct work_struct *dummy); -static DECLARE_DELAYED_WORK(fcopy_work, fcopy_work_func); +static void fcopy_timeout_func(struct work_struct *dummy); +static DECLARE_DELAYED_WORK(fcopy_timeout_work, fcopy_timeout_func); static u8 *recv_buffer; -static void fcopy_work_func(struct work_struct *dummy) +static void fcopy_timeout_func(struct work_struct *dummy) { /* * If the timer fires, the user-mode component has not responded; @@ -261,7 +261,7 @@ void hv_fcopy_onchannelcallback(void *context) /* * Send the information to the user-level daemon. */ - schedule_delayed_work(&fcopy_work, 5*HZ); + schedule_delayed_work(&fcopy_timeout_work, 5*HZ); fcopy_send_data(); return; } @@ -336,7 +336,7 @@ static ssize_t fcopy_write(struct file *file, const char __user *buf, * Complete the transaction by forwarding the result * to the host. But first, cancel the timeout. */ - if (cancel_delayed_work_sync(&fcopy_work)) { + if (cancel_delayed_work_sync(&fcopy_timeout_work)) fcopy_respond_to_host(response); hv_poll_channel(fcopy_transaction.fcopy_context, hv_fcopy_onchannelcallback); @@ -378,7 +378,7 @@ static int fcopy_release(struct inode *inode, struct file *f) in_hand_shake = true; opened = false; - if (cancel_delayed_work_sync(&fcopy_work)) { + if (cancel_delayed_work_sync(&fcopy_timeout_work)) { /* We haven't up()-ed the semaphore(very rare)? */ if (down_trylock(&fcopy_transaction.read_sema)) ; @@ -442,6 +442,6 @@ int hv_fcopy_init(struct hv_util_service *srv) void hv_fcopy_deinit(void) { - cancel_delayed_work_sync(&fcopy_work); + cancel_delayed_work_sync(&fcopy_timeout_work); fcopy_dev_deinit(); } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 12/21] Drivers: hv: fcopy: set .owner reference for file operations
From: Vitaly Kuznetsov Get an additional reference otherwise a crash is observed when hv_utils module is being unloaded while fcopy daemon is still running. .owner gives us an additional reference when someone holds a descriptor for the device. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index a501301..d1475e6 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -360,12 +360,9 @@ static int fcopy_open(struct inode *inode, struct file *f) } /* XXX: there are still some tricky corner cases, e.g., - * 1) In a SMP guest, when fcopy_release() runs between + * In an SMP guest, when fcopy_release() runs between * schedule_delayed_work() and fcopy_send_data(), there is * still a chance an obsolete message will be queued. - * - * 2) When the fcopy daemon is running, if we unload the driver, - * we'll notice a kernel oops when we kill the daemon later. */ static int fcopy_release(struct inode *inode, struct file *f) { @@ -385,6 +382,7 @@ static int fcopy_release(struct inode *inode, struct file *f) static const struct file_operations fcopy_fops = { + .owner = THIS_MODULE, .read = fcopy_read, .write = fcopy_write, .release= fcopy_release, -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 13/21] Drivers: hv: util: introduce hv_utils_transport abstraction
From: Vitaly Kuznetsov The intention is to make KVP/VSS drivers work through misc char devices. Introduce an abstraction for kernel/userspace communication to make the migration smoother. Transport operational mode (netlink or char device) is determined by the first received message. To support driver upgrades the switch from netlink to chardev operational mode is supported. Every hv_util daemon is supposed to register 2 callbacks: 1) on_msg() to get notified when the userspace daemon sent a message; 2) on_reset() to get notified when the userspace daemon drops the connection. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/Makefile |2 +- drivers/hv/hv_utils_transport.c | 276 +++ drivers/hv/hv_utils_transport.h | 51 +++ 3 files changed, 328 insertions(+), 1 deletions(-) create mode 100644 drivers/hv/hv_utils_transport.c create mode 100644 drivers/hv/hv_utils_transport.h diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile index 5e4dfa4..39c9b2c 100644 --- a/drivers/hv/Makefile +++ b/drivers/hv/Makefile @@ -5,4 +5,4 @@ obj-$(CONFIG_HYPERV_BALLOON)+= hv_balloon.o hv_vmbus-y := vmbus_drv.o \ hv.o connection.o channel.o \ channel_mgmt.o ring_buffer.o -hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_fcopy.o +hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_fcopy.o hv_utils_transport.o diff --git a/drivers/hv/hv_utils_transport.c b/drivers/hv/hv_utils_transport.c new file mode 100644 index 000..ea7ba5e --- /dev/null +++ b/drivers/hv/hv_utils_transport.c @@ -0,0 +1,276 @@ +/* + * Kernel/userspace transport abstraction for Hyper-V util driver. + * + * Copyright (C) 2015, Vitaly Kuznetsov + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + */ + +#include +#include +#include + +#include "hyperv_vmbus.h" +#include "hv_utils_transport.h" + +static DEFINE_SPINLOCK(hvt_list_lock); +static struct list_head hvt_list = LIST_HEAD_INIT(hvt_list); + +static void hvt_reset(struct hvutil_transport *hvt) +{ + mutex_lock(&hvt->outmsg_lock); + kfree(hvt->outmsg); + hvt->outmsg = NULL; + hvt->outmsg_len = 0; + mutex_unlock(&hvt->outmsg_lock); + if (hvt->on_reset) + hvt->on_reset(); +} + +static ssize_t hvt_op_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct hvutil_transport *hvt; + int ret; + + hvt = container_of(file->f_op, struct hvutil_transport, fops); + + if (wait_event_interruptible(hvt->outmsg_q, hvt->outmsg_len > 0)) + return -EINTR; + + mutex_lock(&hvt->outmsg_lock); + if (!hvt->outmsg) { + ret = -EAGAIN; + goto out_unlock; + } + + if (count < hvt->outmsg_len) { + ret = -EINVAL; + goto out_unlock; + } + + if (!copy_to_user(buf, hvt->outmsg, hvt->outmsg_len)) + ret = hvt->outmsg_len; + else + ret = -EFAULT; + + kfree(hvt->outmsg); + hvt->outmsg = NULL; + hvt->outmsg_len = 0; + +out_unlock: + mutex_unlock(&hvt->outmsg_lock); + return ret; +} + +static ssize_t hvt_op_write(struct file *file, const char __user *buf, + size_t count, loff_t *ppos) +{ + struct hvutil_transport *hvt; + u8 *inmsg; + + hvt = container_of(file->f_op, struct hvutil_transport, fops); + + inmsg = kzalloc(count, GFP_KERNEL); + if (copy_from_user(inmsg, buf, count)) { + kfree(inmsg); + return -EFAULT; + } + if (hvt->on_msg(inmsg, count)) + return -EFAULT; + kfree(inmsg); + + return count; +} + +static unsigned int hvt_op_poll(struct file *file, poll_table *wait) +{ + struct hvutil_transport *hvt; + + hvt = container_of(file->f_op, struct hvutil_transport, fops); + + poll_wait(file, &hvt->outmsg_q, wait); + if (hvt->outmsg_len > 0) + return POLLIN | POLLRDNORM; + + return 0; +} + +static int hvt_op_open(struct inode *inode, struct file *file) +{ + struct hvutil_transport *hvt; + + hvt = container_of(file->f_op, struct hvutil_transport, fops); + + /* +* Switching to CHARDEV mode. We switch bach to INIT when device +* gets released. +
[PATCH 14/21] Drivers: hv: vss: convert to hv_utils_transport
From: Vitaly Kuznetsov Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_vss communication methods. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_snapshot.c | 52 +++-- 1 files changed, 27 insertions(+), 25 deletions(-) diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c index ddb1cda..2c8c246 100644 --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -25,6 +25,7 @@ #include #include "hyperv_vmbus.h" +#include "hv_utils_transport.h" #define VSS_MAJOR 5 #define VSS_MINOR 0 @@ -58,9 +59,9 @@ static struct { static void vss_respond_to_host(int error); -static struct cb_id vss_id = { CN_VSS_IDX, CN_VSS_VAL }; -static const char vss_name[] = "vss_kernel_module"; +static const char vss_devname[] = "vmbus/hv_vss"; static __u8 *recv_buffer; +static struct hvutil_transport *hvt; static void vss_send_op(struct work_struct *dummy); static void vss_timeout_func(struct work_struct *dummy); @@ -88,12 +89,12 @@ static void vss_timeout_func(struct work_struct *dummy) hv_vss_onchannelcallback); } -static void -vss_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) +static int vss_on_msg(void *msg, int len) { - struct hv_vss_msg *vss_msg; + struct hv_vss_msg *vss_msg = (struct hv_vss_msg *)msg; - vss_msg = (struct hv_vss_msg *)msg->data; + if (len != sizeof(*vss_msg)) + return -EINVAL; /* * Don't process registration messages if we're in the middle of @@ -101,7 +102,7 @@ vss_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) */ if (vss_transaction.state > HVUTIL_READY && vss_msg->vss_hdr.operation == VSS_OP_REGISTER) - return; + return -EINVAL; if (vss_transaction.state == HVUTIL_DEVICE_INIT && vss_msg->vss_hdr.operation == VSS_OP_REGISTER) { @@ -119,8 +120,9 @@ vss_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) } else { /* This is a spurious call! */ pr_warn("VSS: Transaction not active\n"); - return; + return -EINVAL; } + return 0; } @@ -128,27 +130,20 @@ static void vss_send_op(struct work_struct *dummy) { int op = vss_transaction.msg->vss_hdr.operation; int rc; - struct cn_msg *msg; struct hv_vss_msg *vss_msg; /* The transaction state is wrong. */ if (vss_transaction.state != HVUTIL_HOSTMSG_RECEIVED) return; - msg = kzalloc(sizeof(*msg) + sizeof(*vss_msg), GFP_ATOMIC); - if (!msg) + vss_msg = kzalloc(sizeof(*vss_msg), GFP_KERNEL); + if (!vss_msg) return; - vss_msg = (struct hv_vss_msg *)msg->data; - - msg->id.idx = CN_VSS_IDX; - msg->id.val = CN_VSS_VAL; - vss_msg->vss_hdr.operation = op; - msg->len = sizeof(struct hv_vss_msg); vss_transaction.state = HVUTIL_USERSPACE_REQ; - rc = cn_netlink_send(msg, 0, 0, GFP_ATOMIC); + rc = hvutil_transport_send(hvt, vss_msg, sizeof(*vss_msg)); if (rc) { pr_warn("VSS: failed to communicate to the daemon: %d\n", rc); if (cancel_delayed_work_sync(&vss_timeout_work)) { @@ -157,7 +152,7 @@ static void vss_send_op(struct work_struct *dummy) } } - kfree(msg); + kfree(vss_msg); return; } @@ -308,14 +303,16 @@ void hv_vss_onchannelcallback(void *context) } +static void vss_on_reset(void) +{ + if (cancel_delayed_work_sync(&vss_timeout_work)) + vss_respond_to_host(HV_E_FAIL); + vss_transaction.state = HVUTIL_DEVICE_INIT; +} + int hv_vss_init(struct hv_util_service *srv) { - int err; - - err = cn_add_callback(&vss_id, vss_name, vss_cn_callback); - if (err) - return err; recv_buffer = srv->recv_buffer; /* @@ -326,13 +323,18 @@ hv_vss_init(struct hv_util_service *srv) */ vss_transaction.state = HVUTIL_DEVICE_INIT; + hvt = hvutil_transport_init(vss_devname, CN_VSS_IDX, CN_VSS_VAL, + vss_on_msg, vss_on_reset); + if (!hvt) + return -EFAULT; + return 0; } void hv_vss_deinit(void) { vss_transaction.state = HVUTIL_DEVICE_DYING; - cn_del_callback(&vss_id); cancel_delayed_work_sync(&vss_timeout_work); cancel_work_sync(&vss_send_op_work); + hvutil_transport_destroy(hvt); } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 11/21] Drivers: hv: fcopy: switch to using the hvutil_device_state state machine
From: Vitaly Kuznetsov Switch to using the hvutil_device_state state machine from using 3 different state variables: fcopy_transaction.active, opened, and in_hand_shake. State transitions are: -> HVUTIL_DEVICE_INIT when driver loads or on device release -> HVUTIL_READY if the handshake was successful -> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host -> HVUTIL_USERSPACE_REQ after userspace daemon read the message -> HVUTIL_USERSPACE_RECV after/if userspace has replied -> HVUTIL_READY after we respond to the host -> HVUTIL_DEVICE_DYING on driver unload In hv_fcopy_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when the userspace daemon is disconnected, otherwise we can make the host think we don't support FCOPY and disable the service completely. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c | 70 + 1 files changed, 30 insertions(+), 40 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index c14f0f4..a501301 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -47,15 +47,10 @@ * ensure this by serializing packet processing in this driver - we do not * read additional packets from the VMBUs until the current packet is fully * handled. - * - * The transaction "active" state is set when we receive a request from the - * host and we cleanup this state when the transaction is completed - when we - * respond to the host with our response. When the transaction active state is - * set, we defer handling incoming packets. */ static struct { - bool active; /* transaction status - active or not */ + int state; /* hvutil_device_state */ int recv_len; /* number of bytes received. */ struct hv_fcopy_hdr *fcopy_msg; /* current message */ struct hv_start_fcopy message; /* sent to daemon */ @@ -65,14 +60,6 @@ static struct { struct semaphore read_sema; } fcopy_transaction; -static bool opened; /* currently device opened */ - -/* - * Before we can accept copy messages from the host, we need - * to handshake with the user level daemon. This state tracks - * if we are in the handshake phase. - */ -static bool in_hand_shake = true; static void fcopy_send_data(void); static void fcopy_respond_to_host(int error); static void fcopy_timeout_func(struct work_struct *dummy); @@ -87,6 +74,10 @@ static void fcopy_timeout_func(struct work_struct *dummy) */ fcopy_respond_to_host(HV_E_FAIL); + /* Transaction is finished, reset the state. */ + if (fcopy_transaction.state > HVUTIL_READY) + fcopy_transaction.state = HVUTIL_READY; + /* In the case the user-space daemon crashes, hangs or is killed, we * need to down the semaphore, otherwise, after the daemon starts next * time, the obsolete data in fcopy_transaction.message or @@ -118,10 +109,9 @@ static int fcopy_handle_handshake(u32 version) } pr_info("FCP: user-mode registering done. Daemon version: %d\n", version); - fcopy_transaction.active = false; + fcopy_transaction.state = HVUTIL_READY; hv_poll_channel(fcopy_transaction.fcopy_context, hv_fcopy_onchannelcallback); - in_hand_shake = false; return 0; } @@ -191,8 +181,6 @@ fcopy_respond_to_host(int error) channel = fcopy_transaction.recv_channel; req_id = fcopy_transaction.recv_req_id; - fcopy_transaction.active = false; - icmsghdr = (struct icmsg_hdr *) &recv_buffer[sizeof(struct vmbuspipe_hdr)]; @@ -220,7 +208,7 @@ void hv_fcopy_onchannelcallback(void *context) int util_fw_version; int fcopy_srv_version; - if (fcopy_transaction.active) { + if (fcopy_transaction.state > HVUTIL_READY) { /* * We will defer processing this callback once * the current transaction is complete. @@ -252,12 +240,18 @@ void hv_fcopy_onchannelcallback(void *context) * transaction; note transactions are serialized. */ - fcopy_transaction.active = true; fcopy_transaction.recv_len = recvlen; fcopy_transaction.recv_channel = channel; fcopy_transaction.recv_req_id = requestid; fcopy_transaction.fcopy_msg = fcopy_msg; + if (fcopy_transaction.state < HVUTIL_READY) { + /* Userspace is not registered yet */ + fcopy_respond_to_host(HV_E_FAIL); + return; + } + fcopy_transaction.state = HVUTIL_HOSTMSG_RECEIVED; + /* * Send the information to the user-level daemon. */ @@ -290,10 +
[PATCH 17/21] Tools: hv: kvp: use misc char device to communicate with kernel
From: Vitaly Kuznetsov Use /dev/vmbus/hv_kvp instead of netlink. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- tools/hv/hv_kvp_daemon.c | 166 +- 1 files changed, 31 insertions(+), 135 deletions(-) diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index 408bb07..0d9f48e 100644 --- a/tools/hv/hv_kvp_daemon.c +++ b/tools/hv/hv_kvp_daemon.c @@ -33,7 +33,6 @@ #include #include #include -#include #include #include #include @@ -79,7 +78,6 @@ enum { DNS }; -static struct sockaddr_nl addr; static int in_hand_shake = 1; static char *os_name = ""; @@ -1387,34 +1385,6 @@ kvp_get_domain_name(char *buffer, int length) freeaddrinfo(info); } -static int -netlink_send(int fd, struct cn_msg *msg) -{ - struct nlmsghdr nlh = { .nlmsg_type = NLMSG_DONE }; - unsigned int size; - struct msghdr message; - struct iovec iov[2]; - - size = sizeof(struct cn_msg) + msg->len; - - nlh.nlmsg_pid = getpid(); - nlh.nlmsg_len = NLMSG_LENGTH(size); - - iov[0].iov_base = &nlh; - iov[0].iov_len = sizeof(nlh); - - iov[1].iov_base = msg; - iov[1].iov_len = size; - - memset(&message, 0, sizeof(message)); - message.msg_name = &addr; - message.msg_namelen = sizeof(addr); - message.msg_iov = iov; - message.msg_iovlen = 2; - - return sendmsg(fd, &message, 0); -} - void print_usage(char *argv[]) { fprintf(stderr, "Usage: %s [options]\n" @@ -1425,22 +1395,17 @@ void print_usage(char *argv[]) int main(int argc, char *argv[]) { - int fd, len, nl_group; + int kvp_fd, len; int error; - struct cn_msg *message; struct pollfd pfd; - struct nlmsghdr *incoming_msg; - struct cn_msg *incoming_cn_msg; - struct hv_kvp_msg *hv_msg; - char*p; + char*p; + struct hv_kvp_msg hv_msg[1]; char*key_value; char*key_name; int op; int pool; char*if_name; struct hv_kvp_ipaddr_value *kvp_ip_val; - char *kvp_recv_buffer; - size_t kvp_recv_buffer_len; int daemonize = 1, long_index = 0, opt; static struct option long_options[] = { @@ -1468,12 +1433,14 @@ int main(int argc, char *argv[]) openlog("KVP", 0, LOG_USER); syslog(LOG_INFO, "KVP starting; pid is:%d", getpid()); - kvp_recv_buffer_len = NLMSG_LENGTH(0) + sizeof(struct cn_msg) + sizeof(struct hv_kvp_msg); - kvp_recv_buffer = calloc(1, kvp_recv_buffer_len); - if (!kvp_recv_buffer) { - syslog(LOG_ERR, "Failed to allocate netlink buffer"); + kvp_fd = open("/dev/vmbus/hv_kvp", O_RDWR); + + if (kvp_fd < 0) { + syslog(LOG_ERR, "open /dev/vmbus/hv_kvp failed; error: %d %s", + errno, strerror(errno)); exit(EXIT_FAILURE); } + /* * Retrieve OS release information. */ @@ -1489,100 +1456,44 @@ int main(int argc, char *argv[]) exit(EXIT_FAILURE); } - fd = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); - if (fd < 0) { - syslog(LOG_ERR, "netlink socket creation failed; error: %d %s", errno, - strerror(errno)); - exit(EXIT_FAILURE); - } - addr.nl_family = AF_NETLINK; - addr.nl_pad = 0; - addr.nl_pid = 0; - addr.nl_groups = 0; - - - error = bind(fd, (struct sockaddr *)&addr, sizeof(addr)); - if (error < 0) { - syslog(LOG_ERR, "bind failed; error: %d %s", errno, strerror(errno)); - close(fd); - exit(EXIT_FAILURE); - } - nl_group = CN_KVP_IDX; - - if (setsockopt(fd, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP, &nl_group, sizeof(nl_group)) < 0) { - syslog(LOG_ERR, "setsockopt failed; error: %d %s", errno, strerror(errno)); - close(fd); - exit(EXIT_FAILURE); - } - /* * Register ourselves with the kernel. */ - message = (struct cn_msg *)kvp_recv_buffer; - message->id.idx = CN_KVP_IDX; - message->id.val = CN_KVP_VAL; - - hv_msg = (struct hv_kvp_msg *)message->data; hv_msg->kvp_hdr.operation = KVP_OP_REGISTER1; - message->ack = 0; - message->len = sizeof(struct hv_kvp_msg); - - len = netlink_send(fd, message); - if (len < 0) { - syslog(LOG_ERR, "netlink_send failed; error: %d %s", errno, strerror(errno)); - close(fd); + len = write(kvp_fd, hv_msg, sizeof(struct hv_kvp_msg)); + if (len != sizeof(struct hv_kvp_msg)) { + syslo
[PATCH 16/21] Drivers: hv: kvp: convert to hv_utils_transport
From: Vitaly Kuznetsov Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_kvp communication methods. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_kvp.c | 91 +++--- 1 files changed, 42 insertions(+), 49 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index a70d202..baa1208 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -29,6 +29,7 @@ #include #include "hyperv_vmbus.h" +#include "hv_utils_transport.h" /* * Pre win8 version numbers used in ws2008 and ws 2008 r2 (win7) @@ -83,9 +84,9 @@ static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); -static struct cb_id kvp_id = { CN_KVP_IDX, CN_KVP_VAL }; -static const char kvp_name[] = "kvp_kernel_module"; +static const char kvp_devname[] = "vmbus/hv_kvp"; static u8 *recv_buffer; +static struct hvutil_transport *hvt; /* * Register the kernel component with the user-level daemon. * As part of this registration, pass the LIC version number. @@ -97,23 +98,18 @@ static void kvp_register(int reg_value) { - struct cn_msg *msg; struct hv_kvp_msg *kvp_msg; char *version; - msg = kzalloc(sizeof(*msg) + sizeof(struct hv_kvp_msg), GFP_ATOMIC); + kvp_msg = kzalloc(sizeof(*kvp_msg), GFP_KERNEL); - if (msg) { - kvp_msg = (struct hv_kvp_msg *)msg->data; + if (kvp_msg) { version = kvp_msg->body.kvp_register.version; - msg->id.idx = CN_KVP_IDX; - msg->id.val = CN_KVP_VAL; - kvp_msg->kvp_hdr.operation = reg_value; strcpy(version, HV_DRV_VERSION); - msg->len = sizeof(struct hv_kvp_msg); - cn_netlink_send(msg, 0, 0, GFP_ATOMIC); - kfree(msg); + + hvutil_transport_send(hvt, kvp_msg, sizeof(*kvp_msg)); + kfree(kvp_msg); } } @@ -135,8 +131,6 @@ static void kvp_timeout_func(struct work_struct *dummy) static int kvp_handle_handshake(struct hv_kvp_msg *msg) { - int ret = 1; - switch (msg->kvp_hdr.operation) { case KVP_OP_REGISTER: dm_reg_value = KVP_OP_REGISTER; @@ -150,18 +144,17 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_info("KVP: incompatible daemon\n"); pr_info("KVP: KVP version: %d, Daemon version: %d\n", KVP_OP_REGISTER1, msg->kvp_hdr.operation); - ret = 0; + return -EINVAL; } - if (ret) { - /* -* We have a compatible daemon; complete the handshake. -*/ - pr_info("KVP: user-mode registering done.\n"); - kvp_register(dm_reg_value); - kvp_transaction.state = HVUTIL_READY; - } - return ret; + /* +* We have a compatible daemon; complete the handshake. +*/ + pr_info("KVP: user-mode registering done.\n"); + kvp_register(dm_reg_value); + kvp_transaction.state = HVUTIL_READY; + + return 0; } @@ -169,14 +162,14 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) * Callback when data is received from user mode. */ -static void -kvp_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) +static int kvp_on_msg(void *msg, int len) { - struct hv_kvp_msg *message; + struct hv_kvp_msg *message = (struct hv_kvp_msg *)msg; struct hv_kvp_msg_enumerate *data; int error = 0; - message = (struct hv_kvp_msg *)msg->data; + if (len < sizeof(*message)) + return -EINVAL; /* * If we are negotiating the version information @@ -184,13 +177,13 @@ kvp_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) */ if (kvp_transaction.state < HVUTIL_READY) { - kvp_handle_handshake(message); - return; + return kvp_handle_handshake(message); } /* We didn't send anything to userspace so the reply is spurious */ if (kvp_transaction.state < HVUTIL_USERSPACE_REQ) - return; + return -EINVAL; + kvp_transaction.state = HVUTIL_USERSPACE_RECV; /* @@ -228,6 +221,8 @@ kvp_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) hv_poll_channel(kvp_transaction.kvp_context, hv_kvp_onchannelcallback); } + + return 0; } @@ -344,7 +339,6 @@ static void process_ib_ipinfo(void *in_msg, void *out_msg, int op) static void kvp_send_key(struct work_struct *dummy) { - struct cn_msg *msg
[PATCH 05/21] Drivers: hv: vss: process deferred messages when we complete the transaction
From: Vitaly Kuznetsov In theory, the host is not supposed to issue any requests before be reply to the previous one. In KVP we, however, support the following scenarios: 1) A message was received before userspace daemon registered; 2) A message was received while the previous one is still being processed. In VSS we support only the former. Add support for the later, use hv_poll_channel() to do the job. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_snapshot.c | 14 +- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c index c1a3604..4bb9b1c 100644 --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -47,6 +47,7 @@ static struct { struct vmbus_channel *recv_channel; /* chn we got the request */ u64 recv_req_id; /* request ID. */ struct hv_vss_msg *msg; /* current message */ + void *vss_context; /* for the channel callback */ } vss_transaction; @@ -73,6 +74,9 @@ static void vss_timeout_func(struct work_struct *dummy) */ pr_warn("VSS: timeout waiting for daemon to reply\n"); vss_respond_to_host(HV_E_FAIL); + + hv_poll_channel(vss_transaction.vss_context, + hv_vss_onchannelcallback); } static void @@ -85,13 +89,12 @@ vss_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) if (vss_msg->vss_hdr.operation == VSS_OP_REGISTER) { pr_info("VSS daemon registered\n"); vss_transaction.active = false; - if (vss_transaction.recv_channel != NULL) - hv_vss_onchannelcallback(vss_transaction.recv_channel); - return; - } if (cancel_delayed_work_sync(&vss_timeout_work)) vss_respond_to_host(vss_msg->error); + + hv_poll_channel(vss_transaction.vss_context, + hv_vss_onchannelcallback); } @@ -198,9 +201,10 @@ void hv_vss_onchannelcallback(void *context) * We will defer processing this callback once * the current transaction is complete. */ - vss_transaction.recv_channel = channel; + vss_transaction.vss_context = context; return; } + vss_transaction.vss_context = NULL; vmbus_recvpacket(channel, recv_buffer, PAGE_SIZE * 2, &recvlen, &requestid); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 06/21] Drivers: hv: kvp: rename kvp_work -> kvp_timeout_work
From: Vitaly Kuznetsov 'kvp_work' (and kvp_work_func) is a misnomer as it sounds like we expect this useful work to happen and in reality it is just an emergency escape when timeout happens. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_kvp.c | 16 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 939c3e7..14c62e2 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -79,10 +79,10 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); -static void kvp_work_func(struct work_struct *dummy); +static void kvp_timeout_func(struct work_struct *dummy); static void kvp_register(int); -static DECLARE_DELAYED_WORK(kvp_work, kvp_work_func); +static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static struct cb_id kvp_id = { CN_KVP_IDX, CN_KVP_VAL }; @@ -118,8 +118,8 @@ kvp_register(int reg_value) kfree(msg); } } -static void -kvp_work_func(struct work_struct *dummy) + +static void kvp_timeout_func(struct work_struct *dummy) { /* * If the timer fires, the user-mode component has not responded; @@ -215,7 +215,7 @@ kvp_cn_callback(struct cn_msg *msg, struct netlink_skb_parms *nsp) * Complete the transaction by forwarding the key value * to the host. But first, cancel the timeout. */ - if (cancel_delayed_work_sync(&kvp_work)) + if (cancel_delayed_work_sync(&kvp_timeout_work)) kvp_respond_to_host(message, error); } @@ -440,7 +440,7 @@ kvp_send_key(struct work_struct *dummy) rc = cn_netlink_send(msg, 0, 0, GFP_ATOMIC); if (rc) { pr_debug("KVP: failed to communicate to the daemon: %d\n", rc); - if (cancel_delayed_work_sync(&kvp_work)) + if (cancel_delayed_work_sync(&kvp_timeout_work)) kvp_respond_to_host(message, HV_E_FAIL); } @@ -668,7 +668,7 @@ void hv_kvp_onchannelcallback(void *context) * user-mode not responding. */ schedule_work(&kvp_sendkey_work); - schedule_delayed_work(&kvp_work, 5*HZ); + schedule_delayed_work(&kvp_timeout_work, 5*HZ); return; @@ -708,6 +708,6 @@ hv_kvp_init(struct hv_util_service *srv) void hv_kvp_deinit(void) { cn_del_callback(&kvp_id); - cancel_delayed_work_sync(&kvp_work); + cancel_delayed_work_sync(&kvp_timeout_work); cancel_work_sync(&kvp_sendkey_work); } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 19/21] Drivers: hv: vss: full handshake support
From: Vitaly Kuznetsov Introduce VSS_OP_REGISTER1 to support kernel replying to the negotiation message with its own version. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_snapshot.c| 49 -- include/uapi/linux/hyperv.h |5 tools/hv/hv_vss_daemon.c| 14 3 files changed, 56 insertions(+), 12 deletions(-) diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c index 2c8c246..ee1762b 100644 --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -59,6 +59,11 @@ static struct { static void vss_respond_to_host(int error); +/* + * This state maintains the version number registered by the daemon. + */ +static int dm_reg_value; + static const char vss_devname[] = "vmbus/hv_vss"; static __u8 *recv_buffer; static struct hvutil_transport *hvt; @@ -89,6 +94,29 @@ static void vss_timeout_func(struct work_struct *dummy) hv_vss_onchannelcallback); } +static int vss_handle_handshake(struct hv_vss_msg *vss_msg) +{ + u32 our_ver = VSS_OP_REGISTER1; + + switch (vss_msg->vss_hdr.operation) { + case VSS_OP_REGISTER: + /* Daemon doesn't expect us to reply */ + dm_reg_value = VSS_OP_REGISTER; + break; + case VSS_OP_REGISTER1: + /* Daemon expects us to reply with our own version*/ + if (hvutil_transport_send(hvt, &our_ver, sizeof(our_ver))) + return -EFAULT; + dm_reg_value = VSS_OP_REGISTER1; + break; + default: + return -EINVAL; + } + vss_transaction.state = HVUTIL_READY; + pr_info("VSS daemon registered\n"); + return 0; +} + static int vss_on_msg(void *msg, int len) { struct hv_vss_msg *vss_msg = (struct hv_vss_msg *)msg; @@ -96,18 +124,15 @@ static int vss_on_msg(void *msg, int len) if (len != sizeof(*vss_msg)) return -EINVAL; - /* -* Don't process registration messages if we're in the middle of -* a transaction processing. -*/ - if (vss_transaction.state > HVUTIL_READY && - vss_msg->vss_hdr.operation == VSS_OP_REGISTER) - return -EINVAL; - - if (vss_transaction.state == HVUTIL_DEVICE_INIT && - vss_msg->vss_hdr.operation == VSS_OP_REGISTER) { - pr_info("VSS daemon registered\n"); - vss_transaction.state = HVUTIL_READY; + if (vss_msg->vss_hdr.operation == VSS_OP_REGISTER || + vss_msg->vss_hdr.operation == VSS_OP_REGISTER1) { + /* +* Don't process registration messages if we're in the middle +* of a transaction processing. +*/ + if (vss_transaction.state > HVUTIL_READY) + return -EINVAL; + return vss_handle_handshake(vss_msg); } else if (vss_transaction.state == HVUTIL_USERSPACE_REQ) { vss_transaction.state = HVUTIL_USERSPACE_RECV; if (cancel_delayed_work_sync(&vss_timeout_work)) { diff --git a/include/uapi/linux/hyperv.h b/include/uapi/linux/hyperv.h index bb1cb73..66c76df 100644 --- a/include/uapi/linux/hyperv.h +++ b/include/uapi/linux/hyperv.h @@ -45,6 +45,11 @@ #define VSS_OP_REGISTER 128 +/* + Daemon code with full handshake support. + */ +#define VSS_OP_REGISTER1 129 + enum hv_vss_op { VSS_OP_CREATE = 0, VSS_OP_DELETE, diff --git a/tools/hv/hv_vss_daemon.c b/tools/hv/hv_vss_daemon.c index 36f1821..96234b6 100644 --- a/tools/hv/hv_vss_daemon.c +++ b/tools/hv/hv_vss_daemon.c @@ -148,6 +148,8 @@ int main(int argc, char *argv[]) int op; struct hv_vss_msg vss_msg[1]; int daemonize = 1, long_index = 0, opt; + int in_handshake = 1; + __u32 kernel_modver; static struct option long_options[] = { {"help",no_argument, 0, 'h' }, @@ -211,6 +213,18 @@ int main(int argc, char *argv[]) len = read(vss_fd, vss_msg, sizeof(struct hv_vss_msg)); + if (in_handshake) { + if (len != sizeof(kernel_modver)) { + syslog(LOG_ERR, "invalid version negotiation"); + exit(EXIT_FAILURE); + } + kernel_modver = *(__u32 *)vss_msg; + in_handshake = 0; + syslog(LOG_INFO, "VSS: kernel module version: %d", + kernel_modver); + continue; + } + if (len != sizeof(struct hv_vss_msg)) { syslog(LOG_ERR, "read failed; error:%
[PATCH 08/21] Drivers: hv: util: introduce state machine for util drivers
From: Vitaly Kuznetsov KVP/VSS/FCOPY drivers work in fully serialized mode: we wait till userspace daemon registers, wait for a message from the host, send this message to the daemon, get the reply, send it back to host, wait for another message. Introduce enum hvutil_device_state to represend this state in all 3 drivers. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hyperv_vmbus.h |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 2f30456..138d663 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -755,4 +755,13 @@ static inline void hv_poll_channel(struct vmbus_channel *channel, cb(channel); } +enum hvutil_device_state { + HVUTIL_DEVICE_INIT = 0, /* driver is loaded, waiting for userspace */ + HVUTIL_READY,/* userspace is registered */ + HVUTIL_HOSTMSG_RECEIVED, /* message from the host was received */ + HVUTIL_USERSPACE_REQ,/* request to userspace was sent */ + HVUTIL_USERSPACE_RECV, /* reply from userspace was received */ + HVUTIL_DEVICE_DYING, /* driver unload is in progress */ +}; + #endif /* _HYPERV_VMBUS_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 15/21] Drivers: hv: fcopy: convert to hv_utils_transport
From: Vitaly Kuznetsov Unify the code with the recently introduced hv_utils_transport. Netlink communication is disabled for fcopy. Signed-off-by: Vitaly Kuznetsov Tested-by: Alex Ng Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c | 194 - 1 files changed, 46 insertions(+), 148 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index d1475e6..6a8ec9f 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -19,17 +19,13 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt -#include -#include #include #include -#include #include #include -#include -#include #include "hyperv_vmbus.h" +#include "hv_utils_transport.h" #define WIN8_SRV_MAJOR 1 #define WIN8_SRV_MINOR 1 @@ -53,18 +49,19 @@ static struct { int state; /* hvutil_device_state */ int recv_len; /* number of bytes received. */ struct hv_fcopy_hdr *fcopy_msg; /* current message */ - struct hv_start_fcopy message; /* sent to daemon */ struct vmbus_channel *recv_channel; /* chn we got the request */ u64 recv_req_id; /* request ID. */ void *fcopy_context; /* for the channel callback */ - struct semaphore read_sema; } fcopy_transaction; -static void fcopy_send_data(void); static void fcopy_respond_to_host(int error); +static void fcopy_send_data(struct work_struct *dummy); static void fcopy_timeout_func(struct work_struct *dummy); static DECLARE_DELAYED_WORK(fcopy_timeout_work, fcopy_timeout_func); +static DECLARE_WORK(fcopy_send_work, fcopy_send_data); +static const char fcopy_devname[] = "vmbus/hv_fcopy"; static u8 *recv_buffer; +static struct hvutil_transport *hvt; static void fcopy_timeout_func(struct work_struct *dummy) { @@ -78,17 +75,6 @@ static void fcopy_timeout_func(struct work_struct *dummy) if (fcopy_transaction.state > HVUTIL_READY) fcopy_transaction.state = HVUTIL_READY; - /* In the case the user-space daemon crashes, hangs or is killed, we -* need to down the semaphore, otherwise, after the daemon starts next -* time, the obsolete data in fcopy_transaction.message or -* fcopy_transaction.fcopy_msg will be used immediately. -* -* NOTE: fcopy_read() happens to get the semaphore (very rare)? We're -* still OK, because we've reported the failure to the host. -*/ - if (down_trylock(&fcopy_transaction.read_sema)) - ; - hv_poll_channel(fcopy_transaction.fcopy_context, hv_fcopy_onchannelcallback); } @@ -115,11 +101,13 @@ static int fcopy_handle_handshake(u32 version) return 0; } -static void fcopy_send_data(void) +static void fcopy_send_data(struct work_struct *dummy) { - struct hv_start_fcopy *smsg_out = &fcopy_transaction.message; + struct hv_start_fcopy smsg_out; int operation = fcopy_transaction.fcopy_msg->operation; struct hv_start_fcopy *smsg_in; + void *out_src; + int rc, out_len; /* * The strings sent from the host are encoded in @@ -134,26 +122,39 @@ static void fcopy_send_data(void) switch (operation) { case START_FILE_COPY: - memset(smsg_out, 0, sizeof(struct hv_start_fcopy)); - smsg_out->hdr.operation = operation; + out_len = sizeof(struct hv_start_fcopy); + memset(&smsg_out, 0, out_len); + smsg_out.hdr.operation = operation; smsg_in = (struct hv_start_fcopy *)fcopy_transaction.fcopy_msg; utf16s_to_utf8s((wchar_t *)smsg_in->file_name, W_MAX_PATH, UTF16_LITTLE_ENDIAN, - (__u8 *)smsg_out->file_name, W_MAX_PATH - 1); + (__u8 *)&smsg_out.file_name, W_MAX_PATH - 1); utf16s_to_utf8s((wchar_t *)smsg_in->path_name, W_MAX_PATH, UTF16_LITTLE_ENDIAN, - (__u8 *)smsg_out->path_name, W_MAX_PATH - 1); + (__u8 *)&smsg_out.path_name, W_MAX_PATH - 1); - smsg_out->copy_flags = smsg_in->copy_flags; - smsg_out->file_size = smsg_in->file_size; + smsg_out.copy_flags = smsg_in->copy_flags; + smsg_out.file_size = smsg_in->file_size; + out_src = &smsg_out; break; default: + out_src = fcopy_transaction.fcopy_msg; + out_len = fcopy_transaction.recv_len; break; } - up(&fcopy_transaction.read_sema); + + fcopy_transaction.state = HVUTIL_USERSPACE_REQ; + rc = hvutil_transport_send(hvt, out_src, out_len); + if (rc) { + pr_debug(&q
[PATCH 4/5] drivers: hv: vmbus: Get rid of some unused definitions
Get rid of some unused definitions. Signed-off-by: K. Y. Srinivasan --- include/linux/hyperv.h | 19 --- 1 files changed, 0 insertions(+), 19 deletions(-) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 1744148..e29ccdd 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -389,10 +389,6 @@ enum vmbus_channel_message_type { CHANNELMSG_INITIATE_CONTACT = 14, CHANNELMSG_VERSION_RESPONSE = 15, CHANNELMSG_UNLOAD = 16, -#ifdef VMBUS_FEATURE_PARENT_OR_PEER_MEMORY_MAPPED_INTO_A_CHILD - CHANNELMSG_VIEWRANGE_ADD= 17, - CHANNELMSG_VIEWRANGE_REMOVE = 18, -#endif CHANNELMSG_COUNT }; @@ -549,21 +545,6 @@ struct vmbus_channel_gpadl_torndown { u32 gpadl; } __packed; -#ifdef VMBUS_FEATURE_PARENT_OR_PEER_MEMORY_MAPPED_INTO_A_CHILD -struct vmbus_channel_view_range_add { - struct vmbus_channel_message_header header; - PHYSICAL_ADDRESS viewrange_base; - u64 viewrange_length; - u32 child_relid; -} __packed; - -struct vmbus_channel_view_range_remove { - struct vmbus_channel_message_header header; - PHYSICAL_ADDRESS viewrange_base; - u32 child_relid; -} __packed; -#endif - struct vmbus_channel_relid_released { struct vmbus_channel_message_header header; u32 child_relid; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/5] Drivers: hv: vmbus: unregister panic notifier on module unload
From: Vitaly Kuznetsov Commit 96c1d0581d00f7abe033350edb021a9d947d8d81 ("Drivers: hv: vmbus: Add support for VMBus panic notifier handler") introduced atomic_notifier_chain_register() call on module load. We also need to call atomic_notifier_chain_unregister() on module unload as otherwise the following crash is observed when we bring hv_vmbus back: [ 39.788877] BUG: unable to handle kernel paging request at a00078a8 [ 39.788877] IP: [] notifier_call_chain+0x3f/0x80 ... [ 39.788877] Call Trace: [ 39.788877] [] __atomic_notifier_call_chain+0x5d/0x90 ... [ 39.788877] [] ? atomic_notifier_chain_register+0x38/0x70 [ 39.788877] [] ? atomic_notifier_chain_register+0x17/0x70 [ 39.788877] [] hv_acpi_init+0x14f/0x1000 [hv_vmbus] [ 39.788877] [] do_one_initcall+0xd4/0x210 Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/vmbus_drv.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 0d8d1d7..7870a90 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1108,6 +1108,10 @@ static void __exit vmbus_exit(void) hv_synic_clockevents_cleanup(); hv_remove_vmbus_irq(); vmbus_free_channels(); + if (ms_hyperv.features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { + atomic_notifier_chain_unregister(&panic_notifier_list, +&hyperv_panic_block); + } bus_unregister(&hv_bus); hv_cleanup(); for_each_online_cpu(cpu) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/5] Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state
Implement the protocol for tearing down the monitor state established with the host. Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c | 25 + drivers/hv/connection.c |5 + drivers/hv/hyperv_vmbus.h |2 ++ include/linux/hyperv.h|1 + 4 files changed, 33 insertions(+), 0 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 865a3af..4b9d89a 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -422,6 +422,30 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui } /* + * vmbus_unload_response - Handler for the unload response. + */ +static void vmbus_unload_response(struct vmbus_channel_message_header *hdr) +{ + /* +* This is a global event; just wakeup the waiting thread. +* Once we successfully unload, we can cleanup the monitor state. +*/ + complete(&vmbus_connection.unload_event); +} + +void vmbus_initiate_unload(void) +{ + struct vmbus_channel_message_header hdr; + + init_completion(&vmbus_connection.unload_event); + memset(&hdr, 0, sizeof(struct vmbus_channel_message_header)); + hdr.msgtype = CHANNELMSG_UNLOAD; + vmbus_post_msg(&hdr, sizeof(struct vmbus_channel_message_header)); + + wait_for_completion(&vmbus_connection.unload_event); +} + +/* * vmbus_onoffer - Handler for channel offers from vmbus in parent partition. * */ @@ -717,6 +741,7 @@ struct vmbus_channel_message_table_entry {CHANNELMSG_INITIATE_CONTACT, 0, NULL}, {CHANNELMSG_VERSION_RESPONSE, 1, vmbus_onversion_response}, {CHANNELMSG_UNLOAD, 0, NULL}, + {CHANNELMSG_UNLOAD_RESPONSE,1, vmbus_unload_response}, }; /* diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index b27220a..acd50e9 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -227,6 +227,11 @@ cleanup: void vmbus_disconnect(void) { + /* +* First send the unload request to the host. +*/ + vmbus_initiate_unload(); + if (vmbus_connection.work_queue) { drain_workqueue(vmbus_connection.work_queue); destroy_workqueue(vmbus_connection.work_queue); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 138d663..cddc0c9 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -647,6 +647,7 @@ struct vmbus_connection { atomic_t next_gpadl_handle; + struct completion unload_event; /* * Represents channel interrupts. Each bit position represents a * channel. When a channel sends an interrupt via VMBUS, it finds its @@ -741,6 +742,7 @@ void hv_vss_onchannelcallback(void *); int hv_fcopy_init(struct hv_util_service *); void hv_fcopy_deinit(void); void hv_fcopy_onchannelcallback(void *); +void vmbus_initiate_unload(void); static inline void hv_poll_channel(struct vmbus_channel *channel, void (*cb)(void *)) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index e29ccdd..ea93486 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -389,6 +389,7 @@ enum vmbus_channel_message_type { CHANNELMSG_INITIATE_CONTACT = 14, CHANNELMSG_VERSION_RESPONSE = 15, CHANNELMSG_UNLOAD = 16, + CHANNELMSG_UNLOAD_RESPONSE = 17, CHANNELMSG_COUNT }; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/5] hv: vmbus_free_channels(): remove the redundant free_channel()
From: Dexuan Cui free_channel() has been invoked in vmbus_remove() -> hv_process_channel_removal(), or vmbus_remove() -> ... -> vmbus_close_internal() -> hv_process_channel_removal(). We also change to use list_for_each_entry_safe(), because the entry is removed in hv_process_channel_removal(). This patch fixes a bug in the vmbus unload path. Thank Dan Carpenter for finding the issue! Signed-off-by: Dexuan Cui Reported-by: Dan Carpenter Cc: K. Y. Srinivasan Cc: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 0eeb1b3..865a3af 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -212,11 +212,16 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) void vmbus_free_channels(void) { - struct vmbus_channel *channel; + struct vmbus_channel *channel, *tmp; + + list_for_each_entry_safe(channel, tmp, &vmbus_connection.chn_list, + listentry) { + /* if we don't set rescind to true, vmbus_close_internal() +* won't invoke hv_process_channel_removal(). +*/ + channel->rescind = true; - list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) { vmbus_device_unregister(channel->device_obj); - free_channel(channel); } } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/5] Drivers: hv: vmbus: introduce vmbus_acpi_remove
From: Vitaly Kuznetsov In case we do request_resource() in vmbus_acpi_add() we need to tear it down to be able to load the driver again. Otherwise the following crash in observed when hv_vmbus unload/load sequence is performed on a Generation2 instance: [ 38.165701] BUG: unable to handle kernel paging request at a00075a0 [ 38.166315] IP: [] __request_resource+0x2f/0x50 [ 38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0 [ 38.166315] Oops: [#1] SMP [ 38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus] [ 38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-rc5_bug923184+ #486 [ 38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012 [ 38.166315] task: 88003f401cb0 ti: 88003f60c000 task.ti: 88003f60c000 [ 38.166315] RIP: 0010:[] [] __request_resource+0x2f/0x50 [ 38.166315] RSP: 0018:88003f60fb58 EFLAGS: 00010286 Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/vmbus_drv.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index c85235e..0d8d1d7 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1035,6 +1035,15 @@ acpi_walk_err: return ret_val; } +static int vmbus_acpi_remove(struct acpi_device *device) +{ + int ret = 0; + + if (hyperv_mmio.start && hyperv_mmio.end) + ret = release_resource(&hyperv_mmio); + return ret; +} + static const struct acpi_device_id vmbus_acpi_device_ids[] = { {"VMBUS", 0}, {"VMBus", 0}, @@ -1047,6 +1056,7 @@ static struct acpi_driver vmbus_acpi_driver = { .ids = vmbus_acpi_device_ids, .ops = { .add = vmbus_acpi_add, + .remove = vmbus_acpi_remove, }, }; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/5] Drivers: hv: vmbus: Cleanup the vmbus unload path
This patch-set have several fixes to enable the clean unload of the vmbus. Typically, vmbus will not be unloadable when Linux is hosted on Hyper-V since the driver managing the root device needs the vmbus driver. Dexuan Cui (1): hv: vmbus_free_channels(): remove the redundant free_channel() K. Y. Srinivasan (2): drivers: hv: vmbus: Get rid of some unused definitions Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state Vitaly Kuznetsov (2): Drivers: hv: vmbus: introduce vmbus_acpi_remove Drivers: hv: vmbus: unregister panic notifier on module unload drivers/hv/channel_mgmt.c | 36 +--- drivers/hv/connection.c |5 + drivers/hv/hyperv_vmbus.h |2 ++ drivers/hv/vmbus_drv.c| 14 ++ include/linux/hyperv.h| 20 +--- 5 files changed, 55 insertions(+), 22 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 0/5] Drivers: hv: vmbus: Cleanup the vmbus unload path
This patch-set have several fixes to enable the clean unload of the vmbus. Typically, vmbus will not be unloadable when Linux is hosted on Hyper-V since the driver managing the root device needs the vmbus driver. In this version of the patch-set, the patch Drivers: hv: vmbus: Implement the protocol for tearing down vmbus was modified based on feedback from Vitaly Kuznetsov. Dexuan Cui (1): hv: vmbus_free_channels(): remove the redundant free_channel() K. Y. Srinivasan (2): drivers: hv: vmbus: Get rid of some unused definitions Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state Vitaly Kuznetsov (2): Drivers: hv: vmbus: introduce vmbus_acpi_remove Drivers: hv: vmbus: unregister panic notifier on module unload drivers/hv/channel_mgmt.c | 36 +--- drivers/hv/connection.c |5 + drivers/hv/hyperv_vmbus.h |2 ++ drivers/hv/vmbus_drv.c| 16 +++- include/linux/hyperv.h| 20 +--- 5 files changed, 56 insertions(+), 23 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 4/5] drivers: hv: vmbus: Get rid of some unused definitions
Get rid of some unused definitions. Signed-off-by: K. Y. Srinivasan --- include/linux/hyperv.h | 19 --- 1 files changed, 0 insertions(+), 19 deletions(-) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 1744148..e29ccdd 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -389,10 +389,6 @@ enum vmbus_channel_message_type { CHANNELMSG_INITIATE_CONTACT = 14, CHANNELMSG_VERSION_RESPONSE = 15, CHANNELMSG_UNLOAD = 16, -#ifdef VMBUS_FEATURE_PARENT_OR_PEER_MEMORY_MAPPED_INTO_A_CHILD - CHANNELMSG_VIEWRANGE_ADD= 17, - CHANNELMSG_VIEWRANGE_REMOVE = 18, -#endif CHANNELMSG_COUNT }; @@ -549,21 +545,6 @@ struct vmbus_channel_gpadl_torndown { u32 gpadl; } __packed; -#ifdef VMBUS_FEATURE_PARENT_OR_PEER_MEMORY_MAPPED_INTO_A_CHILD -struct vmbus_channel_view_range_add { - struct vmbus_channel_message_header header; - PHYSICAL_ADDRESS viewrange_base; - u64 viewrange_length; - u32 child_relid; -} __packed; - -struct vmbus_channel_view_range_remove { - struct vmbus_channel_message_header header; - PHYSICAL_ADDRESS viewrange_base; - u32 child_relid; -} __packed; -#endif - struct vmbus_channel_relid_released { struct vmbus_channel_message_header header; u32 child_relid; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 2/5] Drivers: hv: vmbus: unregister panic notifier on module unload
From: Vitaly Kuznetsov Commit 96c1d0581d00f7abe033350edb021a9d947d8d81 ("Drivers: hv: vmbus: Add support for VMBus panic notifier handler") introduced atomic_notifier_chain_register() call on module load. We also need to call atomic_notifier_chain_unregister() on module unload as otherwise the following crash is observed when we bring hv_vmbus back: [ 39.788877] BUG: unable to handle kernel paging request at a00078a8 [ 39.788877] IP: [] notifier_call_chain+0x3f/0x80 ... [ 39.788877] Call Trace: [ 39.788877] [] __atomic_notifier_call_chain+0x5d/0x90 ... [ 39.788877] [] ? atomic_notifier_chain_register+0x38/0x70 [ 39.788877] [] ? atomic_notifier_chain_register+0x17/0x70 [ 39.788877] [] hv_acpi_init+0x14f/0x1000 [hv_vmbus] [ 39.788877] [] do_one_initcall+0xd4/0x210 Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/vmbus_drv.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 0d8d1d7..7870a90 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1108,6 +1108,10 @@ static void __exit vmbus_exit(void) hv_synic_clockevents_cleanup(); hv_remove_vmbus_irq(); vmbus_free_channels(); + if (ms_hyperv.features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { + atomic_notifier_chain_unregister(&panic_notifier_list, +&hyperv_panic_block); + } bus_unregister(&hv_bus); hv_cleanup(); for_each_online_cpu(cpu) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 5/5] Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state
Implement the protocol for tearing down the monitor state established with the host. Signed-off-by: K. Y. Srinivasan Tested-by: Vitaly Kuznetsov --- Changes in V2: Call vmbus_disconnect earlier - Vitaly Kuznetsov drivers/hv/channel_mgmt.c | 25 + drivers/hv/connection.c |5 + drivers/hv/hyperv_vmbus.h |2 ++ drivers/hv/vmbus_drv.c|2 +- include/linux/hyperv.h|1 + 5 files changed, 34 insertions(+), 1 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 865a3af..4b9d89a 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -422,6 +422,30 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui } /* + * vmbus_unload_response - Handler for the unload response. + */ +static void vmbus_unload_response(struct vmbus_channel_message_header *hdr) +{ + /* +* This is a global event; just wakeup the waiting thread. +* Once we successfully unload, we can cleanup the monitor state. +*/ + complete(&vmbus_connection.unload_event); +} + +void vmbus_initiate_unload(void) +{ + struct vmbus_channel_message_header hdr; + + init_completion(&vmbus_connection.unload_event); + memset(&hdr, 0, sizeof(struct vmbus_channel_message_header)); + hdr.msgtype = CHANNELMSG_UNLOAD; + vmbus_post_msg(&hdr, sizeof(struct vmbus_channel_message_header)); + + wait_for_completion(&vmbus_connection.unload_event); +} + +/* * vmbus_onoffer - Handler for channel offers from vmbus in parent partition. * */ @@ -717,6 +741,7 @@ struct vmbus_channel_message_table_entry {CHANNELMSG_INITIATE_CONTACT, 0, NULL}, {CHANNELMSG_VERSION_RESPONSE, 1, vmbus_onversion_response}, {CHANNELMSG_UNLOAD, 0, NULL}, + {CHANNELMSG_UNLOAD_RESPONSE,1, vmbus_unload_response}, }; /* diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index b27220a..acd50e9 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -227,6 +227,11 @@ cleanup: void vmbus_disconnect(void) { + /* +* First send the unload request to the host. +*/ + vmbus_initiate_unload(); + if (vmbus_connection.work_queue) { drain_workqueue(vmbus_connection.work_queue); destroy_workqueue(vmbus_connection.work_queue); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 138d663..cddc0c9 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -647,6 +647,7 @@ struct vmbus_connection { atomic_t next_gpadl_handle; + struct completion unload_event; /* * Represents channel interrupts. Each bit position represents a * channel. When a channel sends an interrupt via VMBUS, it finds its @@ -741,6 +742,7 @@ void hv_vss_onchannelcallback(void *); int hv_fcopy_init(struct hv_util_service *); void hv_fcopy_deinit(void); void hv_fcopy_onchannelcallback(void *); +void vmbus_initiate_unload(void); static inline void hv_poll_channel(struct vmbus_channel *channel, void (*cb)(void *)) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 7870a90..2b56260 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1106,6 +1106,7 @@ static void __exit vmbus_exit(void) vmbus_connection.conn_state = DISCONNECTED; hv_synic_clockevents_cleanup(); + vmbus_disconnect(); hv_remove_vmbus_irq(); vmbus_free_channels(); if (ms_hyperv.features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { @@ -1118,7 +1119,6 @@ static void __exit vmbus_exit(void) smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); acpi_bus_unregister_driver(&vmbus_acpi_driver); hv_cpu_hotplug_quirk(false); - vmbus_disconnect(); } diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index e29ccdd..ea93486 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -389,6 +389,7 @@ enum vmbus_channel_message_type { CHANNELMSG_INITIATE_CONTACT = 14, CHANNELMSG_VERSION_RESPONSE = 15, CHANNELMSG_UNLOAD = 16, + CHANNELMSG_UNLOAD_RESPONSE = 17, CHANNELMSG_COUNT }; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 1/5] Drivers: hv: vmbus: introduce vmbus_acpi_remove
From: Vitaly Kuznetsov In case we do request_resource() in vmbus_acpi_add() we need to tear it down to be able to load the driver again. Otherwise the following crash in observed when hv_vmbus unload/load sequence is performed on a Generation2 instance: [ 38.165701] BUG: unable to handle kernel paging request at a00075a0 [ 38.166315] IP: [] __request_resource+0x2f/0x50 [ 38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0 [ 38.166315] Oops: [#1] SMP [ 38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus] [ 38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-rc5_bug923184+ #486 [ 38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012 [ 38.166315] task: 88003f401cb0 ti: 88003f60c000 task.ti: 88003f60c000 [ 38.166315] RIP: 0010:[] [] __request_resource+0x2f/0x50 [ 38.166315] RSP: 0018:88003f60fb58 EFLAGS: 00010286 Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/vmbus_drv.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index c85235e..0d8d1d7 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1035,6 +1035,15 @@ acpi_walk_err: return ret_val; } +static int vmbus_acpi_remove(struct acpi_device *device) +{ + int ret = 0; + + if (hyperv_mmio.start && hyperv_mmio.end) + ret = release_resource(&hyperv_mmio); + return ret; +} + static const struct acpi_device_id vmbus_acpi_device_ids[] = { {"VMBUS", 0}, {"VMBus", 0}, @@ -1047,6 +1056,7 @@ static struct acpi_driver vmbus_acpi_driver = { .ids = vmbus_acpi_device_ids, .ops = { .add = vmbus_acpi_add, + .remove = vmbus_acpi_remove, }, }; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 3/5] hv: vmbus_free_channels(): remove the redundant free_channel()
From: Dexuan Cui free_channel() has been invoked in vmbus_remove() -> hv_process_channel_removal(), or vmbus_remove() -> ... -> vmbus_close_internal() -> hv_process_channel_removal(). We also change to use list_for_each_entry_safe(), because the entry is removed in hv_process_channel_removal(). This patch fixes a bug in the vmbus unload path. Thank Dan Carpenter for finding the issue! Signed-off-by: Dexuan Cui Reported-by: Dan Carpenter Cc: K. Y. Srinivasan Cc: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 0eeb1b3..865a3af 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -212,11 +212,16 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) void vmbus_free_channels(void) { - struct vmbus_channel *channel; + struct vmbus_channel *channel, *tmp; + + list_for_each_entry_safe(channel, tmp, &vmbus_connection.chn_list, + listentry) { + /* if we don't set rescind to true, vmbus_close_internal() +* won't invoke hv_process_channel_removal(). +*/ + channel->rescind = true; - list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) { vmbus_device_unregister(channel->device_obj); - free_channel(channel); } } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] scsi: storvsc: Set the SRB flags correctly when no data transfer is needed
Set the SRB flags correctly when there is no data transfer. Cc: Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li --- drivers/scsi/storvsc_drv.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index d9dad90..3c6584f 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1600,8 +1600,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) break; default: vm_srb->data_in = UNKNOWN_TYPE; - vm_srb->win8_extension.srb_flags |= (SRB_FLAGS_DATA_IN | -SRB_FLAGS_DATA_OUT); + vm_srb->win8_extension.srb_flags |= SRB_FLAGS_NO_DATA_TRANSFER; break; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] scsi: storvsc: Set the SRB flags correctly when no data transfer is needed
Set the SRB flags correctly when there is no data transfer. Cc: Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li --- drivers/scsi/storvsc_drv.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index d9dad90..3c6584f 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1600,8 +1600,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) break; default: vm_srb->data_in = UNKNOWN_TYPE; - vm_srb->win8_extension.srb_flags |= (SRB_FLAGS_DATA_IN | -SRB_FLAGS_DATA_OUT); + vm_srb->win8_extension.srb_flags |= SRB_FLAGS_NO_DATA_TRANSFER; break; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net 1/1] hv_netvsc: Fix a bug in netvsc_start_xmit()
Commit commit b08cc79155fc26d0d112b1470d1ece5034651a4b eliminated memory allocation in the packet send path. This commit introduced a bug since it did not account for the case if the skb was cloned. Fix this bug by using the pre-reserved head room only if the skb is not cloned. Signed-off-by: K. Y. Srinivasan --- drivers/net/hyperv/netvsc_drv.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index a3a9d38..7eb0251 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -421,7 +421,7 @@ check_size: pkt_sz = sizeof(struct hv_netvsc_packet) + RNDIS_AND_PPI_SIZE; - if (head_room < pkt_sz) { + if (skb->cloned || head_room < pkt_sz) { packet = kmalloc(pkt_sz, GFP_ATOMIC); if (!packet) { /* out of memory, drop packet */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 net 1/1] hv_netvsc: Fix a bug in netvsc_start_xmit()
Commit b08cc79155fc26d0d112b1470d1ece5034651a4b eliminated memory allocation in the packet send path: "hv_netvsc: Eliminate memory allocation in the packet send path The network protocol used to communicate with the host is the remote ndis (rndis) protocol. We need to decorate each outgoing packet with a rndis header and additional rndis state (rndis per-packet state). To manage this state, we currently allocate memory in the transmit path. Eliminate this allocation by requesting additional head room in the skb." This commit introduced a bug since it did not account for the case if the skb was cloned. Fix this bug. Signed-off-by: K. Y. Srinivasan --- V2: Used skb_cow_head() based on Dave Miller's feedback V2: Fixed up the commit log based on feedback from Sergei Shtylyov drivers/net/hyperv/hyperv_net.h |1 - drivers/net/hyperv/netvsc.c |5 - drivers/net/hyperv/netvsc_drv.c | 27 +++ 3 files changed, 7 insertions(+), 26 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index a10b316..ae04eb5 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -128,7 +128,6 @@ struct ndis_tcp_ip_checksum_info; struct hv_netvsc_packet { /* Bookkeeping stuff */ u32 status; - bool part_of_skb; bool is_data_pkt; bool xmit_more; /* from skb */ diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 2e8ad06..6d0ac8f 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -889,11 +889,6 @@ int netvsc_send(struct hv_device *device, } else { packet->page_buf_cnt = 0; packet->total_data_buflen += msd_len; - if (!packet->part_of_skb) { - skb = (struct sk_buff *)(unsigned long)packet-> - send_completion_tid; - packet->send_completion_tid = 0; - } } if (msdp->pkt) diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index a3a9d38..19037a6 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -235,9 +235,6 @@ void netvsc_xmit_completion(void *context) struct sk_buff *skb = (struct sk_buff *) (unsigned long)packet->send_completion_tid; - if (!packet->part_of_skb) - kfree(packet); - if (skb) dev_kfree_skb_any(skb); } @@ -389,7 +386,6 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) u32 net_trans_info; u32 hash; u32 skb_length; - u32 head_room; u32 pkt_sz; struct hv_page_buffer page_buf[MAX_PAGE_BUFFER_COUNT]; @@ -402,7 +398,6 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) check_size: skb_length = skb->len; - head_room = skb_headroom(skb); num_data_pgs = netvsc_get_slots(skb) + 2; if (num_data_pgs > MAX_PAGE_BUFFER_COUNT && linear) { net_alert_ratelimited("packet too big: %u pages (%u bytes)\n", @@ -421,20 +416,14 @@ check_size: pkt_sz = sizeof(struct hv_netvsc_packet) + RNDIS_AND_PPI_SIZE; - if (head_room < pkt_sz) { - packet = kmalloc(pkt_sz, GFP_ATOMIC); - if (!packet) { - /* out of memory, drop packet */ - netdev_err(net, "unable to alloc hv_netvsc_packet\n"); - ret = -ENOMEM; - goto drop; - } - packet->part_of_skb = false; - } else { - /* Use the headroom for building up the packet */ - packet = (struct hv_netvsc_packet *)skb->head; - packet->part_of_skb = true; + ret = skb_cow_head(skb, pkt_sz); + if (ret) { + netdev_err(net, "unable to alloc hv_netvsc_packet\n"); + ret = -ENOMEM; + goto drop; } + /* Use the headroom for building up the packet */ + packet = (struct hv_netvsc_packet *)skb->head; packet->status = 0; packet->xmit_more = skb->xmit_more; @@ -591,8 +580,6 @@ drop: net->stats.tx_bytes += skb_length; net->stats.tx_packets++; } else { - if (packet && !packet->part_of_skb) - kfree(packet); if (ret != -EAGAIN) { dev_kfree_skb_any(skb); net->stats.tx_dropped++; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host
Based on the information given to this driver (via the xmit_more skb flag), we can defer signaling the host if more packets are on the way. This will help make the host more efficient since it can potentially process a larger batch of packets. Implement this optimization. Signed-off-by: K. Y. Srinivasan --- drivers/net/hyperv/netvsc.c | 11 +++ 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 2e8ad06..9ca416f 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -743,6 +743,7 @@ static inline int netvsc_send_pkt( u64 req_id; int ret; struct hv_page_buffer *pgbuf; + u32 vmbus_flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT; if (packet->is_data_pkt) { @@ -772,19 +773,21 @@ static inline int netvsc_send_pkt( if (packet->page_buf_cnt) { pgbuf = packet->cp_partial ? packet->page_buf + packet->rmsg_pgcnt : packet->page_buf; - ret = vmbus_sendpacket_pagebuffer(out_channel, + ret = vmbus_sendpacket_pagebuffer_ctl(out_channel, pgbuf, packet->page_buf_cnt, &nvmsg, sizeof(struct nvsp_message), - req_id); + req_id, + vmbus_flags, + !packet->xmit_more); } else { - ret = vmbus_sendpacket( + ret = vmbus_sendpacket_ctl( out_channel, &nvmsg, sizeof(struct nvsp_message), req_id, VM_PKT_DATA_INBAND, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); + vmbus_flags, !packet->xmit_more); } if (ret == 0) { -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 net-next 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host
Based on the information given to this driver (via the xmit_more skb flag), we can defer signaling the host if more packets are on the way. This will help make the host more efficient since it can potentially process a larger batch of packets. Implement this optimization. Signed-off-by: K. Y. Srinivasan --- v2: Fixed up indentation based on feedback from David Miller. drivers/net/hyperv/netvsc.c | 20 1 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 2e8ad06..5fdc5e1 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -743,6 +743,7 @@ static inline int netvsc_send_pkt( u64 req_id; int ret; struct hv_page_buffer *pgbuf; + u32 vmbus_flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT; if (packet->is_data_pkt) { @@ -772,19 +773,22 @@ static inline int netvsc_send_pkt( if (packet->page_buf_cnt) { pgbuf = packet->cp_partial ? packet->page_buf + packet->rmsg_pgcnt : packet->page_buf; - ret = vmbus_sendpacket_pagebuffer(out_channel, - pgbuf, - packet->page_buf_cnt, - &nvmsg, - sizeof(struct nvsp_message), - req_id); + ret = vmbus_sendpacket_pagebuffer_ctl(out_channel, + pgbuf, + packet->page_buf_cnt, + &nvmsg, + sizeof(struct +nvsp_message), + req_id, + vmbus_flags, + !packet->xmit_more); } else { - ret = vmbus_sendpacket( + ret = vmbus_sendpacket_ctl( out_channel, &nvmsg, sizeof(struct nvsp_message), req_id, VM_PKT_DATA_INBAND, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); + vmbus_flags, !packet->xmit_more); } if (ret == 0) { -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 1/1] scsi: storvsc: Set the SRB flags correctly when no data transfer is needed
Set the SRB flags correctly when there is no data transfer. Without this change some IHV drivers will fail valid commands such as TEST_UNIT_READY. Cc: Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li --- V2: Added additional details to the commit log - Dan Carpenter drivers/scsi/storvsc_drv.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index d9dad90..3c6584f 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1600,8 +1600,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) break; default: vm_srb->data_in = UNKNOWN_TYPE; - vm_srb->win8_extension.srb_flags |= (SRB_FLAGS_DATA_IN | -SRB_FLAGS_DATA_OUT); + vm_srb->win8_extension.srb_flags |= SRB_FLAGS_NO_DATA_TRANSFER; break; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V3 net-next 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host
Based on the information given to this driver (via the xmit_more skb flag), we can defer signaling the host if more packets are on the way. This will help make the host more efficient since it can potentially process a larger batch of packets. Implement this optimization. Signed-off-by: K. Y. Srinivasan --- v2: Fixed up indentation based on feedback from David Miller. v3: If the queue is stopped, deal with that condition: Eric Dumazet drivers/net/hyperv/netvsc.c | 20 drivers/net/hyperv/netvsc_drv.c |2 ++ 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ea091bc..36fef17 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -743,6 +743,7 @@ static inline int netvsc_send_pkt( u64 req_id; int ret; struct hv_page_buffer *pgbuf; + u32 vmbus_flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT; if (packet->is_data_pkt) { @@ -772,19 +773,22 @@ static inline int netvsc_send_pkt( if (packet->page_buf_cnt) { pgbuf = packet->cp_partial ? packet->page_buf + packet->rmsg_pgcnt : packet->page_buf; - ret = vmbus_sendpacket_pagebuffer(out_channel, - pgbuf, - packet->page_buf_cnt, - &nvmsg, - sizeof(struct nvsp_message), - req_id); + ret = vmbus_sendpacket_pagebuffer_ctl(out_channel, + pgbuf, + packet->page_buf_cnt, + &nvmsg, + sizeof(struct +nvsp_message), + req_id, + vmbus_flags, + !packet->xmit_more); } else { - ret = vmbus_sendpacket( + ret = vmbus_sendpacket_ctl( out_channel, &nvmsg, sizeof(struct nvsp_message), req_id, VM_PKT_DATA_INBAND, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); + vmbus_flags, !packet->xmit_more); } if (ret == 0) { diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 5993c7e..4efaa6e 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -435,6 +435,8 @@ check_size: packet->page_buf = page_buf; packet->q_idx = skb_get_queue_mapping(skb); + if (netif_tx_queue_stopped(netdev_get_tx_queue(net, packet->q_idx))) + packet->xmit_more = false; packet->is_data_pkt = true; packet->total_data_buflen = skb->len; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V4 net-next 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host
Based on the information given to this driver (via the xmit_more skb flag), we can defer signaling the host if more packets are on the way. This will help make the host more efficient since it can potentially process a larger batch of packets. Implement this optimization. Signed-off-by: K. Y. Srinivasan --- v2: Fixed up indentation based on feedback from David Miller. v3,v4: If the queue could be stopped, deal with that condition: Eric Dumazet drivers/net/hyperv/netvsc.c | 34 -- 1 files changed, 24 insertions(+), 10 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ea091bc..1ab4f9e 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -743,6 +743,8 @@ static inline int netvsc_send_pkt( u64 req_id; int ret; struct hv_page_buffer *pgbuf; + u32 vmbus_flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; + u32 ring_avail = hv_ringbuf_avail_percent(&out_channel->outbound); nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT; if (packet->is_data_pkt) { @@ -769,30 +771,42 @@ static inline int netvsc_send_pkt( if (out_channel->rescind) return -ENODEV; + /* +* It is possible that once we successfully place this packet +* on the ringbuffer, we may stop the queue. In that case, we want +* to notify the host independent of the xmit_more flag. We don't +* need to be precise here; in the worst case we may signal the host +* unnecessarily. +*/ + if (ring_avail < (RING_AVAIL_PERCENT_LOWATER + 1)) + packet->xmit_more = false; + if (packet->page_buf_cnt) { pgbuf = packet->cp_partial ? packet->page_buf + packet->rmsg_pgcnt : packet->page_buf; - ret = vmbus_sendpacket_pagebuffer(out_channel, - pgbuf, - packet->page_buf_cnt, - &nvmsg, - sizeof(struct nvsp_message), - req_id); + ret = vmbus_sendpacket_pagebuffer_ctl(out_channel, + pgbuf, + packet->page_buf_cnt, + &nvmsg, + sizeof(struct +nvsp_message), + req_id, + vmbus_flags, + !packet->xmit_more); } else { - ret = vmbus_sendpacket( + ret = vmbus_sendpacket_ctl( out_channel, &nvmsg, sizeof(struct nvsp_message), req_id, VM_PKT_DATA_INBAND, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); + vmbus_flags, !packet->xmit_more); } if (ret == 0) { atomic_inc(&net_device->num_outstanding_sends); atomic_inc(&net_device->queue_sends[q_idx]); - if (hv_ringbuf_avail_percent(&out_channel->outbound) < - RING_AVAIL_PERCENT_LOWATER) { + if (ring_avail < RING_AVAIL_PERCENT_LOWATER) { netif_tx_stop_queue(netdev_get_tx_queue( ndev, q_idx)); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 7/7] Drivers: hv: vmbus: distribute subchannels among all vcpus
From: Vitaly Kuznetsov Primary channels are distributed evenly across all vcpus we have. When the host asks us to create subchannels it usually makes us num_cpus-1 offers and we are supposed to distribute the work evenly among the channel itself and all its subchannels. Make sure they are all assigned to different vcpus. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c | 29 - 1 files changed, 28 insertions(+), 1 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 655c0a0..1f1417d 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -387,6 +387,8 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui int i; bool perf_chn = false; u32 max_cpus = num_online_cpus(); + struct vmbus_channel *primary = channel->primary_channel, *prev; + unsigned long flags; for (i = IDE; i < MAX_PERF_CHN; i++) { if (!memcmp(type_guid->b, hp_devs[i].guid, @@ -407,7 +409,32 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui channel->target_vp = 0; return; } - cur_cpu = (++next_vp % max_cpus); + + /* +* Primary channels are distributed evenly across all vcpus we have. +* When the host asks us to create subchannels it usually makes us +* num_cpus-1 offers and we are supposed to distribute the work evenly +* among the channel itself and all its subchannels. Make sure they are +* all assigned to different vcpus. +*/ + if (!primary) + cur_cpu = (++next_vp % max_cpus); + else { + /* +* Let's assign the first subchannel of a channel to the +* primary->target_cpu+1 and all the subsequent channels to +* the prev->target_cpu+1. +*/ + spin_lock_irqsave(&primary->lock, flags); + if (primary->num_sc == 1) + cur_cpu = (primary->target_cpu + 1) % max_cpus; + else { + prev = list_prev_entry(channel, sc_list); + cur_cpu = (prev->target_cpu + 1) % max_cpus; + } + spin_unlock_irqrestore(&primary->lock, flags); + } + channel->target_cpu = cur_cpu; channel->target_vp = hv_context.vp_index[cur_cpu]; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/7] Drivers: hv: vmbus: briefly comment num_sc and next_oc
From: Vitaly Kuznetsov <[mailto:vkuzn...@redhat.com]> next_oc and num_sc fields of struct vmbus_channel deserve a description. Move them closer to sc_list as these fields are related to it. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- include/linux/hyperv.h | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index ea93486..3932a99 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -727,6 +727,15 @@ struct vmbus_channel { */ struct list_head sc_list; /* +* Current number of sub-channels. +*/ + int num_sc; + /* +* Number of a sub-channel (position within sc_list) which is supposed +* to be used as the next outgoing channel. +*/ + int next_oc; + /* * The primary channel this sub-channel belongs to. * This will be NULL for the primary channel. */ @@ -740,9 +749,6 @@ struct vmbus_channel { * link up channels based on their CPU affinity. */ struct list_head percpu_list; - - int num_sc; - int next_oc; }; static inline void set_channel_read_state(struct vmbus_channel *c, bool state) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/7] Drivers: hv: vmbus: kill tasklets on module unload
From: Vitaly Kuznetsov Explicitly kill tasklets we create on module unload. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/vmbus_drv.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 2b56260..cf20400 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1108,6 +1108,7 @@ static void __exit vmbus_exit(void) hv_synic_clockevents_cleanup(); vmbus_disconnect(); hv_remove_vmbus_irq(); + tasklet_kill(&msg_dpc); vmbus_free_channels(); if (ms_hyperv.features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { atomic_notifier_chain_unregister(&panic_notifier_list, @@ -1115,8 +1116,10 @@ static void __exit vmbus_exit(void) } bus_unregister(&hv_bus); hv_cleanup(); - for_each_online_cpu(cpu) + for_each_online_cpu(cpu) { + tasklet_kill(hv_context.event_dpc[cpu]); smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); + } acpi_bus_unregister_driver(&vmbus_acpi_driver); hv_cpu_hotplug_quirk(false); } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel