RE: Inconsistent lock state with Hyper-V memory balloon?
Sitsofe, can you try the patch attached to see if it helps with the problem? Long -Original Message- From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Peter Zijlstra Sent: Monday, November 10, 2014 1:44 AM To: Sitsofe Wheeler Cc: KY Srinivasan; Haiyang Zhang; de...@linuxdriverproject.org; Ingo Molnar; linux-ker...@vger.kernel.org Subject: Re: Inconsistent lock state with Hyper-V memory balloon? On Sat, Nov 08, 2014 at 02:36:54PM +, Sitsofe Wheeler wrote: > I've been trying to use the Hyper-V balloon driver to allow the host > to reclaim unused memory but have been hitting issues. With a Hyper-V > 2012 > R2 guest with 4GBytes of RAM, dynamic memory on, 1GByte minimum > 10GByte maximum, 8 vcpus, running a 3.18.0-rc3 kernel with no swap > configured the following lockdep splat occurred: > > = > [ INFO: inconsistent lock state ] > 3.18.0-rc3.x86_64 #159 Not tainted > - > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > (bdev_lock){+.?...}, at: [] > nr_blockdev_pages+0x1c/0x80 {SOFTIRQ-ON-W} state was registered at: > [] __lock_acquire+0x87d/0x1c60 > [] lock_acquire+0xfc/0x150 > [] _raw_spin_lock+0x39/0x50 > [] nr_blockdev_pages+0x1c/0x80 > [] si_meminfo+0x47/0x70 > [] eventpoll_init+0x11/0x10a > [] do_one_initcall+0xf9/0x1a7 > [] kernel_init_freeable+0x1d4/0x268 > [] kernel_init+0xe/0x100 > [] ret_from_fork+0x7c/0xb0 irq event stamp: > 2660283708 hardirqs last enabled at (2660283708): > [] free_hot_cold_page+0x175/0x190 hardirqs last > disabled at (2660283707): [] > free_hot_cold_page+0xa5/0x190 softirqs last enabled at (2660132034): > [] _local_bh_enable+0x4a/0x50 softirqs last disabled > at (2660132035): [] irq_exit+0x58/0xc0 > > might help us debug this: > Possible unsafe locking scenario: > >CPU0 > > lock(bdev_lock); > > lock(bdev_lock); > > * > > no locks held by swapper/0/0. > > > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc3.x86_64 #159 > Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, > BIOS 090006 05/23/2012 > 8266ac90 880107403af8 816db3ef > 81c134c0 880107403b58 816d6fd3 0001 > 0001 8801 81010e6f 0046 > Call Trace: >[] dump_stack+0x4e/0x68 > [] print_usage_bug+0x1f3/0x204 [] > ? save_stack_trace+0x2f/0x50 [] ? > print_irq_inversion_bug+0x200/0x200 > [] mark_lock+0x176/0x2e0 [] > __lock_acquire+0x7c3/0x1c60 [] ? > lookup_address+0x28/0x30 [] ? > _lookup_address_cpa.isra.3+0x3b/0x40 > [] ? __debug_check_no_obj_freed+0x89/0x220 > [] lock_acquire+0xfc/0x150 [] ? > nr_blockdev_pages+0x1c/0x80 [] > _raw_spin_lock+0x39/0x50 [] ? > nr_blockdev_pages+0x1c/0x80 [] > nr_blockdev_pages+0x1c/0x80 [] si_meminfo+0x47/0x70 > [] post_status.isra.3+0x6d/0x190 > [] ? trace_hardirqs_on+0xd/0x10 > [] ? __free_pages+0x2f/0x60 [] ? > free_balloon_pages.isra.5+0x8f/0xb0 > [] balloon_onchannelcallback+0x212/0x380 > [] vmbus_on_event+0x173/0x1d0 [] > tasklet_action+0x127/0x160 [] > __do_softirq+0x18a/0x340 [] irq_exit+0x58/0xc0 > [] hyperv_vector_handler+0x45/0x60 > [] hyperv_callback_vector+0x72/0x80 > [] ? native_safe_halt+0x6/0x10 [] > ? trace_hardirqs_on+0xd/0x10 [] > default_idle+0x51/0xf0 [] arch_cpu_idle+0xf/0x20 > [] cpu_startup_entry+0x217/0x3f0 > [] rest_init+0xc9/0xd0 [] ? > rest_init+0x5/0xd0 [] start_kernel+0x438/0x445 > [] ? set_init_arg+0x57/0x57 [] ? > early_idt_handlers+0x120/0x120 [] > x86_64_start_reservations+0x2a/0x2c > [] x86_64_start_kernel+0x13e/0x14d > > Any help deciphering the above is greatly appreciated! Its fairly simple, the first trace shows where bdev_lock was taken with softirqs enabled, and the second trace shows where its taken from softirqs. Combine the two and you've got a recursive deadlock. I don't know the block layer very well, but a quick glance at the code shows its bdev_lock isn't meant to be used from softirq context, therefore the hyperv stuff is broken. So complain to the hyperv people. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ 0001-Move-unballoon-to-work-queue.patch Description: 0001-Move-unballoon-to-work-queue.patch ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
drivers:scsi:storvsc: Fix a bug in handling ring buffer failures that may result in I/O freeze
When ring buffer returns an error indicating retry, storvsc may not return a proper error code to SCSI when bounce buffer is not used. This has introduced I/O freeze on RAID running atop storvsc devices. This patch fixes it by always returning a proper error code. Signed-off-by: Long Li Reviewed-by: K. Y. Srinivasan --- drivers/scsi/storvsc_drv.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index e3ba251..4cff0dd 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1688,13 +1688,12 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) if (ret == -EAGAIN) { /* no more space */ - if (cmd_request->bounce_sgl_count) { + if (cmd_request->bounce_sgl_count) destroy_bounce_buffer(cmd_request->bounce_sgl, cmd_request->bounce_sgl_count); - ret = SCSI_MLQUEUE_DEVICE_BUSY; - goto queue_error; - } + ret = SCSI_MLQUEUE_DEVICE_BUSY; + goto queue_error; } return 0; -- 2.1.0 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] scsi:storvsc enable reading from VPD pages on SPC-2
MSFT targets currently claim SPC-2 compliance while they implement post SPC-2 features. With this patch we can correctly handle WRITE_SAME_16 issues. This patch fixes an issue where the flag is setup too late in drive initialization process. Reviewed-by: K. Y. Srinivasan Signed-off-by: Long Li --- drivers/scsi/scsi_devinfo.c | 1 + drivers/scsi/storvsc_drv.c | 10 -- 2 files changed, 1 insertion(+), 10 deletions(-) diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c index c1d04d4..f6fe0b2 100644 --- a/drivers/scsi/scsi_devinfo.c +++ b/drivers/scsi/scsi_devinfo.c @@ -211,6 +211,7 @@ static struct { {"Medion", "Flash XL MMC/SD", "2.6D", BLIST_FORCELUN}, {"MegaRAID", "LD", NULL, BLIST_FORCELUN}, {"MICROP", "4110", NULL, BLIST_NOTQ}, + {"Msft", "Virtual Disk", "1.0", BLIST_TRY_VPD_PAGES}, {"MYLEX", "DACARMRB", "*", BLIST_REPORTLUN2}, {"nCipher", "Fastness Crypto", NULL, BLIST_FORCELUN}, {"NAKAMICH", "MJ-4.8S", NULL, BLIST_FORCELUN | BLIST_SINGLELUN}, diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index e3ba251..2452bb4 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -327,8 +327,6 @@ MODULE_PARM_DESC(storvsc_ringbuffer_size, "Ring buffer size (bytes)"); */ static int storvsc_timeout = 180; -static int msft_blist_flags = BLIST_TRY_VPD_PAGES; - #define STORVSC_MAX_IO_REQUESTS200 static void storvsc_on_channel_callback(void *context); @@ -1439,14 +1437,6 @@ static int storvsc_device_configure(struct scsi_device *sdevice) sdevice->no_write_same = 1; - /* -* Add blist flags to permit the reading of the VPD pages even when -* the target may claim SPC-2 compliance. MSFT targets currently -* claim SPC-2 compliance while they implement post SPC-2 features. -* With this patch we can correctly handle WRITE_SAME_16 issues. -*/ - sdevice->sdev_bflags |= msft_blist_flags; - return 0; } -- 2.1.0 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] scsi:storvsc enable reading from VPD pages on SPC-2
> >>>>> "Long" == Long Li writes: > > Long> MSFT targets currently claim SPC-2 compliance while they implement > Long> post SPC-2 features. With this patch we can correctly handle > Long> WRITE_SAME_16 issues. > > Handle the issues or handle WRITE SAME(10/16)? With this patch, the SCSI layer will be able to correctly send WRITE_SAME_16 to the Hyper-V host. It will not send WRITE_SAME_10: it has been disabled in the driver template. Do you want me to send another patch with these details? > > + {"Msft", "Virtual Disk", "1.0", BLIST_TRY_VPD_PAGES}, > > Is that version field meaningful or is it safe for us to inquire about VPD > pages > without problems on older versions? This version is used in all current Hyper-V hosts: Windows 2008 R2, 2012 and 2012 R2. However it may change in future Windows releases. > > -- > Martin K. PetersenOracle Linux Engineering ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] scsi:storvsc enable reading from VPD pages on SPC-2
> -Original Message- > From: Sitsofe Wheeler [mailto:sits...@gmail.com] > Sent: Wednesday, December 10, 2014 12:58 PM > To: Long Li > Cc: KY Srinivasan; Haiyang Zhang; jbottom...@parallels.com; linux- > s...@vger.kernel.org; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH] scsi:storvsc enable reading from VPD pages on SPC-2 > > On Wed, Dec 10, 2014 at 12:38:25AM -0800, Long Li wrote: > > MSFT targets currently claim SPC-2 compliance while they implement > > post SPC-2 features. With this patch we can correctly handle > > WRITE_SAME_16 issues. > > > > This patch fixes an issue where the flag is setup too late in drive > > initialization process. > > Li, is this a fix for the issue I keep whinging about over in > https://lkml.org/lkml/2014/10/21/23 ? > > If is the same issue you may want to CC Martin on this... This is only for fixing WRITE_SAME_16. The rest of the thin provisioning (e.g. UNMAP) is still broken on SPC-2. > > -- > Sitsofe | http://sucs.org/~sits/ ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] scsi:storvsc enable reading from VPD pages on SPC-2
Thanks Martin for the explanation. I'll send out another patch. > -Original Message- > From: Martin K. Petersen [mailto:martin.peter...@oracle.com] > Sent: Thursday, December 11, 2014 7:04 PM > To: Long Li > Cc: Martin K. Petersen; KY Srinivasan; Haiyang Zhang; > jbottom...@parallels.com; linux-s...@vger.kernel.org; > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org > Subject: Re: [PATCH] scsi:storvsc enable reading from VPD pages on SPC-2 > > >>>>> "Long" == Long Li writes: > > >> Handle the issues or handle WRITE SAME(10/16)? > > Long> With this patch, the SCSI layer will be able to correctly send > Long> WRITE_SAME_16 to the Hyper-V host. It will not send WRITE_SAME_10: > Long> it has been disabled in the driver template. Do you want me to > Long> send another patch with these details? > > no_write_same prevents us from attempting to use WRITE SAME(10/16) to zero > block ranges. > > This is completely orthogonal to using the WRITE SAME(10/16) commands with > the UNMAP bit set to discard block ranges. The latter is controlled by the > logical > block provisioning heuristics and is not affected by no_write_same at all. > > -- > Martin K. PetersenOracle Linux Engineering ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/4] Drivers: scsi: storvsc: In responce to a scan event, scan the host
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Tuesday, December 16, 2014 1:22 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; > jbottom...@parallels.com; h...@infradead.org; linux-s...@vger.kernel.org > Subject: [PATCH 1/4] Drivers: scsi: storvsc: In responce to a scan event, scan > the host > > The virtual HBA that storvsc implements can support multiple channels and > targets. So, scan the host when the host notifies that a scan is needed. > > Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li > --- > drivers/scsi/storvsc_drv.c | 19 +++ > 1 files changed, 7 insertions(+), 12 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > e3ba251..0a96fef 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -426,21 +426,16 @@ done: > kfree(wrk); > } > > -static void storvsc_bus_scan(struct work_struct *work) > +static void storvsc_host_scan(struct work_struct *work) > { > struct storvsc_scan_work *wrk; > - int id, order_id; > + struct Scsi_Host *host; > > wrk = container_of(work, struct storvsc_scan_work, work); > - for (id = 0; id < wrk->host->max_id; ++id) { > - if (wrk->host->reverse_ordering) > - order_id = wrk->host->max_id - id - 1; > - else > - order_id = id; > - > - scsi_scan_target(&wrk->host->shost_gendev, 0, > - order_id, SCAN_WILD_CARD, 1); > - } > + host = wrk->host; > + > + scsi_scan_host(host); > + > kfree(wrk); > } > > @@ -1198,7 +1193,7 @@ static void storvsc_on_receive(struct hv_device > *device, > if (!work) > return; > > - INIT_WORK(&work->work, storvsc_bus_scan); > + INIT_WORK(&work->work, storvsc_host_scan); > work->host = stor_device->host; > schedule_work(&work->work); > break; > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/4] Drivers: scsi: storvsc: Force discovery of LUNs that may have been removed.
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Tuesday, December 16, 2014 1:22 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; > jbottom...@parallels.com; h...@infradead.org; linux-s...@vger.kernel.org > Subject: [PATCH 2/4] Drivers: scsi: storvsc: Force discovery of LUNs that may > have been removed. > > The host asks the guest to scan when a LUN is removed or added. > The only way a guest can identify the removed LUN is when an I/O is > attempted on a removed LUN - the SRB status code indicates that the LUN is > invalid. We currently handle this SRB status and remove the device. > > Rather than waiting for an I/O to remove the device, force the discovery of > LUNs that may have been removed prior to discovering LUNs that may have > been added. > > Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li > --- > drivers/scsi/storvsc_drv.c | 26 ++ > 1 files changed, 26 insertions(+), 0 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 0a96fef..a7163c6 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -430,10 +430,36 @@ static void storvsc_host_scan(struct work_struct > *work) { > struct storvsc_scan_work *wrk; > struct Scsi_Host *host; > + struct scsi_device *sdev; > + unsigned long flags; > > wrk = container_of(work, struct storvsc_scan_work, work); > host = wrk->host; > > + /* > + * Before scanning the host, first check to see if any of the > + * currrently known devices have been hot removed. We issue a > + * "unit ready" command against all currently known devices. > + * This I/O will result in an error for devices that have been > + * removed. As part of handling the I/O error, we remove the device. > + * > + * When a LUN is added or removed, the host sends us a signal to > + * scan the host. Thus we are forced to discover the LUNs that > + * may have been removed this way. > + */ > + mutex_lock(&host->scan_mutex); > + spin_lock_irqsave(host->host_lock, flags); > + list_for_each_entry(sdev, &host->__devices, siblings) { > + spin_unlock_irqrestore(host->host_lock, flags); > + scsi_test_unit_ready(sdev, 1, 1, NULL); > + spin_lock_irqsave(host->host_lock, flags); > + continue; > + } > + spin_unlock_irqrestore(host->host_lock, flags); > + mutex_unlock(&host->scan_mutex); > + /* > + * Now scan the host to discover LUNs that may have been added. > + */ > scsi_scan_host(host); > > kfree(wrk); > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 3/4] Drivers: scsi: storvsc: Fix a bug in storvsc limits
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Tuesday, December 16, 2014 1:22 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; > jbottom...@parallels.com; h...@infradead.org; linux-s...@vger.kernel.org > Subject: [PATCH 3/4] Drivers: scsi: storvsc: Fix a bug in storvsc limits > > Commit 4cd83ecdac20d30725b4f96e5d7814a1e290bc7e changed the limits to > reflect the values on the host. It turns out that WS2008R2 cannot correctly > handle these new limits. Fix this bug by setting the limits based on the host. > > Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li > --- > drivers/scsi/storvsc_drv.c | 15 --- > 1 files changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > a7163c6..fdc5164 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -1782,6 +1782,9 @@ static int storvsc_probe(struct hv_device *device, > bool dev_is_ide = ((dev_id->driver_data == IDE_GUID) ? true : false); > int target = 0; > struct storvsc_device *stor_device; > + int max_luns_per_target; > + int max_targets; > + int max_channels; > > /* >* Based on the windows host we are running on, @@ -1795,12 > +1798,18 @@ static int storvsc_probe(struct hv_device *device, > vmscsi_size_delta = sizeof(struct vmscsi_win8_extension); > vmstor_current_major = VMSTOR_WIN7_MAJOR; > vmstor_current_minor = VMSTOR_WIN7_MINOR; > + max_luns_per_target = > STORVSC_IDE_MAX_LUNS_PER_TARGET; > + max_targets = STORVSC_IDE_MAX_TARGETS; > + max_channels = STORVSC_IDE_MAX_CHANNELS; > break; > default: > sense_buffer_size = > POST_WIN7_STORVSC_SENSE_BUFFER_SIZE; > vmscsi_size_delta = 0; > vmstor_current_major = VMSTOR_WIN8_MAJOR; > vmstor_current_minor = VMSTOR_WIN8_MINOR; > + max_luns_per_target = STORVSC_MAX_LUNS_PER_TARGET; > + max_targets = STORVSC_MAX_TARGETS; > + max_channels = STORVSC_MAX_CHANNELS; > break; > } > > @@ -1848,9 +1857,9 @@ static int storvsc_probe(struct hv_device *device, > break; > > case SCSI_GUID: > - host->max_lun = STORVSC_MAX_LUNS_PER_TARGET; > - host->max_id = STORVSC_MAX_TARGETS; > - host->max_channel = STORVSC_MAX_CHANNELS - 1; > + host->max_lun = max_luns_per_target; > + host->max_id = max_targets; > + host->max_channel = max_channels - 1; > break; > > default: > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 4/4] Drivers: scsi: storvsc: Force SPC-3 compliance on win8 and win8 r2 hosts
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Tuesday, December 16, 2014 1:22 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; > jbottom...@parallels.com; h...@infradead.org; linux-s...@vger.kernel.org > Subject: [PATCH 4/4] Drivers: scsi: storvsc: Force SPC-3 compliance on win8 > and win8 r2 hosts > > On win8 and win8 r2 hosts force SPC-3 compliance for MSFT virtual disks. > Ubuntu has been carrying a similar patch outside the tree for a while now. > Starting with win10, the host will support SPC-3 compliance. Based on all the > testing that has been done on win8 and win8 r2 hosts, we are comfortable > claiming SPC-3 compliance on these hosts as well. This will enable TRIM > support on these hosts. > > Suggested by: James Bottomley > > Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li > --- > drivers/scsi/storvsc_drv.c | 13 + > 1 files changed, 13 insertions(+), 0 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > fdc5164..7487e07 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -1468,6 +1468,19 @@ static int storvsc_device_configure(struct > scsi_device *sdevice) >*/ > sdevice->sdev_bflags |= msft_blist_flags; > > + /* > + * If the host is WIN8 or WIN8 R2, claim conformance to SPC-3 > + * if the device is a MSFT virtual device. > + */ > + if (!strncmp(sdevice->vendor, "Msft", 4)) { > + switch (vmbus_proto_version) { > + case VERSION_WIN8: > + case VERSION_WIN8_1: > + sdevice->scsi_level = SCSI_SPC_3; > + break; > + } > + } > + > return 0; > } > > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
> On Tue, Oct 17, 2017 at 01:35:21PM -0400, Cathy Avery wrote: > > + /* > > +* Set the error handler work queue. > > +*/ > > + snprintf(host_dev->work_q_name, sizeof(host_dev- > >work_q_name), > > +"storvsc_error_wq_%d", host->host_no); > > + host_dev->handle_error_wq = > > + create_singlethread_workqueue(host_dev- > >work_q_name); > > If you use alloc_ordered_workqueue directly instead of > create_singlethread_workqueue you can pass a format string and don't need > the separate allocation. > > But I'm not sure if Tejun is fine with using __WQ_LEGACY directly.. > > Except for this nit this looks fine to me: > > Reviewed-by: Christoph Hellwig The work storvsc_host_scan (scheduled from function storvsc_on_receive) should also use this workqueue. We can do it in another patch. Reviewed-by: Long Li > ___ > devel mailing list > de...@linuxdriverproject.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdriverd > ev.linuxdriverproject.org%2Fmailman%2Flistinfo%2Fdriverdev- > devel&data=02%7C01%7Clongli%40microsoft.com%7C9c303c3630ef490cecc3 > 08d5170702a2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636440 > 241242573253&sdata=tbCBOnKxtRR38rAdsBDa7zA0Jc2XwrySTsH3uyRxHxA% > 3D&reserved=0 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] hv: kvp: Avoid reading past allocated blocks from KVP file
From: Paul Meyer While reading in more than one block (50) of KVP records, the allocation goes per block, but the reads used the total number of allocated records (without resetting the pointer/stream). This causes the records buffer to overrun when the refresh reads more than one block over the previous capacity (e.g. reading more than 100 KVP records whereas the in-memory database was empty before). Fix this by reading the correct number of KVP records from file each time. Signed-off-by: Paul Meyer --- tools/hv/hv_kvp_daemon.c | 66 1 file changed, 10 insertions(+), 56 deletions(-) diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index eaa3bec..2094036 100644 --- a/tools/hv/hv_kvp_daemon.c +++ b/tools/hv/hv_kvp_daemon.c @@ -193,11 +193,13 @@ static void kvp_update_mem_state(int pool) for (;;) { readp = &record[records_read]; records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK * num_blocks, - filep); + ENTRIES_PER_BLOCK * num_blocks - records_read, + filep); if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", pool); + syslog(LOG_ERR, + "Failed to read file, pool: %d; error: %d %s", +pool, errno, strerror(errno)); exit(EXIT_FAILURE); } @@ -224,15 +226,11 @@ static void kvp_update_mem_state(int pool) fclose(filep); kvp_release_lock(pool); } + static int kvp_file_init(void) { int fd; - FILE *filep; - size_t records_read; char *fname; - struct kvp_record *record; - struct kvp_record *readp; - int num_blocks; int i; int alloc_unit = sizeof(struct kvp_record) * ENTRIES_PER_BLOCK; @@ -246,61 +244,17 @@ static int kvp_file_init(void) for (i = 0; i < KVP_POOL_COUNT; i++) { fname = kvp_file_info[i].fname; - records_read = 0; - num_blocks = 1; sprintf(fname, "%s/.kvp_pool_%d", KVP_CONFIG_LOC, i); fd = open(fname, O_RDWR | O_CREAT | O_CLOEXEC, 0644 /* rw-r--r-- */); if (fd == -1) return 1; - - filep = fopen(fname, "re"); - if (!filep) { - close(fd); - return 1; - } - - record = malloc(alloc_unit * num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - for (;;) { - readp = &record[records_read]; - records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK, - filep); - - if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", - i); - exit(EXIT_FAILURE); - } - - if (!feof(filep)) { - /* -* We have more data to read. -*/ - num_blocks++; - record = realloc(record, alloc_unit * - num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - continue; - } - break; - } kvp_file_info[i].fd = fd; - kvp_file_info[i].num_blocks = num_blocks; - kvp_file_info[i].records = record; - kvp_file_info[i].num_records = records_read; - fclose(filep); - + kvp_file_info[i].num_blocks = 1; + kvp_file_info[i].records = malloc(alloc_unit); + kvp_file_info[i].num_records = 0; + kvp_update_mem_state(i); } return 0; -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] hv: kvp: Avoid reading past allocated blocks from KVP file
> From: Greg KH [mailto:gre...@linuxfoundation.org] > Sent: Tuesday, October 31, 2017 1:43 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Stephen Hemminger ; > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; Paul Meyer > > Subject: Re: [PATCH] hv: kvp: Avoid reading past allocated blocks from KVP > file > > On Mon, Oct 30, 2017 at 05:08:03PM -0700, Long Li wrote: > > From: Paul Meyer > > > > While reading in more than one block (50) of KVP records, the > > allocation goes per block, but the reads used the total number of > > allocated records (without resetting the pointer/stream). This causes > > the records buffer to overrun when the refresh reads more than one > > block over the previous capacity (e.g. reading more than 100 KVP records > whereas the in-memory database was empty before). > > Please wrap changelogs at 72 columns like your editor asked you to... I will fix it. > > > > > Fix this by reading the correct number of KVP records from file each time. > > > > Signed-off-by: Paul Meyer > > --- > > Why is your name not also on the signed-off-by chain if you are forwarding on > a > patch from someone else? > > Is this patch also needed on stable kernels? I'm sending on behalf of Paul Meyer. I will add a "Reviewed-by:" tag. Yes it should also go stable. Will send v2 to include that. > > thanks, > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] hv: kvp: Avoid reading past allocated blocks from KVP file
From: Paul Meyer While reading in more than one block (50) of KVP records, the allocation goes per block, but the reads used the total number of allocated records (without resetting the pointer/stream). This causes the records buffer to overrun when the refresh reads more than one block over the previous capacity (e.g. reading more than 100 KVP records whereas the in-memory database was empty before). Fix this by reading the correct number of KVP records from file each time. Signed-off-by: Paul Meyer Reviewed-by: Long Li --- tools/hv/hv_kvp_daemon.c | 66 1 file changed, 10 insertions(+), 56 deletions(-) diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index eaa3bec..2094036 100644 --- a/tools/hv/hv_kvp_daemon.c +++ b/tools/hv/hv_kvp_daemon.c @@ -193,11 +193,13 @@ static void kvp_update_mem_state(int pool) for (;;) { readp = &record[records_read]; records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK * num_blocks, - filep); + ENTRIES_PER_BLOCK * num_blocks - records_read, + filep); if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", pool); + syslog(LOG_ERR, + "Failed to read file, pool: %d; error: %d %s", +pool, errno, strerror(errno)); exit(EXIT_FAILURE); } @@ -224,15 +226,11 @@ static void kvp_update_mem_state(int pool) fclose(filep); kvp_release_lock(pool); } + static int kvp_file_init(void) { int fd; - FILE *filep; - size_t records_read; char *fname; - struct kvp_record *record; - struct kvp_record *readp; - int num_blocks; int i; int alloc_unit = sizeof(struct kvp_record) * ENTRIES_PER_BLOCK; @@ -246,61 +244,17 @@ static int kvp_file_init(void) for (i = 0; i < KVP_POOL_COUNT; i++) { fname = kvp_file_info[i].fname; - records_read = 0; - num_blocks = 1; sprintf(fname, "%s/.kvp_pool_%d", KVP_CONFIG_LOC, i); fd = open(fname, O_RDWR | O_CREAT | O_CLOEXEC, 0644 /* rw-r--r-- */); if (fd == -1) return 1; - - filep = fopen(fname, "re"); - if (!filep) { - close(fd); - return 1; - } - - record = malloc(alloc_unit * num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - for (;;) { - readp = &record[records_read]; - records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK, - filep); - - if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", - i); - exit(EXIT_FAILURE); - } - - if (!feof(filep)) { - /* -* We have more data to read. -*/ - num_blocks++; - record = realloc(record, alloc_unit * - num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - continue; - } - break; - } kvp_file_info[i].fd = fd; - kvp_file_info[i].num_blocks = num_blocks; - kvp_file_info[i].records = record; - kvp_file_info[i].num_records = records_read; - fclose(filep); - + kvp_file_info[i].num_blocks = 1; + kvp_file_info[i].records = malloc(alloc_unit); + kvp_file_info[i].num_records = 0; + kvp_update_mem_state(i); } return 0; -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] hv: kvp: Avoid reading past allocated blocks from KVP file
> On Tue, Oct 31, 2017 at 06:10:00PM +0000, Long Li wrote: > > > From: Greg KH [mailto:gre...@linuxfoundation.org] > > > Sent: Tuesday, October 31, 2017 1:43 AM > > > To: Long Li > > > Cc: KY Srinivasan ; Haiyang Zhang > > > ; Stephen Hemminger > > > ; de...@linuxdriverproject.org; > > > linux-ker...@vger.kernel.org; Paul Meyer > > > Subject: Re: [PATCH] hv: kvp: Avoid reading past allocated blocks > > > from KVP file > > > > > > On Mon, Oct 30, 2017 at 05:08:03PM -0700, Long Li wrote: > > > > From: Paul Meyer > > > > > > > > While reading in more than one block (50) of KVP records, the > > > > allocation goes per block, but the reads used the total number of > > > > allocated records (without resetting the pointer/stream). This > > > > causes the records buffer to overrun when the refresh reads more > > > > than one block over the previous capacity (e.g. reading more than > > > > 100 KVP records > > > whereas the in-memory database was empty before). > > > > > > Please wrap changelogs at 72 columns like your editor asked you to... > > > > I will fix it. > > > > > > > > > > > > > Fix this by reading the correct number of KVP records from file each > > > > time. > > > > > > > > Signed-off-by: Paul Meyer > > > > --- > > > > > > Why is your name not also on the signed-off-by chain if you are > > > forwarding on a patch from someone else? > > > > > > Is this patch also needed on stable kernels? > > > > I'm sending on behalf of Paul Meyer. I will add a "Reviewed-by:" tag. > > Sending on behalf means you should add your signed-off-by, as it is going > through you. Thanks. I will re-send the patch. > > thanks, > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v2] hv: kvp: Avoid reading past allocated blocks from KVP file
From: Paul Meyer While reading in more than one block (50) of KVP records, the allocation goes per block, but the reads used the total number of allocated records (without resetting the pointer/stream). This causes the records buffer to overrun when the refresh reads more than one block over the previous capacity (e.g. reading more than 100 KVP records whereas the in-memory database was empty before). Fix this by reading the correct number of KVP records from file each time. Signed-off-by: Paul Meyer Signed-off-by: Long Li --- tools/hv/hv_kvp_daemon.c | 66 1 file changed, 10 insertions(+), 56 deletions(-) diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index eaa3bec..2094036 100644 --- a/tools/hv/hv_kvp_daemon.c +++ b/tools/hv/hv_kvp_daemon.c @@ -193,11 +193,13 @@ static void kvp_update_mem_state(int pool) for (;;) { readp = &record[records_read]; records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK * num_blocks, - filep); + ENTRIES_PER_BLOCK * num_blocks - records_read, + filep); if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", pool); + syslog(LOG_ERR, + "Failed to read file, pool: %d; error: %d %s", +pool, errno, strerror(errno)); exit(EXIT_FAILURE); } @@ -224,15 +226,11 @@ static void kvp_update_mem_state(int pool) fclose(filep); kvp_release_lock(pool); } + static int kvp_file_init(void) { int fd; - FILE *filep; - size_t records_read; char *fname; - struct kvp_record *record; - struct kvp_record *readp; - int num_blocks; int i; int alloc_unit = sizeof(struct kvp_record) * ENTRIES_PER_BLOCK; @@ -246,61 +244,17 @@ static int kvp_file_init(void) for (i = 0; i < KVP_POOL_COUNT; i++) { fname = kvp_file_info[i].fname; - records_read = 0; - num_blocks = 1; sprintf(fname, "%s/.kvp_pool_%d", KVP_CONFIG_LOC, i); fd = open(fname, O_RDWR | O_CREAT | O_CLOEXEC, 0644 /* rw-r--r-- */); if (fd == -1) return 1; - - filep = fopen(fname, "re"); - if (!filep) { - close(fd); - return 1; - } - - record = malloc(alloc_unit * num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - for (;;) { - readp = &record[records_read]; - records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK, - filep); - - if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", - i); - exit(EXIT_FAILURE); - } - - if (!feof(filep)) { - /* -* We have more data to read. -*/ - num_blocks++; - record = realloc(record, alloc_unit * - num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - continue; - } - break; - } kvp_file_info[i].fd = fd; - kvp_file_info[i].num_blocks = num_blocks; - kvp_file_info[i].records = record; - kvp_file_info[i].num_records = records_read; - fclose(filep); - + kvp_file_info[i].num_blocks = 1; + kvp_file_info[i].records = malloc(alloc_unit); + kvp_file_info[i].num_records = 0; + kvp_update_mem_state(i); } return 0; -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] hv: kvp: Avoid reading past allocated blocks from KVP file
> From: Paul Meyer > > While reading in more than one block (50) of KVP records, the allocation goes > per block, but the reads used the total number of allocated records (without > resetting the pointer/stream). This causes the records buffer to overrun when > the refresh reads more than one block over the previous capacity (e.g. reading > more than 100 KVP records whereas the in-memory database was empty before). > > Fix this by reading the correct number of KVP records from file each time. Please drop this patch. I have sent a v2. > > Signed-off-by: Paul Meyer > Reviewed-by: Long Li > --- > tools/hv/hv_kvp_daemon.c | 66 > > 1 file changed, 10 insertions(+), 56 deletions(-) > > diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index > eaa3bec..2094036 100644 > --- a/tools/hv/hv_kvp_daemon.c > +++ b/tools/hv/hv_kvp_daemon.c > @@ -193,11 +193,13 @@ static void kvp_update_mem_state(int pool) > for (;;) { > readp = &record[records_read]; > records_read += fread(readp, sizeof(struct kvp_record), > - ENTRIES_PER_BLOCK * num_blocks, > - filep); > + ENTRIES_PER_BLOCK * num_blocks - records_read, > + filep); > > if (ferror(filep)) { > - syslog(LOG_ERR, "Failed to read file, pool: %d", > pool); > + syslog(LOG_ERR, > + "Failed to read file, pool: %d; error: %d %s", > +pool, errno, strerror(errno)); > exit(EXIT_FAILURE); > } > > @@ -224,15 +226,11 @@ static void kvp_update_mem_state(int pool) > fclose(filep); > kvp_release_lock(pool); > } > + > static int kvp_file_init(void) > { > int fd; > - FILE *filep; > - size_t records_read; > char *fname; > - struct kvp_record *record; > - struct kvp_record *readp; > - int num_blocks; > int i; > int alloc_unit = sizeof(struct kvp_record) * ENTRIES_PER_BLOCK; > > @@ -246,61 +244,17 @@ static int kvp_file_init(void) > > for (i = 0; i < KVP_POOL_COUNT; i++) { > fname = kvp_file_info[i].fname; > - records_read = 0; > - num_blocks = 1; > sprintf(fname, "%s/.kvp_pool_%d", KVP_CONFIG_LOC, i); > fd = open(fname, O_RDWR | O_CREAT | O_CLOEXEC, 0644 /* > rw-r--r-- > */); > > if (fd == -1) > return 1; > > - > - filep = fopen(fname, "re"); > - if (!filep) { > - close(fd); > - return 1; > - } > - > - record = malloc(alloc_unit * num_blocks); > - if (record == NULL) { > - fclose(filep); > - close(fd); > - return 1; > - } > - for (;;) { > - readp = &record[records_read]; > - records_read += fread(readp, sizeof(struct > kvp_record), > - ENTRIES_PER_BLOCK, > - filep); > - > - if (ferror(filep)) { > - syslog(LOG_ERR, "Failed to read file, pool: > %d", > - i); > - exit(EXIT_FAILURE); > - } > - > - if (!feof(filep)) { > - /* > -* We have more data to read. > -*/ > - num_blocks++; > - record = realloc(record, alloc_unit * > - num_blocks); > - if (record == NULL) { > - fclose(filep); > - close(fd); > - return 1; > - } > - continue; > - } > - break; > - } > kvp_file_info[i].fd = fd; > - kvp_file_info[i].num_blocks = num_blocks; > - kvp_file_info[i].records = record; > - kvp_fil
[PATCH] storvsc: Avoid excessive host scan on controller change
From: Long Li When there are multiple disks attached to the same SCSI controller, the host may send several VSTOR_OPERATION_REMOVE_DEVICE or VSTOR_OPERATION_ENUMERATE_BUS messages in a row, to indicate there is a change on the SCSI controller. In response, storvsc rescans the SCSI host. There is no need to do multiple scans on the same host. Fix the code to do only one scan. Signed-off-by: Long Li --- drivers/scsi/storvsc_drv.c | 26 +++--- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 6febcdb..b602f52 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -488,6 +488,8 @@ struct hv_host_device { unsigned char target; struct workqueue_struct *handle_error_wq; char work_q_name[20]; + struct work_struct host_scan_work; + struct Scsi_Host *host; }; struct storvsc_scan_work { @@ -516,13 +518,12 @@ static void storvsc_device_scan(struct work_struct *work) static void storvsc_host_scan(struct work_struct *work) { - struct storvsc_scan_work *wrk; struct Scsi_Host *host; struct scsi_device *sdev; + struct hv_host_device *host_device = + container_of(work, struct hv_host_device, host_scan_work); - wrk = container_of(work, struct storvsc_scan_work, work); - host = wrk->host; - + host = host_device->host; /* * Before scanning the host, first check to see if any of the * currrently known devices have been hot removed. We issue a @@ -542,8 +543,6 @@ static void storvsc_host_scan(struct work_struct *work) * Now scan the host to discover LUNs that may have been added. */ scsi_scan_host(host); - - kfree(wrk); } static void storvsc_remove_lun(struct work_struct *work) @@ -1119,8 +1118,7 @@ static void storvsc_on_receive(struct storvsc_device *stor_device, struct vstor_packet *vstor_packet, struct storvsc_cmd_request *request) { - struct storvsc_scan_work *work; - + struct hv_host_device *host_dev; switch (vstor_packet->operation) { case VSTOR_OPERATION_COMPLETE_IO: storvsc_on_io_completion(stor_device, vstor_packet, request); @@ -1128,13 +1126,9 @@ static void storvsc_on_receive(struct storvsc_device *stor_device, case VSTOR_OPERATION_REMOVE_DEVICE: case VSTOR_OPERATION_ENUMERATE_BUS: - work = kmalloc(sizeof(struct storvsc_scan_work), GFP_ATOMIC); - if (!work) - return; - - INIT_WORK(&work->work, storvsc_host_scan); - work->host = stor_device->host; - schedule_work(&work->work); + host_dev = shost_priv(stor_device->host); + queue_work( + host_dev->handle_error_wq, &host_dev->host_scan_work); break; case VSTOR_OPERATION_FCHBA_DATA: @@ -1747,6 +1741,7 @@ static int storvsc_probe(struct hv_device *device, host_dev->port = host->host_no; host_dev->dev = device; + host_dev->host = host; stor_device = kzalloc(sizeof(struct storvsc_device), GFP_KERNEL); @@ -1815,6 +1810,7 @@ static int storvsc_probe(struct hv_device *device, create_singlethread_workqueue(host_dev->work_q_name); if (!host_dev->handle_error_wq) goto err_out2; + INIT_WORK(&host_dev->host_scan_work, storvsc_host_scan); /* Register the HBA and start the scsi bus scan */ ret = scsi_add_host(host, &device->device); if (ret != 0) -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] hv: kvp: Avoid reading past allocated blocks from KVP file
> From: Greg KH [mailto:g...@kroah.com] > Sent: Tuesday, October 31, 2017 11:50 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Stephen Hemminger > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org; sta...@vger.kernel.org; Paul Meyer > ; Long Li > Subject: Re: [PATCH v2] hv: kvp: Avoid reading past allocated blocks from > KVP file > > On Tue, Oct 31, 2017 at 01:02:35PM -0700, Long Li wrote: > > From: Paul Meyer > > > > While reading in more than one block (50) of KVP records, the > > allocation goes per block, but the reads used the total number of > > allocated records (without resetting the pointer/stream). This causes > > the records buffer to overrun when the refresh reads more than one > > block over the previous capacity (e.g. reading more than 100 KVP > > records whereas the in-memory database was empty before). > > > > Fix this by reading the correct number of KVP records from file each time. > > > > Signed-off-by: Paul Meyer > > Signed-off-by: Long Li > > --- > > tools/hv/hv_kvp_daemon.c | 66 > > > > 1 file changed, 10 insertions(+), 56 deletions(-) > > When you version a patch, you always have to say what changed below the > --- line, as the documentation states to do... Sorry it was my bad. Can I resend v2 and indicate what has changed? Long > > v3? :) > > thanks, > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[Revised PATCH v2] hv: kvp: Avoid reading past allocated blocks from KVP file
From: Paul Meyer While reading in more than one block (50) of KVP records, the allocation goes per block, but the reads used the total number of allocated records (without resetting the pointer/stream). This causes the records buffer to overrun when the refresh reads more than one block over the previous capacity (e.g. reading more than 100 KVP records whereas the in-memory database was empty before). Fix this by reading the correct number of KVP records from file each time. Changes since v1: 1. Properly wrapped comment texts. 2. Added the 2nd Signed-off-by. Signed-off-by: Paul Meyer Signed-off-by: Long Li --- tools/hv/hv_kvp_daemon.c | 66 1 file changed, 10 insertions(+), 56 deletions(-) diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c index eaa3bec..2094036 100644 --- a/tools/hv/hv_kvp_daemon.c +++ b/tools/hv/hv_kvp_daemon.c @@ -193,11 +193,13 @@ static void kvp_update_mem_state(int pool) for (;;) { readp = &record[records_read]; records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK * num_blocks, - filep); + ENTRIES_PER_BLOCK * num_blocks - records_read, + filep); if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", pool); + syslog(LOG_ERR, + "Failed to read file, pool: %d; error: %d %s", +pool, errno, strerror(errno)); exit(EXIT_FAILURE); } @@ -224,15 +226,11 @@ static void kvp_update_mem_state(int pool) fclose(filep); kvp_release_lock(pool); } + static int kvp_file_init(void) { int fd; - FILE *filep; - size_t records_read; char *fname; - struct kvp_record *record; - struct kvp_record *readp; - int num_blocks; int i; int alloc_unit = sizeof(struct kvp_record) * ENTRIES_PER_BLOCK; @@ -246,61 +244,17 @@ static int kvp_file_init(void) for (i = 0; i < KVP_POOL_COUNT; i++) { fname = kvp_file_info[i].fname; - records_read = 0; - num_blocks = 1; sprintf(fname, "%s/.kvp_pool_%d", KVP_CONFIG_LOC, i); fd = open(fname, O_RDWR | O_CREAT | O_CLOEXEC, 0644 /* rw-r--r-- */); if (fd == -1) return 1; - - filep = fopen(fname, "re"); - if (!filep) { - close(fd); - return 1; - } - - record = malloc(alloc_unit * num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - for (;;) { - readp = &record[records_read]; - records_read += fread(readp, sizeof(struct kvp_record), - ENTRIES_PER_BLOCK, - filep); - - if (ferror(filep)) { - syslog(LOG_ERR, "Failed to read file, pool: %d", - i); - exit(EXIT_FAILURE); - } - - if (!feof(filep)) { - /* -* We have more data to read. -*/ - num_blocks++; - record = realloc(record, alloc_unit * - num_blocks); - if (record == NULL) { - fclose(filep); - close(fd); - return 1; - } - continue; - } - break; - } kvp_file_info[i].fd = fd; - kvp_file_info[i].num_blocks = num_blocks; - kvp_file_info[i].records = record; - kvp_file_info[i].num_records = records_read; - fclose(filep); - + kvp_file_info[i].num_blocks = 1; + kvp_file_info[i].records = malloc(alloc_unit); + kvp_file_info[i].num_records = 0; + kvp_update_mem_state(i); } return 0; -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] hv: kvp: Avoid reading past allocated blocks from KVP file
> -Original Message- > From: Greg KH [mailto:g...@kroah.com] > Sent: Wednesday, November 1, 2017 11:54 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Stephen Hemminger > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org; sta...@vger.kernel.org; Paul Meyer > > Subject: Re: [PATCH v2] hv: kvp: Avoid reading past allocated blocks from > KVP file > > On Wed, Nov 01, 2017 at 06:39:00PM +, Long Li wrote: > > > From: Greg KH [mailto:g...@kroah.com] > > > Sent: Tuesday, October 31, 2017 11:50 PM > > > To: Long Li > > > Cc: KY Srinivasan ; Haiyang Zhang > > > ; Stephen Hemminger > > > ; de...@linuxdriverproject.org; linux- > > > ker...@vger.kernel.org; sta...@vger.kernel.org; Paul Meyer > > > ; Long Li > > > Subject: Re: [PATCH v2] hv: kvp: Avoid reading past allocated blocks > > > from KVP file > > > > > > On Tue, Oct 31, 2017 at 01:02:35PM -0700, Long Li wrote: > > > > From: Paul Meyer > > > > > > > > While reading in more than one block (50) of KVP records, the > > > > allocation goes per block, but the reads used the total number of > > > > allocated records (without resetting the pointer/stream). This > > > > causes the records buffer to overrun when the refresh reads more > > > > than one block over the previous capacity (e.g. reading more than > > > > 100 KVP records whereas the in-memory database was empty before). > > > > > > > > Fix this by reading the correct number of KVP records from file each > time. > > > > > > > > Signed-off-by: Paul Meyer > > > > Signed-off-by: Long Li > > > > --- > > > > tools/hv/hv_kvp_daemon.c | 66 > > > > > > > > 1 file changed, 10 insertions(+), 56 deletions(-) > > > > > > When you version a patch, you always have to say what changed below > > > the > > > --- line, as the documentation states to do... > > > > Sorry it was my bad. Can I resend v2 and indicate what has changed? > > Why wouldn't you? > > But it would be v3 then :) I have sent a "revised v2". Please let me know if it is acceptable. If not I'll send a "v3". > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] storvsc: Avoid excessive host scan on controller change
> From: Martin K. Petersen [mailto:martin.peter...@oracle.com] > Sent: Monday, November 6, 2017 7:40 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Stephen Hemminger > ; James E . J . Bottomley > ; Martin K . Petersen > ; de...@linuxdriverproject.org; linux- > s...@vger.kernel.org; linux-ker...@vger.kernel.org; Long Li > > Subject: Re: [PATCH] storvsc: Avoid excessive host scan on controller change > > > Long, > > > When there are multiple disks attached to the same SCSI controller, > > the host may send several VSTOR_OPERATION_REMOVE_DEVICE or > > VSTOR_OPERATION_ENUMERATE_BUS messages in a row, to indicate > there is > > a change on the SCSI controller. In response, storvsc rescans the SCSI > > host. > > Applied to 4.15/scsi-queue with some fuzz. Please verify, thanks! Martin, thank you! All looking good. Long > > -- > Martin K. PetersenOracle Linux Engineering ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove
> -Original Message- > From: Bjorn Helgaas [mailto:bhelg...@google.com] > Sent: Saturday, February 11, 2017 9:35 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; de...@linuxdriverproject.org; linux- > p...@vger.kernel.org; linux-ker...@vger.kernel.org > Subject: Re: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus > remove > > On Fri, Feb 10, 2017 at 7:18 PM, Long Li wrote: > > Hi Bjorn, > > > > This patch and the other one in the series ([Resend PATCH 2/2 v3] pci- > hyperv: lock pci bus on device eject) have been Acked. > > > > Is there anything else should be done before it can be merged? Please let > me know. > > Sorry, I don't know what happened here. I see your Jan 23 posting in my > work email (bhelg...@google.com), but I don't see it on the linux-pci or > linux-kernel lists, and patchwork [1] doesn't have a copy > either. I suspect there was something about your email that made > vger drop it (maybe an HTML or other "fancy" stuff per > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fvger.ke > rnel.org%2Fmajordomo- > info.html&data=02%7C01%7Clongli%40microsoft.com%7Cd3d9fb666bdd4244 > 901b08d452a4692b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63 > 6224313474452403&sdata=UcQu75mTXO3xh5ot%2FZTRDgL5GXayaXjs%2Fugt > wWe91Ko%3D&reserved=0). > > Patchwork works by subscribing to linux-pci and collecting things that look > like patches. Then I work from patchwork as a to-do list. > That's a convenient way to ensure that patches appear on the mailing list > before I apply them. It also means that if a patch doesn't appear on > linux-pci > and subsequently in patchwork, I don't know about it. > > Patchwork does have copies of previous versions, but I marked them > "changes requested". When I do that, the patch drops off the to-do list > because I'm expecting a new version, which *will* appear on the list. I don't > mark things "changes requested" if I'm only waiting for an ack, so it looks > like > the only change I was looking for was a changelog revision. Normally I just > tweak changelogs myself, so I apologize for not doing that in this case. > > Anyway, can you just post the current version, including the acks, and make > sure it shows up on the mailing list? > > I'm sorry this has languished so long. Thanks for reminding me about it so we > can sort this out. Thank you. I will fix the email and resend the patch. > > [1] > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch > work.ozlabs.org%2Fproject%2Flinux- > pci%2Flist%2F%3Fsubmitter%3D69886%26state%3D*%26q%3D%26archive%3 > Dboth%26delegate&data=02%7C01%7Clongli%40microsoft.com%7Cd3d9fb66 > 6bdd4244901b08d452a4692b%7C72f988bf86f141af91ab2d7cd011db47%7C1% > 7C0%7C636224313474452403&sdata=ELx04yDnSbe1fxXLy7z2iFoKwazKEMlDLrl > p4CWhXbk%3D&reserved=0= > > >> -Original Message- > >> From: KY Srinivasan > >> Sent: Friday, January 27, 2017 10:42 AM > >> To: Long Li ; Haiyang Zhang > >> ; Bjorn Helgaas > >> Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > >> ker...@vger.kernel.org; Long Li > >> Subject: RE: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci > >> bus remove > >> > >> > >> > >> > -Original Message- > >> > From: Long Li [mailto:lon...@exchange.microsoft.com] > >> > Sent: Monday, January 23, 2017 9:45 PM > >> > To: KY Srinivasan ; Haiyang Zhang > >> > ; Bjorn Helgaas > >> > Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > >> > ker...@vger.kernel.org; Long Li > >> > Subject: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus > >> > remove > >> > > >> > [This sender failed our fraud detection checks and may not be who > >> > they appear to be. Learn about spoofing at > >> > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Faka > >> > > .ms%2FLearnAboutSpoofing&data=02%7C01%7Clongli%40microsoft.com%7C > d3 > >> > > d9fb666bdd4244901b08d452a4692b%7C72f988bf86f141af91ab2d7cd011db47 > %7 > >> > > C1%7C0%7C636224313474452403&sdata=jlfhIYsJJT4HbcPGSPTk43AApcip%2F > 9m > >> > w7snnFn%2FvI74%3D&reserved=0] > >> > > >> > From: Long Li > >> > > >> > hv_pci_devices_present is called in hv_pci_remove when we remove a > >> > PCI device from host (e.g. by disabling SRIOV on a device). In >
[Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove
hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation. Signed-off-by: Long Li Reported-by: Xiaofeng Wang Acked-by: K. Y. Srinivasan --- drivers/pci/host/pci-hyperv.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -348,6 +348,7 @@ enum hv_pcibus_state { hv_pcibus_init = 0, hv_pcibus_probed, hv_pcibus_installed, + hv_pcibus_removed, hv_pcibus_maximum }; @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct work_struct *work) put_pcichild(hpdev, hv_pcidev_ref_initial); } - /* Tell the core to rescan bus because there may have been changes. */ - if (hbus->state == hv_pcibus_installed) { + switch (hbus->state) { + case hv_pcibus_installed: + /* +* Tell the core to rescan bus +* because there may have been changes. +*/ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - } else { + break; + + case hv_pcibus_init: + case hv_pcibus_probed: survey_child_resources(hbus); + break; + + default: + break; } up(&hbus->enum_sem); @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); if (!hbus) return -ENOMEM; + hbus->state = hv_pcibus_init; /* * The PCI bus "domain" is what is called "segment" in ACPI and @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) pci_stop_root_bus(hbus->pci_bus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); + hbus->state = hv_pcibus_removed; } ret = hv_send_resources_released(hdev); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[Resend PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject
A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message or in create_root_hv_pci_bus(), in this case we could potentailly modify the bus from multiple places. Properly lock the bus access. Thanks Dexuan Cui for pointing out the race condition in create_root_hv_pci_bus(). Signed-off-by: Long Li Reported-by: Xiaofeng Wang Acked-by: K. Y. Srinivasan --- drivers/pci/host/pci-hyperv.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 4a37598..33c75c9 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1198,9 +1198,11 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus) hbus->pci_bus->msi = &hbus->msi_chip; hbus->pci_bus->msi->dev = &hbus->hdev->device; + pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_bus_assign_resources(hbus->pci_bus); pci_bus_add_devices(hbus->pci_bus); + pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; return 0; } @@ -1590,8 +1592,10 @@ static void hv_eject_device_work(struct work_struct *work) pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, wslot); if (pdev) { + pci_lock_rescan_remove(); pci_stop_and_remove_bus_device(pdev); pci_dev_put(pdev); + pci_unlock_rescan_remove(); } memset(&ctxt, 0, sizeof(ctxt)); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [Resend PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject
Ok, I will resend. > -Original Message- > From: Greg KH [mailto:g...@kroah.com] > Sent: Saturday, February 25, 2017 12:02 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > p...@vger.kernel.org > Subject: Re: [Resend PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject > > On Fri, Feb 24, 2017 at 09:49:17PM +, Long Li wrote: > > A PCI_EJECT message can arrive at the same time we are calling > pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS > message or in create_root_hv_pci_bus(), in this case we could potentailly > modify the bus from multiple places. Properly lock the bus access. > > Properly wrap your changelog comments at 72 columns like your editor is > telling you to do... ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/2 v4] pci-hyperv: properly handle pci bus remove
hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation. Signed-off-by: Long Li Reported-by: Xiaofeng Wang Acked-by: K. Y. Srinivasan --- drivers/pci/host/pci-hyperv.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -348,6 +348,7 @@ enum hv_pcibus_state { hv_pcibus_init = 0, hv_pcibus_probed, hv_pcibus_installed, + hv_pcibus_removed, hv_pcibus_maximum }; @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct work_struct *work) put_pcichild(hpdev, hv_pcidev_ref_initial); } - /* Tell the core to rescan bus because there may have been changes. */ - if (hbus->state == hv_pcibus_installed) { + switch (hbus->state) { + case hv_pcibus_installed: + /* +* Tell the core to rescan bus +* because there may have been changes. +*/ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - } else { + break; + + case hv_pcibus_init: + case hv_pcibus_probed: survey_child_resources(hbus); + break; + + default: + break; } up(&hbus->enum_sem); @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); if (!hbus) return -ENOMEM; + hbus->state = hv_pcibus_init; /* * The PCI bus "domain" is what is called "segment" in ACPI and @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) pci_stop_root_bus(hbus->pci_bus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); + hbus->state = hv_pcibus_removed; } ret = hv_send_resources_released(hdev); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2 v4] pci-hyperv: lock pci bus on device eject
A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message or in create_root_hv_pci_bus(), in this case we could potentially modify the bus from multiple places. Properly lock the bus access. Thanks Dexuan Cui for pointing out the race condition in create_root_hv_pci_bus(). Signed-off-by: Long Li Reported-by: Xiaofeng Wang Acked-by: K. Y. Srinivasan --- drivers/pci/host/pci-hyperv.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 4a37598..33c75c9 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1198,9 +1198,11 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus) hbus->pci_bus->msi = &hbus->msi_chip; hbus->pci_bus->msi->dev = &hbus->hdev->device; + pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_bus_assign_resources(hbus->pci_bus); pci_bus_add_devices(hbus->pci_bus); + pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; return 0; } @@ -1590,8 +1592,10 @@ static void hv_eject_device_work(struct work_struct *work) pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, wslot); if (pdev) { + pci_lock_rescan_remove(); pci_stop_and_remove_bus_device(pdev); pci_dev_put(pdev); + pci_unlock_rescan_remove(); } memset(&ctxt, 0, sizeof(ctxt)); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] HV: properly delay KVP packets when negotiation is in progress
The host may send multiple KVP packets before the negotiation with daemon is finished. We need to keep those packets in ring buffer until the daemon is negotiated and connected. This patch is based on the work of Nick Meier Signed-off-by: Long Li --- drivers/hv/hv_kvp.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index de26371..b9f928d 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -628,16 +628,17 @@ void hv_kvp_onchannelcallback(void *context) NEGO_IN_PROGRESS, NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; - if (host_negotiatied == NEGO_NOT_STARTED && - kvp_transaction.state < HVUTIL_READY) { + if (kvp_transaction.state < HVUTIL_READY) { /* * If userspace daemon is not connected and host is asking * us to negotiate we need to delay to not lose messages. * This is important for Failover IP setting. */ - host_negotiatied = NEGO_IN_PROGRESS; - schedule_delayed_work(&kvp_host_handshake_work, + if (host_negotiatied == NEGO_NOT_STARTED) { + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(&kvp_host_handshake_work, HV_UTIL_NEGO_TIMEOUT * HZ); + } return; } if (kvp_transaction.state > HVUTIL_READY) -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/2 v4] pci-hyperv: properly handle pci bus remove
> -Original Message- > From: Bjorn Helgaas [mailto:helg...@kernel.org] > Sent: Thursday, March 16, 2017 1:07 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH 1/2 v4] pci-hyperv: properly handle pci bus remove > > On Tue, Feb 28, 2017 at 02:19:45AM +, Long Li wrote: > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI > > device from host (e.g. by disabling SRIOV on a device). In > > hv_pci_remove, the bus is already removed before the call, so we don't > > need to rescan the bus in the workqueue scheduled from > > hv_pci_devices_present. By introducing status hv_pcibus_removed, we > can avoid this situation. > > > > Signed-off-by: Long Li > > Reported-by: Xiaofeng Wang > > Acked-by: K. Y. Srinivasan > > This didn't apply for me because saving it to a file resulted in some encoded > file with "=3D" instead of "=". I see that mutt is smart enough to deal with > that in this reply, so there's probably a way for it to decode it when saving > to > a file, but I don't know it. > > I tried to apply it by hand, but the hunk in hv_pci_remove() doesn't match > the context. I think your patch is based on something previous to > 17978524a636 ("PCI: hv: Fix hv_pci_remove() for hot-remove"). Please > refresh the patch so it applies to my "master" branch (currently v4.11-rc1). > > Also, the "default: break;" case below is redundant and can be removed. > > So I'll wait for your updated versions of both these patches. Thanks, I'll address those issues and resend the patch. > > > --- > > drivers/pci/host/pci-hyperv.c | 20 +--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > hv_pcibus_init = 0, > > hv_pcibus_probed, > > hv_pcibus_installed, > > + hv_pcibus_removed, > > hv_pcibus_maximum > > }; > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > work_struct *work) > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > } > > > > - /* Tell the core to rescan bus because there may have been changes. > */ > > - if (hbus->state == hv_pcibus_installed) { > > + switch (hbus->state) { > > + case hv_pcibus_installed: > > + /* > > +* Tell the core to rescan bus > > +* because there may have been changes. > > +*/ > > pci_lock_rescan_remove(); > > pci_scan_child_bus(hbus->pci_bus); > > pci_unlock_rescan_remove(); > > - } else { > > + break; > > + > > + case hv_pcibus_init: > > + case hv_pcibus_probed: > > survey_child_resources(hbus); > > + break; > > + > > + default: > > + break; > > ^ This is redundant. > > > } > > > > up(&hbus->enum_sem); > > @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, > > hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); > > if (!hbus) > > return -ENOMEM; > > + hbus->state = hv_pcibus_init; > > > > /* > > * The PCI bus "domain" is what is called "segment" in ACPI and @@ > > -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) > > pci_stop_root_bus(hbus->pci_bus); > > pci_remove_root_bus(hbus->pci_bus); > > pci_unlock_rescan_remove(); > > + hbus->state = hv_pcibus_removed; > > } > > > > ret = hv_send_resources_released(hdev); > > -- > > 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] HV: properly delay KVP packets when negotiation is in progress
> -Original Message- > From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com] > Sent: Friday, March 17, 2017 9:16 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Stephen Hemminger > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH] HV: properly delay KVP packets when negotiation is in > progress > > Long Li writes: > > > The host may send multiple KVP packets before the negotiation with > > daemon is finished. We need to keep those packets in ring buffer until > > the daemon is negotiated and connected. > > The patch looks OK but previously we always presumed that this can't > happen for util drivers and host will never send a new request before we > answer to the previous one. If this is not true we may have more issues > which need fixing as all three drivers we have are written in a 'transaction' > fashion. > > So my question would be: can the host send multiple (KVP) packets _after_ > the negotiation with daemon is finished? Thanks Vitaly. I'm checking with Windows guys and will update soon. > > > > > > This patch is based on the work of Nick Meier > > > > > > Signed-off-by: Long Li > > --- > > drivers/hv/hv_kvp.c | 9 + > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index > > de26371..b9f928d 100644 > > --- a/drivers/hv/hv_kvp.c > > +++ b/drivers/hv/hv_kvp.c > > @@ -628,16 +628,17 @@ void hv_kvp_onchannelcallback(void *context) > > NEGO_IN_PROGRESS, > > NEGO_FINISHED} host_negotiatied = > NEGO_NOT_STARTED; > > > > - if (host_negotiatied == NEGO_NOT_STARTED && > > - kvp_transaction.state < HVUTIL_READY) { > > + if (kvp_transaction.state < HVUTIL_READY) { > > /* > > * If userspace daemon is not connected and host is asking > > * us to negotiate we need to delay to not lose messages. > > * This is important for Failover IP setting. > > */ > > - host_negotiatied = NEGO_IN_PROGRESS; > > - schedule_delayed_work(&kvp_host_handshake_work, > > + if (host_negotiatied == NEGO_NOT_STARTED) { > > + host_negotiatied = NEGO_IN_PROGRESS; > > + > schedule_delayed_work(&kvp_host_handshake_work, > > HV_UTIL_NEGO_TIMEOUT * HZ); > > + } > > return; > > } > > if (kvp_transaction.state > HVUTIL_READY) > > -- > Vitaly ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] HV: properly delay KVP packets when negotiation is in progress
> -Original Message- > From: Long Li > Sent: Sunday, March 19, 2017 7:49 PM > To: 'Vitaly Kuznetsov' > Cc: KY Srinivasan ; Haiyang Zhang > ; Stephen Hemminger > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: RE: [PATCH] HV: properly delay KVP packets when negotiation is in > progress > > > > > -Original Message- > > From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com] > > Sent: Friday, March 17, 2017 9:16 AM > > To: Long Li > > Cc: KY Srinivasan ; Haiyang Zhang > > ; Stephen Hemminger > ; > > de...@linuxdriverproject.org; linux- ker...@vger.kernel.org > > Subject: Re: [PATCH] HV: properly delay KVP packets when negotiation > > is in progress > > > > Long Li writes: > > > > > The host may send multiple KVP packets before the negotiation with > > > daemon is finished. We need to keep those packets in ring buffer > > > until the daemon is negotiated and connected. > > > > The patch looks OK but previously we always presumed that this can't > > happen for util drivers and host will never send a new request before > > we answer to the previous one. If this is not true we may have more > > issues which need fixing as all three drivers we have are written in a > 'transaction' > > fashion. > > > > So my question would be: can the host send multiple (KVP) packets > > _after_ the negotiation with daemon is finished? > > Thanks Vitaly. I'm checking with Windows guys and will update soon. It's possible that hosts may send a number of staged KVP requests as soon as negotiation is done. The current KVP code can deal with a number of pending KVP requests, and respond to them one by one. To summarize the issue this patch tries to fix: 1. When host sends a negotiation request, and it times out, the host will send another negotiation request, and so on. 2. The guest can respond to all negotiation requests from the host. All subsequent response (except for the 1st response) are ignored by the host. 3. Before negotiation is done, the host may have staged a number of pending KVP requests. 4. As soon as negotiation is done, the host sends all KVP requests to guest. There is a corner case that if there is only one pending KVP request after the 2nd (or 3rd etc) negotiation, it may get lost. I'm testing the following code to address this condition: @@ -705,6 +706,7 @@ void hv_kvp_onchannelcallback(void *context) VM_PKT_DATA_INBAND, 0); host_negotiatied = NEGO_FINISHED; + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } } Please drop this patch. I'll send V2. > > > > > > > > > > > This patch is based on the work of Nick Meier > > > > > > > > > Signed-off-by: Long Li > > > --- > > > drivers/hv/hv_kvp.c | 9 + > > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index > > > de26371..b9f928d 100644 > > > --- a/drivers/hv/hv_kvp.c > > > +++ b/drivers/hv/hv_kvp.c > > > @@ -628,16 +628,17 @@ void hv_kvp_onchannelcallback(void *context) > > >NEGO_IN_PROGRESS, > > >NEGO_FINISHED} host_negotiatied = > > NEGO_NOT_STARTED; > > > > > > - if (host_negotiatied == NEGO_NOT_STARTED && > > > - kvp_transaction.state < HVUTIL_READY) { > > > + if (kvp_transaction.state < HVUTIL_READY) { > > > /* > > >* If userspace daemon is not connected and host is asking > > >* us to negotiate we need to delay to not lose messages. > > >* This is important for Failover IP setting. > > >*/ > > > - host_negotiatied = NEGO_IN_PROGRESS; > > > - schedule_delayed_work(&kvp_host_handshake_work, > > > + if (host_negotiatied == NEGO_NOT_STARTED) { > > > + host_negotiatied = NEGO_IN_PROGRESS; > > > + > > schedule_delayed_work(&kvp_host_handshake_work, > > > HV_UTIL_NEGO_TIMEOUT * HZ); > > > + } > > > return; > > > } > > > if (kvp_transaction.state > HVUTIL_READY) > > > > -- > > Vitaly ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v2] HV: properly delay KVP packets when negotiation is in progress
The host may send multiple negotiation packets (due to timeout) before the KVP user-mode daemon is connected. We need to defer processing those packets until the daemon is negotiated and connected. It's okay for guest to respond to all negotiation packets. In addition, the host may send multiple staged KVP requests as soon as negotiation is done. We need to properly process those packets using one tasklet for exclusive access to ring buffer. This patch is based on the work of Nick Meier The patch v2 has incorporated suggestion from Vitaly Kuznetsov . Signed-off-by: Long Li --- drivers/hv/hv_kvp.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index de26371..845b70b 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -113,7 +113,7 @@ static void kvp_poll_wrapper(void *channel) { /* Transaction is finished, reset the state here to avoid races. */ kvp_transaction.state = HVUTIL_READY; - hv_kvp_onchannelcallback(channel); + tasklet_schedule(&((struct vmbus_channel*)channel)->callback_event); } static void kvp_register_done(void) @@ -628,16 +628,17 @@ void hv_kvp_onchannelcallback(void *context) NEGO_IN_PROGRESS, NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; - if (host_negotiatied == NEGO_NOT_STARTED && - kvp_transaction.state < HVUTIL_READY) { + if (kvp_transaction.state < HVUTIL_READY) { /* * If userspace daemon is not connected and host is asking * us to negotiate we need to delay to not lose messages. * This is important for Failover IP setting. */ - host_negotiatied = NEGO_IN_PROGRESS; - schedule_delayed_work(&kvp_host_handshake_work, + if (host_negotiatied == NEGO_NOT_STARTED) { + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(&kvp_host_handshake_work, HV_UTIL_NEGO_TIMEOUT * HZ); + } return; } if (kvp_transaction.state > HVUTIL_READY) @@ -705,6 +706,7 @@ void hv_kvp_onchannelcallback(void *context) VM_PKT_DATA_INBAND, 0); host_negotiatied = NEGO_FINISHED; + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/2 v4] pci-hyperv: properly handle pci bus remove
> -Original Message- > From: Bjorn Helgaas [mailto:helg...@kernel.org] > Sent: Thursday, March 16, 2017 1:07 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH 1/2 v4] pci-hyperv: properly handle pci bus remove > > On Tue, Feb 28, 2017 at 02:19:45AM +, Long Li wrote: > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI > > device from host (e.g. by disabling SRIOV on a device). In > > hv_pci_remove, the bus is already removed before the call, so we don't > > need to rescan the bus in the workqueue scheduled from > > hv_pci_devices_present. By introducing status hv_pcibus_removed, we > can avoid this situation. > > > > Signed-off-by: Long Li > > Reported-by: Xiaofeng Wang > > Acked-by: K. Y. Srinivasan > > This didn't apply for me because saving it to a file resulted in some encoded > file with "=3D" instead of "=". I see that mutt is smart enough to deal with > that in this reply, so there's probably a way for it to decode it when saving > to > a file, but I don't know it. > > I tried to apply it by hand, but the hunk in hv_pci_remove() doesn't match > the context. I think your patch is based on something previous to > 17978524a636 ("PCI: hv: Fix hv_pci_remove() for hot-remove"). Please > refresh the patch so it applies to my "master" branch (currently v4.11-rc1). > > Also, the "default: break;" case below is redundant and can be removed. > > So I'll wait for your updated versions of both these patches. > > > --- > > drivers/pci/host/pci-hyperv.c | 20 +--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > hv_pcibus_init = 0, > > hv_pcibus_probed, > > hv_pcibus_installed, > > + hv_pcibus_removed, > > hv_pcibus_maximum > > }; > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > work_struct *work) > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > } > > > > - /* Tell the core to rescan bus because there may have been changes. > */ > > - if (hbus->state == hv_pcibus_installed) { > > + switch (hbus->state) { > > + case hv_pcibus_installed: > > + /* > > +* Tell the core to rescan bus > > +* because there may have been changes. > > +*/ > > pci_lock_rescan_remove(); > > pci_scan_child_bus(hbus->pci_bus); > > pci_unlock_rescan_remove(); > > - } else { > > + break; > > + > > + case hv_pcibus_init: > > + case hv_pcibus_probed: > > survey_child_resources(hbus); > > + break; > > + > > + default: > > + break; > > ^ This is redundant. I found it still needs "default:break", or it will give a compiler warning: drivers/pci/host/pci-hyperv.c: In function 'pci_devices_present_work': drivers/pci/host/pci-hyperv.c:1510:2: warning: enumeration value 'hv_pcibus_removed' not handled in switch [-Wswitch] switch(hbus->state) { ^ drivers/pci/host/pci-hyperv.c:1510:2: warning: enumeration value 'hv_pcibus_maximum' not handled in switch [-Wswitch] > > > } > > > > up(&hbus->enum_sem); > > @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, > > hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); > > if (!hbus) > > return -ENOMEM; > > + hbus->state = hv_pcibus_init; > > > > /* > > * The PCI bus "domain" is what is called "segment" in ACPI and @@ > > -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) > > pci_stop_root_bus(hbus->pci_bus); > > pci_remove_root_bus(hbus->pci_bus); > > pci_unlock_rescan_remove(); > > + hbus->state = hv_pcibus_removed; > > } > > > > ret = hv_send_resources_released(hdev); > > -- > > 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] HV: properly delay KVP packets when negotiation is in progress
> -Original Message- > From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com] > Sent: Thursday, March 23, 2017 9:04 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Stephen Hemminger > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org; sta...@vger.kernel.org > Subject: Re: [PATCH v2] HV: properly delay KVP packets when negotiation is > in progress > > Long Li writes: > > > The host may send multiple negotiation packets (due to timeout) before > > the KVP user-mode daemon is connected. We need to defer processing > > those packets until the daemon is negotiated and connected. It's okay > > for guest to respond to all negotiation packets. > > > > In addition, the host may send multiple staged KVP requests as soon as > > negotiation is done. We need to properly process those packets using > > one tasklet for exclusive access to ring buffer. > > > > This patch is based on the work of Nick Meier > > > > > > The patch v2 has incorporated suggestion from Vitaly Kuznetsov > > . > > > > Signed-off-by: Long Li > > --- > > drivers/hv/hv_kvp.c | 12 +++- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index > > de26371..845b70b 100644 > > --- a/drivers/hv/hv_kvp.c > > +++ b/drivers/hv/hv_kvp.c > > @@ -113,7 +113,7 @@ static void kvp_poll_wrapper(void *channel) { > > /* Transaction is finished, reset the state here to avoid races. */ > > kvp_transaction.state = HVUTIL_READY; > > - hv_kvp_onchannelcallback(channel); > > + tasklet_schedule(&((struct vmbus_channel*)channel)- > >callback_event); > > } > > There is one more function in the code which calls > hv_kvp_onchannelcallback(): > > static void kvp_host_handshake_func(struct work_struct *dummy) { > hv_poll_channel(kvp_transaction.recv_channel, > hv_kvp_onchannelcallback); } > > we can't replace hv_kvp_onchannelcallback with kvp_poll_wrapper here as > we don't want to reset kvp_transaction.state but it seems this should also > get updated, e.g. hv_poll_channel() here can be replaced with the direct > > tasklet_schedule(&((struct vmbus_channel*)channel)->callback_event); > > call. This will ensure hv_kvp_onchannelcallback() calls are always serialized. Thank you. I will send v3. > > > > > static void kvp_register_done(void) > > @@ -628,16 +628,17 @@ void hv_kvp_onchannelcallback(void *context) > > NEGO_IN_PROGRESS, > > NEGO_FINISHED} host_negotiatied = > NEGO_NOT_STARTED; > > > > - if (host_negotiatied == NEGO_NOT_STARTED && > > - kvp_transaction.state < HVUTIL_READY) { > > + if (kvp_transaction.state < HVUTIL_READY) { > > /* > > * If userspace daemon is not connected and host is asking > > * us to negotiate we need to delay to not lose messages. > > * This is important for Failover IP setting. > > */ > > - host_negotiatied = NEGO_IN_PROGRESS; > > - schedule_delayed_work(&kvp_host_handshake_work, > > + if (host_negotiatied == NEGO_NOT_STARTED) { > > + host_negotiatied = NEGO_IN_PROGRESS; > > + > schedule_delayed_work(&kvp_host_handshake_work, > > HV_UTIL_NEGO_TIMEOUT * HZ); > > + } > > return; > > } > > if (kvp_transaction.state > HVUTIL_READY) @@ -705,6 +706,7 @@ > void > > hv_kvp_onchannelcallback(void *context) > >VM_PKT_DATA_INBAND, 0); > > > > host_negotiatied = NEGO_FINISHED; > > + hv_poll_channel(kvp_transaction.recv_channel, > kvp_poll_wrapper); > > } > > > > } > > -- > Vitaly ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[Please ignore this is a test] pci-hyperv: properly handle pci bus remove
From: Long Li hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation. Signed-off-by: Long Li --- drivers/pci/host/pci-hyperv.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index ada9856..8a92244 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -350,6 +350,7 @@ enum hv_pcibus_state { hv_pcibus_init = 0, hv_pcibus_probed, hv_pcibus_installed, + hv_pcibus_removed, hv_pcibus_maximum }; @@ -1504,12 +1505,19 @@ static void pci_devices_present_work(struct work_struct *work) put_pcichild(hpdev, hv_pcidev_ref_initial); } - /* Tell the core to rescan bus because there may have been changes. */ - if (hbus->state == hv_pcibus_installed) { + switch(hbus->state) { + case hv_pcibus_installed: + /* + * Tell the core to rescan bus + * because there may have been changes. + */ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - } else { + break; + + case hv_pcibus_init: + case hv_pcibus_probed: survey_child_resources(hbus); } @@ -2185,6 +2193,7 @@ static int hv_pci_probe(struct hv_device *hdev, hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); if (!hbus) return -ENOMEM; + hbus->state = hv_pcibus_init; /* * The PCI bus "domain" is what is called "segment" in ACPI and @@ -2348,6 +2357,7 @@ static int hv_pci_remove(struct hv_device *hdev) pci_stop_root_bus(hbus->pci_bus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); + hbus->state = hv_pcibus_removed; } hv_pci_bus_exit(hdev); -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2 v5] pci-hyperv: lock pci bus on device eject
From: Long Li A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message or in create_root_hv_pci_bus(), in this case we could potentially modify the bus from multiple places. Properly lock the bus access. Thanks Dexuan Cui for pointing out the race condition in create_root_hv_pci_bus(). Signed-off-by: Long Li Reported-by: Xiaofeng Wang Acked-by: K. Y. Srinivasan --- drivers/pci/host/pci-hyperv.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 39fafda..a1b3c19 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1209,9 +1209,11 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus) hbus->pci_bus->msi = &hbus->msi_chip; hbus->pci_bus->msi->dev = &hbus->hdev->device; + pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_bus_assign_resources(hbus->pci_bus); pci_bus_add_devices(hbus->pci_bus); + pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; return 0; } @@ -1612,8 +1614,10 @@ static void hv_eject_device_work(struct work_struct *work) pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, wslot); if (pdev) { + pci_lock_rescan_remove(); pci_stop_and_remove_bus_device(pdev); pci_dev_put(pdev); + pci_unlock_rescan_remove(); } spin_lock_irqsave(&hpdev->hbus->device_list_lock, flags); -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/2 v5] pci-hyperv: properly handle pci bus remove
From: Long Li hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation. Signed-off-by: Long Li Reported-by: Xiaofeng Wang Acked-by: K. Y. Srinivasan --- drivers/pci/host/pci-hyperv.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index ada9856..39fafda 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -350,6 +350,7 @@ enum hv_pcibus_state { hv_pcibus_init = 0, hv_pcibus_probed, hv_pcibus_installed, + hv_pcibus_removed, hv_pcibus_maximum }; @@ -1504,13 +1505,24 @@ static void pci_devices_present_work(struct work_struct *work) put_pcichild(hpdev, hv_pcidev_ref_initial); } - /* Tell the core to rescan bus because there may have been changes. */ - if (hbus->state == hv_pcibus_installed) { + switch(hbus->state) { + case hv_pcibus_installed: + /* + * Tell the core to rescan bus + * because there may have been changes. + */ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - } else { + break; + + case hv_pcibus_init: + case hv_pcibus_probed: survey_child_resources(hbus); + break; + + default: + break; } up(&hbus->enum_sem); @@ -2185,6 +2197,7 @@ static int hv_pci_probe(struct hv_device *hdev, hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); if (!hbus) return -ENOMEM; + hbus->state = hv_pcibus_init; /* * The PCI bus "domain" is what is called "segment" in ACPI and @@ -2348,6 +2361,7 @@ static int hv_pci_remove(struct hv_device *hdev) pci_stop_root_bus(hbus->pci_bus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); + hbus->state = hv_pcibus_removed; } hv_pci_bus_exit(hdev); -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v3] HV: properly delay KVP packets when negotiation is in progress
From: Long Li The host may send multiple negotiation packets (due to timeout) before the KVP user-mode daemon is connected. We need to defer processing those packets until the daemon is negotiated and connected. It's okay for guest to respond to all negotiation packets. In addition, the host may send multiple staged KVP requests as soon as negotiation is done. We need to properly process those packets using one tasklet for exclusive access to ring buffer. This patch is based on the work of Nick Meier . The patch v3 has incorporated suggestions from Vitaly Kuznetsov . Signed-off-by: Long Li --- drivers/hv/hv_kvp.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index de26371..be7222e 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -113,7 +113,7 @@ static void kvp_poll_wrapper(void *channel) { /* Transaction is finished, reset the state here to avoid races. */ kvp_transaction.state = HVUTIL_READY; - hv_kvp_onchannelcallback(channel); + tasklet_schedule(&((struct vmbus_channel*)channel)->callback_event); } static void kvp_register_done(void) @@ -160,7 +160,7 @@ static void kvp_timeout_func(struct work_struct *dummy) static void kvp_host_handshake_func(struct work_struct *dummy) { - hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); + tasklet_schedule(&kvp_transaction.recv_channel->callback_event); } static int kvp_handle_handshake(struct hv_kvp_msg *msg) @@ -628,16 +628,17 @@ void hv_kvp_onchannelcallback(void *context) NEGO_IN_PROGRESS, NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; - if (host_negotiatied == NEGO_NOT_STARTED && - kvp_transaction.state < HVUTIL_READY) { + if (kvp_transaction.state < HVUTIL_READY) { /* * If userspace daemon is not connected and host is asking * us to negotiate we need to delay to not lose messages. * This is important for Failover IP setting. */ - host_negotiatied = NEGO_IN_PROGRESS; - schedule_delayed_work(&kvp_host_handshake_work, + if (host_negotiatied == NEGO_NOT_STARTED) { + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(&kvp_host_handshake_work, HV_UTIL_NEGO_TIMEOUT * HZ); + } return; } if (kvp_transaction.state > HVUTIL_READY) @@ -705,6 +706,7 @@ void hv_kvp_onchannelcallback(void *context) VM_PKT_DATA_INBAND, 0); host_negotiatied = NEGO_FINISHED; + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v3] HV: properly delay KVP packets when negotiation is in progress
From: Long Li The host may send multiple negotiation packets (due to timeout) before the KVP user-mode daemon is connected. We need to defer processing those packets until the daemon is negotiated and connected. It's okay for guest to respond to all negotiation packets. In addition, the host may send multiple staged KVP requests as soon as negotiation is done. We need to properly process those packets using one tasklet for exclusive access to ring buffer. This patch is based on the work of Nick Meier . The patch v3 has incorporated suggestions from Vitaly Kuznetsov . Signed-off-by: Long Li --- drivers/hv/hv_kvp.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index de26371..be7222e 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -113,7 +113,7 @@ static void kvp_poll_wrapper(void *channel) { /* Transaction is finished, reset the state here to avoid races. */ kvp_transaction.state = HVUTIL_READY; - hv_kvp_onchannelcallback(channel); + tasklet_schedule(&((struct vmbus_channel*)channel)->callback_event); } static void kvp_register_done(void) @@ -160,7 +160,7 @@ static void kvp_timeout_func(struct work_struct *dummy) static void kvp_host_handshake_func(struct work_struct *dummy) { - hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); + tasklet_schedule(&kvp_transaction.recv_channel->callback_event); } static int kvp_handle_handshake(struct hv_kvp_msg *msg) @@ -628,16 +628,17 @@ void hv_kvp_onchannelcallback(void *context) NEGO_IN_PROGRESS, NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; - if (host_negotiatied == NEGO_NOT_STARTED && - kvp_transaction.state < HVUTIL_READY) { + if (kvp_transaction.state < HVUTIL_READY) { /* * If userspace daemon is not connected and host is asking * us to negotiate we need to delay to not lose messages. * This is important for Failover IP setting. */ - host_negotiatied = NEGO_IN_PROGRESS; - schedule_delayed_work(&kvp_host_handshake_work, + if (host_negotiatied == NEGO_NOT_STARTED) { + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(&kvp_host_handshake_work, HV_UTIL_NEGO_TIMEOUT * HZ); + } return; } if (kvp_transaction.state > HVUTIL_READY) @@ -705,6 +706,7 @@ void hv_kvp_onchannelcallback(void *context) VM_PKT_DATA_INBAND, 0); host_negotiatied = NEGO_FINISHED; + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/2] pci-hyperv: Fix a bug in specifying CPU affinity
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of k...@exchange.microsoft.com > Sent: Friday, March 24, 2017 11:07 AM > To: helg...@kernel.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de; > a...@canonical.com; vkuzn...@redhat.com; jasow...@redhat.com; > leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen > Hemminger > Cc: sta...@vger.kernel.org > Subject: [PATCH 1/2] pci-hyperv: Fix a bug in specifying CPU affinity > > From: K. Y. Srinivasan > > When we have 32 or more CPUs in the affinity mask, we should use a special > constant to specify that to the host. Fix this issue. > > Signed-off-by: K. Y. Srinivasan > Cc: Reviewed-by: Long Li > --- > drivers/pci/host/pci-hyperv.c | 11 --- > 1 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c > index ada9856..32a16fb 100644 > --- a/drivers/pci/host/pci-hyperv.c > +++ b/drivers/pci/host/pci-hyperv.c > @@ -72,6 +72,7 @@ enum { > PCI_PROTOCOL_VERSION_CURRENT = PCI_PROTOCOL_VERSION_1_1 }; > > +#define CPU_AFFINITY_ALL -1ULL > #define PCI_CONFIG_MMIO_LENGTH 0x2000 > #define CFG_PAGE_OFFSET 0x1000 > #define CFG_PAGE_SIZE (PCI_CONFIG_MMIO_LENGTH - > CFG_PAGE_OFFSET) @@ -897,9 +898,13 @@ static void > hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) > * processors because Hyper-V only supports 64 in a guest. > */ > affinity = irq_data_get_affinity_mask(data); > - for_each_cpu_and(cpu, affinity, cpu_online_mask) { > - int_pkt->int_desc.cpu_mask |= > - (1ULL << vmbus_cpu_number_to_vp_number(cpu)); > + if (cpumask_weight(affinity) >= 32) { > + int_pkt->int_desc.cpu_mask = CPU_AFFINITY_ALL; > + } else { > + for_each_cpu_and(cpu, affinity, cpu_online_mask) { > + int_pkt->int_desc.cpu_mask |= > + (1ULL << vmbus_cpu_number_to_vp_number(cpu)); > + } > } > > ret = vmbus_sendpacket(hpdev->hbus->hdev->channel, int_pkt, > -- > 1.7.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/2] pci-hyperv: Fix an atomic bug
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of k...@exchange.microsoft.com > Sent: Friday, March 24, 2017 11:07 AM > To: helg...@kernel.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de; > a...@canonical.com; vkuzn...@redhat.com; jasow...@redhat.com; > leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen > Hemminger > Cc: sta...@vger.kernel.org > Subject: [PATCH 2/2] pci-hyperv: Fix an atomic bug > > From: K. Y. Srinivasan > > The memory allocation here needs to be non-blocking. > Fix the issue. > > Signed-off-by: K. Y. Srinivasan > Cc: Reviewed-by: Long Li > --- > drivers/pci/host/pci-hyperv.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c > index 32a16fb..85088a1 100644 > --- a/drivers/pci/host/pci-hyperv.c > +++ b/drivers/pci/host/pci-hyperv.c > @@ -877,7 +877,7 @@ static void hv_compose_msi_msg(struct irq_data > *data, struct msi_msg *msg) > hv_int_desc_free(hpdev, int_desc); > } > > - int_desc = kzalloc(sizeof(*int_desc), GFP_KERNEL); > + int_desc = kzalloc(sizeof(*int_desc), GFP_ATOMIC); > if (!int_desc) > goto drop_reference; > > -- > 1.7.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [Resend Patch 3/3] Storvsc: Select channel based on available percentage of ring buffer to write
> Subject: RE: [Resend Patch 3/3] Storvsc: Select channel based on available > percentage of ring buffer to write > > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org > > On Behalf Of Long Li > > Sent: Tuesday, March 27, 2018 5:49 PM > > To: KY Srinivasan ; Haiyang Zhang > > ; Stephen Hemminger > ; > > James E . J . Bottomley ; Martin K . Petersen > > ; de...@linuxdriverproject.org; linux- > > s...@vger.kernel.org; linux-ker...@vger.kernel.org; > > net...@vger.kernel.org > > Cc: Long Li > > Subject: [Resend Patch 3/3] Storvsc: Select channel based on available > > percentage of ring buffer to write > > > > From: Long Li > > > > This is a best effort for estimating on how busy the ring buffer is > > for that channel, based on available buffer to write in percentage. It > > is still possible that at the time of actual ring buffer write, the > > space may not be available due to other processes may be writing at the > time. > > > > Selecting a channel based on how full it is can reduce the possibility > > that a ring buffer write will fail, and avoid the situation a channel > > is over busy. > > > > Now it's possible that storvsc can use a smaller ring buffer size > > (e.g. 40k bytes) to take advantage of cache locality. > > > > Signed-off-by: Long Li > > --- > > drivers/scsi/storvsc_drv.c | 62 > > +- > > 1 file changed, 50 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > index a2ec0bc9e9fa..b1a87072b3ab 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -395,6 +395,12 @@ MODULE_PARM_DESC(storvsc_ringbuffer_size, > "Ring > > buffer size (bytes)"); > > > > module_param(storvsc_vcpus_per_sub_channel, int, S_IRUGO); > > MODULE_PARM_DESC(storvsc_vcpus_per_sub_channel, "Ratio of VCPUs > to > > subchannels"); > > + > > +static int ring_avail_percent_lowater = 10; > > Reserving 10% of each ring buffer by default seems like more than is needed > in the storvsc driver. That would be about 4Kbytes for the 40K ring buffer > you suggest, and even more for a ring buffer of 128K. Each outgoing record is > only about 344 bytes (I'd have to check exactly). With the new channel > selection algorithm below, the only time we use a channel that is already > below the low water mark is when no channel could be found that is above > the low water mark. There could be a case of two or more threads deciding > that a channel is above the low water mark at the same time and both > choosing it, but that's likely to be rare. So it seems like we could set the It's not rare for two processes checking on the same channel at the same time, when running multiple processes I/O workload. The CPU to channel is not 1:1 mapping. > default low water mark to 5 percent or even 3 percent, which will let more of > the ring buffer be used, and let a channel be assigned according to the > algorithm, rather than falling through to the default because all channels > appear to be "full". It seems it's not about how big ring buffer is, e.g. even you have a ring buffer of infinite size, it won't help with performance if it's getting queued all the time, while other ring buffers are near empty. It's more about how multiple ring buffers are getting utilized in a reasonable and balanced way. Testing shows 10 is a good choice, while 3 is prone to return BUSY and trigger block layer retry. > > > +module_param(ring_avail_percent_lowater, int, S_IRUGO); > > +MODULE_PARM_DESC(ring_avail_percent_lowater, > > + "Select a channel if available ring size > this in percent"); > > + > > /* > > * Timeout in seconds for all devices managed by this driver. > > */ > > @@ -1285,9 +1291,9 @@ static int storvsc_do_io(struct hv_device > > *device, { > > struct storvsc_device *stor_device; > > struct vstor_packet *vstor_packet; > > - struct vmbus_channel *outgoing_channel; > > + struct vmbus_channel *outgoing_channel, *channel; > > int ret = 0; > > - struct cpumask alloced_mask; > > + struct cpumask alloced_mask, other_numa_mask; > > int tgt_cpu; > > > > vstor_packet = &request->vstor_packet; @@ -1301,22 +1307,53 @@ > > static int storvsc_do_io(struct hv_device *device, > > /* > > * Select an an appropriate channel to send the request out. > > */ &
RE: [PATCH v2] storvsc: Set up correct queue depth values for IDE devices
Hi Martin Can you take a look at the following patch? Long > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org > > On Behalf Of Long Li > > Sent: Thursday, March 22, 2018 2:47 PM > > To: KY Srinivasan ; Haiyang Zhang > > ; Stephen Hemminger > ; > > James E . J . Bottomley ; Martin K . Petersen > > ; de...@linuxdriverproject.org; linux- > > s...@vger.kernel.org; linux-ker...@vger.kernel.org > > Cc: Long Li > > Subject: [PATCH v2] storvsc: Set up correct queue depth values for IDE > > devices > > > > From: Long Li > > > > Unlike SCSI and FC, we don't use multiple channels for IDE. > > Also fix the calculation for sub-channels. > > > > Change log: > > v2: Addressed comment on incorrect number of sub-channels. > > (Michael Kelley ) > > > > Signed-off-by: Long Li > > Reviewed-by: Michael Kelley > > > --- > > drivers/scsi/storvsc_drv.c | 7 +-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > index 8c51d628b52e..a2ec0bc9e9fa 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -1722,11 +1722,14 @@ static int storvsc_probe(struct hv_device > *device, > > max_targets = STORVSC_MAX_TARGETS; > > max_channels = STORVSC_MAX_CHANNELS; > > /* > > -* On Windows8 and above, we support sub-channels for > storage. > > +* On Windows8 and above, we support sub-channels for > storage > > +* on SCSI and FC controllers. > > * The number of sub-channels offerred is based on the > number of > > * VCPUs in the guest. > > */ > > - max_sub_channels = (num_cpus / > storvsc_vcpus_per_sub_channel); > > + if (!dev_is_ide) > > + max_sub_channels = > > + (num_cpus - 1) / > storvsc_vcpus_per_sub_channel; > > } > > > > scsi_driver.can_queue = (max_outstanding_req_per_channel * > > -- > > 2.14.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] storvsc: Set up correct queue depth values for IDE devices
> Subject: Re: [PATCH v2] storvsc: Set up correct queue depth values for IDE > devices > > > Long, > > > Can you take a look at the following patch? > > >> > + max_sub_channels = > >> > +(num_cpus - 1) / storvsc_vcpus_per_sub_channel; > > What happens if num_cpus = 1? If num_cpus=1, we don't have any sub channels. The host offers one sub channel for VM with 5 CPUs, after that it offers an additional sub channel every 4 CPUs. The primary channel is always offered. > > -- > Martin K. PetersenOracle Linux Engineering ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[Patch v2] Storvsc: Select channel based on available percentage of ring buffer to write
From: Long Li This is a best effort for estimating on how busy the ring buffer is for that channel, based on available buffer to write in percentage. It is still possible that at the time of actual ring buffer write, the space may not be available due to other processes may be writing at the time. Selecting a channel based on how full it is can reduce the possibility that a ring buffer write will fail, and avoid the situation a channel is over busy. Now it's possible that storvsc can use a smaller ring buffer size (e.g. 40k bytes) to take advantage of cache locality. Changes. v2: Pre-allocate struct cpumask on the heap. Struct cpumask is a big structure (1k bytes) when CONFIG_NR_CPUS=8192 (default value when CONFIG_MAXSMP=y). Don't use kernel stack for it by pre-allocating them using kmalloc when channels are first initialized. Signed-off-by: Long Li --- drivers/scsi/storvsc_drv.c | 90 -- 1 file changed, 72 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index a2ec0bc9e9fa..2a9fff94dd1a 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -395,6 +395,12 @@ MODULE_PARM_DESC(storvsc_ringbuffer_size, "Ring buffer size (bytes)"); module_param(storvsc_vcpus_per_sub_channel, int, S_IRUGO); MODULE_PARM_DESC(storvsc_vcpus_per_sub_channel, "Ratio of VCPUs to subchannels"); + +static int ring_avail_percent_lowater = 10; +module_param(ring_avail_percent_lowater, int, S_IRUGO); +MODULE_PARM_DESC(ring_avail_percent_lowater, + "Select a channel if available ring size > this in percent"); + /* * Timeout in seconds for all devices managed by this driver. */ @@ -468,6 +474,13 @@ struct storvsc_device { * Mask of CPUs bound to subchannels. */ struct cpumask alloced_cpus; + /* +* Pre-allocated struct cpumask for each hardware queue. +* struct cpumask is used by selecting out-going channels. It is a +* big structure, default to 1024k bytes when CONFIG_MAXSMP=y. +* Pre-allocate it to avoid allocation on the kernel stack. +*/ + struct cpumask *cpumask_chns; /* Used for vsc/vsp channel reset process */ struct storvsc_cmd_request init_request; struct storvsc_cmd_request reset_request; @@ -872,6 +885,13 @@ static int storvsc_channel_init(struct hv_device *device, bool is_fc) if (stor_device->stor_chns == NULL) return -ENOMEM; + stor_device->cpumask_chns = kcalloc(num_possible_cpus(), + sizeof(struct cpumask), GFP_KERNEL); + if (stor_device->cpumask_chns == NULL) { + kfree(stor_device->stor_chns); + return -ENOMEM; + } + stor_device->stor_chns[device->channel->target_cpu] = device->channel; cpumask_set_cpu(device->channel->target_cpu, &stor_device->alloced_cpus); @@ -1232,6 +1252,7 @@ static int storvsc_dev_remove(struct hv_device *device) vmbus_close(device->channel); kfree(stor_device->stor_chns); + kfree(stor_device->cpumask_chns); kfree(stor_device); return 0; } @@ -1241,7 +1262,7 @@ static struct vmbus_channel *get_og_chn(struct storvsc_device *stor_device, { u16 slot = 0; u16 hash_qnum; - struct cpumask alloced_mask; + struct cpumask *alloced_mask = &stor_device->cpumask_chns[q_num]; int num_channels, tgt_cpu; if (stor_device->num_sc == 0) @@ -1257,10 +1278,10 @@ static struct vmbus_channel *get_og_chn(struct storvsc_device *stor_device, * III. Mapping is persistent. */ - cpumask_and(&alloced_mask, &stor_device->alloced_cpus, + cpumask_and(alloced_mask, &stor_device->alloced_cpus, cpumask_of_node(cpu_to_node(q_num))); - num_channels = cpumask_weight(&alloced_mask); + num_channels = cpumask_weight(alloced_mask); if (num_channels == 0) return stor_device->device->channel; @@ -1268,7 +1289,7 @@ static struct vmbus_channel *get_og_chn(struct storvsc_device *stor_device, while (hash_qnum >= num_channels) hash_qnum -= num_channels; - for_each_cpu(tgt_cpu, &alloced_mask) { + for_each_cpu(tgt_cpu, alloced_mask) { if (slot == hash_qnum) break; slot++; @@ -1285,9 +1306,9 @@ static int storvsc_do_io(struct hv_device *device, { struct storvsc_device *stor_device; struct vstor_packet *vstor_packet; - struct vmbus_channel *outgoing_channel; + struct vmbus_channel *outgoing_channel, *channel; int ret = 0; - struct cpumask alloced_mask; + struct cpumask *alloced_mask; i
RE: [PATCH 1/6] scsi: storvsc: Rather than look for sets of specific protocol versions, make decisions based on ranges.
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Friday, May 29, 2015 1:29 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Cc: Keith Mange > Subject: [PATCH 1/6] scsi: storvsc: Rather than look for sets of specific > protocol > versions, make decisions based on ranges. > > From: keith.ma...@microsoft.com > > Rather than look for sets of specific protocol versions, make decisions based > on > ranges. This will be safer and require fewer changes going forward as we add > more storage protocol versions. > Reviewed-by: Long Li > Tested-by: Alex Ng > Signed-off-by: Keith Mange > Signed-off-by: K. Y. Srinivasan > --- > drivers/scsi/storvsc_drv.c | 11 +++ > 1 files changed, 3 insertions(+), 8 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 3c6584f..582f3b5 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -988,8 +988,7 @@ static int storvsc_channel_init(struct hv_device *device) >* support multi-channel. >*/ > max_chns = vstor_packet- > >storage_channel_properties.max_channel_cnt; > - if ((vmbus_proto_version != VERSION_WIN7) && > -(vmbus_proto_version != VERSION_WS2008)) { > + if (vmbus_proto_version >= VERSION_WIN8) { > if (vstor_packet->storage_channel_properties.flags & > STORAGE_CHANNEL_SUPPORTS_MULTI_CHANNEL) > process_sub_channels = true; > @@ -1758,9 +1757,7 @@ static int storvsc_probe(struct hv_device *device, >* set state to properly communicate with the host. >*/ > > - switch (vmbus_proto_version) { > - case VERSION_WS2008: > - case VERSION_WIN7: > + if (vmbus_proto_version < VERSION_WIN8) { > sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; > vmscsi_size_delta = sizeof(struct vmscsi_win8_extension); > vmstor_current_major = VMSTOR_WIN7_MAJOR; @@ - > 1768,8 +1765,7 @@ static int storvsc_probe(struct hv_device *device, > max_luns_per_target = > STORVSC_IDE_MAX_LUNS_PER_TARGET; > max_targets = STORVSC_IDE_MAX_TARGETS; > max_channels = STORVSC_IDE_MAX_CHANNELS; > - break; > - default: > + } else { > sense_buffer_size = > POST_WIN7_STORVSC_SENSE_BUFFER_SIZE; > vmscsi_size_delta = 0; > vmstor_current_major = VMSTOR_WIN8_MAJOR; @@ - > 1783,7 +1779,6 @@ static int storvsc_probe(struct hv_device *device, >* VCPUs in the guest. >*/ > max_sub_channels = (num_cpus / > storvsc_vcpus_per_sub_channel); > - break; > } > > scsi_driver.can_queue = (max_outstanding_req_per_channel * > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/6] scsi: storvsc: Use a single value to track protocol versions
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Friday, May 29, 2015 1:29 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Cc: Keith Mange > Subject: [PATCH 2/6] scsi: storvsc: Use a single value to track protocol > versions > > From: keith.ma...@microsoft.com > > Use a single value to track protocol versions to simplify comparisons and to > be > consistent with vmbus version tracking. > Reviewed-by: Long Li > Tested-by: Alex Ng > Signed-off-by: Keith Mange > Signed-off-by: K. Y. Srinivasan > --- > drivers/scsi/storvsc_drv.c | 35 +-- > 1 files changed, 9 insertions(+), 26 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 582f3b5..5f9d133 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -58,12 +58,11 @@ > * Win8: 5.1 > */ > > +#define VMSTOR_PROTO_VERSION(MAJOR_, MINOR_) MAJOR_) & 0xff) > << 8) | \ > + (((MINOR_) & 0xff))) > > -#define VMSTOR_WIN7_MAJOR 4 > -#define VMSTOR_WIN7_MINOR 2 > - > -#define VMSTOR_WIN8_MAJOR 5 > -#define VMSTOR_WIN8_MINOR 1 > +#define VMSTOR_PROTO_VERSION_WIN7VMSTOR_PROTO_VERSION(4, > 2) > +#define VMSTOR_PROTO_VERSION_WIN8VMSTOR_PROTO_VERSION(5, > 1) > > > /* Packet structure describing virtual storage requests. */ @@ -161,8 +160,7 > @@ static int sense_buffer_size; > */ > > static int vmscsi_size_delta; > -static int vmstor_current_major; > -static int vmstor_current_minor; > +static int vmstor_proto_version; > > struct vmscsi_win8_extension { > /* > @@ -481,18 +479,6 @@ done: > kfree(wrk); > } > > -/* > - * Major/minor macros. Minor version is in LSB, meaning that earlier flat > - * version numbers will be interpreted as "0.x" (i.e., 1 becomes 0.1). > - */ > - > -static inline u16 storvsc_get_version(u8 major, u8 minor) -{ > - u16 version; > - > - version = ((major << 8) | minor); > - return version; > -} > > /* > * We can get incoming messages from the host that are not in response to > @@ -930,8 +916,7 @@ static int storvsc_channel_init(struct hv_device *device) > vstor_packet->operation = > VSTOR_OPERATION_QUERY_PROTOCOL_VERSION; > vstor_packet->flags = REQUEST_COMPLETION_FLAG; > > - vstor_packet->version.major_minor = > - storvsc_get_version(vmstor_current_major, > vmstor_current_minor); > + vstor_packet->version.major_minor = vmstor_proto_version; > > /* >* The revision number is only used in Windows; set it to 0. > @@ -1562,7 +1547,7 @@ static int storvsc_queuecommand(struct Scsi_Host > *host, struct scsi_cmnd *scmnd) > u32 payload_sz; > u32 length; > > - if (vmstor_current_major <= VMSTOR_WIN8_MAJOR) { > + if (vmstor_proto_version <= VMSTOR_PROTO_VERSION_WIN8) { > /* >* On legacy hosts filter unimplemented commands. >* Future hosts are expected to correctly handle @@ -1760,16 > +1745,14 @@ static int storvsc_probe(struct hv_device *device, > if (vmbus_proto_version < VERSION_WIN8) { > sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; > vmscsi_size_delta = sizeof(struct vmscsi_win8_extension); > - vmstor_current_major = VMSTOR_WIN7_MAJOR; > - vmstor_current_minor = VMSTOR_WIN7_MINOR; > + vmstor_proto_version = VMSTOR_PROTO_VERSION_WIN7; > max_luns_per_target = > STORVSC_IDE_MAX_LUNS_PER_TARGET; > max_targets = STORVSC_IDE_MAX_TARGETS; > max_channels = STORVSC_IDE_MAX_CHANNELS; > } else { > sense_buffer_size = > POST_WIN7_STORVSC_SENSE_BUFFER_SIZE; > vmscsi_size_delta = 0; > - vmstor_current_major = VMSTOR_WIN8_MAJOR; > - vmstor_current_minor = VMSTOR_WIN8_MINOR; > + vmstor_proto_version = VMSTOR_PROTO_VERSION_WIN8; > max_luns_per_target = STORVSC_MAX_LUNS_PER_TARGET; > max_targets = STORVSC_MAX_TARGETS; > max_channels = STORVSC_MAX_CHANNELS; > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 3/6] hv:scsi:Untangle the storage protocol negotiation from the vmbus protocol negotiation.
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Friday, May 29, 2015 1:29 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Cc: Keith Mange > Subject: [PATCH 3/6] hv:scsi:Untangle the storage protocol negotiation from > the vmbus protocol negotiation. > > From: keith.ma...@microsoft.com > > Currently we are making decisions based on vmbus protocol versions that have > been negotiated; use storage potocol versions instead. > Reviewed-by: Long Li > Tested-by: Alex Ng > Signed-off-by: Keith Mange > Signed-off-by: K. Y. Srinivasan > --- > drivers/scsi/storvsc_drv.c | 109 +++-- > --- > 1 files changed, 87 insertions(+), 22 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 5f9d133..f29871e 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -56,14 +56,18 @@ > * V1 RC > 2008/1/31: 2.0 > * Win7: 4.2 > * Win8: 5.1 > + * Win8.1: 6.0 > + * Win10: 6.2 > */ > > #define VMSTOR_PROTO_VERSION(MAJOR_, MINOR_) MAJOR_) & 0xff) > << 8) | \ > (((MINOR_) & 0xff))) > > +#define VMSTOR_PROTO_VERSION_WIN6VMSTOR_PROTO_VERSION(2, > 0) > #define VMSTOR_PROTO_VERSION_WIN7VMSTOR_PROTO_VERSION(4, > 2) > #define VMSTOR_PROTO_VERSION_WIN8VMSTOR_PROTO_VERSION(5, > 1) > - > +#define VMSTOR_PROTO_VERSION_WIN8_1 VMSTOR_PROTO_VERSION(6, > 0) > +#define VMSTOR_PROTO_VERSION_WIN10 VMSTOR_PROTO_VERSION(6, > 2) > > /* Packet structure describing virtual storage requests. */ enum > vstor_packet_operation { @@ -205,6 +209,46 @@ struct vmscsi_request { > > > /* > + * The list of storage protocols in order of preference. > + */ > +struct vmstor_protocol { > + int protocol_version; > + int sense_buffer_size; > + int vmscsi_size_delta; > +}; > + > +#define VMSTOR_NUM_PROTOCOLS5 > + > +const struct vmstor_protocol vmstor_protocols[VMSTOR_NUM_PROTOCOLS] > = { > + { > + VMSTOR_PROTO_VERSION_WIN10, > + POST_WIN7_STORVSC_SENSE_BUFFER_SIZE, > + 0 > + }, > + { > + VMSTOR_PROTO_VERSION_WIN8_1, > + POST_WIN7_STORVSC_SENSE_BUFFER_SIZE, > + 0 > + }, > + { > + VMSTOR_PROTO_VERSION_WIN8, > + POST_WIN7_STORVSC_SENSE_BUFFER_SIZE, > + 0 > + }, > + { > + VMSTOR_PROTO_VERSION_WIN7, > + PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE, > + sizeof(struct vmscsi_win8_extension), > + }, > + { > + VMSTOR_PROTO_VERSION_WIN6, > + PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE, > + sizeof(struct vmscsi_win8_extension), > + } > +}; > + > + > +/* > * This structure is sent during the intialization phase to get the different > * properties of the channel. > */ > @@ -871,7 +915,7 @@ static int storvsc_channel_init(struct hv_device *device) > struct storvsc_device *stor_device; > struct storvsc_cmd_request *request; > struct vstor_packet *vstor_packet; > - int ret, t; > + int ret, t, i; > int max_chns; > bool process_sub_channels = false; > > @@ -911,36 +955,59 @@ static int storvsc_channel_init(struct hv_device > *device) > goto cleanup; > > > - /* reuse the packet for version range supported */ > - memset(vstor_packet, 0, sizeof(struct vstor_packet)); > - vstor_packet->operation = > VSTOR_OPERATION_QUERY_PROTOCOL_VERSION; > - vstor_packet->flags = REQUEST_COMPLETION_FLAG; > + for (i = 0; i < VMSTOR_NUM_PROTOCOLS; i++) { > + /* reuse the packet for version range supported */ > + memset(vstor_packet, 0, sizeof(struct vstor_packet)); > + vstor_packet->operation = > + VSTOR_OPERATION_QUERY_PROTOCOL_VERSION; > + vstor_packet->flags = REQUEST_COMPLETION_FLAG; > > - vstor_packet->version.major_minor = vmstor_proto_version; > + vstor_packet->version.major_minor = > + vmstor_protocols[i].protocol_version; > > - /* > - * The revision number is only used in Windows; set it to 0. > - */ > - vstor_packet->version.revision = 0; > +
RE: [PATCH 4/6] scsi: storvsc: use correct defaults for values determined by protocol negotiation
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Friday, May 29, 2015 1:29 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Cc: Keith Mange > Subject: [PATCH 4/6] scsi: storvsc: use correct defaults for values determined > by protocol negotiation > > From: keith.ma...@microsoft.com > > Use correct defaults for values determined by protocol negotiation, instead of > resetting them with every scsi controller. > Reviewed-by: Long Li > Tested-by: Alex Ng > Signed-off-by: Keith Mange > Signed-off-by: K. Y. Srinivasan > --- > drivers/scsi/storvsc_drv.c | 33 +++-- > 1 files changed, 19 insertions(+), 14 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > f29871e..6f38cdf 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -151,19 +151,17 @@ struct hv_fc_wwn_packet { > > /* > * Sense buffer size changed in win8; have a run-time > - * variable to track the size we should use. > + * variable to track the size we should use. This value will > + * likely change during protocol negotiation but it is valid > + * to start by assuming pre-Win8. > */ > -static int sense_buffer_size; > +static int sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; > > /* > - * The size of the vmscsi_request has changed in win8. The > - * additional size is because of new elements added to the > - * structure. These elements are valid only when we are talking > - * to a win8 host. > - * Track the correction to size we need to apply. > - */ > - > -static int vmscsi_size_delta; > + * The storage protocol version is determined during the > + * initial exchange with the host. It will indicate which > + * storage functionality is available in the host. > +*/ > static int vmstor_proto_version; > > struct vmscsi_win8_extension { > @@ -209,6 +207,17 @@ struct vmscsi_request { > > > /* > + * The size of the vmscsi_request has changed in win8. The > + * additional size is because of new elements added to the > + * structure. These elements are valid only when we are talking > + * to a win8 host. > + * Track the correction to size we need to apply. This value > + * will likely change during protocol negotiation but it is > + * valid to start by assuming pre-Win8. > + */ > +static int vmscsi_size_delta = sizeof(struct vmscsi_win8_extension); > + > +/* > * The list of storage protocols in order of preference. > */ > struct vmstor_protocol { > @@ -1810,14 +1819,10 @@ static int storvsc_probe(struct hv_device *device, >*/ > > if (vmbus_proto_version < VERSION_WIN8) { > - sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; > - vmscsi_size_delta = sizeof(struct vmscsi_win8_extension); > max_luns_per_target = > STORVSC_IDE_MAX_LUNS_PER_TARGET; > max_targets = STORVSC_IDE_MAX_TARGETS; > max_channels = STORVSC_IDE_MAX_CHANNELS; > } else { > - sense_buffer_size = > POST_WIN7_STORVSC_SENSE_BUFFER_SIZE; > - vmscsi_size_delta = 0; > max_luns_per_target = STORVSC_MAX_LUNS_PER_TARGET; > max_targets = STORVSC_MAX_TARGETS; > max_channels = STORVSC_MAX_CHANNELS; > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 5/6] scsi: storvsc: use storage protocol version to determine storage capabilities
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Friday, May 29, 2015 1:29 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Cc: Keith Mange > Subject: [PATCH 5/6] scsi: storvsc: use storage protocol version to determine > storage capabilities > > From: keith.ma...@microsoft.com > > Use storage protocol version instead of vmbus protocol version when > determining storage capabilities. > Reviewed-by: Long Li > Tested-by: Alex Ng > Signed-off-by: Keith Mange > Signed-off-by: K. Y. Srinivasan > --- > drivers/scsi/storvsc_drv.c |8 > 1 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 6f38cdf..58fa47a 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -1049,7 +1049,7 @@ static int storvsc_channel_init(struct hv_device > *device) >* support multi-channel. >*/ > max_chns = vstor_packet- > >storage_channel_properties.max_channel_cnt; > - if (vmbus_proto_version >= VERSION_WIN8) { > + if (vmstor_proto_version >= VMSTOR_PROTO_VERSION_WIN8) { > if (vstor_packet->storage_channel_properties.flags & > STORAGE_CHANNEL_SUPPORTS_MULTI_CHANNEL) > process_sub_channels = true; > @@ -1491,9 +1491,9 @@ static int storvsc_device_configure(struct > scsi_device *sdevice) >* if the device is a MSFT virtual device. >*/ > if (!strncmp(sdevice->vendor, "Msft", 4)) { > - switch (vmbus_proto_version) { > - case VERSION_WIN8: > - case VERSION_WIN8_1: > + switch (vmstor_proto_version) { > + case VMSTOR_PROTO_VERSION_WIN8: > + case VMSTOR_PROTO_VERSION_WIN8_1: > sdevice->scsi_level = SCSI_SPC_3; > break; > } > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 6/6] scsi: storvsc: Allow write_same when host is windows 10
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Friday, May 29, 2015 1:29 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Cc: Keith Mange > Subject: [PATCH 6/6] scsi: storvsc: Allow write_same when host is windows 10 > > From: keith.ma...@microsoft.com > > Allow WRITE_SAME for Windows10 and above hosts. > Reviewed-by: Long Li > Tested-by: Alex Ng > Signed-off-by: Keith Mange > Signed-off-by: K. Y. Srinivasan > --- > drivers/scsi/storvsc_drv.c |6 +- > 1 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 58fa47a..021cbdf 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -1488,7 +1488,8 @@ static int storvsc_device_configure(struct > scsi_device *sdevice) > > /* >* If the host is WIN8 or WIN8 R2, claim conformance to SPC-3 > - * if the device is a MSFT virtual device. > + * if the device is a MSFT virtual device. If the host is > + * WIN10 or newer, allow write_same. >*/ > if (!strncmp(sdevice->vendor, "Msft", 4)) { > switch (vmstor_proto_version) { > @@ -1497,6 +1498,9 @@ static int storvsc_device_configure(struct > scsi_device *sdevice) > sdevice->scsi_level = SCSI_SPC_3; > break; > } > + > + if (vmstor_proto_version >= > VMSTOR_PROTO_VERSION_WIN10) > + sdevice->no_write_same = 0; > } > > return 0; > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [REGRESSION][Stable][v3.12.y][v4.4.y][v4.9.y][v4.10.y][v4.11-rc1] scsi: storvsc: properly set residual data length on errors
I think we need both going forward. They addressed different problems. > -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of Cathy Avery > Sent: Thursday, March 30, 2017 6:52 AM > To: driverdev-devel@linuxdriverproject.org; Stephen Hemminger > ; gre...@linuxfoundation.org > Subject: Re: [REGRESSION][Stable][v3.12.y][v4.4.y][v4.9.y][v4.10.y][v4.11-rc1] > scsi: storvsc: properly set residual data length on errors > > Hi, > > So which commit is moving forward and which one is not? > > f1c635b439a5c01776fe3a25b1e2dc546ea82e6f or > 40630f462824ee24bc00d692865c86c3828094e0? > > We have backported 40630f462824ee24bc00d692865c86c3828094e0 and I am > unclear if this is a regression and must be removed or it is a regression but > is > fixed by f1c635b439a5c01776fe3a25b1e2dc546ea82e6f and can remain. > > Thanks, > > Cathy > > On 03/28/2017 12:14 PM, Stephen Hemminger wrote: > > I decided not to send it to stable since problem was only observed on > > 4.11 but it is probably endemic to all GEN2 VM's > > > > -Original Message- > > From: Joseph Salisbury [mailto:joseph.salisb...@canonical.com] > > Sent: Tuesday, March 28, 2017 7:29 AM > > To: Stephen Hemminger ; Long Li > > > > Cc: KY Srinivasan ; Martin K. Petersen > > ; Haiyang Zhang > ; > > j...@linux.vnet.ibm.com; de...@linuxdriverproject.org; linux-scsi > > ; LKML ; > > sta...@vger.kernel.org; Greg KH > > Subject: Re: > > [REGRESSION][Stable][v3.12.y][v4.4.y][v4.9.y][v4.10.y][v4.11-rc1] > > scsi: storvsc: properly set residual data length on errors > > > > On 03/27/2017 06:14 PM, Stephen Hemminger wrote: > >> Are you sure the real problem is not the one fixed by this commit? > >> > >> commit f1c635b439a5c01776fe3a25b1e2dc546ea82e6f > >> Author: Stephen Hemminger > >> Date: Tue Mar 7 09:15:53 2017 -0800 > >> > >> scsi: storvsc: Workaround for virtual DVD SCSI version > >> > >> Hyper-V host emulation of SCSI for virtual DVD device reports SCSI > >> version 0 (UNKNOWN) but is still capable of supporting REPORTLUN. > >> > >> Without this patch, a GEN2 Linux guest on Hyper-V will not boot 4.11 > >> successfully with virtual DVD ROM device. What happens is that the > SCSI > >> scan process falls back to doing sequential probing by INQUIRY. But > >> the > >> storvsc driver has a previous workaround that masks/blocks all errors > >> reports from INQUIRY (or MODE_SENSE) commands. This workaround > causes > >> the scan to then populate a full set of bogus LUN's on the target and > >> then sends kernel spinning off into a death spiral doing block reads > >> on > >> the non-existent LUNs. > >> > >> By setting the correct blacklist flags, the target with the DVD device > >> is scanned with REPORTLUN and that works correctly. > >> > >> Patch needs to go in current 4.11, it is safe but not necessary in > >> older > >> kernels. > >> > >> Signed-off-by: Stephen Hemminger > >> Reviewed-by: K. Y. Srinivasan > >> Reviewed-by: Christoph Hellwig > >> Signed-off-by: Martin K. Petersen > >> > >> -Original Message----- > >> From: Joseph Salisbury [mailto:joseph.salisb...@canonical.com] > >> Sent: Monday, March 27, 2017 1:22 PM > >> To: Long Li > >> Cc: KY Srinivasan ; Martin K. Petersen > >> ; Haiyang Zhang > ; > >> Stephen Hemminger ; > j...@linux.vnet.ibm.com; > >> de...@linuxdriverproject.org; linux-scsi > >> ; LKML ; > >> sta...@vger.kernel.org; Greg KH > >> Subject: > >> [REGRESSION][Stable][v3.12.y][v4.4.y][v4.9.y][v4.10.y][v4.11-rc1] > >> scsi: storvsc: properly set residual data length on errors > >> > >> Hi Long Li, > >> > >> A kernel bug report was opened against Ubuntu [0]. After a kernel > >> bisect, it was found that reverting the following commit resolved this bug: > >> > >> commit 40630f462824ee24bc00d692865c86c3828094e0 > >> Author: Long Li > >> Date: Wed Dec 14 18:46:03 2016 -0800 > >> > >> scsi: storvsc: properly set residual data length on errors > >> > >> > >> The regression was introduced in mainline as of v4.11-rc1. It was > >> also cc'd to stable and has landed in v3.12.y, v4.4.y, v4.9.y and v4.10.y. >
[PATCH] storvsc: add more logging for error and warning messages
Introduce a logging level for storvsc to log certain error/warning messages. Those messages are helpful in some environments, e.g. Microsoft Azure, for customer support and troubleshooting purposes. Signed-off-by: Long Li --- drivers/scsi/storvsc_drv.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 40c43ae..afa1647 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -164,6 +164,21 @@ static int sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; */ static int vmstor_proto_version; +#define STORVSC_LOGGING_NONE 0 +#define STORVSC_LOGGING_ERROR 1 +#define STORVSC_LOGGING_WARN 2 + +static int logging_level = STORVSC_LOGGING_ERROR; +module_param(logging_level, int, S_IRUGO|S_IWUSR); +MODULE_PARM_DESC(logging_level, + "Logging level, 0 - None, 1 - Error (default), 2 - Warning."); + +static inline bool do_logging(int level) +{ + return (logging_level >= level) ? true : false; +} + + struct vmscsi_win8_extension { /* * The following were added in Windows 8 @@ -1183,7 +1198,7 @@ static void storvsc_command_completion(struct storvsc_cmd_request *cmd_request) scmnd->result = vm_srb->scsi_status; - if (scmnd->result) { + if (scmnd->result && do_logging(STORVSC_LOGGING_ERROR)) { if (scsi_normalize_sense(scmnd->sense_buffer, SCSI_SENSE_BUFFERSIZE, &sense_hdr)) scsi_print_sense_hdr(scmnd->device, "storvsc", @@ -1239,12 +1254,25 @@ static void storvsc_on_io_completion(struct hv_device *device, stor_pkt->vm_srb.sense_info_length = vstor_packet->vm_srb.sense_info_length; + if (vstor_packet->vm_srb.scsi_status != 0 || + vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) + if (do_logging(STORVSC_LOGGING_WARN)) + dev_warn(&device->device, + "cmd 0x%x scsi status 0x%x srb status 0x%x\n", + stor_pkt->vm_srb.cdb[0], + vstor_packet->vm_srb.scsi_status, + vstor_packet->vm_srb.srb_status); if ((vstor_packet->vm_srb.scsi_status & 0xFF) == 0x02) { /* CHECK_CONDITION */ if (vstor_packet->vm_srb.srb_status & SRB_STATUS_AUTOSENSE_VALID) { /* autosense data available */ + if (do_logging(STORVSC_LOGGING_WARN)) + dev_warn(&device->device, + "stor pkt %p autosense data valid - len %d\n", + request, + vstor_packet->vm_srb.sense_info_length); memcpy(request->cmd->sense_buffer, vstor_packet->vm_srb.sense_data, -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] storvsc: add more logging for error and warning messages
Thanks Joe. I'll send out another patch. > -Original Message- > From: Joe Perches [mailto:j...@perches.com] > Sent: Thursday, December 3, 2015 6:28 PM > To: Long Li ; KY Srinivasan ; > Haiyang Zhang ; James E.J. Bottomley > > Cc: de...@linuxdriverproject.org; linux-s...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH] storvsc: add more logging for error and warning > messages > > On Thu, 2015-12-03 at 19:47 -0800, Long Li wrote: > > Introduce a logging level for storvsc to log certain error/warning > > messages. Those messages are helpful in some environments, e.g. > > Microsoft Azure, for customer support and troubleshooting purposes. > [] > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > [] > > +static inline bool do_logging(int level) { > > + return (logging_level >= level) ? true : false; > > The ternary is not necessary > > return logging_level >= level; > > is enough > > > +} > > + > > + > > struct vmscsi_win8_extension { > > /* > > * The following were added in Windows 8 @@ -1183,7 +1198,7 @@ > > static void storvsc_command_completion(struct storvsc_cmd_request > > *cmd_request) > > > > scmnd->result = vm_srb->scsi_status; > > > > - if (scmnd->result) { > > + if (scmnd->result && do_logging(STORVSC_LOGGING_ERROR)) { > > if (scsi_normalize_sense(scmnd->sense_buffer, > > SCSI_SENSE_BUFFERSIZE, &sense_hdr)) > > scsi_print_sense_hdr(scmnd->device, "storvsc", > > Is it appropriate to make this scsi_normalize_sense call conditional on > do_logging here? > > > @@ -1239,12 +1254,25 @@ static void storvsc_on_io_completion(struct > hv_device *device, > > stor_pkt->vm_srb.sense_info_length = > > vstor_packet->vm_srb.sense_info_length; > > > > + if (vstor_packet->vm_srb.scsi_status != 0 || > > + vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) > > + if (do_logging(STORVSC_LOGGING_WARN)) > > + dev_warn(&device->device, > > + "cmd 0x%x scsi status 0x%x srb status > 0x%x\n", > > + stor_pkt->vm_srb.cdb[0], > > + vstor_packet->vm_srb.scsi_status, > > + vstor_packet->vm_srb.srb_status); > > It might make some sense to use another macro indirection like > > #define svc_log_warn(dev, level, fmt, ...)\ > do { \ > if (do_logging(STORSVC_LOGGING_##level) \ > dev_warn(&(dev)->device, fmt, ##__VA_ARGS__); \ > } while (0) > > So a use could be: > > if (vstore_packet...) > svc_log_warn(device, WARN, ...); > > > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v2] storvsc: add logging for error/warning messages
Introduce a logging level for storvsc to log certain error/warning messages. Those messages are helpful in some environments, e.g. Microsoft Azure, for customer support and troubleshooting purposes. Signed-off-by: Long Li --- drivers/scsi/storvsc_drv.c | 34 +- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 40c43ae..f46ed2c 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -164,6 +164,26 @@ static int sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; */ static int vmstor_proto_version; +#define STORVSC_LOGGING_NONE 0 +#define STORVSC_LOGGING_ERROR 1 +#define STORVSC_LOGGING_WARN 2 + +static int logging_level = STORVSC_LOGGING_ERROR; +module_param(logging_level, int, S_IRUGO|S_IWUSR); +MODULE_PARM_DESC(logging_level, + "Logging level, 0 - None, 1 - Error (default), 2 - Warning."); + +static inline bool do_logging(int level) +{ + return logging_level >= level; +} + +#define storvsc_log(dev, level, fmt, ...) \ +do { \ + if (do_logging(level)) \ + dev_warn(&(dev)->device, fmt, ##__VA_ARGS__); \ +} while (0) + struct vmscsi_win8_extension { /* * The following were added in Windows 8 @@ -1185,7 +1205,8 @@ static void storvsc_command_completion(struct storvsc_cmd_request *cmd_request) if (scmnd->result) { if (scsi_normalize_sense(scmnd->sense_buffer, - SCSI_SENSE_BUFFERSIZE, &sense_hdr)) + SCSI_SENSE_BUFFERSIZE, &sense_hdr) && + do_logging(STORVSC_LOGGING_ERROR)) scsi_print_sense_hdr(scmnd->device, "storvsc", &sense_hdr); } @@ -1239,6 +1260,13 @@ static void storvsc_on_io_completion(struct hv_device *device, stor_pkt->vm_srb.sense_info_length = vstor_packet->vm_srb.sense_info_length; + if (vstor_packet->vm_srb.scsi_status != 0 || + vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) + storvsc_log(device, STORVSC_LOGGING_WARN, + "cmd 0x%x scsi status 0x%x srb status 0x%x\n", + stor_pkt->vm_srb.cdb[0], + vstor_packet->vm_srb.scsi_status, + vstor_packet->vm_srb.srb_status); if ((vstor_packet->vm_srb.scsi_status & 0xFF) == 0x02) { /* CHECK_CONDITION */ @@ -1246,6 +1274,10 @@ static void storvsc_on_io_completion(struct hv_device *device, SRB_STATUS_AUTOSENSE_VALID) { /* autosense data available */ + storvsc_log(device, STORVSC_LOGGING_WARN, + "stor pkt %p autosense data valid - len %d\n", + request, vstor_packet->vm_srb.sense_info_length); + memcpy(request->cmd->sense_buffer, vstor_packet->vm_srb.sense_data, vstor_packet->vm_srb.sense_info_length); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] storvsc: add more logging for error and warning messages
> -Original Message- > From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com] > Sent: Friday, December 4, 2015 1:53 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; James E.J. Bottomley ; > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > s...@vger.kernel.org > Subject: Re: [PATCH] storvsc: add more logging for error and warning > messages > > Long Li writes: > > > Introduce a logging level for storvsc to log certain error/warning > > messages. Those messages are helpful in some environments, e.g. > > Microsoft Azure, for customer support and troubleshooting purposes. > > I have an alternative suggestion: let's use dynamic debug! Basically, we need > to convert all non-error logging to using dev_dbg() and this can be enabled > dynamically when needed, even reboot won't be required. This is great idea for debugging! I think the messages (srb errors) we want to log in this patch are real errors. They are not for debugging, but for customer support in production environment. Those errors can be ignored in certain specific storage configurations due to some quirks. They are real errors on Azure, so we want to always log them. > > > > > Signed-off-by: Long Li > > --- > > drivers/scsi/storvsc_drv.c | 30 +- > > 1 file changed, 29 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > index 40c43ae..afa1647 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -164,6 +164,21 @@ static int sense_buffer_size = > > PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; > > */ > > static int vmstor_proto_version; > > > > +#define STORVSC_LOGGING_NONE 0 > > +#define STORVSC_LOGGING_ERROR 1 > > +#define STORVSC_LOGGING_WARN 2 > > + > > +static int logging_level = STORVSC_LOGGING_ERROR; > > +module_param(logging_level, int, S_IRUGO|S_IWUSR); > > +MODULE_PARM_DESC(logging_level, > > + "Logging level, 0 - None, 1 - Error (default), 2 - Warning."); > > + > > +static inline bool do_logging(int level) { > > + return (logging_level >= level) ? true : false; } > > + > > + > > struct vmscsi_win8_extension { > > /* > > * The following were added in Windows 8 @@ -1183,7 +1198,7 @@ > > static void storvsc_command_completion(struct storvsc_cmd_request > > *cmd_request) > > > > scmnd->result = vm_srb->scsi_status; > > > > - if (scmnd->result) { > > + if (scmnd->result && do_logging(STORVSC_LOGGING_ERROR)) { > > if (scsi_normalize_sense(scmnd->sense_buffer, > > SCSI_SENSE_BUFFERSIZE, &sense_hdr)) > > scsi_print_sense_hdr(scmnd->device, "storvsc", @@ > -1239,12 > > +1254,25 @@ static void storvsc_on_io_completion(struct hv_device > *device, > > stor_pkt->vm_srb.sense_info_length = > > vstor_packet->vm_srb.sense_info_length; > > > > + if (vstor_packet->vm_srb.scsi_status != 0 || > > + vstor_packet->vm_srb.srb_status != SRB_STATUS_SUCCESS) > > + if (do_logging(STORVSC_LOGGING_WARN)) > > + dev_warn(&device->device, > > + "cmd 0x%x scsi status 0x%x srb status > 0x%x\n", > > + stor_pkt->vm_srb.cdb[0], > > + vstor_packet->vm_srb.scsi_status, > > + vstor_packet->vm_srb.srb_status); > > > > if ((vstor_packet->vm_srb.scsi_status & 0xFF) == 0x02) { > > /* CHECK_CONDITION */ > > if (vstor_packet->vm_srb.srb_status & > > SRB_STATUS_AUTOSENSE_VALID) { > > /* autosense data available */ > > + if (do_logging(STORVSC_LOGGING_WARN)) > > + dev_warn(&device->device, > > + "stor pkt %p autosense data valid - > len %d\n", > > + request, > > + vstor_packet- > >vm_srb.sense_info_length); > > > > memcpy(request->cmd->sense_buffer, > >vstor_packet->vm_srb.sense_data, > > -- > Vitaly ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] scsi: storvsc: use shost_for_each_device() instead of open coding
> -Original Message- > From: KY Srinivasan > Sent: Friday, July 03, 2015 11:35 AM > To: Vitaly Kuznetsov; linux-s...@vger.kernel.org > Cc: Long Li; Haiyang Zhang; James E.J. Bottomley; > de...@linuxdriverproject.org; > linux-ker...@vger.kernel.org > Subject: RE: [PATCH] scsi: storvsc: use shost_for_each_device() instead of > open > coding > > > > > -Original Message- > > From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com] > > Sent: Wednesday, July 1, 2015 2:31 AM > > To: linux-s...@vger.kernel.org > > Cc: Long Li; KY Srinivasan; Haiyang Zhang; James E.J. Bottomley; > > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org > > Subject: [PATCH] scsi: storvsc: use shost_for_each_device() instead of > > open coding > > > > Comment in struct Scsi_Host says that drivers are not supposed to > > access __devices directly. storvsc_host_scan() doesn't happen in irq > > context so we can just use shost_for_each_device(). > > > > Signed-off-by: Vitaly Kuznetsov > > Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li > > --- > > drivers/scsi/storvsc_drv.c | 9 + > > 1 file changed, 1 insertion(+), 8 deletions(-) > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > index 3c6584f..9ea912b 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -426,7 +426,6 @@ static void storvsc_host_scan(struct work_struct > > *work) > > struct storvsc_scan_work *wrk; > > struct Scsi_Host *host; > > struct scsi_device *sdev; > > - unsigned long flags; > > > > wrk = container_of(work, struct storvsc_scan_work, work); > > host = wrk->host; > > @@ -443,14 +442,8 @@ static void storvsc_host_scan(struct work_struct > > *work) > > * may have been removed this way. > > */ > > mutex_lock(&host->scan_mutex); > > - spin_lock_irqsave(host->host_lock, flags); > > - list_for_each_entry(sdev, &host->__devices, siblings) { > > - spin_unlock_irqrestore(host->host_lock, flags); > > + shost_for_each_device(sdev, host) > > scsi_test_unit_ready(sdev, 1, 1, NULL); > > - spin_lock_irqsave(host->host_lock, flags); > > - continue; > > - } > > - spin_unlock_irqrestore(host->host_lock, flags); > > mutex_unlock(&host->scan_mutex); > > /* > > * Now scan the host to discover LUNs that may have been added. > > -- > > 2.4.3 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] pci-hyperv: use kmalloc to allocate hypercall params buffer
> -Original Message- > From: KY Srinivasan > Sent: Monday, December 5, 2016 1:23 PM > To: Cathy Avery ; Bjorn Helgaas > ; Long Li > Cc: de...@linuxdriverproject.org > Subject: RE: [PATCH] pci-hyperv: use kmalloc to allocate hypercall params > buffer > > > > > -Original Message- > > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > > Behalf Of Cathy Avery > > Sent: Monday, December 5, 2016 4:54 AM > > To: Bjorn Helgaas ; Long Li > > Cc: de...@linuxdriverproject.org > > Subject: Re: [PATCH] pci-hyperv: use kmalloc to allocate hypercall > > params buffer > > > > Hi, > > > > Is the double semicolon a typo? > > Yes; it is a typo. I'll fix this. > > K. Y > > > > Thanks, > > > > Cathy > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index 763ff87..ca553df 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -378,6 +378,8 @@ struct hv_pcibus_device { > > struct msi_domain_info msi_info; > > struct msi_controller msi_chip; > > struct irq_domain *irq_domain; > > + struct retarget_msi_interrupt retarget_msi_interrupt_params; > > + spinlock_t retarget_msi_interrupt_lock;; > > }; > > > > > > > > On 11/29/2016 06:25 PM, Bjorn Helgaas wrote: > > > On Tue, Nov 08, 2016 at 02:04:38PM -0800, Long Li wrote: > > >> From: Long Li > > >> > > >> hv_do_hypercall assumes that we pass a segment from a physically > > >> continuous buffer. Buffer allocated on the stack may not work if > > >> CONFIG_VMAP_STACK=y is set. > > >> > > >> Change to use kmalloc to allocate this buffer. > > >> > > >> The v2 patch adds locking to access the pre-allocated buffer. > > >> > > >> Signed-off-by: Long Li > > >> Reported-by: Haiyang Zhang > > > Applied with KY's ack to pci/host-hv, thanks! > > > > > >> --- > > >> drivers/pci/host/pci-hyperv.c | 29 +++-- > > >> 1 file changed, 19 insertions(+), 10 deletions(-) > > >> > > >> diff --git a/drivers/pci/host/pci-hyperv.c > > >> b/drivers/pci/host/pci-hyperv.c index 763ff87..ca553df 100644 > > >> --- a/drivers/pci/host/pci-hyperv.c > > >> +++ b/drivers/pci/host/pci-hyperv.c > > >> @@ -378,6 +378,8 @@ struct hv_pcibus_device { > > >> struct msi_domain_info msi_info; > > >> struct msi_controller msi_chip; > > >> struct irq_domain *irq_domain; > > >> +struct retarget_msi_interrupt retarget_msi_interrupt_params; > > >> +spinlock_t retarget_msi_interrupt_lock;; > > >> }; > > >> > > >> /* > > >> @@ -774,34 +776,40 @@ void hv_irq_unmask(struct irq_data *data) > > >> { > > >> struct msi_desc *msi_desc = irq_data_get_msi_desc(data); > > >> struct irq_cfg *cfg = irqd_cfg(data); > > >> -struct retarget_msi_interrupt params; > > >> +struct retarget_msi_interrupt *params; > > >> struct hv_pcibus_device *hbus; > > >> struct cpumask *dest; > > >> struct pci_bus *pbus; > > >> struct pci_dev *pdev; > > >> int cpu; > > >> +unsigned long flags; > > >> > > >> dest = irq_data_get_affinity_mask(data); > > >> pdev = msi_desc_to_pci_dev(msi_desc); > > >> pbus = pdev->bus; > > >> hbus = container_of(pbus->sysdata, struct hv_pcibus_device, > > sysdata); > > >> > > >> -memset(¶ms, 0, sizeof(params)); > > >> -params.partition_id = HV_PARTITION_ID_SELF; > > >> -params.source = 1; /* MSI(-X) */ > > >> -params.address = msi_desc->msg.address_lo; > > >> -params.data = msi_desc->msg.data; > > >> -params.device_id = (hbus->hdev->dev_instance.b[5] << 24) | > > >> +spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags); > > >> + > > >> +params = &hbus->retarget_msi_interrupt_params; > > >> +memset(params, 0, sizeof(*params)); > > >> +params->partition_id = HV_PARTITION_ID_SELF; >
RE: [PATCH] pci-hyperv: use kmalloc to allocate hypercall params buffer
> -Original Message- > From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Monday, December 5, 2016 8:53 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > p...@vger.kernel.org > Subject: Re: [PATCH] pci-hyperv: use kmalloc to allocate hypercall params > buffer > > On Tue, 8 Nov 2016 14:04:38 -0800 > Long Li wrote: > > > + spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags); > > + > > + params = &hbus->retarget_msi_interrupt_params; > > + memset(params, 0, sizeof(*params)); > > + params->partition_id = HV_PARTITION_ID_SELF; > > + params->source = 1; /* MSI(-X) */ > > + params->address = msi_desc->msg.address_lo; > > + params->data = msi_desc->msg.data; > > + params->device_id = (hbus->hdev->dev_instance.b[5] << 24) | > >(hbus->hdev->dev_instance.b[4] << 16) | > >(hbus->hdev->dev_instance.b[7] << 8) | > >(hbus->hdev->dev_instance.b[6] & 0xf8) | > >PCI_FUNC(pdev->devfn); > > - params.vector = cfg->vector; > > + params->vector = cfg->vector; > > > > for_each_cpu_and(cpu, dest, cpu_online_mask) > > - params.vp_mask |= (1ULL << > vmbus_cpu_number_to_vp_number(cpu)); > > + params->vp_mask |= (1ULL << > vmbus_cpu_number_to_vp_number(cpu)); > > + > > + hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, params, NULL); > > > > - hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, ¶ms, NULL); > > + spin_unlock_irqrestore(&hbus->retarget_msi_interrupt_lock, flags); > > It looks like the additional locking here is being overly paranoid. > The caller is already holding the irq descriptor lock. Look at fixup_irqs. You are right. On my test machine, there are two possible places calling hv_irq_unmask(): request _irq() and handle_edge_irq(). They both have desc->lock held when calling .irq_unmask on the chip. A review of the IRQ code shows that desc->lock is always held while calling chip->irq_unmask(). Since the lock doesn't do any harm and it is not on performance code path, we can remove the lock in the upcoming patches. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] Retry infinitely for hypercall
From: Long Li Hyper-v host guarantees that a hypercall will succeed. Retry infinitely to avoid returning transient failures to upper layer. Signed-off-by: Long Li --- drivers/hv/connection.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 6ce8b87..4bcb099 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -439,7 +439,6 @@ int vmbus_post_msg(void *buffer, size_t buflen) { union hv_connection_id conn_id; int ret = 0; - int retries = 0; u32 usec = 1; conn_id.asu32 = 0; @@ -447,10 +446,10 @@ int vmbus_post_msg(void *buffer, size_t buflen) /* * hv_post_message() can have transient failures because of -* insufficient resources. Retry the operation a couple of -* times before giving up. +* insufficient resources. We retry infinitely on these failures +* because host guarantees hypercall will eventually succeed. */ - while (retries < 20) { + while (1) { ret = hv_post_message(conn_id, 1, buffer, buflen); switch (ret) { @@ -459,11 +458,11 @@ int vmbus_post_msg(void *buffer, size_t buflen) * We could get this if we send messages too * frequently. */ - ret = -EAGAIN; - break; case HV_STATUS_INSUFFICIENT_MEMORY: case HV_STATUS_INSUFFICIENT_BUFFERS: - ret = -ENOMEM; + /* +* Temporary failure out of resources +*/ break; case HV_STATUS_SUCCESS: return ret; @@ -472,12 +471,12 @@ int vmbus_post_msg(void *buffer, size_t buflen) return -EINVAL; } - retries++; udelay(usec); if (usec < 2048) usec *= 2; } - return ret; + /* Impossible to get here */ + BUG_ON(1); } /* -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] Retry infinitely for hypercall
> -Original Message- > From: Greg KH [mailto:g...@kroah.com] > Sent: Wednesday, January 4, 2017 12:51 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH] Retry infinitely for hypercall > > On Wed, Jan 04, 2017 at 02:39:31PM -0800, Long Li wrote: > > From: Long Li > > > > Hyper-v host guarantees that a hypercall will succeed. Retry infinitely to > avoid returning transient failures to upper layer. > > Please wrap your changelog at the proper column. Will do in V2. > > And what happens when the hypercall does not succeed? How is the kernel > going to recover from that? Sorry I should have used better wording in the patch. It should be "Retry infinitely on transient failures for hypercall". The host guarantees that it will return something other than transient failures in a reasonable small time frame. I will fix the comment in V2. > > > > > Signed-off-by: Long Li > > --- > > drivers/hv/connection.c | 17 - > > 1 file changed, 8 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index > > 6ce8b87..4bcb099 100644 > > --- a/drivers/hv/connection.c > > +++ b/drivers/hv/connection.c > > @@ -439,7 +439,6 @@ int vmbus_post_msg(void *buffer, size_t buflen) { > > union hv_connection_id conn_id; > > int ret = 0; > > - int retries = 0; > > u32 usec = 1; > > > > conn_id.asu32 = 0; > > @@ -447,10 +446,10 @@ int vmbus_post_msg(void *buffer, size_t buflen) > > > > /* > > * hv_post_message() can have transient failures because of > > -* insufficient resources. Retry the operation a couple of > > -* times before giving up. > > +* insufficient resources. We retry infinitely on these failures > > +* because host guarantees hypercall will eventually succeed. > > */ > > - while (retries < 20) { > > + while (1) { > > ret = hv_post_message(conn_id, 1, buffer, buflen); > > > > switch (ret) { > > @@ -459,11 +458,11 @@ int vmbus_post_msg(void *buffer, size_t buflen) > > * We could get this if we send messages too > > * frequently. > > */ > > - ret = -EAGAIN; > > - break; > > Document you are falling through please, otherwise someone will "fix" > this later. Will add comment in V2. > > > case HV_STATUS_INSUFFICIENT_MEMORY: > > case HV_STATUS_INSUFFICIENT_BUFFERS: > > - ret = -ENOMEM; > > + /* > > +* Temporary failure out of resources > > +*/ > > break; > > case HV_STATUS_SUCCESS: > > return ret; > > @@ -472,12 +471,12 @@ int vmbus_post_msg(void *buffer, size_t buflen) > > return -EINVAL; > > } > > > > - retries++; > > udelay(usec); > > if (usec < 2048) > > usec *= 2; > > } > > - return ret; > > + /* Impossible to get here */ > > + BUG_ON(1); > > If it is impossible, why do you have this line at all? I will remove this line. There is no way for the code to get here. > > What is this trying to solve? Do you need to increase the time spent waiting? > We all know things break, please allow the kernel to stay alive if at all > possible. The purpose is to wait until the host returns a non-transient status code for a hypercall. However, we don't know how many transient failures we are getting before the host returns a final status code. So use the infinite loop to wait until the host returns the final status code. Thanks for reviewing. I will send V2 to address the comment. Long > > thanks, > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] Retry infinitely for hypercall
> -Original Message- > From: Dan Carpenter [mailto:dan.carpen...@oracle.com] > Sent: Wednesday, January 4, 2017 1:48 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH] Retry infinitely for hypercall > > Fix the subsystem prefix in the subject. > > On Wed, Jan 04, 2017 at 02:39:31PM -0800, Long Li wrote: > > From: Long Li > > > > Hyper-v host guarantees that a hypercall will succeed. Retry infinitely to > avoid returning transient failures to upper layer. > > > > Signed-off-by: Long Li > > --- > > drivers/hv/connection.c | 17 - > > 1 file changed, 8 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index > > 6ce8b87..4bcb099 100644 > > --- a/drivers/hv/connection.c > > +++ b/drivers/hv/connection.c > > @@ -439,7 +439,6 @@ int vmbus_post_msg(void *buffer, size_t buflen) { > > union hv_connection_id conn_id; > > int ret = 0; > > Btw, when you disable GCC's uninitialized variable checking by storing bogus > values in "ret", it's eventually going to bite you in the bum. > Eventually you're going to get a bug that should have been detected through > static analysis if only you hadn't disabled it. > > > - int retries = 0; > > u32 usec = 1; > > > > conn_id.asu32 = 0; > > @@ -447,10 +446,10 @@ int vmbus_post_msg(void *buffer, size_t buflen) > > > > /* > > * hv_post_message() can have transient failures because of > > -* insufficient resources. Retry the operation a couple of > > -* times before giving up. > > +* insufficient resources. We retry infinitely on these failures > > +* because host guarantees hypercall will eventually succeed. > > */ > > - while (retries < 20) { > > + while (1) { > > ret = hv_post_message(conn_id, 1, buffer, buflen); > > > > switch (ret) { > > @@ -459,11 +458,11 @@ int vmbus_post_msg(void *buffer, size_t buflen) > > * We could get this if we send messages too > > * frequently. > > */ > > Move the comment above the code it's commenting about. > > /* >* We could get INVALID_CONNECTION_ID if we flood the >* host with too many messages. >*/ > case HV_STATUS_INVALID_CONNECTION_ID: > case HV_STATUS_INSUFFICIENT_MEMORY: > case HV_STATUS_INSUFFICIENT_BUFFERS: > break; > > > > > - ret = -EAGAIN; > > - break; > > case HV_STATUS_INSUFFICIENT_MEMORY: > > case HV_STATUS_INSUFFICIENT_BUFFERS: > > - ret = -ENOMEM; > > + /* > > +* Temporary failure out of resources > > +*/ > > break; > > case HV_STATUS_SUCCESS: > > return ret; > > return 0; > > Better to be more explicit. When I looked at this I got briefly confused if > this > function was supposed to return HV_ statuses or standard kernel error > codes. It turns out that HV_STATUS_SUCCESS is zero the success returns > map directly to linux kernel code for success but it's clearer to be explicit. > > > @@ -472,12 +471,12 @@ int vmbus_post_msg(void *buffer, size_t buflen) > > return -EINVAL; > > } > > > - retries++; > > udelay(usec); > > if (usec < 2048) > > usec *= 2; > > } > > - return ret; > > + /* Impossible to get here */ > > + BUG_ON(1); > > Remove the comment and the BUG_ON(). > > regards, > dan carpenter Thanks, I will fix those in V2. Long ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v2] hv: retry infinitely on hypercall transient failures
From: Long Li Hyper-v host guarantees that a hypercall will finish in reasonable time. Retry infinitely on transient failures to avoid returning error to upper layer. Signed-off-by: Long Li --- drivers/hv/connection.c | 30 ++ 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 6ce8b87..4b3cfde 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -438,46 +438,44 @@ void vmbus_on_event(unsigned long data) int vmbus_post_msg(void *buffer, size_t buflen) { union hv_connection_id conn_id; - int ret = 0; - int retries = 0; + int ret; u32 usec = 1; conn_id.asu32 = 0; conn_id.u.id = VMBUS_MESSAGE_CONNECTION_ID; /* -* hv_post_message() can have transient failures because of -* insufficient resources. Retry the operation a couple of -* times before giving up. +* hv_post_message() can have transient failures. We retry infinitely +* on these failures because host guarantees hypercall will finish. */ - while (retries < 20) { + while (1) { ret = hv_post_message(conn_id, 1, buffer, buflen); switch (ret) { + /* +* Retry on transient failures: +* 1. HV_STATUS_INVALID_CONNECTION_ID: +*We send messages too frequently. +* +* 2. HV_STATUS_INSUFFICIENT_MEMORY and +*HV_STATUS_INSUFFICIENT_BUFFERS: +*The host is temporariliy running out of resources. +*/ case HV_STATUS_INVALID_CONNECTION_ID: - /* -* We could get this if we send messages too -* frequently. -*/ - ret = -EAGAIN; - break; case HV_STATUS_INSUFFICIENT_MEMORY: case HV_STATUS_INSUFFICIENT_BUFFERS: - ret = -ENOMEM; break; case HV_STATUS_SUCCESS: - return ret; + return 0; default: pr_err("hv_post_msg() failed; error code:%d\n", ret); return -EINVAL; } - retries++; udelay(usec); if (usec < 2048) usec *= 2; } - return ret; } /* -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] hv: use substraction to update ring buffer index
From: Long Li The ring buffer code uses %= to calculate index. For x86/64, %= compiles to div, more than 10 times slower than sub. Replace div with sub for this data heavy code path. Signed-off-by: Long Li --- drivers/hv/ring_buffer.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index cd49cb1..f8eee6e 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -135,7 +135,8 @@ hv_get_next_readlocation_withoffset(struct hv_ring_buffer_info *ring_info, u32 next = ring_info->ring_buffer->read_index; next += offset; - next %= ring_info->ring_datasize; + if (next >= ring_info->ring_datasize) + next -= ring_info->ring_datasize; return next; } @@ -179,7 +180,8 @@ static u32 hv_copyfrom_ringbuffer( memcpy(dest, ring_buffer + start_read_offset, destlen); start_read_offset += destlen; - start_read_offset %= ring_buffer_size; + if (start_read_offset >= ring_buffer_size) + start_read_offset -= ring_buffer_size; return start_read_offset; } @@ -201,7 +203,8 @@ static u32 hv_copyto_ringbuffer( memcpy(ring_buffer + start_write_offset, src, srclen); start_write_offset += srclen; - start_write_offset %= ring_buffer_size; + if (start_write_offset >= ring_buffer_size) + start_write_offset -= ring_buffer_size; return start_write_offset; } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] hv: use substraction to update ring buffer index
> -Original Message- > From: Dan Carpenter [mailto:dan.carpen...@oracle.com] > Sent: Thursday, January 05, 2017 3:40 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH] hv: use substraction to update ring buffer index > > On Wed, Jan 04, 2017 at 08:08:22PM -0800, Long Li wrote: > > From: Long Li > > > > The ring buffer code uses %= to calculate index. For x86/64, %= > > compiles to div, more than 10 times slower than sub. > > > > Replace div with sub for this data heavy code path. > > > > Signed-off-by: Long Li > > --- > > drivers/hv/ring_buffer.c | 9 ++--- > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index > > cd49cb1..f8eee6e 100644 > > --- a/drivers/hv/ring_buffer.c > > +++ b/drivers/hv/ring_buffer.c > > @@ -135,7 +135,8 @@ hv_get_next_readlocation_withoffset(struct > hv_ring_buffer_info *ring_info, > > u32 next = ring_info->ring_buffer->read_index; > > > > next += offset; > > - next %= ring_info->ring_datasize; > > + if (next >= ring_info->ring_datasize) > > + next -= ring_info->ring_datasize; > > I take it that we trust that offset is roughly correct and not more than 2x > ring_info->ring_datasize? I guess there is only one caller so it's probably > true... Yes, you are right. It's not possible that we are getting to 2x ring_datasize, because it's not possible to transfer data more than ring_datasize over ring buffer. > > > > > return next; > > } > > @@ -179,7 +180,8 @@ static u32 hv_copyfrom_ringbuffer( > > memcpy(dest, ring_buffer + start_read_offset, destlen); > > > > start_read_offset += destlen; > > - start_read_offset %= ring_buffer_size; > > + if (start_read_offset >= ring_buffer_size) > > + start_read_offset -= ring_buffer_size; > > I totally don't understand the original code here. We do the memset and > then we verify that we are not copying beyond the end of the ring buffer? If > feels like we should verify that offset + destlen aren't more than > ring_buffer_size before we do the memcpy(). The ring buffer pages are mapped to wraparound 2x virtual address space. Please see hv_ringbuffer_init(). The call to vmap() setup this virtual address space. So we can use memcpy across the last page. > > regards, > dan carpenter > Thanks for reviewing! Long ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] hv: retry infinitely on hypercall transient failures
> -Original Message- > From: Greg KH [mailto:g...@kroah.com] > Sent: Wednesday, January 04, 2017 11:48 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures > > On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote: > > From: Long Li > > > > Hyper-v host guarantees that a hypercall will finish in reasonable time. > > Retry infinitely on transient failures to avoid returning error to upper > > layer. > > Again, never retry "forever", always have a way out, otherwise you will crash. > > And again, why are you making this change? What problem does it solve? The problem it tries to solve is that in this code we are returning error prematurely on transient failures. The hypercall is used mostly in channel establishment. If we return a transient failure, the VM may not boot or not useful after boot due to some devices missing. Another approach is to increase the number of retries. But we don't know how many retries is safe, and Windows host side expects the guest retry infinitely and not return error on transient failures. > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] hv: retry infinitely on hypercall transient failures
> -Original Message- > From: Greg KH [mailto:g...@kroah.com] > Sent: Friday, January 06, 2017 11:43 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; de...@linuxdriverproject.org; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient failures > > On Sat, Jan 07, 2017 at 07:23:14AM +, Long Li wrote: > > > -Original Message- > > > From: Greg KH [mailto:g...@kroah.com] > > > Sent: Wednesday, January 04, 2017 11:48 PM > > > To: Long Li > > > Cc: KY Srinivasan ; Haiyang Zhang > > > ; de...@linuxdriverproject.org; linux- > > > ker...@vger.kernel.org > > > Subject: Re: [PATCH v2] hv: retry infinitely on hypercall transient > > > failures > > > > > > On Wed, Jan 04, 2017 at 06:12:20PM -0800, Long Li wrote: > > > > From: Long Li > > > > > > > > Hyper-v host guarantees that a hypercall will finish in reasonable time. > > > > Retry infinitely on transient failures to avoid returning error to upper > layer. > > > > > > Again, never retry "forever", always have a way out, otherwise you will > crash. > > > > > > And again, why are you making this change? What problem does it solve? > > > > The problem it tries to solve is that in this code we are returning > > error prematurely on transient failures. The hypercall is used mostly > > in channel establishment. If we return a transient failure, the VM may > > not boot or not useful after boot due to some devices missing. > > > > Another approach is to increase the number of retries. But we don't > > know how many retries is safe, and Windows host side expects the guest > > retry infinitely and not return error on transient failures. > > That implies a lot of trust in the host side, don't you think? > > Worse case, make the delay a minute or so, but give the system a way out > incase there's a bug in the host. As there will be bugs in the host, just > like > there are bugs in the client :) This makes sense. 1 minute is a long time for a hypercall. I will send V3. > > thanks, > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove
Hi Bjorn This patch is still pending. The patch has been ack'ed. Do you want me to resend this patch? Thanks Long > -Original Message- > From: KY Srinivasan > Sent: Friday, November 11, 2016 2:21 PM > To: Bjorn Helgaas ; Long Li > Cc: Haiyang Zhang ; Bjorn Helgaas > ; de...@linuxdriverproject.org; linux- > p...@vger.kernel.org; linux-ker...@vger.kernel.org; Long Li > > Subject: RE: [PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove > > > > > -Original Message- > > From: Bjorn Helgaas [mailto:helg...@kernel.org] > > Sent: Friday, November 11, 2016 1:04 PM > > To: Long Li > > Cc: KY Srinivasan ; Haiyang Zhang > > ; Bjorn Helgaas ; > > de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > > ker...@vger.kernel.org; Long Li > > Subject: Re: [PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove > > > > On Mon, Oct 03, 2016 at 11:42:47PM -0700, Long Li wrote: > > > From: Long Li > > > > > > hv_pci_devices_present is called in hv_pci_remove when we remove a > > > PCI > > device from host (e.g. by disabling SRIOV on a device). In > > hv_pci_remove, the bus is already removed before the call, so we don't > > need to rescan the bus in the workqueue scheduled from > > hv_pci_devices_present. By introducing status hv_pcibus_removed, we > can avoid this situation. > > > > > > Signed-off-by: Long Li > > > Tested-by: Cathy Avery > > > Reported-by: Xiaofeng Wang > > Acked-by: K. Y. Srinivasan > > > > > > I need an ack from the Hyper-V maintainers. I see acks for previous > > versions, but I don't know whether you've changed things that would > > invalidate those acks. If the acks still apply, please include them > > and repost these patches. > > > > Also, please run "git log --oneline drivers/pci/host/pci-hyperv.c" and > > make your subject line match the previous ones. > > > > > --- > > > drivers/pci/host/pci-hyperv.c | 20 +--- > > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 > > > --- a/drivers/pci/host/pci-hyperv.c > > > +++ b/drivers/pci/host/pci-hyperv.c > > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > > hv_pcibus_init = 0, > > > hv_pcibus_probed, > > > hv_pcibus_installed, > > > + hv_pcibus_removed, > > > hv_pcibus_maximum > > > }; > > > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > > work_struct *work) > > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > > } > > > > > > - /* Tell the core to rescan bus because there may have been changes. > > */ > > > - if (hbus->state == hv_pcibus_installed) { > > > + switch (hbus->state) { > > > + case hv_pcibus_installed: > > > + /* > > > + * Tell the core to rescan bus > > > + * because there may have been changes. > > > + */ > > > pci_lock_rescan_remove(); > > > pci_scan_child_bus(hbus->pci_bus); > > > pci_unlock_rescan_remove(); > > > - } else { > > > + break; > > > + > > > + case hv_pcibus_init: > > > + case hv_pcibus_probed: > > > survey_child_resources(hbus); > > > + break; > > > + > > > + default: > > > + break; > > > } > > > > > > up(&hbus->enum_sem); > > > @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, > > > hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); > > > if (!hbus) > > > return -ENOMEM; > > > + hbus->state = hv_pcibus_init; > > > > > > /* > > >* The PCI bus "domain" is what is called "segment" in ACPI and @@ > > > -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) > > > pci_stop_root_bus(hbus->pci_bus); > > > pci_remove_root_bus(hbus->pci_bus); > > > pci_unlock_rescan_remove(); > > > + hbus->state = hv_pcibus_removed; > > > } > > > > > > ret = hv_send_resources_released(hdev); > > > -- > > > 1.8.5.6 > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-pci" > > > in the body of a message to majord...@vger.kernel.org More > majordomo > > > info at > > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fvger.k > > e > > rnel.org%2Fmajordomo- > > > info.html&data=02%7C01%7Ckys%40microsoft.com%7C982492a275ed4126c4 > > > d308d40a7644da%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6361 > > > 44950466092469&sdata=9cXs6P1zoQ7qB%2BxYD9bsd%2BLMN%2BjwSPQkxnj > > iqBdv9go%3D&reserved=0 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject
Hi Bjorn, The patch is still pending (along with 1/2 v3). Please let me know if you want me to resend the two patches. Thanks Long > -Original Message- > From: KY Srinivasan > Sent: Tuesday, October 4, 2016 1:49 PM > To: Long Li ; Haiyang Zhang > ; Bjorn Helgaas > Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: RE: [PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject > > > > > -----Original Message- > > From: Long Li > > Sent: Monday, October 3, 2016 11:43 PM > > To: KY Srinivasan ; Haiyang Zhang > > ; Bjorn Helgaas > > Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > > ker...@vger.kernel.org; Long Li > > Subject: [PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject > > > > This sender failed our fraud detection checks and may not be who they > > appear to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing > > > > From: Long Li > > > > A PCI_EJECT message can arrive at the same time we are calling > > pci_scan_child_bus in the workqueue for the previous > PCI_BUS_RELATIONS > > message or in create_root_hv_pci_bus(), in this case we could > > potentailly modify the bus from multiple places. Properly lock the bus > access. > > > > Thanks Dexuan Cui for pointing out the race > > condition in create_root_hv_pci_bus(). > > > > Signed-off-by: Long Li > > Tested-by: Cathy Avery > > Reported-by: Xiaofeng Wang > > Acked-by: KY Srinivasan > > > --- > > drivers/pci/host/pci-hyperv.c | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index 4a37598..33c75c9 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -1198,9 +1198,11 @@ static int create_root_hv_pci_bus(struct > > hv_pcibus_device *hbus) > > hbus->pci_bus->msi = &hbus->msi_chip; > > hbus->pci_bus->msi->dev = &hbus->hdev->device; > > > > + pci_lock_rescan_remove(); > > pci_scan_child_bus(hbus->pci_bus); > > pci_bus_assign_resources(hbus->pci_bus); > > pci_bus_add_devices(hbus->pci_bus); > > + pci_unlock_rescan_remove(); > > hbus->state = hv_pcibus_installed; > > return 0; > > } > > @@ -1590,8 +1592,10 @@ static void hv_eject_device_work(struct > > work_struct *work) > > pdev = > > pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, > > 0, > >wslot); > > if (pdev) { > > + pci_lock_rescan_remove(); > > pci_stop_and_remove_bus_device(pdev); > > pci_dev_put(pdev); > > + pci_unlock_rescan_remove(); > > } > > > > memset(&ctxt, 0, sizeof(ctxt)); > > -- > > 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] hv: use substraction to update ring buffer index
> -Original Message- > From: Dexuan Cui > Sent: Sunday, January 15, 2017 7:12 PM > To: Long Li ; KY Srinivasan ; > Haiyang Zhang > Cc: de...@linuxdriverproject.org; linux-ker...@vger.kernel.org > Subject: RE: [PATCH] hv: use substraction to update ring buffer index > > > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > > Behalf Of Long Li > > Sent: Thursday, January 5, 2017 12:08 > > To: KY Srinivasan ; Haiyang Zhang > > > > Cc: de...@linuxdriverproject.org; linux-ker...@vger.kernel.org > > Subject: [PATCH] hv: use substraction to update ring buffer index > > > > From: Long Li > > > > The ring buffer code uses %= to calculate index. For x86/64, %= > > compiles to div, more than 10 times slower than sub. > > > > Replace div with sub for this data heavy code path. > > > > Signed-off-by: Long Li > > --- > > drivers/hv/ring_buffer.c | 9 ++--- > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index > > cd49cb1..f8eee6e 100644 > > --- a/drivers/hv/ring_buffer.c > > +++ b/drivers/hv/ring_buffer.c > > @@ -135,7 +135,8 @@ hv_get_next_readlocation_withoffset(struct > > hv_ring_buffer_info *ring_info, > > u32 next = ring_info->ring_buffer->read_index; > > > > next += offset; > > - next %= ring_info->ring_datasize; > > + if (next >= ring_info->ring_datasize) > > + next -= ring_info->ring_datasize; > > > > return next; > > } > > @@ -179,7 +180,8 @@ static u32 hv_copyfrom_ringbuffer( > > memcpy(dest, ring_buffer + start_read_offset, destlen); > > > > start_read_offset += destlen; > > - start_read_offset %= ring_buffer_size; > > + if (start_read_offset >= ring_buffer_size) > > + start_read_offset -= ring_buffer_size; > > > > return start_read_offset; > > } > > @@ -201,7 +203,8 @@ static u32 hv_copyto_ringbuffer( > > memcpy(ring_buffer + start_write_offset, src, srclen); > > > > start_write_offset += srclen; > > - start_write_offset %= ring_buffer_size; > > + if (start_write_offset >= ring_buffer_size) > > + start_write_offset -= ring_buffer_size; > > > > return start_write_offset; > > } > > Hi Long, > I guess you want to fix put_pkt_raw() too. :-) Good point. I will send an updated patch. > > Thanks, > -- Dexuan ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove
Hi Bjorn, This patch and the other one in the series ([Resend PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject) have been Acked. Is there anything else should be done before it can be merged? Please let me know. Thanks Long > -Original Message- > From: KY Srinivasan > Sent: Friday, January 27, 2017 10:42 AM > To: Long Li ; Haiyang Zhang > ; Bjorn Helgaas > Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; Long Li > Subject: RE: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus > remove > > > > > -----Original Message- > > From: Long Li [mailto:lon...@exchange.microsoft.com] > > Sent: Monday, January 23, 2017 9:45 PM > > To: KY Srinivasan ; Haiyang Zhang > > ; Bjorn Helgaas > > Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > > ker...@vger.kernel.org; Long Li > > Subject: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus > > remove > > > > [This sender failed our fraud detection checks and may not be who they > > appear to be. Learn about spoofing at > > http://aka.ms/LearnAboutSpoofing] > > > > From: Long Li > > > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI > > device from host (e.g. by disabling SRIOV on a device). In > > hv_pci_remove, the bus is already removed before the call, so we don't > > need to rescan the bus in the workqueue scheduled from > > hv_pci_devices_present. By introducing status hv_pcibus_removed, we > can avoid this situation. > > > > Signed-off-by: Long Li > > Reported-by: Xiaofeng Wang > Acked-by: K. Y. Srinivasan > > --- > > drivers/pci/host/pci-hyperv.c | 20 +--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > hv_pcibus_init = 0, > > hv_pcibus_probed, > > hv_pcibus_installed, > > + hv_pcibus_removed, > > hv_pcibus_maximum > > }; > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > > work_struct *work) > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > } > > > > - /* Tell the core to rescan bus because there may have been changes. > */ > > - if (hbus->state == hv_pcibus_installed) { > > + switch (hbus->state) { > > + case hv_pcibus_installed: > > + /* > > +* Tell the core to rescan bus > > +* because there may have been changes. > > +*/ > > pci_lock_rescan_remove(); > > pci_scan_child_bus(hbus->pci_bus); > > pci_unlock_rescan_remove(); > > - } else { > > + break; > > + > > + case hv_pcibus_init: > > + case hv_pcibus_probed: > > survey_child_resources(hbus); > > + break; > > + > > + default: > > + break; > > } > > > > up(&hbus->enum_sem); > > @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, > > hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); > > if (!hbus) > > return -ENOMEM; > > + hbus->state = hv_pcibus_init; > > > > /* > > * The PCI bus "domain" is what is called "segment" in ACPI > > and @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device > *hdev) > > pci_stop_root_bus(hbus->pci_bus); > > pci_remove_root_bus(hbus->pci_bus); > > pci_unlock_rescan_remove(); > > + hbus->state = hv_pcibus_removed; > > } > > > > ret = hv_send_resources_released(hdev); > > -- > > 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 6/7] scsi: storvsc: Don't assume that the scatterlist is not chained
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Monday, March 23, 2015 2:07 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Subject: [PATCH 6/7] scsi: storvsc: Don't assume that the scatterlist is not > chained > > The current code assumes that the scatterlists presented are not chained. > Fix the code to not make this assumption. > > Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li > --- > drivers/scsi/storvsc_drv.c | 98 +-- > 1 files changed, 57 insertions(+), 41 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 88f5d79..a599677 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -626,19 +626,6 @@ cleanup: > return NULL; > } > > -/* Disgusting wrapper functions */ > -static inline unsigned long sg_kmap_atomic(struct scatterlist *sgl, int idx) > -{ > - void *addr = kmap_atomic(sg_page(sgl + idx)); > - return (unsigned long)addr; > -} > - > -static inline void sg_kunmap_atomic(unsigned long addr) -{ > - kunmap_atomic((void *)addr); > -} > - > - > /* Assume the original sgl has enough room */ static unsigned int > copy_from_bounce_buffer(struct scatterlist *orig_sgl, > struct scatterlist *bounce_sgl, @@ - > 653,32 +640,39 @@ static unsigned int copy_from_bounce_buffer(struct > scatterlist *orig_sgl, > unsigned long bounce_addr = 0; > unsigned long dest_addr = 0; > unsigned long flags; > + struct scatterlist *cur_dest_sgl; > + struct scatterlist *cur_src_sgl; > > local_irq_save(flags); > - > + cur_dest_sgl = orig_sgl; > + cur_src_sgl = bounce_sgl; > for (i = 0; i < orig_sgl_count; i++) { > - dest_addr = sg_kmap_atomic(orig_sgl,i) + orig_sgl[i].offset; > + dest_addr = (unsigned long) > + kmap_atomic(sg_page(cur_dest_sgl)) + > + cur_dest_sgl->offset; > dest = dest_addr; > destlen = orig_sgl[i].length; > + destlen = cur_dest_sgl->length; > > if (bounce_addr == 0) > - bounce_addr = sg_kmap_atomic(bounce_sgl,j); > + bounce_addr = (unsigned long)kmap_atomic( > + sg_page(cur_src_sgl)); > > while (destlen) { > - src = bounce_addr + bounce_sgl[j].offset; > - srclen = bounce_sgl[j].length - bounce_sgl[j].offset; > + src = bounce_addr + cur_src_sgl->offset; > + srclen = cur_src_sgl->length - cur_src_sgl->offset; > > copylen = min(srclen, destlen); > memcpy((void *)dest, (void *)src, copylen); > > total_copied += copylen; > - bounce_sgl[j].offset += copylen; > + cur_src_sgl->offset += copylen; > destlen -= copylen; > dest += copylen; > > - if (bounce_sgl[j].offset == bounce_sgl[j].length) { > + if (cur_src_sgl->offset == cur_src_sgl->length) { > /* full */ > - sg_kunmap_atomic(bounce_addr); > + kunmap_atomic((void *)bounce_addr); > j++; > > /* > @@ -692,21 +686,27 @@ static unsigned int copy_from_bounce_buffer(struct > scatterlist *orig_sgl, > /* >* We are done; cleanup and return. >*/ > - sg_kunmap_atomic(dest_addr - > orig_sgl[i].offset); > + kunmap_atomic((void *)(dest_addr - > + cur_dest_sgl->offset)); > local_irq_restore(flags); > return total_copied; > } > > /* if we need to use another bounce buffer */ > - if (destlen || i != orig_sgl_count - 1) > -
RE: [PATCH 5/7] scsi: storvsc: Fix a bug in copy_from_bounce_buffer()
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of K. Y. Srinivasan > Sent: Monday, March 23, 2015 2:07 PM > To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > de...@linuxdriverproject.org; oher...@suse.com; jbottom...@parallels.com; > h...@infradead.org; linux-s...@vger.kernel.org; a...@canonical.com; > vkuzn...@redhat.com; jasow...@redhat.com > Cc: sta...@vger.kernel.org > Subject: [PATCH 5/7] scsi: storvsc: Fix a bug in copy_from_bounce_buffer() > > We may exit this function without properly freeing up the maapings we may > have acquired. Fix the bug. > > Signed-off-by: K. Y. Srinivasan Reviewed-by: Long Li > Cc: > --- > drivers/scsi/storvsc_drv.c | 15 --- > 1 files changed, 8 insertions(+), 7 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index > 5c13eec..88f5d79 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -754,21 +754,22 @@ static unsigned int copy_to_bounce_buffer(struct > scatterlist *orig_sgl, > if (bounce_sgl[j].length == PAGE_SIZE) { > /* full..move to next entry */ > sg_kunmap_atomic(bounce_addr); > + bounce_addr = 0; > j++; > + } > > - /* if we need to use another bounce buffer */ > - if (srclen || i != orig_sgl_count - 1) > - bounce_addr = > sg_kmap_atomic(bounce_sgl,j); > + /* if we need to use another bounce buffer */ > + if (srclen && bounce_addr == 0) > + bounce_addr = sg_kmap_atomic(bounce_sgl, > j); > > - } else if (srclen == 0 && i == orig_sgl_count - 1) { > - /* unmap the last bounce that is < PAGE_SIZE > */ > - sg_kunmap_atomic(bounce_addr); > - } > } > > sg_kunmap_atomic(src_addr - orig_sgl[i].offset); > } > > + if (bounce_addr) > + sg_kunmap_atomic(bounce_addr); > + > local_irq_restore(flags); > > return total_copied; > -- > 1.7.4.1 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/3] storvsc: use tagged SRB requests if supported by the device
From: Long Li Properly set SRB flags when hosting device supports tagged queuing. This patch improves the performance on Fiber Channel disks. --- drivers/scsi/storvsc_drv.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 8ccfc9e..a8f3e4c 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -136,6 +136,8 @@ struct hv_fc_wwn_packet { #define SRB_FLAGS_PORT_DRIVER_RESERVED 0x0F00 #define SRB_FLAGS_CLASS_DRIVER_RESERVED0xF000 +#define SP_UNTAGGED((unsigned char) ~0) +#define SRB_SIMPLE_TAG_REQUEST 0x20 /* * Platform neutral description of a scsi request - @@ -1451,6 +1453,12 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) vm_srb->win8_extension.srb_flags |= SRB_FLAGS_DISABLE_SYNCH_TRANSFER; + if(scmnd->device->tagged_supported) { + vm_srb->win8_extension.srb_flags |= (SRB_FLAGS_QUEUE_ACTION_ENABLE | SRB_FLAGS_NO_QUEUE_FREEZE); + vm_srb->win8_extension.queue_tag = SP_UNTAGGED; + vm_srb->win8_extension.queue_action = SRB_SIMPLE_TAG_REQUEST; + } + /* Build the SRB */ switch (scmnd->sc_data_direction) { case DMA_TO_DEVICE: -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/3] storvsc: properly handle SRB_ERROR when sense message is present
From: Long Li When sense message is present on error, we should pass along to the upper layer to decide how to deal with the error. This patch fixes connectivity issues with Fiber Channel devices. --- drivers/scsi/storvsc_drv.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index a8f3e4c..8328c87 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -890,6 +890,9 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb, switch (SRB_STATUS(vm_srb->srb_status)) { case SRB_STATUS_ERROR: + /* Let upper layer deal with error when sense message is present */ + if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID) + break; /* * If there is an error; offline the device since all * error recovery strategies would have already been -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/3] storvsc: use block layer default segment size
From: Long Li We no long have the restriction of page size limit in the SG list. Remove it. The driver can properly handle default block segment size. --- drivers/scsi/storvsc_drv.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 8328c87..ac57f9c 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1271,9 +1271,6 @@ static int storvsc_do_io(struct hv_device *device, static int storvsc_device_configure(struct scsi_device *sdevice) { - - blk_queue_max_segment_size(sdevice->request_queue, PAGE_SIZE); - blk_queue_bounce_limit(sdevice->request_queue, BLK_BOUNCE_ANY); blk_queue_rq_timeout(sdevice->request_queue, (storvsc_timeout * HZ)); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/3] storvsc: fixes issues on Fiber Channel
From: Long Li This patch set fixes connectivity issues and improves performance for Fiber Channel disks. Long Li (3): Use tagged SRB requests if supported by the device Properly handle SRB_ERROR when sense message is present Use block layer default segment size drivers/scsi/storvsc_drv.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/3] storvsc: use tagged SRB requests if supported by the device
> -Original Message- > From: Johannes Thumshirn [mailto:jthumsh...@suse.de] > Sent: Wednesday, September 7, 2016 12:47 AM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; James E.J. Bottomley > ; Martin K. Petersen > ; de...@linuxdriverproject.org; linux- > s...@vger.kernel.org; linux-ker...@vger.kernel.org; Long Li > > Subject: Re: [PATCH 1/3] storvsc: use tagged SRB requests if supported by > the device > > On Tue, Sep 06, 2016 at 02:25:41PM -0700, Long Li wrote: > > From: Long Li > > > > Properly set SRB flags when hosting device supports tagged queuing. This > patch improves the performance on Fiber Channel disks. > > ENOSIGNEDOFF and please use checkpatch.pl on the patch. Thanks for pointing that out. I'll re-send the patches. > > > > > --- > > drivers/scsi/storvsc_drv.c | 8 > > 1 file changed, 8 insertions(+) > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > index 8ccfc9e..a8f3e4c 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -136,6 +136,8 @@ struct hv_fc_wwn_packet { > > #define SRB_FLAGS_PORT_DRIVER_RESERVED 0x0F00 > > #define SRB_FLAGS_CLASS_DRIVER_RESERVED0xF000 > > > > +#define SP_UNTAGGED((unsigned char) ~0) > > +#define SRB_SIMPLE_TAG_REQUEST 0x20 > > > > /* > > * Platform neutral description of a scsi request - @@ -1451,6 > > +1453,12 @@ static int storvsc_queuecommand(struct Scsi_Host *host, > struct scsi_cmnd *scmnd) > > vm_srb->win8_extension.srb_flags |= > > SRB_FLAGS_DISABLE_SYNCH_TRANSFER; > > > > + if(scmnd->device->tagged_supported) { > > + vm_srb->win8_extension.srb_flags |= > (SRB_FLAGS_QUEUE_ACTION_ENABLE | SRB_FLAGS_NO_QUEUE_FREEZE); > > + vm_srb->win8_extension.queue_tag = SP_UNTAGGED; > > + vm_srb->win8_extension.queue_action = > SRB_SIMPLE_TAG_REQUEST; > > + } > > + > > /* Build the SRB */ > > switch (scmnd->sc_data_direction) { > > case DMA_TO_DEVICE: > > -- > > 1.8.5.6 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-scsi" > > in the body of a message to majord...@vger.kernel.org More > majordomo > > info at > > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fvger.k > > ernel.org%2fmajordomo- > info.html&data=02%7c01%7clongli%40microsoft.com% > > > 7cdedd4c7ad4cf4955224d08d3d6f31f3d%7c72f988bf86f141af91ab2d7cd011db > 47% > > > 7c1%7c0%7c636088312112339554&sdata=QvrOLvFjisQ4Nfz%2bkz1uyt7G7wh > R7Uz7D > > DlYMuc5VUM%3d > > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG > Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 > 0850 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/2] pci-hyperv: properly handle pci bus remove
From: Long Li hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation. The patch fixes the following kernel panic. [ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv] [ 383.853124] task: 88007f5f8000 ti: 88007f60 task.ti: 88007f60 [ 383.853124] RIP: 0010:[] [] pci_is_pcie+0x6/0x20 [ 383.853124] RSP: 0018:88007f603d38 EFLAGS: 00010206 [ 383.853124] RAX: 88007f5f8000 RBX: 642f3d4854415056 RCX: 88007f603fd8 [ 383.853124] RDX: RSI: RDI: 642f3d4854415056 [ 383.853124] RBP: 88007f603d68 R08: 0246 R09: a045eb9e [ 383.853124] R10: 88007b419a80 R11: ea0001c0ef40 R12: 880003ee1c00 [ 383.853124] R13: 63702f30303a3137 R14: R15: 0246 [ 383.853124] FS: () GS:88007b40() knlGS: [ 383.853124] CS: 0010 DS: ES: CR0: 80050033 [ 383.853124] CR2: 7f68b3f52350 CR3: 03546000 CR4: 000406f0 [ 383.853124] DR0: DR1: DR2: [ 383.853124] DR3: DR6: 0ff0 DR7: 0400 [ 383.853124] Stack: [ 383.853124] 88007f603d68 8134db17 0008 880003ee1c00 [ 383.853124] 63702f30303a3137 880003d8edb8 88007f603da0 8134ee2d [ 383.853124] 880003d8ed00 88007f603dd8 880075fec320 880003d8edb8 [ 383.853124] Call Trace: [ 383.853124] [] ? pci_scan_slot+0x27/0x140 [ 383.853124] [] pci_scan_child_bus+0x3d/0x150 [ 383.853124] [] pci_devices_present_work+0x3ea/0x400 [pci_hyperv] [ 383.853124] [] process_one_work+0x17b/0x470 [ 383.853124] [] worker_thread+0x126/0x410 [ 383.853124] [] ? rescuer_thread+0x460/0x460 [ 383.853124] [] kthread+0xcf/0xe0 [ 383.853124] [] ? kthread_create_on_node+0x140/0x140 [ 383.853124] [] ret_from_fork+0x58/0x90 [ 383.853124] [] ? kthread_create_on_node+0x140/0x140 [ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 66 90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 [ 383.853124] RIP [] pci_is_pcie+0x6/0x20 [ 383.853124] RSP Signed-off-by: Long Li --- drivers/pci/host/pci-hyperv.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index daa5fc3..26f049b 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -348,6 +348,7 @@ enum hv_pcibus_state { hv_pcibus_init = 0, hv_pcibus_probed, hv_pcibus_installed, + hv_pcibus_removed, hv_pcibus_maximum }; @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct work_struct *work) put_pcichild(hpdev, hv_pcidev_ref_initial); } - /* Tell the core to rescan bus because there may have been changes. */ - if (hbus->state == hv_pcibus_installed) { + switch (hbus->state) { + case hv_pcibus_installed: + /* +* Tell the core to rescan bus +* because there may have been changes. +*/ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - } else { + break; + + case hv_pcibus_init: + case hv_pcibus_probed: survey_child_resources(hbus); + break; + + default: + break; } up(&hbus->enum_sem); @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); if (!hbus) return -ENOMEM; + hbus->state = hv_pcibus_init; /* * The PCI bus "domain" is what is called "segment" in ACPI and @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) pci_stop_root_bus(hbus->pci_bus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); + hbus->state = hv_pcibus_removed; } ret = hv_send_resources_released(hdev); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2] pci-hyperv: properly handle device eject
From: Long Li A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message, in this case we could potentailly modify the bus from two places. Properly lock the bus access. Signed-off-by: Long Li --- drivers/pci/host/pci-hyperv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 3c2b330..ca77009 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct work_struct *work) pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, wslot); if (pdev) { - pci_stop_and_remove_bus_device(pdev); + pci_stop_and_remove_bus_device_locked(pdev); pci_dev_put(pdev); } -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] hv: do not lose pending heartbeat vmbus packets
From: Long Li The host keeps sending heartbeat packets independent of guest responding to them. In some situations, there might be multiple heartbeat packets pending in the ring buffer. Don't lose them, read them all. Signed-off-by: Long Li --- drivers/hv/hv_util.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c index d5acaa2..9dc6372 100644 --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -283,10 +283,14 @@ static void heartbeat_onchannelcallback(void *context) u8 *hbeat_txf_buf = util_heartbeat.recv_buffer; struct icmsg_negotiate *negop = NULL; - vmbus_recvpacket(channel, hbeat_txf_buf, -PAGE_SIZE, &recvlen, &requestid); + while (1) { + + vmbus_recvpacket(channel, hbeat_txf_buf, +PAGE_SIZE, &recvlen, &requestid); + + if (!recvlen) + break; - if (recvlen > 0) { icmsghdrp = (struct icmsg_hdr *)&hbeat_txf_buf[ sizeof(struct vmbuspipe_hdr)]; -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> -Original Message- > From: Dexuan Cui > Sent: Tuesday, September 13, 2016 2:51 AM > To: Long Li ; KY Srinivasan ; > Haiyang Zhang ; Bjorn Helgaas > > Cc: de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > p...@vger.kernel.org > Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject > > > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > > Behalf Of Long Li > > Sent: Tuesday, September 13, 2016 7:54 ... > > A PCI_EJECT message can arrive at the same time we are calling > > pci_scan_child_bus in the workqueue for the previous > PCI_BUS_RELATIONS > > message, in this case we could potentailly modify the bus from two places. > > Properly lock the bus access. > > > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct > > work_struct > > *work) > > pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, > 0, > >wslot); > > if (pdev) { > > - pci_stop_and_remove_bus_device(pdev); > > + pci_stop_and_remove_bus_device_locked(pdev); > > pci_dev_put(pdev); > > } > > The _locked version tries to get the mutex pci_rescan_remove_lock. > > But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so how can > this patch make sure the 2 code paths are not running simultaneously? Thanks for the review. The lock is to protect the following call to pci_scan_child_bus() in pci_devices_present_work(): /* * Tell the core to rescan bus * because there may have been changes. */ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); This race condition has shown up in the tests. You raised a valid concern in create_root_hv_pci_bus(). There might be another race condition there. I'll look into this. > > Thanks, > -- Dexuan ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of Long Li > Sent: Tuesday, September 13, 2016 10:33 AM > To: Dexuan Cui ; KY Srinivasan > ; Haiyang Zhang ; Bjorn > Helgaas > Cc: de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > p...@vger.kernel.org > Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject > > This sender failed our fraud detection checks and may not be who they > appear to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing > > > -Original Message- > > From: Dexuan Cui > > Sent: Tuesday, September 13, 2016 2:51 AM > > To: Long Li ; KY Srinivasan ; > > Haiyang Zhang ; Bjorn Helgaas > > > > Cc: de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > > p...@vger.kernel.org > > Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject > > > > > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] > > > On Behalf Of Long Li > > > Sent: Tuesday, September 13, 2016 7:54 ... > > > A PCI_EJECT message can arrive at the same time we are calling > > > pci_scan_child_bus in the workqueue for the previous > > PCI_BUS_RELATIONS > > > message, in this case we could potentailly modify the bus from two > places. > > > Properly lock the bus access. > > > > > > --- a/drivers/pci/host/pci-hyperv.c > > > +++ b/drivers/pci/host/pci-hyperv.c > > > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct > > > work_struct > > > *work) > > > pdev = > > > pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, > > 0, > > >wslot); > > > if (pdev) { > > > - pci_stop_and_remove_bus_device(pdev); > > > + pci_stop_and_remove_bus_device_locked(pdev); > > > pci_dev_put(pdev); > > > } > > > > The _locked version tries to get the mutex pci_rescan_remove_lock. > > > > But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so > > how can this patch make sure the 2 code paths are not running > simultaneously? > > Thanks for the review. > > The lock is to protect the following call to pci_scan_child_bus() in > pci_devices_present_work(): > > /* > * Tell the core to rescan bus > * because there may have been changes. > */ > pci_lock_rescan_remove(); > pci_scan_child_bus(hbus->pci_bus); > pci_unlock_rescan_remove(); > > This race condition has shown up in the tests. > > You raised a valid concern in create_root_hv_pci_bus(). There might be > another race condition there. I'll look into this. I think this code is safe here. If we reach the code pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is already called. > > > > > Thanks, > > -- Dexuan > ___ > devel mailing list > de...@linuxdriverproject.org > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fdriverde > v.linuxdriverproject.org%2fmailman%2flistinfo%2fdriverdev- > devel&data=02%7c01%7clongli%40microsoft.com%7c3d12ee6d87c140eb5114 > 08d3dbfc1713%7c72f988bf86f141af91ab2d7cd011db47%7c1%7c0%7c6360938 > 48185348266&sdata=a2GYqIBsQAFxszkKg3fl1nqqPgvZHh%2bAY2255RgrvUU > %3d ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> -Original Message- > From: Dexuan Cui > Sent: Tuesday, September 13, 2016 10:45 PM > To: Long Li ; KY Srinivasan ; > Haiyang Zhang ; Bjorn Helgaas > > Cc: de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > p...@vger.kernel.org > Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject > > > From: Long Li > > Sent: Wednesday, September 14, 2016 1:41 > > > > I think this code is safe here. If we reach the code > > pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is > > already called. > > When hv_pci_probe() -> create_root_hv_pci_bus() -> pci_scan_child_bus() > is running on one cpu, I think nothing in the current code can prevent > hv_eject_device_work() -> pci_stop_and_remove_bus_device_locked() > from running on another cpu? > > The race window is pretty small however. This is a valid race condition. I'll work on a V2 patch. Thanks! > > Thanks, > -- Dexuan ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove
From: Long Li hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation. The patch fixes the following kernel panic. [ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv] [ 383.853124] task: 88007f5f8000 ti: 88007f60 task.ti: 88007f60 [ 383.853124] RIP: 0010:[] [] pci_is_pcie+0x6/0x20 [ 383.853124] RSP: 0018:88007f603d38 EFLAGS: 00010206 [ 383.853124] RAX: 88007f5f8000 RBX: 642f3d4854415056 RCX: 88007f603fd8 [ 383.853124] RDX: RSI: RDI: 642f3d4854415056 [ 383.853124] RBP: 88007f603d68 R08: 0246 R09: a045eb9e [ 383.853124] R10: 88007b419a80 R11: ea0001c0ef40 R12: 880003ee1c00 [ 383.853124] R13: 63702f30303a3137 R14: R15: 0246 [ 383.853124] FS: () GS:88007b40() knlGS: [ 383.853124] CS: 0010 DS: ES: CR0: 80050033 [ 383.853124] CR2: 7f68b3f52350 CR3: 03546000 CR4: 000406f0 [ 383.853124] DR0: DR1: DR2: [ 383.853124] DR3: DR6: 0ff0 DR7: 0400 [ 383.853124] Stack: [ 383.853124] 88007f603d68 8134db17 0008 880003ee1c00 [ 383.853124] 63702f30303a3137 880003d8edb8 88007f603da0 8134ee2d [ 383.853124] 880003d8ed00 88007f603dd8 880075fec320 880003d8edb8 [ 383.853124] Call Trace: [ 383.853124] [] ? pci_scan_slot+0x27/0x140 [ 383.853124] [] pci_scan_child_bus+0x3d/0x150 [ 383.853124] [] pci_devices_present_work+0x3ea/0x400 [pci_hyperv] [ 383.853124] [] process_one_work+0x17b/0x470 [ 383.853124] [] worker_thread+0x126/0x410 [ 383.853124] [] ? rescuer_thread+0x460/0x460 [ 383.853124] [] kthread+0xcf/0xe0 [ 383.853124] [] ? kthread_create_on_node+0x140/0x140 [ 383.853124] [] ret_from_fork+0x58/0x90 [ 383.853124] [] ? kthread_create_on_node+0x140/0x140 [ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 66 90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 [ 383.853124] RIP [] pci_is_pcie+0x6/0x20 [ 383.853124] RSP Signed-off-by: Long Li --- drivers/pci/host/pci-hyperv.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -348,6 +348,7 @@ enum hv_pcibus_state { hv_pcibus_init = 0, hv_pcibus_probed, hv_pcibus_installed, + hv_pcibus_removed, hv_pcibus_maximum }; @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct work_struct *work) put_pcichild(hpdev, hv_pcidev_ref_initial); } - /* Tell the core to rescan bus because there may have been changes. */ - if (hbus->state == hv_pcibus_installed) { + switch (hbus->state) { + case hv_pcibus_installed: + /* +* Tell the core to rescan bus +* because there may have been changes. +*/ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - } else { + break; + + case hv_pcibus_init: + case hv_pcibus_probed: survey_child_resources(hbus); + break; + + default: + break; } up(&hbus->enum_sem); @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); if (!hbus) return -ENOMEM; + hbus->state = hv_pcibus_init; /* * The PCI bus "domain" is what is called "segment" in ACPI and @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) pci_stop_root_bus(hbus->pci_bus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); + hbus->state = hv_pcibus_removed; } ret = hv_send_resources_released(hdev); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2 v2] pci-hyperv: lock pci bus on device eject
From: Long Li A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message or in create_root_hv_pci_bus(), in this case we could potentailly modify the bus from multiple places. Properly lock the bus access. Thanks Dexuan Cui for pointing out the race condition in create_root_hv_pci_bus(). Signed-off-by: Long Li --- drivers/pci/host/pci-hyperv.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 4a37598..33c75c9 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1198,9 +1198,11 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus) hbus->pci_bus->msi = &hbus->msi_chip; hbus->pci_bus->msi->dev = &hbus->hdev->device; + pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_bus_assign_resources(hbus->pci_bus); pci_bus_add_devices(hbus->pci_bus); + pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; return 0; } @@ -1590,8 +1592,10 @@ static void hv_eject_device_work(struct work_struct *work) pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, wslot); if (pdev) { + pci_lock_rescan_remove(); pci_stop_and_remove_bus_device(pdev); pci_dev_put(pdev); + pci_unlock_rescan_remove(); } memset(&ctxt, 0, sizeof(ctxt)); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove
Thanks for pointing that out. If you don't mind, I will also add "Tested-by: Cathy Avery ". > -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of Cathy Avery > Sent: Friday, September 23, 2016 4:59 AM > To: driverdev-devel@linuxdriverproject.org > Subject: Re: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove > > Hi, > > You seem to be missing the Reported-by tag. > > That's xiaof...@redhat.com. > > Cathy > > On 09/14/2016 10:10 PM, Long Li wrote: > > From: Long Li > > > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI > device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, > the bus is already removed before the call, so we don't need to rescan the > bus in the workqueue scheduled from hv_pci_devices_present. By > introducing status hv_pcibus_removed, we can avoid this situation. > > > > The patch fixes the following kernel panic. > > > > [ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv] > > [ 383.853124] task: 88007f5f8000 ti: 88007f60 task.ti: > > 88007f60 > > [ 383.853124] RIP: 0010:[] [] > > pci_is_pcie+0x6/0x20 > > [ 383.853124] RSP: 0018:88007f603d38 EFLAGS: 00010206 [ > > 383.853124] RAX: 88007f5f8000 RBX: 642f3d4854415056 RCX: > > 88007f603fd8 > > [ 383.853124] RDX: RSI: RDI: > > 642f3d4854415056 > > [ 383.853124] RBP: 88007f603d68 R08: 0246 R09: > > a045eb9e > > [ 383.853124] R10: 88007b419a80 R11: ea0001c0ef40 R12: > > 880003ee1c00 > > [ 383.853124] R13: 63702f30303a3137 R14: R15: > > 0246 > > [ 383.853124] FS: () GS:88007b40() > > knlGS: > > [ 383.853124] CS: 0010 DS: ES: CR0: 80050033 [ > > 383.853124] CR2: 7f68b3f52350 CR3: 03546000 CR4: > > 000406f0 > > [ 383.853124] DR0: DR1: DR2: > > > > [ 383.853124] DR3: DR6: 0ff0 DR7: > > 0400 > > [ 383.853124] Stack: > > [ 383.853124] 88007f603d68 8134db17 0008 > > 880003ee1c00 > > [ 383.853124] 63702f30303a3137 880003d8edb8 88007f603da0 > > 8134ee2d [ 383.853124] 880003d8ed00 88007f603dd8 > > 880075fec320 > > 880003d8edb8 > > [ 383.853124] Call Trace: > > [ 383.853124] [] ? pci_scan_slot+0x27/0x140 [ > > 383.853124] [] pci_scan_child_bus+0x3d/0x150 [ > > 383.853124] [] > > pci_devices_present_work+0x3ea/0x400 [pci_hyperv] [ 383.853124] > > [] process_one_work+0x17b/0x470 [ 383.853124] > > [] worker_thread+0x126/0x410 [ 383.853124] > > [] ? rescuer_thread+0x460/0x460 [ 383.853124] > > [] kthread+0xcf/0xe0 [ 383.853124] > > [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] [] ret_from_fork+0x58/0x90 [ > > 383.853124] [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 > > 00 > > 00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 > > 66 > > 90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 > > [ 383.853124] RIP [] pci_is_pcie+0x6/0x20 [ > > 383.853124] RSP > > > > Signed-off-by: Long Li > > --- > > drivers/pci/host/pci-hyperv.c | 20 +--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > hv_pcibus_init = 0, > > hv_pcibus_probed, > > hv_pcibus_installed, > > + hv_pcibus_removed, > > hv_pcibus_maximum > > }; > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > work_struct *work) > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > } > > > > - /* Tell the core to rescan bus because there may have been changes. > */ > > - if (hbus->state == hv_pcibus_installed) { > > + switch (hbus->state) { > > + case hv_pcibus_installed: > > + /* > > +* Tell the core to rescan bus > > +* because there may have been changes. &g
RE: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove
> -Original Message- > From: Bjorn Helgaas [mailto:helg...@kernel.org] > Sent: Tuesday, September 27, 2016 12:30 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; Long Li > Subject: Re: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove > > On Wed, Sep 14, 2016 at 07:10:01PM -0700, Long Li wrote: > > From: Long Li > > > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI > device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, > the bus is already removed before the call, so we don't need to rescan the > bus in the workqueue scheduled from hv_pci_devices_present. By > introducing status hv_pcibus_removed, we can avoid this situation. > > > > The patch fixes the following kernel panic. > > > > [ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv] > > [ 383.853124] task: 88007f5f8000 ti: 88007f60 task.ti: > > 88007f60 > > [ 383.853124] RIP: 0010:[] [] > > pci_is_pcie+0x6/0x20 > > [ 383.853124] RSP: 0018:88007f603d38 EFLAGS: 00010206 [ > > 383.853124] RAX: 88007f5f8000 RBX: 642f3d4854415056 RCX: > > 88007f603fd8 > > [ 383.853124] RDX: RSI: RDI: > > 642f3d4854415056 > > [ 383.853124] RBP: 88007f603d68 R08: 0246 R09: > > a045eb9e > > [ 383.853124] R10: 88007b419a80 R11: ea0001c0ef40 R12: > > 880003ee1c00 > > [ 383.853124] R13: 63702f30303a3137 R14: R15: > > 0246 > > [ 383.853124] FS: () GS:88007b40() > > knlGS: > > [ 383.853124] CS: 0010 DS: ES: CR0: 80050033 [ > > 383.853124] CR2: 7f68b3f52350 CR3: 03546000 CR4: > > 000406f0 > > [ 383.853124] DR0: DR1: DR2: > > > > [ 383.853124] DR3: DR6: 0ff0 DR7: > > 0400 > > [ 383.853124] Stack: > > [ 383.853124] 88007f603d68 8134db17 0008 > > 880003ee1c00 > > [ 383.853124] 63702f30303a3137 880003d8edb8 88007f603da0 > > 8134ee2d [ 383.853124] 880003d8ed00 88007f603dd8 > > 880075fec320 > > 880003d8edb8 > > [ 383.853124] Call Trace: > > [ 383.853124] [] ? pci_scan_slot+0x27/0x140 [ > > 383.853124] [] pci_scan_child_bus+0x3d/0x150 [ > > 383.853124] [] > > pci_devices_present_work+0x3ea/0x400 [pci_hyperv] [ 383.853124] > > [] process_one_work+0x17b/0x470 [ 383.853124] > > [] worker_thread+0x126/0x410 [ 383.853124] > > [] ? rescuer_thread+0x460/0x460 [ 383.853124] > > [] kthread+0xcf/0xe0 [ 383.853124] > > [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] [] ret_from_fork+0x58/0x90 [ > > 383.853124] [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 > > 00 > > 00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 > > 66 > > 90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 > > [ 383.853124] RIP [] pci_is_pcie+0x6/0x20 [ > > 383.853124] RSP > > Personally, I would remove the timestamps and addresses from this trace > because I don't think they contribute to diagnosing the problem. Thanks Bjorn. I will remove those kernel traces and send a v3 patch. > > > Signed-off-by: Long Li > > I'm ready to apply these but am waiting for an ack from the maintainers listed > in MAINTAINERS (feel free to update that if it's out of date). > > > --- > > drivers/pci/host/pci-hyperv.c | 20 +--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > hv_pcibus_init = 0, > > hv_pcibus_probed, > > hv_pcibus_installed, > > + hv_pcibus_removed, > > hv_pcibus_maximum > > }; > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > work_struct *work) > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > } > > > > - /* Tell the core to rescan bus because there may have been changes. > */ >
RE: [PATCH] hv: do not lose pending heartbeat vmbus packets
> -Original Message- > From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com] > Sent: Thursday, September 29, 2016 2:22 AM > To: KY Srinivasan ; Long Li > Cc: Haiyang Zhang ; > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org > Subject: Re: [PATCH] hv: do not lose pending heartbeat vmbus packets > > Long Li writes: > > > From: Long Li > > > > The host keeps sending heartbeat packets independent of guest > responding to them. In some situations, there might be multiple heartbeat > packets pending in the ring buffer. Don't lose them, read them all. > > > > Signed-off-by: Long Li > > Long, K. Y., > > it seems this patch didn't make it to char-misc tree and it looks like an > important fix. A couple of nitpicks below, > > > --- > > drivers/hv/hv_util.c | 10 +++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c index > > d5acaa2..9dc6372 100644 > > --- a/drivers/hv/hv_util.c > > +++ b/drivers/hv/hv_util.c > > @@ -283,10 +283,14 @@ static void heartbeat_onchannelcallback(void > *context) > > u8 *hbeat_txf_buf = util_heartbeat.recv_buffer; > > struct icmsg_negotiate *negop = NULL; > > > > - vmbus_recvpacket(channel, hbeat_txf_buf, > > -PAGE_SIZE, &recvlen, &requestid); > > + while (1) { > > + > > + vmbus_recvpacket(channel, hbeat_txf_buf, > > +PAGE_SIZE, &recvlen, &requestid); > > We should check vmbus_recvpacket() return value as well. E.g. > hv_ringbuffer_read() may return -EAGAIN in case we didn't receive the > whole packet (and we do this check in other drivers, see > storvsc_on_channel_callback() for example). I agree with you, we should check for -EAGAIN. This should also be done in storvsc_on_channel_callback. I think the chance of hv_ringbuffer_read() returning -EAGAIN is almost zero. Because read_index and write_index are updated after the whole packet is written to the ring buffer, and protected by memory barriers. So getting a partial read is impossible, unless the host is doing something wrong. Checking for recvlen is safe, because it's always set to 0 at the beginning of hv_ringbuffer_read(). Anyway, we should check for -EAGAIN for all hyperv drivers on read. I think this is a separate issue on how we deal with a buggy host. Will send another set of patches . > > > + > > + if (!recvlen) > > so this should be 'if (ret || !recvlen)' > > > + break; > > > > - if (recvlen > 0) { > > icmsghdrp = (struct icmsg_hdr *)&hbeat_txf_buf[ > > sizeof(struct vmbuspipe_hdr)]; > > -- > Vitaly ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove
From: Long Li hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation. Signed-off-by: Long Li Tested-by: Cathy Avery Reported-by: Xiaofeng Wang --- drivers/pci/host/pci-hyperv.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -348,6 +348,7 @@ enum hv_pcibus_state { hv_pcibus_init = 0, hv_pcibus_probed, hv_pcibus_installed, + hv_pcibus_removed, hv_pcibus_maximum }; @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct work_struct *work) put_pcichild(hpdev, hv_pcidev_ref_initial); } - /* Tell the core to rescan bus because there may have been changes. */ - if (hbus->state == hv_pcibus_installed) { + switch (hbus->state) { + case hv_pcibus_installed: + /* +* Tell the core to rescan bus +* because there may have been changes. +*/ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - } else { + break; + + case hv_pcibus_init: + case hv_pcibus_probed: survey_child_resources(hbus); + break; + + default: + break; } up(&hbus->enum_sem); @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); if (!hbus) return -ENOMEM; + hbus->state = hv_pcibus_init; /* * The PCI bus "domain" is what is called "segment" in ACPI and @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) pci_stop_root_bus(hbus->pci_bus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); + hbus->state = hv_pcibus_removed; } ret = hv_send_resources_released(hdev); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2 v3] pci-hyperv: lock pci bus on device eject
From: Long Li A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message or in create_root_hv_pci_bus(), in this case we could potentailly modify the bus from multiple places. Properly lock the bus access. Thanks Dexuan Cui for pointing out the race condition in create_root_hv_pci_bus(). Signed-off-by: Long Li Tested-by: Cathy Avery Reported-by: Xiaofeng Wang --- drivers/pci/host/pci-hyperv.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 4a37598..33c75c9 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1198,9 +1198,11 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus) hbus->pci_bus->msi = &hbus->msi_chip; hbus->pci_bus->msi->dev = &hbus->hdev->device; + pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_bus_assign_resources(hbus->pci_bus); pci_bus_add_devices(hbus->pci_bus); + pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; return 0; } @@ -1590,8 +1592,10 @@ static void hv_eject_device_work(struct work_struct *work) pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, wslot); if (pdev) { + pci_lock_rescan_remove(); pci_stop_and_remove_bus_device(pdev); pci_dev_put(pdev); + pci_unlock_rescan_remove(); } memset(&ctxt, 0, sizeof(ctxt)); -- 1.8.5.6 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] pci-hyperv: move hypercall buffer from stack to heap
From: Long Li We need to pass a segment from a physically continuous buffer to hv_do_hypercall. Buffer allocated on the stack may not work if CONFIG_VMAP_STACK=y is set. Moving the params buffer from stack to buffer returned by kmalloc. Signed-off-by: Long Li Reported-by: Haiyang Zhang --- drivers/pci/host/pci-hyperv.c | 24 +--- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 763ff87..97e6daf 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -378,6 +378,7 @@ struct hv_pcibus_device { struct msi_domain_info msi_info; struct msi_controller msi_chip; struct irq_domain *irq_domain; + struct retarget_msi_interrupt retarget_msi_interrupt_params; }; /* @@ -774,7 +775,7 @@ void hv_irq_unmask(struct irq_data *data) { struct msi_desc *msi_desc = irq_data_get_msi_desc(data); struct irq_cfg *cfg = irqd_cfg(data); - struct retarget_msi_interrupt params; + struct retarget_msi_interrupt *params; struct hv_pcibus_device *hbus; struct cpumask *dest; struct pci_bus *pbus; @@ -785,23 +786,24 @@ void hv_irq_unmask(struct irq_data *data) pdev = msi_desc_to_pci_dev(msi_desc); pbus = pdev->bus; hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata); - - memset(¶ms, 0, sizeof(params)); - params.partition_id = HV_PARTITION_ID_SELF; - params.source = 1; /* MSI(-X) */ - params.address = msi_desc->msg.address_lo; - params.data = msi_desc->msg.data; - params.device_id = (hbus->hdev->dev_instance.b[5] << 24) | + params = &hbus->retarget_msi_interrupt_params; + + memset(params, 0, sizeof(*params)); + params->partition_id = HV_PARTITION_ID_SELF; + params->source = 1; /* MSI(-X) */ + params->address = msi_desc->msg.address_lo; + params->data = msi_desc->msg.data; + params->device_id = (hbus->hdev->dev_instance.b[5] << 24) | (hbus->hdev->dev_instance.b[4] << 16) | (hbus->hdev->dev_instance.b[7] << 8) | (hbus->hdev->dev_instance.b[6] & 0xf8) | PCI_FUNC(pdev->devfn); - params.vector = cfg->vector; + params->vector = cfg->vector; for_each_cpu_and(cpu, dest, cpu_online_mask) - params.vp_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu)); + params->vp_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu)); - hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, ¶ms, NULL); + hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, params, NULL); pci_msi_unmask_irq(data); } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[Resend] [PATCH] pci-hyperv: use kmalloc to allocate hypercall params buffer
From: Long Li hv_do_hypercall assumes that we pass a segment from a physically continuous buffer. Buffer allocated on the stack may not work if CONFIG_VMAP_STACK=y is set. Use kmalloc to allocate this buffer. Signed-off-by: Long Li Reported-by: Haiyang Zhang --- drivers/pci/host/pci-hyperv.c | 24 +--- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 763ff87..97e6daf 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -378,6 +378,7 @@ struct hv_pcibus_device { struct msi_domain_info msi_info; struct msi_controller msi_chip; struct irq_domain *irq_domain; + struct retarget_msi_interrupt retarget_msi_interrupt_params; }; /* @@ -774,7 +775,7 @@ void hv_irq_unmask(struct irq_data *data) { struct msi_desc *msi_desc = irq_data_get_msi_desc(data); struct irq_cfg *cfg = irqd_cfg(data); - struct retarget_msi_interrupt params; + struct retarget_msi_interrupt *params; struct hv_pcibus_device *hbus; struct cpumask *dest; struct pci_bus *pbus; @@ -785,23 +786,24 @@ void hv_irq_unmask(struct irq_data *data) pdev = msi_desc_to_pci_dev(msi_desc); pbus = pdev->bus; hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata); - - memset(¶ms, 0, sizeof(params)); - params.partition_id = HV_PARTITION_ID_SELF; - params.source = 1; /* MSI(-X) */ - params.address = msi_desc->msg.address_lo; - params.data = msi_desc->msg.data; - params.device_id = (hbus->hdev->dev_instance.b[5] << 24) | + params = &hbus->retarget_msi_interrupt_params; + + memset(params, 0, sizeof(*params)); + params->partition_id = HV_PARTITION_ID_SELF; + params->source = 1; /* MSI(-X) */ + params->address = msi_desc->msg.address_lo; + params->data = msi_desc->msg.data; + params->device_id = (hbus->hdev->dev_instance.b[5] << 24) | (hbus->hdev->dev_instance.b[4] << 16) | (hbus->hdev->dev_instance.b[7] << 8) | (hbus->hdev->dev_instance.b[6] & 0xf8) | PCI_FUNC(pdev->devfn); - params.vector = cfg->vector; + params->vector = cfg->vector; for_each_cpu_and(cpu, dest, cpu_online_mask) - params.vp_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu)); + params->vp_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu)); - hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, ¶ms, NULL); + hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, params, NULL); pci_msi_unmask_irq(data); } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [Resend] [PATCH] pci-hyperv: use kmalloc to allocate hypercall params buffer
> -Original Message- > From: Greg KH [mailto:gre...@linuxfoundation.org] > Sent: Monday, November 7, 2016 11:00 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > p...@vger.kernel.org > Subject: Re: [Resend] [PATCH] pci-hyperv: use kmalloc to allocate hypercall > params buffer > > On Tue, Nov 08, 2016 at 12:14:14AM -0800, Long Li wrote: > > From: Long Li > > > > hv_do_hypercall assumes that we pass a segment from a physically > continuous buffer. Buffer allocated on the stack may not work if > CONFIG_VMAP_STACK=y is set. Use kmalloc to allocate this buffer. > > Please wrap your changelog at 72 columns. > > > > > Signed-off-by: Long Li > > Reported-by: Haiyang Zhang > > --- > > drivers/pci/host/pci-hyperv.c | 24 +--- > > 1 file changed, 13 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index 763ff87..97e6daf 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -378,6 +378,7 @@ struct hv_pcibus_device { > > struct msi_domain_info msi_info; > > struct msi_controller msi_chip; > > struct irq_domain *irq_domain; > > + struct retarget_msi_interrupt retarget_msi_interrupt_params; > > Can you handle potentially unaligned accesses like this? Is there some lock > preventing you from using this structure more than once at the same time? > > > }; > > > > /* > > @@ -774,7 +775,7 @@ void hv_irq_unmask(struct irq_data *data) { > > struct msi_desc *msi_desc = irq_data_get_msi_desc(data); > > struct irq_cfg *cfg = irqd_cfg(data); > > - struct retarget_msi_interrupt params; > > + struct retarget_msi_interrupt *params; > > struct hv_pcibus_device *hbus; > > struct cpumask *dest; > > struct pci_bus *pbus; > > @@ -785,23 +786,24 @@ void hv_irq_unmask(struct irq_data *data) > > pdev = msi_desc_to_pci_dev(msi_desc); > > pbus = pdev->bus; > > hbus = container_of(pbus->sysdata, struct hv_pcibus_device, > > sysdata); > > - > > - memset(¶ms, 0, sizeof(params)); > > - params.partition_id = HV_PARTITION_ID_SELF; > > - params.source = 1; /* MSI(-X) */ > > - params.address = msi_desc->msg.address_lo; > > - params.data = msi_desc->msg.data; > > - params.device_id = (hbus->hdev->dev_instance.b[5] << 24) | > > + params = &hbus->retarget_msi_interrupt_params; > > + > > + memset(params, 0, sizeof(*params)); > > + params->partition_id = HV_PARTITION_ID_SELF; > > + params->source = 1; /* MSI(-X) */ > > + params->address = msi_desc->msg.address_lo; > > + params->data = msi_desc->msg.data; > > + params->device_id = (hbus->hdev->dev_instance.b[5] << 24) | > >(hbus->hdev->dev_instance.b[4] << 16) | > >(hbus->hdev->dev_instance.b[7] << 8) | > >(hbus->hdev->dev_instance.b[6] & 0xf8) | > >PCI_FUNC(pdev->devfn); > > - params.vector = cfg->vector; > > + params->vector = cfg->vector; > > > > for_each_cpu_and(cpu, dest, cpu_online_mask) > > - params.vp_mask |= (1ULL << > vmbus_cpu_number_to_vp_number(cpu)); > > + params->vp_mask |= (1ULL << > vmbus_cpu_number_to_vp_number(cpu)); > > > > - hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, ¶ms, NULL); > > + hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, params, NULL); > > As you only use this in one spot, why not just allocate it here and then free > it? Why add it to the pcibus device structure? Thanks Greg. I will send a V2. > > thanks, > > greg k-h ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] pci-hyperv: use kmalloc to allocate hypercall params buffer
From: Long Li hv_do_hypercall assumes that we pass a segment from a physically continuous buffer. Buffer allocated on the stack may not work if CONFIG_VMAP_STACK=y is set. Change to use kmalloc to allocate this buffer. The v2 patch adds locking to access the pre-allocated buffer. Signed-off-by: Long Li Reported-by: Haiyang Zhang --- drivers/pci/host/pci-hyperv.c | 29 +++-- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index 763ff87..ca553df 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -378,6 +378,8 @@ struct hv_pcibus_device { struct msi_domain_info msi_info; struct msi_controller msi_chip; struct irq_domain *irq_domain; + struct retarget_msi_interrupt retarget_msi_interrupt_params; + spinlock_t retarget_msi_interrupt_lock;; }; /* @@ -774,34 +776,40 @@ void hv_irq_unmask(struct irq_data *data) { struct msi_desc *msi_desc = irq_data_get_msi_desc(data); struct irq_cfg *cfg = irqd_cfg(data); - struct retarget_msi_interrupt params; + struct retarget_msi_interrupt *params; struct hv_pcibus_device *hbus; struct cpumask *dest; struct pci_bus *pbus; struct pci_dev *pdev; int cpu; + unsigned long flags; dest = irq_data_get_affinity_mask(data); pdev = msi_desc_to_pci_dev(msi_desc); pbus = pdev->bus; hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata); - memset(¶ms, 0, sizeof(params)); - params.partition_id = HV_PARTITION_ID_SELF; - params.source = 1; /* MSI(-X) */ - params.address = msi_desc->msg.address_lo; - params.data = msi_desc->msg.data; - params.device_id = (hbus->hdev->dev_instance.b[5] << 24) | + spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags); + + params = &hbus->retarget_msi_interrupt_params; + memset(params, 0, sizeof(*params)); + params->partition_id = HV_PARTITION_ID_SELF; + params->source = 1; /* MSI(-X) */ + params->address = msi_desc->msg.address_lo; + params->data = msi_desc->msg.data; + params->device_id = (hbus->hdev->dev_instance.b[5] << 24) | (hbus->hdev->dev_instance.b[4] << 16) | (hbus->hdev->dev_instance.b[7] << 8) | (hbus->hdev->dev_instance.b[6] & 0xf8) | PCI_FUNC(pdev->devfn); - params.vector = cfg->vector; + params->vector = cfg->vector; for_each_cpu_and(cpu, dest, cpu_online_mask) - params.vp_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu)); + params->vp_mask |= (1ULL << vmbus_cpu_number_to_vp_number(cpu)); + + hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, params, NULL); - hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, ¶ms, NULL); + spin_unlock_irqrestore(&hbus->retarget_msi_interrupt_lock, flags); pci_msi_unmask_irq(data); } @@ -2186,6 +2194,7 @@ static int hv_pci_probe(struct hv_device *hdev, INIT_LIST_HEAD(&hbus->resources_for_children); spin_lock_init(&hbus->config_lock); spin_lock_init(&hbus->device_list_lock); + spin_lock_init(&hbus->retarget_msi_interrupt_lock); sema_init(&hbus->enum_sem, 1); init_completion(&hbus->remove_event); -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [Resend] [PATCH] pci-hyperv: use kmalloc to allocate hypercall params buffer
> -Original Message- > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On > Behalf Of Long Li > Sent: Tuesday, November 8, 2016 8:57 AM > To: Greg KH > Cc: linux-...@vger.kernel.org; Haiyang Zhang ; > linux-ker...@vger.kernel.org; Bjorn Helgaas ; > de...@linuxdriverproject.org > Subject: RE: [Resend] [PATCH] pci-hyperv: use kmalloc to allocate hypercall > params buffer > > This sender failed our fraud detection checks and may not be who they > appear to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing > > > -Original Message- > > From: Greg KH [mailto:gre...@linuxfoundation.org] > > Sent: Monday, November 7, 2016 11:00 PM > > To: Long Li > > Cc: KY Srinivasan ; Haiyang Zhang > > ; Bjorn Helgaas ; > > de...@linuxdriverproject.org; linux-ker...@vger.kernel.org; linux- > > p...@vger.kernel.org > > Subject: Re: [Resend] [PATCH] pci-hyperv: use kmalloc to allocate > > hypercall params buffer > > > > On Tue, Nov 08, 2016 at 12:14:14AM -0800, Long Li wrote: > > > From: Long Li > > > > > > hv_do_hypercall assumes that we pass a segment from a physically > > continuous buffer. Buffer allocated on the stack may not work if > > CONFIG_VMAP_STACK=y is set. Use kmalloc to allocate this buffer. > > > > Please wrap your changelog at 72 columns. > > > > > > > > Signed-off-by: Long Li > > > Reported-by: Haiyang Zhang > > > --- > > > drivers/pci/host/pci-hyperv.c | 24 +--- > > > 1 file changed, 13 insertions(+), 11 deletions(-) > > > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > > b/drivers/pci/host/pci-hyperv.c index 763ff87..97e6daf 100644 > > > --- a/drivers/pci/host/pci-hyperv.c > > > +++ b/drivers/pci/host/pci-hyperv.c > > > @@ -378,6 +378,7 @@ struct hv_pcibus_device { > > > struct msi_domain_info msi_info; > > > struct msi_controller msi_chip; > > > struct irq_domain *irq_domain; > > > + struct retarget_msi_interrupt retarget_msi_interrupt_params; > > > > Can you handle potentially unaligned accesses like this? Is there > > some lock preventing you from using this structure more than once at the > same time? > > > > > }; > > > > > > /* > > > @@ -774,7 +775,7 @@ void hv_irq_unmask(struct irq_data *data) { > > > struct msi_desc *msi_desc = irq_data_get_msi_desc(data); > > > struct irq_cfg *cfg = irqd_cfg(data); > > > - struct retarget_msi_interrupt params; > > > + struct retarget_msi_interrupt *params; > > > struct hv_pcibus_device *hbus; > > > struct cpumask *dest; > > > struct pci_bus *pbus; > > > @@ -785,23 +786,24 @@ void hv_irq_unmask(struct irq_data *data) > > > pdev = msi_desc_to_pci_dev(msi_desc); > > > pbus = pdev->bus; > > > hbus = container_of(pbus->sysdata, struct hv_pcibus_device, > > > sysdata); > > > - > > > - memset(¶ms, 0, sizeof(params)); > > > - params.partition_id = HV_PARTITION_ID_SELF; > > > - params.source = 1; /* MSI(-X) */ > > > - params.address = msi_desc->msg.address_lo; > > > - params.data = msi_desc->msg.data; > > > - params.device_id = (hbus->hdev->dev_instance.b[5] << 24) | > > > + params = &hbus->retarget_msi_interrupt_params; > > > + > > > + memset(params, 0, sizeof(*params)); > > > + params->partition_id = HV_PARTITION_ID_SELF; > > > + params->source = 1; /* MSI(-X) */ > > > + params->address = msi_desc->msg.address_lo; > > > + params->data = msi_desc->msg.data; > > > + params->device_id = (hbus->hdev->dev_instance.b[5] << 24) | > > >(hbus->hdev->dev_instance.b[4] << 16) | > > >(hbus->hdev->dev_instance.b[7] << 8) | > > >(hbus->hdev->dev_instance.b[6] & 0xf8) | > > >PCI_FUNC(pdev->devfn); > > > - params.vector = cfg->vector; > > > + params->vector = cfg->vector; > > > > > > for_each_cpu_and(cpu, dest, cpu_online_mask) > > > - params.vp_mask |= (1ULL << > > vmbus_cpu_number_to_vp_number(cpu)); > > > + params->vp_mask |= (1ULL << > > vmbus_cpu_number_to_vp_number(cpu)); > > > > > > - hv_do_hypercall(HVCALL_RETARGET_INTERRUPT, ¶ms, NULL); > > &
[PATCH] storvsc: do not assume SG list is continuous when doing bounce buffers (for 4.1 stable only)
From: Long Li This patch is for linux-stable 4.1 branch only. storvsc checks the SG list for gaps before passing them to Hyper-v device. If there are gaps, data is copied to a bounce buffer and a continuous data buffer is passed to Hyper-V. The check on gaps assumes SG list is continuous, and not chained. This is not always true. Failing the check may result in incorrect I/O data passed to the Hyper-v device. This code path is not used post Linux 4.1. Signed-off-by: Long Li --- drivers/scsi/storvsc_drv.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 6c52d14..14dc5c6 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -584,17 +584,18 @@ static int do_bounce_buffer(struct scatterlist *sgl, unsigned int sg_count) for (i = 0; i < sg_count; i++) { if (i == 0) { /* make sure 1st one does not have hole */ - if (sgl[i].offset + sgl[i].length != PAGE_SIZE) + if (sgl->offset + sgl->length != PAGE_SIZE) return i; } else if (i == sg_count - 1) { /* make sure last one does not have hole */ - if (sgl[i].offset != 0) + if (sgl->offset != 0) return i; } else { /* make sure no hole in the middle */ - if (sgl[i].length != PAGE_SIZE || sgl[i].offset != 0) + if (sgl->length != PAGE_SIZE || sgl->offset != 0) return i; } + sgl = sg_next(sgl); } return -1; } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] storvsc: do not assume SG list is continuous when doing bounce buffers (for 4.1 stable only)
> Wouldn't it make sense to backport the changes to set the virt_boundary > (which probably still is the SG_GAPS flag in such an old kernel)? We can make storvsc use SG_GAPS. But the following patch is missing in 4.1 stable block layer to make this work on some I/O situations. Backporting is more difficult and affect other code. commit 5e7c4274a70aa2d6f485996d0ca1dad52d0039ca Author: Jens Axboe Date: Thu Sep 3 19:28:20 2015 +0300 block: Check for gaps on front and back merges We are checking for gaps to previous bio_vec, which can only detect back merges gaps. Moreover, at the point where we check for a gap, we don't know if we will attempt a back or a front merge. Thus, check for gap to prev in a back merge attempt and check for a gap to next in a front merge attempt. Signed-off-by: Jens Axboe [sagig: Minor rename change] Signed-off-by: Sagi Grimberg ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] storvsc: fix memory leak on ring buffer busy
From: Long Li When storvsc is sending I/O to Hyper-v, it may allocate a bigger buffer descriptor for large data payload that can't fit into a pre-allocated buffer descriptor. This bigger buffer is freed on return path. If I/O request to Hyper-v fails due to ring buffer busy, the storvsc allocated buffer descriptor should also be freed. Signed-off-by: Long Li --- drivers/scsi/storvsc_drv.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 009adb0..db52882 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1657,6 +1657,8 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) ret = storvsc_do_io(dev, cmd_request, smp_processor_id()); if (ret == -EAGAIN) { + if (payload_sz > sizeof(cmd_request->mpb)) + kfree(payload); /* no more space */ return SCSI_MLQUEUE_DEVICE_BUSY; } -- 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel