Re: [PATCH 2/3] usb: dwc3: gadget: Fix early exit in set/clear ep halt

2017-04-10 Thread Felipe Balbi

Hi,

Thinh Nguyen  writes:
> This patch fixes a commit that causes a hang from device waiting for
> data with the wrong sequence number. The commit ffb80fc672c3 ("usb:
> dwc3: gadget: skip Set/Clear Halt when invalid") adds a check to return
> early depending on DWC3_EP_STALL is set or not, prevent sending the ep
> halt command to HW endpoint to do CLEAR_FEATURE(ENDPOINT_HALT) request.
> This was to workaround the issue for macOS where the device hangs from
> sending DWC3 clear stall command.
>
> In USB 3.1 spec, 9.4.5, CLEAR_FEATURE(ENDPOINT_HALT) request always
> results in the data sequence being reinitialized to zero regardless
> whether the endpoint has been halted or not. Some device class depends
> on this feature for its protocol. For instance, in mass storage class,
> there is MSC reset protocol that does CLEAR_FEATURE(ENDPOINT_HALT) on
> bulk endpoints. This protocol reinitializes the data sequence and
> ensures that whatever pending data requested from previous CBW will be
> reset. Otherwise this will cause a hang as the device can wait for the
> data with the wrong sequence number from the previous CBW. We found this
> failure in USB CV: MSC Error Recovery Test with f_mass_storage.
>
> This patch fixes this issue by checking to see whether the set/halt ep
> call is a protocol call before early exit to make sure that set/clear
> halt endpoint command can go through if it is a device class protocol.
>
> Fixes: ffb80fc672c3 ("usb: dwc3: gadget: skip Set/Clear Halt when invalid")
> Signed-off-by: Thinh Nguyen 

this will regress the macOS case I wrote that commit for. We need to
FIRST figure out why ClearStall command times out. Have you done that?

Do you have a testcase which constantly issues
ClearFeature(ENDPOINT_HALT) to a single endpoint and can verify that it
doesn't time out? Why does it timeout on macOS?

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/3] usb: dwc3: gadget: Fix early exit in set/clear ep halt

2017-04-10 Thread Felipe Balbi

Hi again,

Felipe Balbi  writes:
> Thinh Nguyen  writes:
>> This patch fixes a commit that causes a hang from device waiting for
>> data with the wrong sequence number. The commit ffb80fc672c3 ("usb:
>> dwc3: gadget: skip Set/Clear Halt when invalid") adds a check to return
>> early depending on DWC3_EP_STALL is set or not, prevent sending the ep
>> halt command to HW endpoint to do CLEAR_FEATURE(ENDPOINT_HALT) request.
>> This was to workaround the issue for macOS where the device hangs from
>> sending DWC3 clear stall command.
>>
>> In USB 3.1 spec, 9.4.5, CLEAR_FEATURE(ENDPOINT_HALT) request always
>> results in the data sequence being reinitialized to zero regardless
>> whether the endpoint has been halted or not. Some device class depends
>> on this feature for its protocol. For instance, in mass storage class,
>> there is MSC reset protocol that does CLEAR_FEATURE(ENDPOINT_HALT) on
>> bulk endpoints. This protocol reinitializes the data sequence and
>> ensures that whatever pending data requested from previous CBW will be
>> reset. Otherwise this will cause a hang as the device can wait for the
>> data with the wrong sequence number from the previous CBW. We found this
>> failure in USB CV: MSC Error Recovery Test with f_mass_storage.
>>
>> This patch fixes this issue by checking to see whether the set/halt ep
>> call is a protocol call before early exit to make sure that set/clear
>> halt endpoint command can go through if it is a device class protocol.
>>
>> Fixes: ffb80fc672c3 ("usb: dwc3: gadget: skip Set/Clear Halt when invalid")
>> Signed-off-by: Thinh Nguyen 
>
> this will regress the macOS case I wrote that commit for. We need to

no wait, it won't regress macOS, but we're still left with the problem
of host and peripheral being able to get DataToggle/SeqN out of sync.

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH] phy: qcom-qmp: Add dependency on COMMON_CLK

2017-04-10 Thread Vivek Gautam

On 2017-04-10 10:52, Kishon Vijay Abraham I wrote:

On Friday 07 April 2017 01:37 AM, Vivek Gautam wrote:

The driver uses clock provider interface, and therefore
it fails to build when enabled for COMPILE_TEST, since
COMMON_CLK is not enabled at that time.
So, make PHY_QCOM_QMP depend on COMMON_CLK as well.

Cc: Fengguang Wu 
Cc: Kishon Vijay Abraham I 
Signed-off-by: Vivek Gautam 
---

Hi Kishon,

This patch is fixing the build failures for architectures
such as, tile, and ia64, that don't enable COMMON_CLK by default.
You can either squash this into the qmp phy driver patch,
or put it as a separate patch with 'Fix' tag.
Let me know if you want me to add a 'fix' tag with a reference
to the commit ID.


I squashed it with your earlier patch and pushed.


Thanks Kishon.

Regards
Vivek



Thanks
Kishon



Regards
Vivek

 drivers/phy/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index bb8140355608..8550d3dc0b71 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -441,7 +441,7 @@ config PHY_STIH407_USB

 config PHY_QCOM_QMP
tristate "Qualcomm QMP PHY Driver"
-   depends on OF && (ARCH_QCOM || COMPILE_TEST)
+   depends on OF && COMMON_CLK && (ARCH_QCOM || COMPILE_TEST)
select GENERIC_PHY
help
  Enable this to support the QMP PHY transceiver that is used


--
To unsubscribe from this list: send the line "unsubscribe 
linux-arm-msm" in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: dwc3: ep0: improve handling of unaligned OUT requests

2017-04-10 Thread Janusz Dziedzic
On 7 April 2017 at 13:36, Felipe Balbi  wrote:
> Just like we did for all other endpoint types, let's rely on a chained
> TRB pointing to ep0_bounce_addr in order to align transfer size. This
> will make the code simpler.
>
> Signed-off-by: Felipe Balbi 
> ---
>  drivers/usb/dwc3/core.h   |  6 +
>  drivers/usb/dwc3/ep0.c| 65 
> +--
>  drivers/usb/dwc3/gadget.c | 50 
>  3 files changed, 29 insertions(+), 92 deletions(-)
>
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index 2a2673767ccd..e1958f6237af 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -43,7 +43,7 @@
>  #define DWC3_PULL_UP_TIMEOUT   500 /* ms */
>  #define DWC3_ZLP_BUF_SIZE  1024/* size of a superspeed bulk */
>  #define DWC3_BOUNCE_SIZE   1024/* size of a superspeed bulk */
> -#define DWC3_EP0_BOUNCE_SIZE   512
> +#define DWC3_EP0_SETUP_SIZE512
>  #define DWC3_ENDPOINTS_NUM 32
>  #define DWC3_XHCI_RESOURCES_NUM2
>
> @@ -763,12 +763,10 @@ struct dwc3_scratchpad_array {
>   * struct dwc3 - representation of our controller
>   * @drd_work - workqueue used for role swapping
>   * @ep0_trb: trb which is used for the ctrl_req
> - * @ep0_bounce: bounce buffer for ep0
>   * @zlp_buf: used when request->zero is set
>   * @setup_buf: used while precessing STD USB requests
>   * @ep0_trb: dma address of ep0_trb
>   * @ep0_usb_req: dummy req used while handling STD USB requests
> - * @ep0_bounce_addr: dma address of ep0_bounce
>   * @scratch_addr: dma address of scratchbuf
>   * @ep0_in_setup: one control transfer is completed and enter setup phase
>   * @lock: for synchronizing
> @@ -866,13 +864,11 @@ struct dwc3 {
> struct work_struct  drd_work;
> struct dwc3_trb *ep0_trb;
> void*bounce;
> -   void*ep0_bounce;
> void*zlp_buf;
> void*scratchbuf;
> u8  *setup_buf;
> dma_addr_t  ep0_trb_addr;
> dma_addr_t  bounce_addr;
> -   dma_addr_t  ep0_bounce_addr;
> dma_addr_t  scratch_addr;
> struct dwc3_request ep0_usb_req;
> struct completion   ep0_in_setup;
> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
> index 3ba2309c837f..04249243e4d3 100644
> --- a/drivers/usb/dwc3/ep0.c
> +++ b/drivers/usb/dwc3/ep0.c
> @@ -836,7 +836,6 @@ static void dwc3_ep0_complete_data(struct dwc3 *dwc,
> struct usb_request  *ur;
> struct dwc3_trb *trb;
> struct dwc3_ep  *ep0;
> -   unsignedtransfer_size = 0;
> unsignedmaxp;
> unsignedremaining_ur_length;
> void*buf;
> @@ -849,9 +848,7 @@ static void dwc3_ep0_complete_data(struct dwc3 *dwc,
> ep0 = dwc->eps[0];
>
> dwc->ep0_next_event = DWC3_EP0_NRDY_STATUS;
> -
> trb = dwc->ep0_trb;
> -
> trace_dwc3_complete_trb(ep0, trb);
>
> r = next_request(&ep0->pending_list);
> @@ -872,39 +869,17 @@ static void dwc3_ep0_complete_data(struct dwc3 *dwc,
> remaining_ur_length = ur->length;
>
> length = trb->size & DWC3_TRB_SIZE_MASK;
> -
> maxp = ep0->endpoint.maxpacket;
> +   transferred = ur->length - length;
> +   ur->actual += transferred;
>
> if (dwc->ep0_bounced) {
> -   /*
> -* Handle the first TRB before handling the bounce buffer if
> -* the request length is greater than the bounce buffer size
> -*/
> -   if (ur->length > DWC3_EP0_BOUNCE_SIZE) {
> -   transfer_size = ALIGN(ur->length - maxp, maxp);
> -   transferred = transfer_size - length;
> -   buf = (u8 *)buf + transferred;
> -   ur->actual += transferred;
> -   remaining_ur_length -= transferred;
> -
> -   trb++;
> -   length = trb->size & DWC3_TRB_SIZE_MASK;
> -
> -   ep0->trb_enqueue = 0;
> -   }
> -
> -   transfer_size = roundup((ur->length - transfer_size),
> -   maxp);
> -
> -   transferred = min_t(u32, remaining_ur_length,
> -   transfer_size - length);
> -   memcpy(buf, dwc->ep0_bounce, transferred);
> -   } else {
> -   transferred = ur->length - length;
> +   trb++;
> +   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
> +   ep0->trb_enqueue = 0;
> +   dwc->ep0_bounced = false;
> }
>
> -   ur->actual += transferred;
> -
> if ((epnum & 1) && ur->actual < ur->length) {
> /* for some reason we did

Re: [balbi-usb:testing/next 46/49] undefined reference to `extcon_get_edev_by_phandle'

2017-04-10 Thread Roger Quadros
On 09/04/17 03:28, Randy Dunlap wrote:
> On 04/08/17 09:29, kbuild test robot wrote:
>> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git 
>> testing/next
>> head:   9653f750a746d5abc4bc054d818eb32d1351eae9
>> commit: d5a80bad699c94703a0306edd250f65cfd050580 [46/49] usb: dwc3: Add 
>> dual-role support
>> config: x86_64-randconfig-h0-04081918 (attached as .config)
>> compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
>> reproduce:
>> git checkout d5a80bad699c94703a0306edd250f65cfd050580
>> # save the attached .config to linux build tree
>> make ARCH=x86_64 
>>
>> All errors (new ones prefixed by >>):
>>
>>drivers/built-in.o: In function `dwc3_drd_init':
 (.text+0x2aff13): undefined reference to `extcon_get_edev_by_phandle'
>>drivers/built-in.o: In function `dwc3_drd_init':
 (.text+0x2affb1): undefined reference to `extcon_register_notifier'
>>drivers/built-in.o: In function `dwc3_drd_init':
 (.text+0x2b0029): undefined reference to `extcon_get_state'
>>drivers/built-in.o: In function `dwc3_drd_exit':
 (.text+0x2b00e1): undefined reference to `extcon_unregister_notifier'
> 
> EXTCON=m in this config.
> 
> DWC3_DUAL_ROLE needs to depend on EXTCON...
> 
> 
Thanks for the analysis Randy.

Felipe, let me know if you want me to resend the patch or you'll fix it locally.
Thanks. :)

cheers,
-roger
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: host: plat: Enable xHCI plat runtime PM

2017-04-10 Thread Baolin Wang
Hi Mathias,

On 30 March 2017 at 11:26, Baolin Wang  wrote:
> Enable the xHCI plat runtime PM for parent device to suspend/resume
> xHCI. Also call pm_runtime_forbid() in probe() function to force users
> to explicitly enable runtime pm using power/control in sysfs, in case
> some parent devices didn't implement runtime PM callbacks.
>
> Signed-off-by: Baolin Wang 
> ---
>  drivers/usb/host/xhci-plat.c |   54 
> --
>  1 file changed, 47 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
> index bd02a6c..2036c24 100644
> --- a/drivers/usb/host/xhci-plat.c
> +++ b/drivers/usb/host/xhci-plat.c
> @@ -179,9 +179,15 @@ static int xhci_plat_probe(struct platform_device *pdev)
> return ret;
> }
>
> +   pm_runtime_set_active(&pdev->dev);
> +   pm_runtime_enable(&pdev->dev);
> +   pm_runtime_get_noresume(&pdev->dev);
> +
> hcd = usb_create_hcd(driver, &pdev->dev, dev_name(&pdev->dev));
> -   if (!hcd)
> -   return -ENOMEM;
> +   if (!hcd) {
> +   ret = -ENOMEM;
> +   goto disable_runtime;
> +   }
>
> res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> hcd->regs = devm_ioremap_resource(&pdev->dev, res);
> @@ -258,6 +264,14 @@ static int xhci_plat_probe(struct platform_device *pdev)
> if (ret)
> goto dealloc_usb2_hcd;
>
> +   pm_runtime_put_noidle(&pdev->dev);
> +
> +   /*
> +* Prevent runtime pm from being on as default, users should enable
> +* runtime pm using power/control in sysfs.
> +*/
> +   pm_runtime_forbid(&pdev->dev);
> +
> return 0;
>
>
> @@ -277,6 +291,10 @@ static int xhci_plat_probe(struct platform_device *pdev)
>  put_hcd:
> usb_put_hcd(hcd);
>
> +disable_runtime:
> +   pm_runtime_put_noidle(&pdev->dev);
> +   pm_runtime_disable(&pdev->dev);
> +
> return ret;
>  }
>
> @@ -298,6 +316,9 @@ static int xhci_plat_remove(struct platform_device *dev)
> clk_disable_unprepare(clk);
> usb_put_hcd(hcd);
>
> +   pm_runtime_set_suspended(&dev->dev);
> +   pm_runtime_disable(&dev->dev);
> +
> return 0;
>  }
>
> @@ -325,14 +346,33 @@ static int xhci_plat_resume(struct device *dev)
>
> return xhci_resume(xhci, 0);
>  }
> +#endif /* CONFIG_PM_SLEEP */
> +
> +#ifdef CONFIG_PM
> +static int xhci_plat_runtime_suspend(struct device *dev)
> +{
> +   struct usb_hcd  *hcd = dev_get_drvdata(dev);
> +   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
> +
> +   return xhci_suspend(xhci, device_may_wakeup(dev));
> +}
> +
> +static int xhci_plat_runtime_resume(struct device *dev)
> +{
> +   struct usb_hcd  *hcd = dev_get_drvdata(dev);
> +   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
> +
> +   return xhci_resume(xhci, 0);
> +}
> +#endif /* CONFIG_PM */
>
>  static const struct dev_pm_ops xhci_plat_pm_ops = {
> SET_SYSTEM_SLEEP_PM_OPS(xhci_plat_suspend, xhci_plat_resume)
> +
> +   SET_RUNTIME_PM_OPS(xhci_plat_runtime_suspend,
> +  xhci_plat_runtime_resume,
> +  NULL)
>  };
> -#define DEV_PM_OPS (&xhci_plat_pm_ops)
> -#else
> -#define DEV_PM_OPS NULL
> -#endif /* CONFIG_PM */
>
>  static const struct acpi_device_id usb_xhci_acpi_match[] = {
> /* XHCI-compliant USB Controller */
> @@ -346,7 +386,7 @@ static int xhci_plat_resume(struct device *dev)
> .remove = xhci_plat_remove,
> .driver = {
> .name = "xhci-hcd",
> -   .pm = DEV_PM_OPS,
> +   .pm = &xhci_plat_pm_ops,
> .of_match_table = of_match_ptr(usb_xhci_of_match),
> .acpi_match_table = ACPI_PTR(usb_xhci_acpi_match),
> },

Any comments?

-- 
Baolin.wang
Best Regards
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [balbi-usb:testing/next 46/49] undefined reference to `extcon_get_edev_by_phandle'

2017-04-10 Thread Felipe Balbi

Hi,

Roger Quadros  writes:
>>> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git 
>>> testing/next
>>> head:   9653f750a746d5abc4bc054d818eb32d1351eae9
>>> commit: d5a80bad699c94703a0306edd250f65cfd050580 [46/49] usb: dwc3: Add 
>>> dual-role support
>>> config: x86_64-randconfig-h0-04081918 (attached as .config)
>>> compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
>>> reproduce:
>>> git checkout d5a80bad699c94703a0306edd250f65cfd050580
>>> # save the attached .config to linux build tree
>>> make ARCH=x86_64 
>>>
>>> All errors (new ones prefixed by >>):
>>>
>>>drivers/built-in.o: In function `dwc3_drd_init':
> (.text+0x2aff13): undefined reference to `extcon_get_edev_by_phandle'
>>>drivers/built-in.o: In function `dwc3_drd_init':
> (.text+0x2affb1): undefined reference to `extcon_register_notifier'
>>>drivers/built-in.o: In function `dwc3_drd_init':
> (.text+0x2b0029): undefined reference to `extcon_get_state'
>>>drivers/built-in.o: In function `dwc3_drd_exit':
> (.text+0x2b00e1): undefined reference to `extcon_unregister_notifier'
>> 
>> EXTCON=m in this config.
>> 
>> DWC3_DUAL_ROLE needs to depend on EXTCON...
>> 
>> 
> Thanks for the analysis Randy.
>
> Felipe, let me know if you want me to resend the patch or you'll fix it 
> locally.
> Thanks. :)

8<- cut here 
-

From 28ea6be01e2cf244c461a40c8e9593816f894412 Mon Sep 17 00:00:00 2001
From: Roger Quadros 
Date: Wed, 5 Apr 2017 13:39:31 +0300
Subject: [PATCH v2] usb: dwc3: Add dual-role support

If dr_mode is "otg" then support dual role mode of operation.
Currently this mode is only supported when an extcon handle is
present in the dwc3 device tree node. This is needed to
get the ID status events of the port.

We're using a workqueue to manage the dual-role state transitions
as the extcon notifier (dwc3_drd_notifier) is called in an atomic
context by extcon_sync() and this doesn't go well with
usb_del_gadget_udc() causing a lockdep and softirq warning.

Signed-off-by: Roger Quadros 
Signed-off-by: Felipe Balbi 
---

Add EXTCON dependency

 drivers/usb/dwc3/Kconfig  |  1 +
 drivers/usb/dwc3/Makefile |  4 +++
 drivers/usb/dwc3/core.c   | 11 +++---
 drivers/usb/dwc3/core.h   | 14 
 drivers/usb/dwc3/drd.c| 85 +++
 5 files changed, 111 insertions(+), 4 deletions(-)
 create mode 100644 drivers/usb/dwc3/drd.c

diff --git a/drivers/usb/dwc3/Kconfig b/drivers/usb/dwc3/Kconfig
index c5aa235863e8..aee55114d269 100644
--- a/drivers/usb/dwc3/Kconfig
+++ b/drivers/usb/dwc3/Kconfig
@@ -41,6 +41,7 @@ config USB_DWC3_GADGET
 config USB_DWC3_DUAL_ROLE
bool "Dual Role mode"
depends on ((USB=y || USB=USB_DWC3) && (USB_GADGET=y || 
USB_GADGET=USB_DWC3))
+   depends on (EXTCON=y || EXTCON=USB_DWC3)
help
  This is the default mode of working of DWC3 controller where
  both host and gadget features are enabled.
diff --git a/drivers/usb/dwc3/Makefile b/drivers/usb/dwc3/Makefile
index ffca34029b21..f15fabbd1e59 100644
--- a/drivers/usb/dwc3/Makefile
+++ b/drivers/usb/dwc3/Makefile
@@ -17,6 +17,10 @@ ifneq ($(filter y,$(CONFIG_USB_DWC3_GADGET) 
$(CONFIG_USB_DWC3_DUAL_ROLE)),)
dwc3-y  += gadget.o ep0.o
 endif
 
+ifneq ($(CONFIG_USB_DWC3_DUAL_ROLE),)
+   dwc3-y  += drd.o
+endif
+
 ifneq ($(CONFIG_USB_DWC3_ULPI),)
dwc3-y  += ulpi.o
 endif
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 458e7c6cc002..455d89a1cd6d 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -921,7 +921,12 @@ static int dwc3_core_init_mode(struct dwc3 *dwc)
break;
case USB_DR_MODE_OTG:
INIT_WORK(&dwc->drd_work, __dwc3_set_mode);
-   dwc3_set_mode(dwc, DWC3_GCTL_PRTCAP_DEVICE);
+   ret = dwc3_drd_init(dwc);
+   if (ret) {
+   if (ret != -EPROBE_DEFER)
+   dev_err(dev, "failed to initialize 
dual-role\n");
+   return ret;
+   }
break;
default:
dev_err(dev, "Unsupported mode of operation %d\n", 
dwc->dr_mode);
@@ -941,9 +946,7 @@ static void dwc3_core_exit_mode(struct dwc3 *dwc)
dwc3_host_exit(dwc);
break;
case USB_DR_MODE_OTG:
-   dwc3_set_mode(dwc, DWC3_GCTL_PRTCAP_DEVICE);
-   dwc3_gadget_exit(dwc);
-   flush_work(&dwc->drd_work);
+   dwc3_drd_exit(dwc);
break;
default:
/* do nothing */
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 1fe23e36485f..981c77f5628e 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -785,6 +785,8 @@ struct dwc3_scratchpad_array {
  * @dr_mode: requested mode

Re: [PATCH 3/3] usb: dwc3: ep0: improve handling of unaligned OUT requests

2017-04-10 Thread Felipe Balbi

Hi,

Janusz Dziedzic  writes:
> On 7 April 2017 at 13:36, Felipe Balbi  wrote:
>> Just like we did for all other endpoint types, let's rely on a chained
>> TRB pointing to ep0_bounce_addr in order to align transfer size. This
>> will make the code simpler.
>>
>> Signed-off-by: Felipe Balbi 
>> ---
>>  drivers/usb/dwc3/core.h   |  6 +
>>  drivers/usb/dwc3/ep0.c| 65 
>> +--
>>  drivers/usb/dwc3/gadget.c | 50 
>>  3 files changed, 29 insertions(+), 92 deletions(-)
>>
>> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
>> index 2a2673767ccd..e1958f6237af 100644
>> --- a/drivers/usb/dwc3/core.h
>> +++ b/drivers/usb/dwc3/core.h
>> @@ -43,7 +43,7 @@
>>  #define DWC3_PULL_UP_TIMEOUT   500 /* ms */
>>  #define DWC3_ZLP_BUF_SIZE  1024/* size of a superspeed bulk */
>>  #define DWC3_BOUNCE_SIZE   1024/* size of a superspeed bulk */
>> -#define DWC3_EP0_BOUNCE_SIZE   512
>> +#define DWC3_EP0_SETUP_SIZE512
>>  #define DWC3_ENDPOINTS_NUM 32
>>  #define DWC3_XHCI_RESOURCES_NUM2
>>
>> @@ -763,12 +763,10 @@ struct dwc3_scratchpad_array {
>>   * struct dwc3 - representation of our controller
>>   * @drd_work - workqueue used for role swapping
>>   * @ep0_trb: trb which is used for the ctrl_req
>> - * @ep0_bounce: bounce buffer for ep0
>>   * @zlp_buf: used when request->zero is set
>>   * @setup_buf: used while precessing STD USB requests
>>   * @ep0_trb: dma address of ep0_trb
>>   * @ep0_usb_req: dummy req used while handling STD USB requests
>> - * @ep0_bounce_addr: dma address of ep0_bounce
>>   * @scratch_addr: dma address of scratchbuf
>>   * @ep0_in_setup: one control transfer is completed and enter setup phase
>>   * @lock: for synchronizing
>> @@ -866,13 +864,11 @@ struct dwc3 {
>> struct work_struct  drd_work;
>> struct dwc3_trb *ep0_trb;
>> void*bounce;
>> -   void*ep0_bounce;
>> void*zlp_buf;
>> void*scratchbuf;
>> u8  *setup_buf;
>> dma_addr_t  ep0_trb_addr;
>> dma_addr_t  bounce_addr;
>> -   dma_addr_t  ep0_bounce_addr;
>> dma_addr_t  scratch_addr;
>> struct dwc3_request ep0_usb_req;
>> struct completion   ep0_in_setup;
>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
>> index 3ba2309c837f..04249243e4d3 100644
>> --- a/drivers/usb/dwc3/ep0.c
>> +++ b/drivers/usb/dwc3/ep0.c
>> @@ -836,7 +836,6 @@ static void dwc3_ep0_complete_data(struct dwc3 *dwc,
>> struct usb_request  *ur;
>> struct dwc3_trb *trb;
>> struct dwc3_ep  *ep0;
>> -   unsignedtransfer_size = 0;
>> unsignedmaxp;
>> unsignedremaining_ur_length;
>> void*buf;
>> @@ -849,9 +848,7 @@ static void dwc3_ep0_complete_data(struct dwc3 *dwc,
>> ep0 = dwc->eps[0];
>>
>> dwc->ep0_next_event = DWC3_EP0_NRDY_STATUS;
>> -
>> trb = dwc->ep0_trb;
>> -
>> trace_dwc3_complete_trb(ep0, trb);
>>
>> r = next_request(&ep0->pending_list);
>> @@ -872,39 +869,17 @@ static void dwc3_ep0_complete_data(struct dwc3 *dwc,
>> remaining_ur_length = ur->length;
>>
>> length = trb->size & DWC3_TRB_SIZE_MASK;
>> -
>> maxp = ep0->endpoint.maxpacket;
>> +   transferred = ur->length - length;
>> +   ur->actual += transferred;
>>
>> if (dwc->ep0_bounced) {
>> -   /*
>> -* Handle the first TRB before handling the bounce buffer if
>> -* the request length is greater than the bounce buffer size
>> -*/
>> -   if (ur->length > DWC3_EP0_BOUNCE_SIZE) {
>> -   transfer_size = ALIGN(ur->length - maxp, maxp);
>> -   transferred = transfer_size - length;
>> -   buf = (u8 *)buf + transferred;
>> -   ur->actual += transferred;
>> -   remaining_ur_length -= transferred;
>> -
>> -   trb++;
>> -   length = trb->size & DWC3_TRB_SIZE_MASK;
>> -
>> -   ep0->trb_enqueue = 0;
>> -   }
>> -
>> -   transfer_size = roundup((ur->length - transfer_size),
>> -   maxp);
>> -
>> -   transferred = min_t(u32, remaining_ur_length,
>> -   transfer_size - length);
>> -   memcpy(buf, dwc->ep0_bounce, transferred);
>> -   } else {
>> -   transferred = ur->length - length;
>> +   trb++;
>> +   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
>> +   ep0->trb_enqueue = 0;
>> +   dwc->ep0_bounced = false;
>>  

[PATCH] smsc95xx: Add comments to the registers definition

2017-04-10 Thread Martin Wetterwald
This chip is used by a lot of embedded devices and also by the Raspberry
Pi 1, 2 & 3 which were created to promote the study of computer
sciences. Students wanting to learn kernel / network device driver
programming through those devices can only rely on the Linux kernel
driver source to make their own.

This commit adds a lot of comments to the registers definition to expand
the register names.

Cc: Steve Glendinning 
Cc: Microchip Linux Driver Support 
CC: David Miller 
Signed-off-by: Martin Wetterwald 
---
 drivers/net/usb/smsc95xx.c |   6 +-
 drivers/net/usb/smsc95xx.h | 474 ++---
 2 files changed, 273 insertions(+), 207 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 831aa33..52d71a8 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -33,7 +33,7 @@
 #include "smsc95xx.h"
 
 #define SMSC_CHIPNAME  "smsc95xx"
-#define SMSC_DRIVER_VERSION"1.0.5"
+#define SMSC_DRIVER_VERSION"1.0.6"
 #define HS_USB_PKT_SIZE(512)
 #define FS_USB_PKT_SIZE(64)
 #define DEFAULT_HS_BURST_CAP_SIZE  (16 * 1024 + 5 * HS_USB_PKT_SIZE)
@@ -1498,7 +1498,7 @@ static int smsc95xx_enter_suspend3(struct usbnet *dev)
if (ret < 0)
return ret;
 
-   if (val & 0x) {
+   if (val & RX_FIFO_INF_USED_) {
netdev_info(dev->net, "rx fifo not empty in autosuspend\n");
return -EBUSY;
}
@@ -2032,7 +2032,7 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet 
*dev,
skb_push(skb, 4);
tx_cmd_b = (u32)(skb->len - 4);
if (csum)
-   tx_cmd_b |= TX_CMD_B_CSUM_ENABLE;
+   tx_cmd_b |= TX_CMD_B_CSUM_EN;
cpu_to_le32s(&tx_cmd_b);
memcpy(skb->data, &tx_cmd_b, 4);
 
diff --git a/drivers/net/usb/smsc95xx.h b/drivers/net/usb/smsc95xx.h
index 29a4d9e..d82b667 100644
--- a/drivers/net/usb/smsc95xx.h
+++ b/drivers/net/usb/smsc95xx.h
@@ -21,128 +21,158 @@
 #define _SMSC95XX_H
 
 /* Tx command words */
-#define TX_CMD_A_DATA_OFFSET_  (0x001F)
-#define TX_CMD_A_FIRST_SEG_(0x2000)
-#define TX_CMD_A_LAST_SEG_ (0x1000)
-#define TX_CMD_A_BUF_SIZE_ (0x07FF)
+#define TX_CMD_A_DATA_OFFSET_  (0x001F)/* Data Start Offset */
+#define TX_CMD_A_FIRST_SEG_(0x2000)/* First Segment */
+#define TX_CMD_A_LAST_SEG_ (0x1000)/* Last Segment */
+#define TX_CMD_A_BUF_SIZE_ (0x07FF)/* Buffer Size */
 
-#define TX_CMD_B_CSUM_ENABLE   (0x4000)
-#define TX_CMD_B_ADD_CRC_DISABLE_  (0x2000)
-#define TX_CMD_B_DISABLE_PADDING_  (0x1000)
-#define TX_CMD_B_PKT_BYTE_LENGTH_  (0x07FF)
+#define TX_CMD_B_CSUM_EN   (0x4000)/* TX Checksum Enable */
+#define TX_CMD_B_ADD_CRC_DIS_  (0x2000)/* Add CRC Disable */
+#define TX_CMD_B_DIS_PADDING_  (0x1000)/* Disable Frame Padding */
+#define TX_CMD_B_FRAME_LENGTH_ (0x07FF)/* Frame Length (bytes) */
 
 /* Rx status word */
-#define RX_STS_FF_ (0x4000)/* Filter Fail */
-#define RX_STS_FL_ (0x3FFF)/* Frame Length */
-#define RX_STS_ES_ (0x8000)/* Error Summary */
-#define RX_STS_BF_ (0x2000)/* Broadcast Frame */
-#define RX_STS_LE_ (0x1000)/* Length Error */
-#define RX_STS_RF_ (0x0800)/* Runt Frame */
-#define RX_STS_MF_ (0x0400)/* Multicast Frame */
-#define RX_STS_TL_ (0x0080)/* Frame too long */
-#define RX_STS_CS_ (0x0040)/* Collision Seen */
-#define RX_STS_FT_ (0x0020)/* Frame Type */
-#define RX_STS_RW_ (0x0010)/* Receive Watchdog */
-#define RX_STS_ME_ (0x0008)/* Mii Error */
-#define RX_STS_DB_ (0x0004)/* Dribbling */
-#define RX_STS_CRC_(0x0002)/* CRC Error */
-
-/* SCSRs */
-#define ID_REV (0x00)
-#define ID_REV_CHIP_ID_MASK_   (0x)
-#define ID_REV_CHIP_REV_MASK_  (0x)
-#define ID_REV_CHIP_ID_9500_   (0x9500)
-#define ID_REV_CHIP_ID_9500A_  (0x9E00)
-#define ID_REV_CHIP_ID_9512_   (0xEC00)
-#define ID_REV_CHIP_ID_9530_   (0x9530)
-#define ID_REV_CHIP_ID_89530_  (0x9E08)
-#define ID_REV_CHIP_ID_9730_   (0x9730)
-
-#define INT_STS(0x08)
-#define INT_STS_TX_STOP_   (0x0002)
-#define INT_STS_RX_STOP_   (0x0001)
-#define INT_STS_PHY_INT_   (0x8000)
-#define INT_STS_TXE_   (0x4000)
-#define INT_STS_TDFU_  (0x2000)
-#define INT_STS_TDFO_   

[PATCH v2] smsc95xx: Add comments to the registers definition

2017-04-10 Thread Martin Wetterwald
This chip is used by a lot of embedded devices and also by the Raspberry
Pi 1, 2 & 3 which were created to promote the study of computer
sciences. Students wanting to learn kernel / network device driver
programming through those devices can only rely on the Linux kernel
driver source to make their own.

This commit adds a lot of comments to the registers definition to expand
the register names.

Cc: Steve Glendinning 
Cc: Microchip Linux Driver Support 
CC: David Miller 
Signed-off-by: Martin Wetterwald 
---
Changes in v2:
 * Correct David's email address (replace org by net)
 * Remove duplicate register definition (INT_ENP_TX_STOP_)

 drivers/net/usb/smsc95xx.c |   6 +-
 drivers/net/usb/smsc95xx.h | 473 ++---
 2 files changed, 272 insertions(+), 207 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 831aa33..52d71a8 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -33,7 +33,7 @@
 #include "smsc95xx.h"
 
 #define SMSC_CHIPNAME  "smsc95xx"
-#define SMSC_DRIVER_VERSION"1.0.5"
+#define SMSC_DRIVER_VERSION"1.0.6"
 #define HS_USB_PKT_SIZE(512)
 #define FS_USB_PKT_SIZE(64)
 #define DEFAULT_HS_BURST_CAP_SIZE  (16 * 1024 + 5 * HS_USB_PKT_SIZE)
@@ -1498,7 +1498,7 @@ static int smsc95xx_enter_suspend3(struct usbnet *dev)
if (ret < 0)
return ret;
 
-   if (val & 0x) {
+   if (val & RX_FIFO_INF_USED_) {
netdev_info(dev->net, "rx fifo not empty in autosuspend\n");
return -EBUSY;
}
@@ -2032,7 +2032,7 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet 
*dev,
skb_push(skb, 4);
tx_cmd_b = (u32)(skb->len - 4);
if (csum)
-   tx_cmd_b |= TX_CMD_B_CSUM_ENABLE;
+   tx_cmd_b |= TX_CMD_B_CSUM_EN;
cpu_to_le32s(&tx_cmd_b);
memcpy(skb->data, &tx_cmd_b, 4);
 
diff --git a/drivers/net/usb/smsc95xx.h b/drivers/net/usb/smsc95xx.h
index 29a4d9e..6c8acf1 100644
--- a/drivers/net/usb/smsc95xx.h
+++ b/drivers/net/usb/smsc95xx.h
@@ -21,128 +21,158 @@
 #define _SMSC95XX_H
 
 /* Tx command words */
-#define TX_CMD_A_DATA_OFFSET_  (0x001F)
-#define TX_CMD_A_FIRST_SEG_(0x2000)
-#define TX_CMD_A_LAST_SEG_ (0x1000)
-#define TX_CMD_A_BUF_SIZE_ (0x07FF)
+#define TX_CMD_A_DATA_OFFSET_  (0x001F)/* Data Start Offset */
+#define TX_CMD_A_FIRST_SEG_(0x2000)/* First Segment */
+#define TX_CMD_A_LAST_SEG_ (0x1000)/* Last Segment */
+#define TX_CMD_A_BUF_SIZE_ (0x07FF)/* Buffer Size */
 
-#define TX_CMD_B_CSUM_ENABLE   (0x4000)
-#define TX_CMD_B_ADD_CRC_DISABLE_  (0x2000)
-#define TX_CMD_B_DISABLE_PADDING_  (0x1000)
-#define TX_CMD_B_PKT_BYTE_LENGTH_  (0x07FF)
+#define TX_CMD_B_CSUM_EN   (0x4000)/* TX Checksum Enable */
+#define TX_CMD_B_ADD_CRC_DIS_  (0x2000)/* Add CRC Disable */
+#define TX_CMD_B_DIS_PADDING_  (0x1000)/* Disable Frame Padding */
+#define TX_CMD_B_FRAME_LENGTH_ (0x07FF)/* Frame Length (bytes) */
 
 /* Rx status word */
-#define RX_STS_FF_ (0x4000)/* Filter Fail */
-#define RX_STS_FL_ (0x3FFF)/* Frame Length */
-#define RX_STS_ES_ (0x8000)/* Error Summary */
-#define RX_STS_BF_ (0x2000)/* Broadcast Frame */
-#define RX_STS_LE_ (0x1000)/* Length Error */
-#define RX_STS_RF_ (0x0800)/* Runt Frame */
-#define RX_STS_MF_ (0x0400)/* Multicast Frame */
-#define RX_STS_TL_ (0x0080)/* Frame too long */
-#define RX_STS_CS_ (0x0040)/* Collision Seen */
-#define RX_STS_FT_ (0x0020)/* Frame Type */
-#define RX_STS_RW_ (0x0010)/* Receive Watchdog */
-#define RX_STS_ME_ (0x0008)/* Mii Error */
-#define RX_STS_DB_ (0x0004)/* Dribbling */
-#define RX_STS_CRC_(0x0002)/* CRC Error */
-
-/* SCSRs */
-#define ID_REV (0x00)
-#define ID_REV_CHIP_ID_MASK_   (0x)
-#define ID_REV_CHIP_REV_MASK_  (0x)
-#define ID_REV_CHIP_ID_9500_   (0x9500)
-#define ID_REV_CHIP_ID_9500A_  (0x9E00)
-#define ID_REV_CHIP_ID_9512_   (0xEC00)
-#define ID_REV_CHIP_ID_9530_   (0x9530)
-#define ID_REV_CHIP_ID_89530_  (0x9E08)
-#define ID_REV_CHIP_ID_9730_   (0x9730)
-
-#define INT_STS(0x08)
-#define INT_STS_TX_STOP_   (0x0002)
-#define INT_STS_RX_STOP_   (0x0001)
-#define INT_STS_PHY_INT_   (0x8000)
-#define INT_S

Re: [PATCH v3 1/3] usb: udc: allow adding and removing the same gadget device

2017-04-10 Thread Felipe Balbi

Hi,

Alan Stern  writes:
> On Wed, 5 Apr 2017, Felipe Balbi wrote:
>
>> >> >> --- a/drivers/usb/gadget/udc/core.c
>> >> >> +++ b/drivers/usb/gadget/udc/core.c
>> >> >> @@ -1273,6 +1273,7 @@ void usb_del_gadget_udc(struct usb_gadget 
>> >> >> *gadget)
>> >> >>flush_work(&gadget->work);
>> >> >>device_unregister(&udc->dev);
>> >> >>device_unregister(&gadget->dev);
>> >> >> +  memset(&gadget->dev, 0x00, sizeof(gadget->dev));
>> >> >>  }
>> >> >>  EXPORT_SYMBOL_GPL(usb_del_gadget_udc);
>> >> >
>> >> > Isn't this dangerous?  It's quite possible that the device_unregister() 
>> >> 
>> >> not on the gadget API, no.
>> >> 
>> >> > call on the previous line invokes the gadget->dev.release callback, 
>> >> > which might deallocate gadget.  If that happens, your new memset will 
>> >> > oops.
>> >> 
>> >> that won't happen. struct usb_gadget is a member of the UDC's private
>> >> structure, like this:
>> >> 
>> >> struct dwc3 {
>> >>   [...]
>> >>   struct usb_gadget   gadget;
>> >>   struct usb_gadget_driver *gadget_driver;
>> >>   [...]
>> >> };
>> >
>> > Yes.  So what?  Can't the UDC driver use the refcount inside struct 
>> > usb_gadget to control the lifetime of its private structure?
>> 
>> nope, not being used. At least not yet.
>
> I'm not convinced (yet)...
>
>> > (By the way, can you tell what's going on in net2280.c?  I must be
>> > missing something; it looks like gadget_release() would quickly run
>> > into problems because it calls dev_get_drvdata() for &gadget->dev, but
>> > net2280_probe() never calls dev_set_drvdata() for that device.  
>> > Furthermore, net2280_remove() continues to reference the net2280 struct
>> > after calling usb_del_gadget_udc(), and it never does seem to do a
>> > final put.)
>> 
>> static int net2280_probe(struct pci_dev *pdev, const struct pci_device_id 
>> *id)
>> {
>>  struct net2280  *dev;
>>  unsigned long   resource, len;
>>  void__iomem *base = NULL;
>>  int retval, i;
>> 
>>  /* alloc, and start init */
>>  dev = kzalloc(sizeof(*dev), GFP_KERNEL);
>>  if (dev == NULL) {
>>  retval = -ENOMEM;
>>  goto done;
>>  }
>> 
>>  pci_set_drvdata(pdev, dev);
>>  ^^^
>
> That sets the driver data in the struct pci_dev, not in
> dev->gadget.dev.  As far as I can see, _nothing_ in the driver sets the 
> driver data in dev->gadget.dev.

hmmm, indeed. The same is happening with other callers of
usb_add_gadget_udc_release().

I guess this should be enough?

@@ -3557,7 +3557,7 @@ static irqreturn_t net2280_irq(int irq, void *_dev)
 
 static void gadget_release(struct device *_dev)
 {
-   struct net2280  *dev = dev_get_drvdata(_dev);
+   struct net2280  *dev = dev_get_drvdata(_dev->parent);
 
kfree(dev);
 }


> (Even after all these years, I still get bothered by the way Dave 
> Brownell used to call everything "dev"...  IIRC, at one time he had a 
> line of code that went something like:  dev->dev.dev = &pdev->dev !)

:-)

>> >> I'm actually thinking that struct usb_gadget shouldn't have a struct
>> >> device at all. Just a pointer to a device, that would solve all these
>> >> issues.
>> >
>> > A pointer to which device?  The UDC?  That would change the directory 
>> > layout in sysfs.
>> 
>> indeed. Would that be a problem?
>
> Possibly for some userspace tool.

yeah, we can do dynamic allocation of the device pointer, no issue.

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH] usb: host: plat: Enable xHCI plat runtime PM

2017-04-10 Thread Mathias Nyman

On 10.04.2017 10:57, Baolin Wang wrote:

Hi Mathias,

On 30 March 2017 at 11:26, Baolin Wang  wrote:

Enable the xHCI plat runtime PM for parent device to suspend/resume
xHCI. Also call pm_runtime_forbid() in probe() function to force users
to explicitly enable runtime pm using power/control in sysfs, in case
some parent devices didn't implement runtime PM callbacks.

Signed-off-by: Baolin Wang 
---

Any comments?


Patch looks good, but it didn't apply cleanly on top of the series going to
usb next last Friday.
I wanted those to go forward so this got postponed.

Can you rebase this on latest usb-next?

-Mathias



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Felipe Balbi

Hi,

Thinh Nguyen  writes:
> f_mass_storage has a memorry barrier issue with the sleep and wake
> functions that can cause a deadlock. This results in intermittent hangs
> during MSC file transfer. The host will reset the device after receiving
> no response to resume the transfer. This issue is seen when dwc3 is
> processing 2 transfer-in-progress events at the same time, invoking
> completion handlers for CSW and CBW. Also this issue occurs depending on
> the system timing and latency.
>
> To increase the chance to hit this issue, you can force dwc3 driver to
> wait and process those 2 events at once by adding a small delay (~100us)
> in dwc3_check_event_buf() whenever the request is for CSW and read the
> event count again. Avoid debugging with printk and ftrace as extra
> delays and memory barrier will mask this issue.
>
> Scenario which can lead to failure:
> ---
> 1) The main thread sleeps and waits for the next command.
> 2) bulk_in_complete() wakes up main thread for CSW.
> 3) bulk_out_complete() tries to wake up the running main thread for CBW.
> 4) thread_wakeup_needed is not loaded with correct value in sleep_thread().
> 5) Main thread goes to sleep.
>
> See more detail below:
>
> Note the 2 critical variables.
>  * common->thread_wakeup_needed
>  * bh->state
>
> Eval'd  | CPU0 (main thread) | CPU1
> +--+--
> | get_next_command() |
> | ...|
> while(1)| while (bh->state != BUF_STATE_FULL) {  |
> |  for (;;) {|
> |set_current_state(TASK_INTERRUPTIBLE);|
> if(0)   |if (thread_wakeup_needed)   |
> |schedule()  |
>   |  | bulk_in_complete() {
>   |  | smp_wmb();
>   |  | bh->state = BUF_STATE_EMPTY
>   |  | smp_wmb();
>   |  | thread_wakeup_needed = 1
>   |  | wake_up_process()
>   |  | }
>   |  /* 2nd for loop iteration */|
> |  for (;;) {|
> |set_current_state(TASK_INTERRUPTIBLE);|
> if(1)   |if (thread_wakeup_needed)   |
> |  break |
> |  } |
>   |  | bulk_out_complete() {
>   |  | smp_wmb();
>   |  __set_current_state(TASK_RUNNING);  |
>   |  thread_wakeup_needed = 0;   |
>   |  smp_rmb();  |
>   |  |
>   |  /* 2nd while loop iteration */  |
> while(1)| while (bh->state != BUF_STATE_FULL) {  |
>   |  | bh->state = BUF_STATE_FULL;
>   |  | smp_wmb();
>   |   -->| thread_wakeup_needed = 1
>   |  | wake_up_process()
>   |  | }
> |  for (;;) {|
> |set_current_state(TASK_INTERRUPTIBLE);|
> if(0)   |if (thread_wakeup_needed)   |<--
> |schedule()  |
> ||
>   |  |
>
> thread_wakeup_needed is not guaranteed to be loaded before checking it.
> Note that this happens when wake_up_process() is called on a running
> process.
>
> This patch fixes this issue by explicitly use smp_mb() after clearing of
> thread_wakeup_needed in the sleep function. It serializes the order in
> which thread_wakeup_needed is loaded and stored, thus prevent loading
> the wrong value when checking it in the sleeper.
>
> However, a better solution in the future would be to use wait_queue
> method that takes care of managing memory barrier between waker and
> waiter.
>
> See DEFINE_WAIT_FUNC comment in kernel scheduling wait.c as this
> solution is similar to its implementation.
>
> Signed-off-by: Thinh Nguyen 
> ---
>  drivers/usb/gadget/function/f_mass_storage.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/usb/gadget/function/f_mass_storage.c 
> b/drivers/usb/gadget/function/f_mass_storage.c
> index 4c8aacc232c0..f40433eb8259 100644
> --- a/drivers/usb/gadget/function/f_mass_storage.c
> +++ b/drivers/usb/gadget/function/f_mass_storage.c
> @@ -627,7 +627,7 @@ static int sleep_thread(struct fsg_common *common, bool 
> can_freeze)
>   }
> 

Re: [PATCH 2/2] usb: host: xhci: using dma pool for scratchpad buffer

2017-04-10 Thread Mathias Nyman

On 10.04.2017 05:57, Peter Chen wrote:

On Thu, Mar 30, 2017 at 11:29:57AM +0300, Mathias Nyman wrote:


the other way around. First you patch dma_alloc_coherent() to
dma_zalloc_coherent() and that could go in during the -rc and possibly
get a stable tag, since spec requires kernel to zero the memory area.

--
balbi


In fact, it fixes two things at spec, the other thing is the scratchpad
buffer needs to be page_size boundary. I will send these two fixes
first.



If I remember correctly dma_alloc_coherent() takes care of proper alignment
and won't cross the boundary



Sorry to reply late.

The dma_alloc_coherent design is different per platform, and both the
code include/linux/dma-mapping.h and doc Documentation/DMA-API.txt
does not mention boundary and alignment.



DMA-API-HOWTO.txt mentions alignment and boundary:

"dma_alloc_coherent() returns two values: the virtual address which you
can use to access it from the CPU and dma_handle which you pass to the
card.

The CPU virtual address and the DMA address are both
guaranteed to be aligned to the smallest PAGE_SIZE order which
is greater than or equal to the requested size.  This invariant
exists (for example) to guarantee that if you allocate a chunk
which is smaller than or equal to 64 kilobytes, the extent of the
buffer you receive will not cross a 64K boundary."

To me it looks like it's taken care of

-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: host: plat: Enable xHCI plat runtime PM

2017-04-10 Thread Baolin Wang
Hi Mathias,

On 10 April 2017 at 18:09, Mathias Nyman  wrote:
> On 10.04.2017 10:57, Baolin Wang wrote:
>>
>> Hi Mathias,
>>
>> On 30 March 2017 at 11:26, Baolin Wang  wrote:
>>>
>>> Enable the xHCI plat runtime PM for parent device to suspend/resume
>>> xHCI. Also call pm_runtime_forbid() in probe() function to force users
>>> to explicitly enable runtime pm using power/control in sysfs, in case
>>> some parent devices didn't implement runtime PM callbacks.
>>>
>>> Signed-off-by: Baolin Wang 
>>> ---
>>
>> Any comments?
>
>
> Patch looks good, but it didn't apply cleanly on top of the series going to
> usb next last Friday.
> I wanted those to go forward so this got postponed.
>
> Can you rebase this on latest usb-next?

Sure. I will resend it. Thanks.

-- 
Baolin.wang
Best Regards
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Synopsis XHCI controller lock up

2017-04-10 Thread Roger Quadros
Hi,

I'm using a Texas Instruments PCIe XHCI USB card on x86-64 machine running 
Linux v4.11-rc6 with Ubuntu.
I can manage to get the XHCI controller in a unresponsive state with the 
following message on a
device disconnect

[  242.817353] usb 9-3: USB disconnect, device number 9
[  247.919845] xhci_hcd :03:00.0: xHCI host not responding to stop endpoint 
command.
[  247.919898] xhci_hcd :03:00.0: Assuming host is dying, halting host.
[  247.94] xhci_hcd :03:00.0: Host halt failed, -110
[  247.947781] xhci_hcd :03:00.0: Non-responsive xHCI host is not halting.
[  247.947782] xhci_hcd :03:00.0: Completing active URBs anyway.
[  247.947789] xhci_hcd :03:00.0: HC died; cleaning up
[  247.948070] sd 6:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK
[  247.948074] sd 6:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 00 00 26 9a 00 00 f0 
00
[  247.948076] blk_update_request: I/O error, dev sdb, sector 9882
[  247.948086] sd 6:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK
[  247.948087] sd 6:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 00 00 27 8a 00 00 10 
00
[  247.948088] blk_update_request: I/O error, dev sdb, sector 10122
[  247.948586] sd 6:0:0:0: [sdb] Synchronizing SCSI cache
[  247.948605] sd 6:0:0:0: [sdb] Synchronize Cache(10) failed: Result: 
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

Test case is pretty simple
1) connect *full-speed* mass storage device
2) start file transfer from mass storage to host (Bulk IN transfer)
3) while transfer is active pull out the mass storage device

I can see the lock up within 10 iterations

NOTE: I cannot reproduce the problem if transfer direction is from host to 
device or on a high speed device.
Same issue is seen on older kernels as well. I've tried v4.1 and v4.4.

lspci -nn shows
03:00.0 USB controller [0c03]: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 
xHCI Host Controller [104c:8241] (rev 02)

Apparently this controller uses Synopsis XHCI core which is XHCI v0.96 compliant
SNPS_ID: 0x5533171a

John,
Is there any quirk required for this controller that we're missing?

cheers,
-roger
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: using dma pool for scratchpad buffer

2017-04-10 Thread Peter Chen
On Mon, Apr 10, 2017 at 01:13:54PM +0300, Mathias Nyman wrote:
> On 10.04.2017 05:57, Peter Chen wrote:
> >On Thu, Mar 30, 2017 at 11:29:57AM +0300, Mathias Nyman wrote:
> 
> the other way around. First you patch dma_alloc_coherent() to
> dma_zalloc_coherent() and that could go in during the -rc and possibly
> get a stable tag, since spec requires kernel to zero the memory area.
> 
> --
> balbi
> >>>
> >>>In fact, it fixes two things at spec, the other thing is the scratchpad
> >>>buffer needs to be page_size boundary. I will send these two fixes
> >>>first.
> >>>
> >>
> >>If I remember correctly dma_alloc_coherent() takes care of proper alignment
> >>and won't cross the boundary
> >>
> >
> >Sorry to reply late.
> >
> >The dma_alloc_coherent design is different per platform, and both the
> >code include/linux/dma-mapping.h and doc Documentation/DMA-API.txt
> >does not mention boundary and alignment.
> >
> 
> DMA-API-HOWTO.txt mentions alignment and boundary:
> 
> "dma_alloc_coherent() returns two values: the virtual address which you
> can use to access it from the CPU and dma_handle which you pass to the
> card.
> 
> The CPU virtual address and the DMA address are both
> guaranteed to be aligned to the smallest PAGE_SIZE order which
> is greater than or equal to the requested size.  This invariant
> exists (for example) to guarantee that if you allocate a chunk
> which is smaller than or equal to 64 kilobytes, the extent of the
> buffer you receive will not cross a 64K boundary."
> 
> To me it looks like it's taken care of
> 
> -Mathias

Thanks, Mathias, then I will send patch for only zeroed Scratchpad
Buffer. How about my 1st patch in series?

-- 

Best Regards,
Peter Chen
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] usb: host: xhci-mem: allocate zeroed Scratchpad Buffer

2017-04-10 Thread Peter Chen
According to xHCI ch4.20 Scratchpad Buffers, the Scratchpad
Buffer needs to be zeroed.

...
The following operations take place to allocate
Scratchpad Buffers to the xHC:
...
b. Software clears the Scratchpad Buffer to '0'

Cc: stable 
Signed-off-by: Peter Chen 
---
 drivers/usb/host/xhci-mem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 032a702..580c890 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1718,7 +1718,7 @@ static int scratchpad_alloc(struct xhci_hcd *xhci, gfp_t 
flags)
xhci->dcbaa->dev_context_ptrs[0] = 
cpu_to_le64(xhci->scratchpad->sp_dma);
for (i = 0; i < num_sp; i++) {
dma_addr_t dma;
-   void *buf = dma_alloc_coherent(dev, xhci->page_size, &dma,
+   void *buf = dma_zalloc_coherent(dev, xhci->page_size, &dma,
flags);
if (!buf)
goto fail_sp5;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: dwc3: gadget: delay unmap of bounced requests

2017-04-10 Thread Roger Quadros
Hi Felipe,

On 13/03/17 16:38, Felipe Balbi wrote:
> From: Janusz Dziedzic 
> 
> In the case of bounced ep0 requests, we must delay DMA operation until
> after ->complete() otherwise we might overwrite contents of req->buf.
> 
> This caused problems with RNDIS gadget.
> 
> Signed-off-by: Janusz Dziedzic 
> Signed-off-by: Felipe Balbi 
> ---

Sekhar found out that this fixes RNDIS for us on v4.9.
Could you please tag this for stable? Thanks.

cheers,
-roger

>  drivers/usb/dwc3/gadget.c | 21 +
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 0d75158e43fe..79e7a3480d51 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -171,6 +171,7 @@ void dwc3_gadget_giveback(struct dwc3_ep *dep, struct 
> dwc3_request *req,
>   int status)
>  {
>   struct dwc3 *dwc = dep->dwc;
> + unsigned intunmap_after_complete = false;
>  
>   req->started = false;
>   list_del(&req->list);
> @@ -180,11 +181,19 @@ void dwc3_gadget_giveback(struct dwc3_ep *dep, struct 
> dwc3_request *req,
>   if (req->request.status == -EINPROGRESS)
>   req->request.status = status;
>  
> - if (dwc->ep0_bounced && dep->number <= 1)
> + /*
> +  * NOTICE we don't want to unmap before calling ->complete() if we're
> +  * dealing with a bounced ep0 request. If we unmap it here, we would end
> +  * up overwritting the contents of req->buf and this could confuse the
> +  * gadget driver.
> +  */
> + if (dwc->ep0_bounced && dep->number <= 1) {
>   dwc->ep0_bounced = false;
> -
> - usb_gadget_unmap_request_by_dev(dwc->sysdev,
> - &req->request, req->direction);
> + unmap_after_complete = true;
> + } else {
> + usb_gadget_unmap_request_by_dev(dwc->sysdev,
> + &req->request, req->direction);
> + }
>  
>   trace_dwc3_gadget_giveback(req);
>  
> @@ -192,6 +201,10 @@ void dwc3_gadget_giveback(struct dwc3_ep *dep, struct 
> dwc3_request *req,
>   usb_gadget_giveback_request(&dep->endpoint, &req->request);
>   spin_lock(&dwc->lock);
>  
> + if (unmap_after_complete)
> + usb_gadget_unmap_request_by_dev(dwc->sysdev,
> + &req->request, req->direction);
> +
>   if (dep->number > 1)
>   pm_runtime_put(dwc->dev);
>  }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: dwc3: gadget: delay unmap of bounced requests

2017-04-10 Thread Felipe Balbi

Hi,

Roger Quadros  writes:
>> From: Janusz Dziedzic 
>> 
>> In the case of bounced ep0 requests, we must delay DMA operation until
>> after ->complete() otherwise we might overwrite contents of req->buf.
>> 
>> This caused problems with RNDIS gadget.
>> 
>> Signed-off-by: Janusz Dziedzic 
>> Signed-off-by: Felipe Balbi 
>> ---
>
> Sekhar found out that this fixes RNDIS for us on v4.9.
> Could you please tag this for stable? Thanks.

Was this a regression on v4.9 or was it already broken? Was it broken
forever or which commit regressed it? If it falls into
"has-never-worked-before" then I'm not sure we should tag stable. If we
can blame a commit, then I can send the patch to stable, no issues.

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH 2/2] usb: host: xhci: using dma pool for scratchpad buffer

2017-04-10 Thread Mathias Nyman

On 10.04.2017 14:11, Peter Chen wrote:

On Mon, Apr 10, 2017 at 01:13:54PM +0300, Mathias Nyman wrote:

On 10.04.2017 05:57, Peter Chen wrote:

On Thu, Mar 30, 2017 at 11:29:57AM +0300, Mathias Nyman wrote:


the other way around. First you patch dma_alloc_coherent() to
dma_zalloc_coherent() and that could go in during the -rc and possibly
get a stable tag, since spec requires kernel to zero the memory area.

--
balbi


In fact, it fixes two things at spec, the other thing is the scratchpad
buffer needs to be page_size boundary. I will send these two fixes
first.



If I remember correctly dma_alloc_coherent() takes care of proper alignment
and won't cross the boundary



Sorry to reply late.

The dma_alloc_coherent design is different per platform, and both the
code include/linux/dma-mapping.h and doc Documentation/DMA-API.txt
does not mention boundary and alignment.



DMA-API-HOWTO.txt mentions alignment and boundary:

"dma_alloc_coherent() returns two values: the virtual address which you
can use to access it from the CPU and dma_handle which you pass to the
card.

The CPU virtual address and the DMA address are both
guaranteed to be aligned to the smallest PAGE_SIZE order which
is greater than or equal to the requested size.  This invariant
exists (for example) to guarantee that if you allocate a chunk
which is smaller than or equal to 64 kilobytes, the extent of the
buffer you receive will not cross a 64K boundary."

To me it looks like it's taken care of

-Mathias


Thanks, Mathias, then I will send patch for only zeroed Scratchpad
Buffer. How about my 1st patch in series?



Looks good, I'll add it to the queue for usb-next.

Thanks
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] usb: host: xhci-mem: allocate zeroed Scratchpad Buffer

2017-04-10 Thread Mathias Nyman

On 10.04.2017 14:15, Peter Chen wrote:

According to xHCI ch4.20 Scratchpad Buffers, the Scratchpad
Buffer needs to be zeroed.

...
The following operations take place to allocate
Scratchpad Buffers to the xHC:
...
b. Software clears the Scratchpad Buffer to '0'

Cc: stable 
Signed-off-by: Peter Chen 


Thanks, adding to queue for usb-linus (4.12-rc2)

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: dwc3: gadget: delay unmap of bounced requests

2017-04-10 Thread Sekhar Nori
Hi Felipe,

On Monday 10 April 2017 05:53 PM, Felipe Balbi wrote:
> 
> Hi,
> 
> Roger Quadros  writes:
>>> From: Janusz Dziedzic 
>>>
>>> In the case of bounced ep0 requests, we must delay DMA operation until
>>> after ->complete() otherwise we might overwrite contents of req->buf.
>>>
>>> This caused problems with RNDIS gadget.
>>>
>>> Signed-off-by: Janusz Dziedzic 
>>> Signed-off-by: Felipe Balbi 
>>> ---
>>
>> Sekhar found out that this fixes RNDIS for us on v4.9.
>> Could you please tag this for stable? Thanks.
> 
> Was this a regression on v4.9 or was it already broken? Was it broken
> forever or which commit regressed it? If it falls into
> "has-never-worked-before" then I'm not sure we should tag stable. If we
> can blame a commit, then I can send the patch to stable, no issues.

Mainline v4.9 works. v4.9.20 is broken because commit d62145929992
("usb: dwc3: gadget: always unmap EP0 requests") was backported it.

I think the $SUBJECT patch needs to be propagated to all the stable
kernels to which commit d62145929992 was back-ported to.

Thanks,
Sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: dwc3: gadget: delay unmap of bounced requests

2017-04-10 Thread Felipe Balbi

Hi,

Sekhar Nori  writes:
>> Roger Quadros  writes:
 From: Janusz Dziedzic 

 In the case of bounced ep0 requests, we must delay DMA operation until
 after ->complete() otherwise we might overwrite contents of req->buf.

 This caused problems with RNDIS gadget.

 Signed-off-by: Janusz Dziedzic 
 Signed-off-by: Felipe Balbi 
 ---
>>>
>>> Sekhar found out that this fixes RNDIS for us on v4.9.
>>> Could you please tag this for stable? Thanks.
>> 
>> Was this a regression on v4.9 or was it already broken? Was it broken
>> forever or which commit regressed it? If it falls into
>> "has-never-worked-before" then I'm not sure we should tag stable. If we
>> can blame a commit, then I can send the patch to stable, no issues.
>
> Mainline v4.9 works. v4.9.20 is broken because commit d62145929992
> ("usb: dwc3: gadget: always unmap EP0 requests") was backported it.
>
> I think the $SUBJECT patch needs to be propagated to all the stable
> kernels to which commit d62145929992 was back-ported to.

okay, thanks for letting me know :-)

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH] smsc95xx: Add comments to the registers definition

2017-04-10 Thread Andrew Lunn
Hi Martin

> @@ -2032,7 +2032,7 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet 
> *dev,
>   skb_push(skb, 4);
>   tx_cmd_b = (u32)(skb->len - 4);
>   if (csum)
> - tx_cmd_b |= TX_CMD_B_CSUM_ENABLE;
> + tx_cmd_b |= TX_CMD_B_CSUM_EN;

This changed seems a step backwards, ENABLE is much more readable than EN.

>  
> -#define TX_CMD_B_CSUM_ENABLE (0x4000)
> -#define TX_CMD_B_ADD_CRC_DISABLE_(0x2000)
> -#define TX_CMD_B_DISABLE_PADDING_(0x1000)
> -#define TX_CMD_B_PKT_BYTE_LENGTH_(0x07FF)
> +#define TX_CMD_B_CSUM_EN (0x4000)/* TX Checksum Enable */

And there is space for ABLE here.

Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Alan Stern
On Mon, 10 Apr 2017, Felipe Balbi wrote:

> Hi,
> 
> Thinh Nguyen  writes:
> > f_mass_storage has a memorry barrier issue with the sleep and wake
> > functions that can cause a deadlock. This results in intermittent hangs
> > during MSC file transfer. The host will reset the device after receiving
> > no response to resume the transfer. This issue is seen when dwc3 is
> > processing 2 transfer-in-progress events at the same time, invoking
> > completion handlers for CSW and CBW. Also this issue occurs depending on
> > the system timing and latency.
> >
> > To increase the chance to hit this issue, you can force dwc3 driver to
> > wait and process those 2 events at once by adding a small delay (~100us)
> > in dwc3_check_event_buf() whenever the request is for CSW and read the
> > event count again. Avoid debugging with printk and ftrace as extra
> > delays and memory barrier will mask this issue.

> >
> > Scenario which can lead to failure:
> > ---
> > 1) The main thread sleeps and waits for the next command.
> > 2) bulk_in_complete() wakes up main thread for CSW.
> > 3) bulk_out_complete() tries to wake up the running main thread for CBW.
> > 4) thread_wakeup_needed is not loaded with correct value in sleep_thread().
> > 5) Main thread goes to sleep.
> >
> > See more detail below:
> >
> > Note the 2 critical variables.
> >  * common->thread_wakeup_needed
> >  * bh->state
> >
> > Eval'd  | CPU0 (main thread)   | CPU1
> > +--+--
> > | get_next_command()   |
> > | ...  |
> > while(1)| while (bh->state != BUF_STATE_FULL) {|
> > |  for (;;) {  |
> > |set_current_state(TASK_INTERRUPTIBLE);|
> > if(0)   |if (thread_wakeup_needed) |
> > |schedule()|
> > |  | bulk_in_complete() {
> > |  | smp_wmb();
> > |  | bh->state = BUF_STATE_EMPTY
> > |  | smp_wmb();
> > |  | thread_wakeup_needed = 1
> > |  | wake_up_process()
> > |  | }
> > |  /* 2nd for loop iteration */|
> > |  for (;;) {  |
> > |set_current_state(TASK_INTERRUPTIBLE);|
> > if(1)   |if (thread_wakeup_needed) |
> > |  break   |
> > |  }   |
> > |  | bulk_out_complete() {
> > |  | smp_wmb();
> > |  __set_current_state(TASK_RUNNING);  |
> > |  thread_wakeup_needed = 0;   |
> > |  smp_rmb();  |
> > |  |
> > |  /* 2nd while loop iteration */  |
> > while(1)| while (bh->state != BUF_STATE_FULL) {|
> > |  | bh->state = BUF_STATE_FULL;
> > |  | smp_wmb();
> > |   -->| thread_wakeup_needed = 1
> > |  | wake_up_process()
> > |  | }
> > |  for (;;) {  |
> > |set_current_state(TASK_INTERRUPTIBLE);|
> > if(0)   |if (thread_wakeup_needed) |<--
> > |schedule()|
> > |  |
> > |  |
> >
> > thread_wakeup_needed is not guaranteed to be loaded before checking it.
> > Note that this happens when wake_up_process() is called on a running
> > process.
> >
> > This patch fixes this issue by explicitly use smp_mb() after clearing of
> > thread_wakeup_needed in the sleep function. It serializes the order in
> > which thread_wakeup_needed is loaded and stored, thus prevent loading
> > the wrong value when checking it in the sleeper.
> >
> > However, a better solution in the future would be to use wait_queue
> > method that takes care of managing memory barrier between waker and
> > waiter.
> >
> > See DEFINE_WAIT_FUNC comment in kernel scheduling wait.c as this
> > solution is similar to its implementation.
> >
> > Signed-off-by: Thinh Nguyen 
> > ---
> >  drivers/usb/gadget/function/f_mass_storage.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/gadget/function/f_mass_storage.c 
> > b/drivers/usb/gadget/function/f_mass_storage.c
> > index 4c8aacc232c0..f4043

Re: [PATCH 1/1] usb: ehci: put wmb at the end sitd/itd_link memory access

2017-04-10 Thread pierre kuo
hi Alan
>> > If you need to flush something, wmb() is the wrong primitive.
>> > It merely ensures ordering. It is not guaranteed to flush writes
>> Not guaranteed to flush writes?that makes me curious.
>> As far as I know, wmb() make sure any write operation before the wmb()
>> is completed and are observed before prior to the execution of any
>> subsequent write.
>
> No, that's not correct.  wmb() makes sure that any write operation
> before the wmb() is completed prior to the execution of any subsequent
> write.  It does _not_ guarantee that the earlier write is _observed_
> prior to the execution of subsequent writes.
>
> However, it _does_ guarantee that the earlier write is observed before
> any subsequent write is observed.  In other words, if we have A, wmb(),
> B, then:
>
> A completes before B is executed, and
> A is observed before B is observed.
>
> But we do not have: A is observed before B is executed.
Got it and thanks for your kind explanation. ^^

>
>> I can understand ur opinion that below (1) wmb is needed, since it
>> preserve the write order of (A) and (B).
>> But shouldn't we add wmb in (2) to make sure the ordering of (B) and
>> following writes?
>>   sitd->hw_next = ehci->periodic[frame];
>> ehci->pshadow[frame].sitd = sitd;
>> sitd->frame = frame; > (A)
>> -   wmb();  > (1)
>> ehci->periodic[frame] = cpu_to_hc32(ehci, sitd->sitd_dma |
>> Q_TYPE_SITD); -> (B)
>> +   /* make sure host see periodic frame change before scheduling */
>> +   wmb(); > (2)
>
> No.  (B) does not need to be ordered before following writes.  Think
> about it -- what would happen if a subsequent write was observed before
> (B)?
Per my understanding,
1. this part of code is protected by ehci->lock, so there will be only
one processor update the data.
2. Even subsequent write was observed before above (B), but all
necessary work is done on __this__ frame, and following write after
(1)  will only affect next frame's data.
if I am wrong, please let me know.

Appreciate your kind suggestion and correction,
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] usb: udc: allow adding and removing the same gadget device

2017-04-10 Thread Alan Stern
On Mon, 10 Apr 2017, Felipe Balbi wrote:

> Hi,
> 
> Alan Stern  writes:
> > On Wed, 5 Apr 2017, Felipe Balbi wrote:
> >
> >> >> >> --- a/drivers/usb/gadget/udc/core.c
> >> >> >> +++ b/drivers/usb/gadget/udc/core.c
> >> >> >> @@ -1273,6 +1273,7 @@ void usb_del_gadget_udc(struct usb_gadget 
> >> >> >> *gadget)
> >> >> >>  flush_work(&gadget->work);
> >> >> >>  device_unregister(&udc->dev);
> >> >> >>  device_unregister(&gadget->dev);
> >> >> >> +memset(&gadget->dev, 0x00, sizeof(gadget->dev));
> >> >> >>  }
> >> >> >>  EXPORT_SYMBOL_GPL(usb_del_gadget_udc);
> >> >> >
> >> >> > Isn't this dangerous?  It's quite possible that the 
> >> >> > device_unregister() 
> >> >> 
> >> >> not on the gadget API, no.
> >> >> 
> >> >> > call on the previous line invokes the gadget->dev.release callback, 
> >> >> > which might deallocate gadget.  If that happens, your new memset will 
> >> >> > oops.
> >> >> 
> >> >> that won't happen. struct usb_gadget is a member of the UDC's private
> >> >> structure, like this:
> >> >> 
> >> >> struct dwc3 {
> >> >> [...]
> >> >> struct usb_gadget   gadget;
> >> >> struct usb_gadget_driver *gadget_driver;
> >> >> [...]
> >> >> };
> >> >
> >> > Yes.  So what?  Can't the UDC driver use the refcount inside struct 
> >> > usb_gadget to control the lifetime of its private structure?
> >> 
> >> nope, not being used. At least not yet.
> >
> > I'm not convinced (yet)...
> >
> >> > (By the way, can you tell what's going on in net2280.c?  I must be
> >> > missing something; it looks like gadget_release() would quickly run
> >> > into problems because it calls dev_get_drvdata() for &gadget->dev, but
> >> > net2280_probe() never calls dev_set_drvdata() for that device.  
> >> > Furthermore, net2280_remove() continues to reference the net2280 struct
> >> > after calling usb_del_gadget_udc(), and it never does seem to do a
> >> > final put.)
> >> 
> >> static int net2280_probe(struct pci_dev *pdev, const struct pci_device_id 
> >> *id)
> >> {
> >>struct net2280  *dev;
> >>unsigned long   resource, len;
> >>void__iomem *base = NULL;
> >>int retval, i;
> >> 
> >>/* alloc, and start init */
> >>dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> >>if (dev == NULL) {
> >>retval = -ENOMEM;
> >>goto done;
> >>}
> >> 
> >>pci_set_drvdata(pdev, dev);
> >>^^^
> >
> > That sets the driver data in the struct pci_dev, not in
> > dev->gadget.dev.  As far as I can see, _nothing_ in the driver sets the 
> > driver data in dev->gadget.dev.
> 
> hmmm, indeed. The same is happening with other callers of
> usb_add_gadget_udc_release().
> 
> I guess this should be enough?
> 
> @@ -3557,7 +3557,7 @@ static irqreturn_t net2280_irq(int irq, void *_dev)
>  
>  static void gadget_release(struct device *_dev)
>  {
> - struct net2280  *dev = dev_get_drvdata(_dev);
> + struct net2280  *dev = dev_get_drvdata(_dev->parent);
>  
>   kfree(dev);
>  }

Oddly enough, yes.  But it doesn't explain why this code doesn't blow 
up every time it gets called, in its current form.

And it doesn't help with the fact that net2280_remove() continues to 
access the private data structure after calling usb_del_gadget_udc().  
Strictly speaking, that routine should do

get_device(&dev->gadget.dev);

at the start, with a corresponding put_device() at the end.

There's another problem.  Suppose a call to 
usb_add_gadget_udc_release() fails.  At the end of that routine, the 
error pathway does put_device(&gadget->dev).  This will invoke the 
release callback, deallocating the private data structure without 
giving the caller (i.e., the UDC driver) a chance to clean up.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201

2017-04-10 Thread Daniele Palmas
Telit LE920A4 uses the same pid 0x1201 of LE920, but modem
implementation is different, since it requires DTR to be set for
answering to qmi messages.

This patch replaces QMI_FIXED_INTF with QMI_QUIRK_SET_DTR: tests on
LE920 have been performed in order to verify backward compatibility.

Signed-off-by: Daniele Palmas 
---

Hi Bjørn,

as a side note in latest kernels I had troubles with qmi devices
(e.g. I/O error when using qmicli).

I found your suggestion in libqmi mailing list to revert commit

833415a3e781a26fe480a34d45086bdb4fe1e4c0
cdc-wdm: fix "out-of-sync" due to missing notifications

and this seems to work.

Regards,
Daniele

---
 drivers/net/usb/qmi_wwan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 156f7f8..2474618 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -908,7 +908,7 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x2357, 0x9000, 4)},/* TP-LINK MA260 */
{QMI_QUIRK_SET_DTR(0x1bc7, 0x1040, 2)}, /* Telit LE922A */
{QMI_FIXED_INTF(0x1bc7, 0x1200, 5)},/* Telit LE920 */
-   {QMI_FIXED_INTF(0x1bc7, 0x1201, 2)},/* Telit LE920 */
+   {QMI_QUIRK_SET_DTR(0x1bc7, 0x1201, 2)}, /* Telit LE920, LE920A4 */
{QMI_FIXED_INTF(0x1c9e, 0x9b01, 3)},/* XS Stick W100-2 from 4G 
Systems */
{QMI_FIXED_INTF(0x0b3c, 0xc000, 4)},/* Olivetti Olicard 100 */
{QMI_FIXED_INTF(0x0b3c, 0xc001, 4)},/* Olivetti Olicard 120 */
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 11:01:59AM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Felipe Balbi wrote:
> 
> > Hi,
> > 
> > Thinh Nguyen  writes:
> > > f_mass_storage has a memorry barrier issue with the sleep and wake
> > > functions that can cause a deadlock. This results in intermittent hangs
> > > during MSC file transfer. The host will reset the device after receiving
> > > no response to resume the transfer. This issue is seen when dwc3 is
> > > processing 2 transfer-in-progress events at the same time, invoking
> > > completion handlers for CSW and CBW. Also this issue occurs depending on
> > > the system timing and latency.
> > >
> > > To increase the chance to hit this issue, you can force dwc3 driver to
> > > wait and process those 2 events at once by adding a small delay (~100us)
> > > in dwc3_check_event_buf() whenever the request is for CSW and read the
> > > event count again. Avoid debugging with printk and ftrace as extra
> > > delays and memory barrier will mask this issue.
> 
> > >
> > > Scenario which can lead to failure:
> > > ---
> > > 1) The main thread sleeps and waits for the next command.
> > > 2) bulk_in_complete() wakes up main thread for CSW.
> > > 3) bulk_out_complete() tries to wake up the running main thread for CBW.
> > > 4) thread_wakeup_needed is not loaded with correct value in 
> > > sleep_thread().
> > > 5) Main thread goes to sleep.
> > >
> > > See more detail below:
> > >
> > > Note the 2 critical variables.
> > >  * common->thread_wakeup_needed
> > >  * bh->state
> > >
> > > Eval'd  | CPU0 (main thread) | CPU1
> > > +--+--
> > > | get_next_command() |
> > > | ...|
> > > while(1)| while (bh->state != BUF_STATE_FULL) {  |
> > > |  for (;;) {|
> > > |set_current_state(TASK_INTERRUPTIBLE);|
> > > if(0)   |if (thread_wakeup_needed)   |
> > > |schedule()  |
> > >   |  | bulk_in_complete() {
> > >   |  | smp_wmb();
> > >   |  | bh->state = BUF_STATE_EMPTY
> > >   |  | smp_wmb();
> > >   |  | thread_wakeup_needed = 1
> > >   |  | wake_up_process()
> > >   |  | }
> > >   |  /* 2nd for loop iteration */|
> > > |  for (;;) {|
> > > |set_current_state(TASK_INTERRUPTIBLE);|
> > > if(1)   |if (thread_wakeup_needed)   |
> > > |  break |
> > > |  } |
> > >   |  | bulk_out_complete() {
> > >   |  | smp_wmb();
> > >   |  __set_current_state(TASK_RUNNING);  |
> > >   |  thread_wakeup_needed = 0;   |
> > >   |  smp_rmb();  |
> > >   |  |
> > >   |  /* 2nd while loop iteration */  |
> > > while(1)| while (bh->state != BUF_STATE_FULL) {  |
> > >   |  | bh->state = BUF_STATE_FULL;
> > >   |  | smp_wmb();
> > >   |   -->| thread_wakeup_needed = 1
> > >   |  | wake_up_process()
> > >   |  | }
> > > |  for (;;) {|
> > > |set_current_state(TASK_INTERRUPTIBLE);|
> > > if(0)   |if (thread_wakeup_needed)   |<--
> > > |schedule()  |
> > > ||
> > >   |  |
> > >
> > > thread_wakeup_needed is not guaranteed to be loaded before checking it.
> > > Note that this happens when wake_up_process() is called on a running
> > > process.
> > >
> > > This patch fixes this issue by explicitly use smp_mb() after clearing of
> > > thread_wakeup_needed in the sleep function. It serializes the order in
> > > which thread_wakeup_needed is loaded and stored, thus prevent loading
> > > the wrong value when checking it in the sleeper.
> > >
> > > However, a better solution in the future would be to use wait_queue
> > > method that takes care of managing memory barrier between waker and
> > > waiter.
> > >
> > > See DEFINE_WAIT_FUNC comment in kernel scheduling wait.c as this
> > > solution is similar to its implementation.
> > >
> > > Signed-off-by: Thinh Nguyen 
> > > ---
> > >  drivers/usb/gadget/function/f_mass_storage.c | 2 +-
> > >  1 file c

Re: [PATCH 1/1] usb: ehci: put wmb at the end sitd/itd_link memory access

2017-04-10 Thread Alan Stern
On Mon, 10 Apr 2017, pierre kuo wrote:

> hi Alan

Hello.

> >> > If you need to flush something, wmb() is the wrong primitive.
> >> > It merely ensures ordering. It is not guaranteed to flush writes
> >> Not guaranteed to flush writes?that makes me curious.
> >> As far as I know, wmb() make sure any write operation before the wmb()
> >> is completed and are observed before prior to the execution of any
> >> subsequent write.
> >
> > No, that's not correct.  wmb() makes sure that any write operation
> > before the wmb() is completed prior to the execution of any subsequent
> > write.  It does _not_ guarantee that the earlier write is _observed_
> > prior to the execution of subsequent writes.
> >
> > However, it _does_ guarantee that the earlier write is observed before
> > any subsequent write is observed.  In other words, if we have A, wmb(),
> > B, then:
> >
> > A completes before B is executed, and
> > A is observed before B is observed.
> >
> > But we do not have: A is observed before B is executed.
> Got it and thanks for your kind explanation. ^^
> 
> >
> >> I can understand ur opinion that below (1) wmb is needed, since it
> >> preserve the write order of (A) and (B).
> >> But shouldn't we add wmb in (2) to make sure the ordering of (B) and
> >> following writes?
> >>   sitd->hw_next = ehci->periodic[frame];
> >> ehci->pshadow[frame].sitd = sitd;
> >> sitd->frame = frame; > (A)
> >> -   wmb();  > (1)
> >> ehci->periodic[frame] = cpu_to_hc32(ehci, sitd->sitd_dma |
> >> Q_TYPE_SITD); -> (B)
> >> +   /* make sure host see periodic frame change before scheduling */
> >> +   wmb(); > (2)
> >
> > No.  (B) does not need to be ordered before following writes.  Think
> > about it -- what would happen if a subsequent write was observed before
> > (B)?
> Per my understanding,
> 1. this part of code is protected by ehci->lock, so there will be only
> one processor update the data.
> 2. Even subsequent write was observed before above (B), but all
> necessary work is done on __this__ frame, and following write after
> (1)  will only affect next frame's data.
> if I am wrong, please let me know.

You are correct.  But there's more to it.  Even if a subsequent write
was observed before B, it wouldn't matter.  If the EHCI hardware sees
the old value of ehci->periodic[frame] -- the value before B updated it
-- then it will start reading the frame's data from the old starting
point, the position after sitd.  That's okay, skipping over the sitd
that was just added won't hurt anything.

But suppose you remove the wmb at (1).  Then the EHCI hardware might 
see the new value of ehci->periodic[frame] and start reading the data 
from the new sitd, before it sees the value stored in sitd->hw_next.  
In other words, the hardware would read garbage from sitd->hw_next 
rather than the value stored two lines above (A), and the hardware 
would crash.  That's why the wmb at (1) is necessary.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Alan Stern
On Mon, 10 Apr 2017, Paul E. McKenney wrote:

> > But I would like to get this matter settled first.  Is the explicit 
> > barrier truly necessary?
> 
> If you are using wait_event()/wake_up() or friends, the explicit
> barrier -is- necessary.  To see this, look at v4.10's wait_event():
> 
>   #define wait_event(wq, condition)   \
>   do {\
>   might_sleep();  \
>   if (condition)  \
>   break;  \
>   __wait_event(wq, condition);\
>   } while (0)
> 
> As you can see, if the condition is set just before the wait_event()
> macro checks it, there is no ordering whatsoever.

This is true, but it is not relevant to the question I was asking.

>  And if wake_up()
> finds nothing to wake up, there is no relevant ordering on that side,
> either.
> 
> So you had better supply your own ordering, period, end of story.

The question is: Exactly what ordering do I need to supply?  The 
ordering among my own variables is okay; I know how to deal with that.  
But what about the ordering between my variables and current->state?

For example, __wait_event() calls prepare_to_wait(), which calls
set_current_state(), which calls smp_store_mb(), thereby inserting a
full memory barrier between setting current->state and checking the
condition.  But I didn't see any comparable barrier inserted by
wake_up(), between setting the condition and checking task->state.

However, now that I look more closely, I do see that wakeup_process() 
calls try_to_wake_up(), which begins with:

/*
 * If we are going to wake up a thread waiting for CONDITION we
 * need to ensure that CONDITION=1 done by the caller can not be
 * reordered with p->state check below. This pairs with mb() in
 * set_current_state() the waiting thread does.
 */
smp_mb__before_spinlock();
raw_spin_lock_irqsave(&p->pi_lock, flags);
if (!(p->state & state))

So it does insert a full barrier after all, and there is nothing to 
worry about.

This also means that the analysis provided by Thinh Nguyen in the 
original patch description is wrong.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 12:20:53PM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> 
> > > But I would like to get this matter settled first.  Is the explicit 
> > > barrier truly necessary?
> > 
> > If you are using wait_event()/wake_up() or friends, the explicit
> > barrier -is- necessary.  To see this, look at v4.10's wait_event():
> > 
> > #define wait_event(wq, condition)   \
> > do {\
> > might_sleep();  \
> > if (condition)  \
> > break;  \
> > __wait_event(wq, condition);\
> > } while (0)
> > 
> > As you can see, if the condition is set just before the wait_event()
> > macro checks it, there is no ordering whatsoever.
> 
> This is true, but it is not relevant to the question I was asking.

Apologies!  What I get for answering email too early on Monday, I guess...

> >  And if wake_up()
> > finds nothing to wake up, there is no relevant ordering on that side,
> > either.
> > 
> > So you had better supply your own ordering, period, end of story.
> 
> The question is: Exactly what ordering do I need to supply?  The 
> ordering among my own variables is okay; I know how to deal with that.  
> But what about the ordering between my variables and current->state?

The ordering with current->state is sadly not relevant because it is
only touched if wake_up() actually wakes the process up.

> For example, __wait_event() calls prepare_to_wait(), which calls
> set_current_state(), which calls smp_store_mb(), thereby inserting a
> full memory barrier between setting current->state and checking the
> condition.  But I didn't see any comparable barrier inserted by
> wake_up(), between setting the condition and checking task->state.
> 
> However, now that I look more closely, I do see that wakeup_process() 
> calls try_to_wake_up(), which begins with:
> 
>   /*
>* If we are going to wake up a thread waiting for CONDITION we
>* need to ensure that CONDITION=1 done by the caller can not be
>* reordered with p->state check below. This pairs with mb() in
>* set_current_state() the waiting thread does.
>*/
>   smp_mb__before_spinlock();
>   raw_spin_lock_irqsave(&p->pi_lock, flags);
>   if (!(p->state & state))
> 
> So it does insert a full barrier after all, and there is nothing to 
> worry about.

Nice!

Hmmm...

Another valid (and I believe more common) idiom is this:

spin_lock(&mylock);
changes_that_must_be_visible_to_woken_thread();
WRITE_ONCE(need_wake_up, true);
spin_unlock(&mylock);

---

wait_event(wq, READ_ONCE(need_wake_up));
spin_lock(&mylock);
access_variables_used_by_waking_thread();
spin_unlock(&mylock);

In this case, the locks do all the required ordering.

> This also means that the analysis provided by Thinh Nguyen in the 
> original patch description is wrong.

And that the bug is elsewhere?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Alan Stern
On Mon, 10 Apr 2017, Paul E. McKenney wrote:

> On Mon, Apr 10, 2017 at 12:20:53PM -0400, Alan Stern wrote:
> > On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> > 
> > > > But I would like to get this matter settled first.  Is the explicit 
> > > > barrier truly necessary?
> > > 
> > > If you are using wait_event()/wake_up() or friends, the explicit
> > > barrier -is- necessary.  To see this, look at v4.10's wait_event():
> > > 
> > >   #define wait_event(wq, condition)   \
> > >   do {\
> > >   might_sleep();  \
> > >   if (condition)  \
> > >   break;  \
> > >   __wait_event(wq, condition);\
> > >   } while (0)
> > > 
> > > As you can see, if the condition is set just before the wait_event()
> > > macro checks it, there is no ordering whatsoever.
> > 
> > This is true, but it is not relevant to the question I was asking.
> 
> Apologies!  What I get for answering email too early on Monday, I guess...
> 
> > >  And if wake_up()
> > > finds nothing to wake up, there is no relevant ordering on that side,
> > > either.
> > > 
> > > So you had better supply your own ordering, period, end of story.
> > 
> > The question is: Exactly what ordering do I need to supply?  The 
> > ordering among my own variables is okay; I know how to deal with that.  
> > But what about the ordering between my variables and current->state?
> 
> The ordering with current->state is sadly not relevant because it is
> only touched if wake_up() actually wakes the process up.

Well, it is _written_ only if wake_up() actually wakes the process up.  
But it is _read_ in every case.

> > For example, __wait_event() calls prepare_to_wait(), which calls
> > set_current_state(), which calls smp_store_mb(), thereby inserting a
> > full memory barrier between setting current->state and checking the
> > condition.  But I didn't see any comparable barrier inserted by
> > wake_up(), between setting the condition and checking task->state.
> > 
> > However, now that I look more closely, I do see that wakeup_process() 
> > calls try_to_wake_up(), which begins with:
> > 
> > /*
> >  * If we are going to wake up a thread waiting for CONDITION we
> >  * need to ensure that CONDITION=1 done by the caller can not be
> >  * reordered with p->state check below. This pairs with mb() in
> >  * set_current_state() the waiting thread does.
> >  */
> > smp_mb__before_spinlock();
> > raw_spin_lock_irqsave(&p->pi_lock, flags);
> > if (!(p->state & state))
> > 
> > So it does insert a full barrier after all, and there is nothing to 
> > worry about.
> 
> Nice!

It looks like the other wakeup pathways end up funnelling through 
try_to_wake_up(), so this is true in general.

> Hmmm...
> 
> Another valid (and I believe more common) idiom is this:
> 
>   spin_lock(&mylock);
>   changes_that_must_be_visible_to_woken_thread();
>   WRITE_ONCE(need_wake_up, true);
>   spin_unlock(&mylock);
> 
>   ---
> 
>   wait_event(wq, READ_ONCE(need_wake_up));
>   spin_lock(&mylock);
>   access_variables_used_by_waking_thread();
>   spin_unlock(&mylock);
> 
> In this case, the locks do all the required ordering.
> 
> > This also means that the analysis provided by Thinh Nguyen in the 
> > original patch description is wrong.
> 
> And that the bug is elsewhere?

Presumably.  On the other hand, Thinh Nguyen claimed to have narrowed
the problem down to this particular mechanism.  The driver in question
in drivers/usb/gadget/function/f_mass_storage.c.  The waker routines
are bulk_out_complete()/wakeup_thread(), which do:

// bulk_out_complete()
bh->state = BH_STATE_FULL;

// wakeup_thread()
smp_wmb();  /* ensure the write of bh->state is complete */
/* Tell the main thread that something has happened */
common->thread_wakeup_needed = 1;
if (common->thread_task)
wake_up_process(common->thread_task);

and the waiters are get_next_command()/sleep_thread(), which do:

// get_next_command()
while (bh->state == BH_STATE_EMPTY) {

// sleep_thread()
for (;;) {
if (can_freeze)
try_to_freeze();
set_current_state(TASK_INTERRUPTIBLE);
if (signal_pending(current)) {
rc = -EINTR;
break;
}
if (common->thread_wakeup_needed)
break;
schedule();
}
__set_current_state(TASK_RUNNING);
common->thread_wakeup_needed = 0;
smp_rmb();  /* ensure the latest bh->state is visible */

}

and he said that the problem was caused by the waiter seeing 
thread_wakeup_needed

Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 01:48:13PM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> 
> > On Mon, Apr 10, 2017 at 12:20:53PM -0400, Alan Stern wrote:
> > > On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> > > 
> > > > > But I would like to get this matter settled first.  Is the explicit 
> > > > > barrier truly necessary?
> > > > 
> > > > If you are using wait_event()/wake_up() or friends, the explicit
> > > > barrier -is- necessary.  To see this, look at v4.10's wait_event():
> > > > 
> > > > #define wait_event(wq, condition)   
> > > > \
> > > > do {
> > > > \
> > > > might_sleep();  
> > > > \
> > > > if (condition)  
> > > > \
> > > > break;  
> > > > \
> > > > __wait_event(wq, condition);
> > > > \
> > > > } while (0)
> > > > 
> > > > As you can see, if the condition is set just before the wait_event()
> > > > macro checks it, there is no ordering whatsoever.
> > > 
> > > This is true, but it is not relevant to the question I was asking.
> > 
> > Apologies!  What I get for answering email too early on Monday, I guess...
> > 
> > > >  And if wake_up()
> > > > finds nothing to wake up, there is no relevant ordering on that side,
> > > > either.
> > > > 
> > > > So you had better supply your own ordering, period, end of story.
> > > 
> > > The question is: Exactly what ordering do I need to supply?  The 
> > > ordering among my own variables is okay; I know how to deal with that.  
> > > But what about the ordering between my variables and current->state?
> > 
> > The ordering with current->state is sadly not relevant because it is
> > only touched if wake_up() actually wakes the process up.
> 
> Well, it is _written_ only if wake_up() actually wakes the process up.  
> But it is _read_ in every case.

For wake_up_process(), agreed.  But for wake_up(), if the process
doing the wait_event() saw the changed state on the very first check,
the waking process won't have any way of gaining a reference to the
"awakened" task, so cannot access its ->state.

> > > For example, __wait_event() calls prepare_to_wait(), which calls
> > > set_current_state(), which calls smp_store_mb(), thereby inserting a
> > > full memory barrier between setting current->state and checking the
> > > condition.  But I didn't see any comparable barrier inserted by
> > > wake_up(), between setting the condition and checking task->state.
> > > 
> > > However, now that I look more closely, I do see that wakeup_process() 
> > > calls try_to_wake_up(), which begins with:
> > > 
> > >   /*
> > >* If we are going to wake up a thread waiting for CONDITION we
> > >* need to ensure that CONDITION=1 done by the caller can not be
> > >* reordered with p->state check below. This pairs with mb() in
> > >* set_current_state() the waiting thread does.
> > >*/
> > >   smp_mb__before_spinlock();
> > >   raw_spin_lock_irqsave(&p->pi_lock, flags);
> > >   if (!(p->state & state))
> > > 
> > > So it does insert a full barrier after all, and there is nothing to 
> > > worry about.
> > 
> > Nice!
> 
> It looks like the other wakeup pathways end up funnelling through 
> try_to_wake_up(), so this is true in general.

Only for wake_up_process() and friends, not for wake_up() and friends.
Again, although wake_up_process() unconditionally checks the awakened
processm, wake_up() doesn't even have any way of knowing what process
it woke up in the case where the "awakened" process didn't actually sleep.

But even in the wake_up_process() case, don't we need the wait-side
process to have appropriate barriers (or locks or whatever) manually
inserted?

> > Hmmm...
> > 
> > Another valid (and I believe more common) idiom is this:
> > 
> > spin_lock(&mylock);
> > changes_that_must_be_visible_to_woken_thread();
> > WRITE_ONCE(need_wake_up, true);
> > spin_unlock(&mylock);
> > 
> > ---
> > 
> > wait_event(wq, READ_ONCE(need_wake_up));
> > spin_lock(&mylock);
> > access_variables_used_by_waking_thread();
> > spin_unlock(&mylock);
> > 
> > In this case, the locks do all the required ordering.
> > 
> > > This also means that the analysis provided by Thinh Nguyen in the 
> > > original patch description is wrong.
> > 
> > And that the bug is elsewhere?
> 
> Presumably.  On the other hand, Thinh Nguyen claimed to have narrowed
> the problem down to this particular mechanism.  The driver in question
> in drivers/usb/gadget/function/f_mass_storage.c.  The waker routines
> are bulk_out_complete()/wakeup_thread(), which do:
> 
>   // bulk_out_complete()
>   bh->state = BH_STATE_FULL;
> 
>   // wakeup_thread()
>   smp_wmb();  /* ensure the write of bh->state i

Re: [PATCH 1/3] usb: dwc3: gadget: Prevent losing events in event cache

2017-04-10 Thread John Youn
On 04/08/2017 12:38 PM, Janusz Dziedzic wrote:
> 2017-04-08 1:57 GMT+02:00 Thinh Nguyen :
>> The dwc3 driver can overwite its previous events if its top half IRQ
>> handler gets invoked again before processing the events in the cache. We
>> see this as a hang in the file transfer and the host will attempt to
>> reset the device. This shouldn't be possible in the dwc3 implementation
>> if the HW and the SW are in sync. However, through testing and reading
>> the pcie trace, the top half IRQ handler occasionally still gets invoked
>> one more time after HW interrupt deassertion. We suspect that there is a
>> small detection delay in the SW.
>>
>> Top half IRQ handler deasserts the interrupt line after it gets the
>> event count by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. There's a
>> small window for a new event coming in between reading the event count
>> and the interrupt deassertion. More generally, we will see 0 event
>> count, which should not affect anything.
>>
>> To avoid this issue, we ensure that the events in the cache are
>> processed before checking for the new ones. The bottom half IRQ will
>> unmask the DWC3_GEVNTSIZ_INTMASK once it finishes processing the events.
>> Top half IRQ handler needs to check whether the DWC3_GEVNTSIZ_INTMASK is
>> still set before storing the new events. It also should return
>> IRQ_HANDLED if the mask is still set since IRQ handler was invoked for
>> our usb interrupt.
>>
>> Signed-off-by: Thinh Nguyen 
>> ---
>>  drivers/usb/dwc3/gadget.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 4dc80729ae11..93d98fb7215e 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -3045,6 +3045,10 @@ static irqreturn_t dwc3_check_event_buf(struct 
>> dwc3_event_buffer *evt)
>> return IRQ_HANDLED;
>> }
>>
>> +   reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
>> +   if (reg & DWC3_GEVNTSIZ_INTMASK)
>> +   return IRQ_HANDLED;
>> +
>> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>> count &= DWC3_GEVNTCOUNT_MASK;
>> if (!count)
>> @@ -3054,7 +3058,6 @@ static irqreturn_t dwc3_check_event_buf(struct 
>> dwc3_event_buffer *evt)
>> evt->flags |= DWC3_EVENT_PENDING;
>>
>> /* Mask interrupt */
>> -   reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
>> reg |= DWC3_GEVNTSIZ_INTMASK;
>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);
>>
>> --
>> 2.11.0
>>
> This seems like a workaround, while when we mask interrupts we should
> not get IRQ before BH will done. Current driver implementation base on
> this.
> We are using 2.60a and didn't see problem you describe. Is it specyfic
> for some HW revision?

Doesn't seem to be. We can try other HW versions.

>
> I can imagine we can have the "last" interrupt and you will return
> IRQ_HANDLED. Will HW issue this interrupt once again? If not will we
> loose this IRQ (eg. disconnect event)?
>

Yes it will re-assert. The interrupt line will remain asserted as long
events remain in the buffer and it is not masked. So when we
eventually unmask, the interrupt will be reasserted again immediately.

> BTW, other option here could be using (like Baolin propose some time
> ago) dwc->lock in top half while this is held for BH.
> Question how long spin_lock() will be held in top half...
> I am not sure what option is the best.

That won't help in this case since you can still have two calls top
top-half before a call to bottom. The lock only prevents the two from
stepping on each other, but that should already be the case without
needing the lock.

Regards,
John
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Alan Stern
On Mon, 10 Apr 2017, Paul E. McKenney wrote:

> > > The ordering with current->state is sadly not relevant because it is
> > > only touched if wake_up() actually wakes the process up.
> > 
> > Well, it is _written_ only if wake_up() actually wakes the process up.  
> > But it is _read_ in every case.
> 
> For wake_up_process(), agreed.  But for wake_up(), if the process
> doing the wait_event() saw the changed state on the very first check,
> the waking process won't have any way of gaining a reference to the
> "awakened" task, so cannot access its ->state.

True.  But that would be okay, since the waiter has definitely seen the
changed state.  I was concerned about the possibility that there was no
wakeup and the waiter did _not_ see the changed state.  That's how you 
get tasks staying asleep indefinitely when they should be running.

> > It looks like the other wakeup pathways end up funnelling through 
> > try_to_wake_up(), so this is true in general.
> 
> Only for wake_up_process() and friends, not for wake_up() and friends.
> Again, although wake_up_process() unconditionally checks the awakened
> processm, wake_up() doesn't even have any way of knowing what process
> it woke up in the case where the "awakened" process didn't actually sleep.

Like the above, this would be okay.

> But even in the wake_up_process() case, don't we need the wait-side
> process to have appropriate barriers (or locks or whatever) manually
> inserted?

Only for accesses among the driver's own variables.  There's no need to
order the local variables against current->state; as we have seen,
that's all handled for us.

> > > > This also means that the analysis provided by Thinh Nguyen in the 
> > > > original patch description is wrong.
> > > 
> > > And that the bug is elsewhere?
> > 
> > Presumably.  On the other hand, Thinh Nguyen claimed to have narrowed
> > the problem down to this particular mechanism.  The driver in question
> > in drivers/usb/gadget/function/f_mass_storage.c.  The waker routines
> > are bulk_out_complete()/wakeup_thread(), which do:
> > 
> > // bulk_out_complete()
> > bh->state = BH_STATE_FULL;
> > 
> > // wakeup_thread()
> > smp_wmb();  /* ensure the write of bh->state is complete */
> > /* Tell the main thread that something has happened */
> > common->thread_wakeup_needed = 1;
> > if (common->thread_task)
> > wake_up_process(common->thread_task);
> > 
> > and the waiters are get_next_command()/sleep_thread(), which do:
> > 
> > // get_next_command()
> > while (bh->state == BH_STATE_EMPTY) {
> > 
> > // sleep_thread()
> > for (;;) {
> > if (can_freeze)
> > try_to_freeze();
> > set_current_state(TASK_INTERRUPTIBLE);
> > if (signal_pending(current)) {
> > rc = -EINTR;
> > break;
> > }
> > if (common->thread_wakeup_needed)
> > break;
> > schedule();
> > }
> > __set_current_state(TASK_RUNNING);
> > common->thread_wakeup_needed = 0;
> > smp_rmb();  /* ensure the latest bh->state is visible */
> > 
> > }
> > 
> > and he said that the problem was caused by the waiter seeing 
> > thread_wakeup_needed == 0, so the wakeup was getting lost.
> > 
> > Hmmm, I suppose it's possible that the waker's thread_wakeup_needed = 1
> > could race with the waiter's thread_wakeup_needed = 0.  If there are
> > two waits in quick succession, the second could get lost.  The pattern
> > would be:
> > 
> > bh->state = BH_STATE_FULL;
> > smp_wmb();
> > thread_wakeup_needed = 0;   thread_wakeup_needed = 1;
> > smp_rmb();
> > if (bh->state != BH_STATE_FULL)
> > sleep again...
> > 
> > This is the so-called R pattern, and it also needs full memory barriers
> > on both sides.  The barriers we have are not sufficient.  (This is an
> > indication that the driver's design needs to be re-thought.)  As it is,
> > the waiter's thread_wakeup_needed = 0 can overwrite the waker's
> > thread_wakeup_needed = 1 while the waiter's read of bh->state then
> > fails to see the waker's write.  (This analysis is similar to but 
> > different from the one in the patch description.)
> > 
> > To fix this problem, both the smp_rmb() in sleep_thread() and the
> > smp_wmb() in wakeup_thread() should be changed to smp_mb().
> 
> Good catch!
> 
> However, if this failure was seen on x86, there is something else going
> on as well.

Why do you say that?  Isn't R-pattern reordering observable on x86?  
It involves a write followed by a read, which is the sort of thing x86
is able to reorder.

Anyway, I don't know what type of system was used for testing.  The
patch description didn't say.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info 

Re: [PATCH 1/3] usb: dwc3: gadget: Prevent losing events in event cache

2017-04-10 Thread John Youn
On 04/09/2017 11:59 PM, Felipe Balbi wrote:
>
> Hi,
>
> Thinh Nguyen  writes:
>> The dwc3 driver can overwite its previous events if its top half IRQ
>> handler gets invoked again before processing the events in the cache. We
>
> interrupts are masked, why would top half get invoked again? Is this,
> perhaps, related to DWC3 3.00a which has the "Interrupt line doesn't
> lower when masked" problem? We've added a lot of code to workaround that
> problem and, apparently, it wasn't enough.

No, it is not related to that. We verified with PCIe traces. The
interrupt line gets deasserted after we mask it. And we put the
masking as close to the beginning of the top-half as possible.

>
> In any case, there's no way top half would be invoked again in a
> properly working DWC3.

Yet we still see it sometimes. Usually it doesn't create a problem,
but if there happens to be a new event there, we get the failure.

We didn't trace into that part of the kernel so we can't explain why.
But if there is any chance the interrupt line deassertion wasn't
detected in time, whatever part of the kernel that thinks it is still
asserted might just call our top-half again. This could be a totally
wrong assumption, but it doesn't seem too far-fetched.

If you have any insight on how to debug this, we can investigate this
more.


>
>> see this as a hang in the file transfer and the host will attempt to
>> reset the device. This shouldn't be possible in the dwc3 implementation
>> if the HW and the SW are in sync. However, through testing and reading
>
> it's not about SW and HW being in sync, it's about the HW being quirky,
> right?

No, we don't see any problem in the hardware. Here, being out of
"sync" just means the device hangs because of lost events. There are a
number of pretty bad failure scenarios that could happen due to this.

>
>> the pcie trace, the top half IRQ handler occasionally still gets invoked
>> one more time after HW interrupt deassertion. We suspect that there is a
>> small detection delay in the SW.
>
> so you don't really know what's going on :-) Why send a patch if you
> don't understand the problem fully?
>
>> Top half IRQ handler deasserts the interrupt line after it gets the
>> event count by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. There's a
>> small window for a new event coming in between reading the event count
>> and the interrupt deassertion. More generally, we will see 0 event
>> count, which should not affect anything.
>
> this new event can come, and it'll remain in the event buffer. We have
> masked the IRQ line and this should be enough to make sure IRQ isn't
> generated again. Why is the IRQ raising again? Which kernel version are
> you testing? Which DWC3 revision? Do you have interrupt moderation?

It's not about the new event. We see the case where the top-half is
called again just after we mask the interrupt without any events too.
That's probably the majority of the time it occurs. It's just that if
this case happens to coincide with a new event, everything gets
screwed up.

An alternative fix is to not allow the top-half to overwrite the event
cache. Just append and update it. The top and bottom are already
mutually exclusive so it should be fine.

Regards,
John
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Paul E. McKenney
On Mon, Apr 10, 2017 at 03:00:34PM -0400, Alan Stern wrote:
> On Mon, 10 Apr 2017, Paul E. McKenney wrote:
> 
> > > > The ordering with current->state is sadly not relevant because it is
> > > > only touched if wake_up() actually wakes the process up.
> > > 
> > > Well, it is _written_ only if wake_up() actually wakes the process up.  
> > > But it is _read_ in every case.
> > 
> > For wake_up_process(), agreed.  But for wake_up(), if the process
> > doing the wait_event() saw the changed state on the very first check,
> > the waking process won't have any way of gaining a reference to the
> > "awakened" task, so cannot access its ->state.
> 
> True.  But that would be okay, since the waiter has definitely seen the
> changed state.  I was concerned about the possibility that there was no
> wakeup and the waiter did _not_ see the changed state.  That's how you 
> get tasks staying asleep indefinitely when they should be running.
> 
> > > It looks like the other wakeup pathways end up funnelling through 
> > > try_to_wake_up(), so this is true in general.
> > 
> > Only for wake_up_process() and friends, not for wake_up() and friends.
> > Again, although wake_up_process() unconditionally checks the awakened
> > processm, wake_up() doesn't even have any way of knowing what process
> > it woke up in the case where the "awakened" process didn't actually sleep.
> 
> Like the above, this would be okay.
> 
> > But even in the wake_up_process() case, don't we need the wait-side
> > process to have appropriate barriers (or locks or whatever) manually
> > inserted?
> 
> Only for accesses among the driver's own variables.  There's no need to
> order the local variables against current->state; as we have seen,
> that's all handled for us.

Fair enough.  However, the kernel documentation needs to consider
the driver's own variables.

> > > > > This also means that the analysis provided by Thinh Nguyen in the 
> > > > > original patch description is wrong.
> > > > 
> > > > And that the bug is elsewhere?
> > > 
> > > Presumably.  On the other hand, Thinh Nguyen claimed to have narrowed
> > > the problem down to this particular mechanism.  The driver in question
> > > in drivers/usb/gadget/function/f_mass_storage.c.  The waker routines
> > > are bulk_out_complete()/wakeup_thread(), which do:
> > > 
> > >   // bulk_out_complete()
> > >   bh->state = BH_STATE_FULL;
> > > 
> > >   // wakeup_thread()
> > >   smp_wmb();  /* ensure the write of bh->state is complete */
> > >   /* Tell the main thread that something has happened */
> > >   common->thread_wakeup_needed = 1;
> > >   if (common->thread_task)
> > >   wake_up_process(common->thread_task);
> > > 
> > > and the waiters are get_next_command()/sleep_thread(), which do:
> > > 
> > > // get_next_command()
> > > while (bh->state == BH_STATE_EMPTY) {
> > > 
> > >   // sleep_thread()
> > >   for (;;) {
> > >   if (can_freeze)
> > >   try_to_freeze();
> > >   set_current_state(TASK_INTERRUPTIBLE);
> > >   if (signal_pending(current)) {
> > >   rc = -EINTR;
> > >   break;
> > >   }
> > >   if (common->thread_wakeup_needed)
> > >   break;
> > >   schedule();
> > >   }
> > >   __set_current_state(TASK_RUNNING);
> > >   common->thread_wakeup_needed = 0;
> > >   smp_rmb();  /* ensure the latest bh->state is visible */
> > > 
> > > }
> > > 
> > > and he said that the problem was caused by the waiter seeing 
> > > thread_wakeup_needed == 0, so the wakeup was getting lost.
> > > 
> > > Hmmm, I suppose it's possible that the waker's thread_wakeup_needed = 1
> > > could race with the waiter's thread_wakeup_needed = 0.  If there are
> > > two waits in quick succession, the second could get lost.  The pattern
> > > would be:
> > > 
> > >   bh->state = BH_STATE_FULL;
> > >   smp_wmb();
> > >   thread_wakeup_needed = 0;   thread_wakeup_needed = 1;
> > >   smp_rmb();
> > >   if (bh->state != BH_STATE_FULL)
> > >   sleep again...
> > > 
> > > This is the so-called R pattern, and it also needs full memory barriers
> > > on both sides.  The barriers we have are not sufficient.  (This is an
> > > indication that the driver's design needs to be re-thought.)  As it is,
> > > the waiter's thread_wakeup_needed = 0 can overwrite the waker's
> > > thread_wakeup_needed = 1 while the waiter's read of bh->state then
> > > fails to see the waker's write.  (This analysis is similar to but 
> > > different from the one in the patch description.)
> > > 
> > > To fix this problem, both the smp_rmb() in sleep_thread() and the
> > > smp_wmb() in wakeup_thread() should be changed to smp_mb().
> > 
> > Good catch!
> > 
> > However, if this failure was seen on x86, there is something else going
> > on as well.
> 
> Why do you say that?  Isn't R-pattern reordering observable on x86?  
> It involves a write followed b

Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

2017-04-10 Thread Thinh Nguyen

On 4/10/2017 12:31 PM, Paul E. McKenney wrote:

On Mon, Apr 10, 2017 at 03:00:34PM -0400, Alan Stern wrote:

On Mon, 10 Apr 2017, Paul E. McKenney wrote:


The ordering with current->state is sadly not relevant because it is
only touched if wake_up() actually wakes the process up.


Well, it is _written_ only if wake_up() actually wakes the process up.
But it is _read_ in every case.


For wake_up_process(), agreed.  But for wake_up(), if the process
doing the wait_event() saw the changed state on the very first check,
the waking process won't have any way of gaining a reference to the
"awakened" task, so cannot access its ->state.


True.  But that would be okay, since the waiter has definitely seen the
changed state.  I was concerned about the possibility that there was no
wakeup and the waiter did _not_ see the changed state.  That's how you
get tasks staying asleep indefinitely when they should be running.


It looks like the other wakeup pathways end up funnelling through
try_to_wake_up(), so this is true in general.


Only for wake_up_process() and friends, not for wake_up() and friends.
Again, although wake_up_process() unconditionally checks the awakened
processm, wake_up() doesn't even have any way of knowing what process
it woke up in the case where the "awakened" process didn't actually sleep.


Like the above, this would be okay.


But even in the wake_up_process() case, don't we need the wait-side
process to have appropriate barriers (or locks or whatever) manually
inserted?


Only for accesses among the driver's own variables.  There's no need to
order the local variables against current->state; as we have seen,
that's all handled for us.


Fair enough.  However, the kernel documentation needs to consider
the driver's own variables.


This also means that the analysis provided by Thinh Nguyen in the
original patch description is wrong.


And that the bug is elsewhere?


Presumably.  On the other hand, Thinh Nguyen claimed to have narrowed
the problem down to this particular mechanism.  The driver in question
in drivers/usb/gadget/function/f_mass_storage.c.  The waker routines
are bulk_out_complete()/wakeup_thread(), which do:

// bulk_out_complete()
bh->state = BH_STATE_FULL;

// wakeup_thread()
smp_wmb();  /* ensure the write of bh->state is complete */
/* Tell the main thread that something has happened */
common->thread_wakeup_needed = 1;
if (common->thread_task)
wake_up_process(common->thread_task);

and the waiters are get_next_command()/sleep_thread(), which do:

// get_next_command()
while (bh->state == BH_STATE_EMPTY) {

// sleep_thread()
for (;;) {
if (can_freeze)
try_to_freeze();
set_current_state(TASK_INTERRUPTIBLE);
if (signal_pending(current)) {
rc = -EINTR;
break;
}
if (common->thread_wakeup_needed)
break;
schedule();
}
__set_current_state(TASK_RUNNING);
common->thread_wakeup_needed = 0;
smp_rmb();  /* ensure the latest bh->state is visible */

}

and he said that the problem was caused by the waiter seeing
thread_wakeup_needed == 0, so the wakeup was getting lost.

Hmmm, I suppose it's possible that the waker's thread_wakeup_needed = 1
could race with the waiter's thread_wakeup_needed = 0.  If there are
two waits in quick succession, the second could get lost.  The pattern
would be:

bh->state = BH_STATE_FULL;
smp_wmb();
thread_wakeup_needed = 0;   thread_wakeup_needed = 1;
smp_rmb();
if (bh->state != BH_STATE_FULL)
sleep again...

This is the so-called R pattern, and it also needs full memory barriers
on both sides.  The barriers we have are not sufficient.  (This is an
indication that the driver's design needs to be re-thought.)  As it is,
the waiter's thread_wakeup_needed = 0 can overwrite the waker's
thread_wakeup_needed = 1 while the waiter's read of bh->state then
fails to see the waker's write.  (This analysis is similar to but
different from the one in the patch description.)

To fix this problem, both the smp_rmb() in sleep_thread() and the
smp_wmb() in wakeup_thread() should be changed to smp_mb().


Wouldn't it be sufficient to have smp_mb() just either in the 
sleep_thread() or the wakeup_thread() in this case?


With just 1 smp_mb(), either bh->state or thread_wakeup_needed is set in 
time to break the loop depending on when the wakeup_thread() enters.


Case 1: smp_mb() in sleep_thread()
bh->state = BH_STATE_FULL;
smp_wmb();
thread_wakeup_needed = 0;   thread_wakeup_needed = 1;
-->  smp_mb();
if (bh->state != BH_STATE_FULL)
 

Re: [PATCH 2/3] usb: dwc3: gadget: Fix early exit in set/clear ep halt

2017-04-10 Thread Thinh Nguyen

Hi Felipe,

On 4/10/2017 12:04 AM, Felipe Balbi wrote:


Hi again,

Felipe Balbi  writes:

Thinh Nguyen  writes:

This patch fixes a commit that causes a hang from device waiting for
data with the wrong sequence number. The commit ffb80fc672c3 ("usb:
dwc3: gadget: skip Set/Clear Halt when invalid") adds a check to return
early depending on DWC3_EP_STALL is set or not, prevent sending the ep
halt command to HW endpoint to do CLEAR_FEATURE(ENDPOINT_HALT) request.
This was to workaround the issue for macOS where the device hangs from
sending DWC3 clear stall command.

In USB 3.1 spec, 9.4.5, CLEAR_FEATURE(ENDPOINT_HALT) request always
results in the data sequence being reinitialized to zero regardless
whether the endpoint has been halted or not. Some device class depends
on this feature for its protocol. For instance, in mass storage class,
there is MSC reset protocol that does CLEAR_FEATURE(ENDPOINT_HALT) on
bulk endpoints. This protocol reinitializes the data sequence and
ensures that whatever pending data requested from previous CBW will be
reset. Otherwise this will cause a hang as the device can wait for the
data with the wrong sequence number from the previous CBW. We found this
failure in USB CV: MSC Error Recovery Test with f_mass_storage.

This patch fixes this issue by checking to see whether the set/halt ep
call is a protocol call before early exit to make sure that set/clear
halt endpoint command can go through if it is a device class protocol.

Fixes: ffb80fc672c3 ("usb: dwc3: gadget: skip Set/Clear Halt when invalid")
Signed-off-by: Thinh Nguyen 


this will regress the macOS case I wrote that commit for. We need to


no wait, it won't regress macOS, but we're still left with the problem
of host and peripheral being able to get DataToggle/SeqN out of sync.



This patch is for the regression we have. Can you provide more 
information for the macOS? I'm not sure if this is the case for macOS, 
but maybe there is still pending transfer when it tries to send the 
request? (There shouldn't be any before issuing ClearStall command). Do 
you see Starttransfer command after clear transfer?


BR,
Thinh Nguyen
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] usb: dwc3: gadget: Prevent losing events in event cache

2017-04-10 Thread Janusz Dziedzic
On 10 April 2017 at 20:43, John Youn  wrote:
> On 04/08/2017 12:38 PM, Janusz Dziedzic wrote:
>> 2017-04-08 1:57 GMT+02:00 Thinh Nguyen :
>>> The dwc3 driver can overwite its previous events if its top half IRQ
>>> handler gets invoked again before processing the events in the cache. We
>>> see this as a hang in the file transfer and the host will attempt to
>>> reset the device. This shouldn't be possible in the dwc3 implementation
>>> if the HW and the SW are in sync. However, through testing and reading
>>> the pcie trace, the top half IRQ handler occasionally still gets invoked
>>> one more time after HW interrupt deassertion. We suspect that there is a
>>> small detection delay in the SW.
>>>
>>> Top half IRQ handler deasserts the interrupt line after it gets the
>>> event count by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. There's a
>>> small window for a new event coming in between reading the event count
>>> and the interrupt deassertion. More generally, we will see 0 event
>>> count, which should not affect anything.
>>>
>>> To avoid this issue, we ensure that the events in the cache are
>>> processed before checking for the new ones. The bottom half IRQ will
>>> unmask the DWC3_GEVNTSIZ_INTMASK once it finishes processing the events.
>>> Top half IRQ handler needs to check whether the DWC3_GEVNTSIZ_INTMASK is
>>> still set before storing the new events. It also should return
>>> IRQ_HANDLED if the mask is still set since IRQ handler was invoked for
>>> our usb interrupt.
>>>
>>> Signed-off-by: Thinh Nguyen 
>>> ---
>>>  drivers/usb/dwc3/gadget.c | 5 -
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>> index 4dc80729ae11..93d98fb7215e 100644
>>> --- a/drivers/usb/dwc3/gadget.c
>>> +++ b/drivers/usb/dwc3/gadget.c
>>> @@ -3045,6 +3045,10 @@ static irqreturn_t dwc3_check_event_buf(struct 
>>> dwc3_event_buffer *evt)
>>> return IRQ_HANDLED;
>>> }
>>>
>>> +   reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
>>> +   if (reg & DWC3_GEVNTSIZ_INTMASK)
>>> +   return IRQ_HANDLED;
>>> +
>>> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>>> count &= DWC3_GEVNTCOUNT_MASK;
>>> if (!count)
>>> @@ -3054,7 +3058,6 @@ static irqreturn_t dwc3_check_event_buf(struct 
>>> dwc3_event_buffer *evt)
>>> evt->flags |= DWC3_EVENT_PENDING;
>>>
>>> /* Mask interrupt */
>>> -   reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
>>> reg |= DWC3_GEVNTSIZ_INTMASK;
>>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);
>>>
>>> --
>>> 2.11.0
>>>
>> This seems like a workaround, while when we mask interrupts we should
>> not get IRQ before BH will done. Current driver implementation base on
>> this.
>> We are using 2.60a and didn't see problem you describe. Is it specyfic
>> for some HW revision?
>
> Doesn't seem to be. We can try other HW versions.
>
>>
>> I can imagine we can have the "last" interrupt and you will return
>> IRQ_HANDLED. Will HW issue this interrupt once again? If not will we
>> loose this IRQ (eg. disconnect event)?
>>
>
> Yes it will re-assert. The interrupt line will remain asserted as long
> events remain in the buffer and it is not masked. So when we
> eventually unmask, the interrupt will be reasserted again immediately.
>
>> BTW, other option here could be using (like Baolin propose some time
>> ago) dwc->lock in top half while this is held for BH.
>> Question how long spin_lock() will be held in top half...
>> I am not sure what option is the best.
>
> That won't help in this case since you can still have two calls top
> top-half before a call to bottom. The lock only prevents the two from
> stepping on each other, but that should already be the case without
> needing the lock.
>
Really can we have two TH calls before BH?
Interesting case :)

You suggest we have something like this:
dwc3 IRQ
   kernel irq_mask
  dwc3 TH
 mask dwc3 interrupts
 get evt->count, evt->cache
 write to hw (count)
 return WAKE_THREAD
   kernel irq_unmask
a) Before BH start we get another dwc3 IRQ?
b) CPU0 starts with BH while we get dwc3 IRQ on CPU1?

*) in normal case BH should start and in the end unmask
dwc3_interrupts and after that we could get new irq

If this could happen then for sure you need dwc->lock protection in TH
while base on your description CPU0 can execute BH and CPU1 can handle
TH - so you have to protect dwc3_event_buffer.

Could you describe this more? How to reproduce?
Do you think we introduce this when adding evt->cache in TH?
Even without evt->cache we still could overwrite evt->count - so, was
that seen before evt->cache?

BR
Janusz

> Regards,
> John
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: sen

Re: [PATCH 1/3] usb: dwc3: gadget: Prevent losing events in event cache

2017-04-10 Thread John Youn
On 04/10/2017 11:17 PM, Janusz Dziedzic wrote:
> On 10 April 2017 at 20:43, John Youn  wrote:
>> On 04/08/2017 12:38 PM, Janusz Dziedzic wrote:
>>> 2017-04-08 1:57 GMT+02:00 Thinh Nguyen :
 The dwc3 driver can overwite its previous events if its top half IRQ
 handler gets invoked again before processing the events in the cache. We
 see this as a hang in the file transfer and the host will attempt to
 reset the device. This shouldn't be possible in the dwc3 implementation
 if the HW and the SW are in sync. However, through testing and reading
 the pcie trace, the top half IRQ handler occasionally still gets invoked
 one more time after HW interrupt deassertion. We suspect that there is a
 small detection delay in the SW.

 Top half IRQ handler deasserts the interrupt line after it gets the
 event count by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. There's a
 small window for a new event coming in between reading the event count
 and the interrupt deassertion. More generally, we will see 0 event
 count, which should not affect anything.

 To avoid this issue, we ensure that the events in the cache are
 processed before checking for the new ones. The bottom half IRQ will
 unmask the DWC3_GEVNTSIZ_INTMASK once it finishes processing the events.
 Top half IRQ handler needs to check whether the DWC3_GEVNTSIZ_INTMASK is
 still set before storing the new events. It also should return
 IRQ_HANDLED if the mask is still set since IRQ handler was invoked for
 our usb interrupt.

 Signed-off-by: Thinh Nguyen 
 ---
  drivers/usb/dwc3/gadget.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

 diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
 index 4dc80729ae11..93d98fb7215e 100644
 --- a/drivers/usb/dwc3/gadget.c
 +++ b/drivers/usb/dwc3/gadget.c
 @@ -3045,6 +3045,10 @@ static irqreturn_t dwc3_check_event_buf(struct 
 dwc3_event_buffer *evt)
 return IRQ_HANDLED;
 }

 +   reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
 +   if (reg & DWC3_GEVNTSIZ_INTMASK)
 +   return IRQ_HANDLED;
 +
 count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
 count &= DWC3_GEVNTCOUNT_MASK;
 if (!count)
 @@ -3054,7 +3058,6 @@ static irqreturn_t dwc3_check_event_buf(struct 
 dwc3_event_buffer *evt)
 evt->flags |= DWC3_EVENT_PENDING;

 /* Mask interrupt */
 -   reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
 reg |= DWC3_GEVNTSIZ_INTMASK;
 dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);

 --
 2.11.0

>>> This seems like a workaround, while when we mask interrupts we should
>>> not get IRQ before BH will done. Current driver implementation base on
>>> this.
>>> We are using 2.60a and didn't see problem you describe. Is it specyfic
>>> for some HW revision?
>>
>> Doesn't seem to be. We can try other HW versions.
>>
>>>
>>> I can imagine we can have the "last" interrupt and you will return
>>> IRQ_HANDLED. Will HW issue this interrupt once again? If not will we
>>> loose this IRQ (eg. disconnect event)?
>>>
>>
>> Yes it will re-assert. The interrupt line will remain asserted as long
>> events remain in the buffer and it is not masked. So when we
>> eventually unmask, the interrupt will be reasserted again immediately.
>>
>>> BTW, other option here could be using (like Baolin propose some time
>>> ago) dwc->lock in top half while this is held for BH.
>>> Question how long spin_lock() will be held in top half...
>>> I am not sure what option is the best.
>>
>> That won't help in this case since you can still have two calls top
>> top-half before a call to bottom. The lock only prevents the two from
>> stepping on each other, but that should already be the case without
>> needing the lock.
>>
> Really can we have two TH calls before BH?
> Interesting case :)

That's what we seem to be seeing. I'm not sure if it is normal or not.

>
> You suggest we have something like this:
> dwc3 IRQ
>kernel irq_mask
>   dwc3 TH
>  mask dwc3 interrupts
>  get evt->count, evt->cache
>  write to hw (count)
>  return WAKE_THREAD
>kernel irq_unmask
> a) Before BH start we get another dwc3 IRQ?
> b) CPU0 starts with BH while we get dwc3 IRQ on CPU1?
>
> *) in normal case BH should start and in the end unmask
> dwc3_interrupts and after that we could get new irq
>
> If this could happen then for sure you need dwc->lock protection in TH
> while base on your description CPU0 can execute BH and CPU1 can handle
> TH - so you have to protect dwc3_event_buffer.
>
> Could you describe this more? How to reproduce?

We can reproduce by running a file transfer continuously. It fails
fairly reliably, but it can take a while to hit the condition. We are
using our PCIe based HAPS FPGA on x86_64

Re: [PATCH v3 3/6] usb: usbip tool: Check the return of get_nports()

2017-04-10 Thread Yuyang Du
Hi Shuah,

Could you please take a look at these patches? I got more patches
that sit on these to send out.

Thanks,
Yuyang

On Thu, Apr 06, 2017 at 06:03:24AM +0800, Yuyang Du wrote:
> If we get nonpositive number of ports, there is no sense to
> continue, then fail gracefully.
> 
> In addition, the commit 0775a9cbc694e8c72 ("usbip: vhci extension:
> modifications to vhci driver") introduced configurable numbers of
> controllers and ports, but we have a static port number maximum,
> MAXNPORT. If exceeded, the idev array will be overflown. We fix
> it by validating the nports to make sure the port number max is
> not exceeded.
> 
> Signed-off-by: Yuyang Du 
> ---
>  tools/usb/usbip/libsrc/vhci_driver.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/usb/usbip/libsrc/vhci_driver.c 
> b/tools/usb/usbip/libsrc/vhci_driver.c
> index f659c14..151580a 100644
> --- a/tools/usb/usbip/libsrc/vhci_driver.c
> +++ b/tools/usb/usbip/libsrc/vhci_driver.c
> @@ -220,9 +220,17 @@ int usbip_vhci_driver_open(void)
>   }
>  
>   vhci_driver->nports = get_nports();
> -
>   dbg("available ports: %d", vhci_driver->nports);
>  
> + if (vhci_driver->nports <=0) {
> + err("no available ports");
> + goto err;
> + }
> + else if (vhci_driver->nports > MAXNPORT) {
> + err("port number exceeds %d", MAXNPORT);
> + goto err;
> + }
> +
>   if (refresh_imported_device_list())
>   goto err;
>  
> -- 
> 2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] usb: dwc3: gadget: Prevent losing events in event cache

2017-04-10 Thread Felipe Balbi

Hi,

John Youn  writes:
>>> Yes it will re-assert. The interrupt line will remain asserted as long
>>> events remain in the buffer and it is not masked. So when we
>>> eventually unmask, the interrupt will be reasserted again immediately.
>>>
 BTW, other option here could be using (like Baolin propose some time
 ago) dwc->lock in top half while this is held for BH.
 Question how long spin_lock() will be held in top half...
 I am not sure what option is the best.
>>>
>>> That won't help in this case since you can still have two calls top
>>> top-half before a call to bottom. The lock only prevents the two from
>>> stepping on each other, but that should already be the case without
>>> needing the lock.
>>>
>> Really can we have two TH calls before BH?
>> Interesting case :)
>
> That's what we seem to be seeing. I'm not sure if it is normal or not.

no, that's not normal. If this is happening, there's either a HW bug
somewhere, or we're not clearing events properly.

Can you provide tracepoints of the failure? Perhaps add a trace_printk()
call on both BH and TH just so we see when they're both called.

>> You suggest we have something like this:
>> dwc3 IRQ
>>kernel irq_mask
>>   dwc3 TH
>>  mask dwc3 interrupts
>>  get evt->count, evt->cache
>>  write to hw (count)
>>  return WAKE_THREAD
>>kernel irq_unmask
>> a) Before BH start we get another dwc3 IRQ?
>> b) CPU0 starts with BH while we get dwc3 IRQ on CPU1?
>>
>> *) in normal case BH should start and in the end unmask
>> dwc3_interrupts and after that we could get new irq
>>
>> If this could happen then for sure you need dwc->lock protection in TH
>> while base on your description CPU0 can execute BH and CPU1 can handle
>> TH - so you have to protect dwc3_event_buffer.
>>
>> Could you describe this more? How to reproduce?
>
> We can reproduce by running a file transfer continuously. It fails
> fairly reliably, but it can take a while to hit the condition. We are
> using our PCIe based HAPS FPGA on x86_64 using mainline kernel.

Okay, please provide tracepoints of the problem in question.

> Thinh can provide which IP versions we tried with.

Thank you

>> Do you think we introduce this when adding evt->cache in TH?
>> Even without evt->cache we still could overwrite evt->count - so, was
>> that seen before evt->cache?
>
> The multiple TH calls could have happened even before we introduced
> evt->cache, but only with the cache would it have caused a failure due
> to overwriting the cached events on multiple calls.

Right, multiple TH calls should NEVER happen, because our IRQ line is
masked. Unless, and this is a long shot, IRQ line is shared and ANOTHER
device is causing IRQ to be raised. Can you show the output of:

# grep dwc3 /proc/interrupts

If another device raises the interrupt, then we will get into our TH,
however we should just return IRQ_HANDLED in that case because we
shouldn't be generating events.

Hmm, can you apply this patch and provide output when the failure
happens? I suggest setting up a 100MiB trace buffer. You don't need to
enable any tracers nor any event. trace_printk() is always enabled.

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 6f6f0b3be3ad..3b8185d81f9f 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -3005,6 +3005,7 @@ static irqreturn_t dwc3_thread_interrupt(int irq, void 
*_evt)
unsigned long flags;
irqreturn_t ret = IRQ_NONE;
 
+   trace_printk("BH\n");
spin_lock_irqsave(&dwc->lock, flags);
ret = dwc3_process_event_buf(evt);
spin_unlock_irqrestore(&dwc->lock, flags);
@@ -3019,6 +3020,8 @@ static irqreturn_t dwc3_check_event_buf(struct 
dwc3_event_buffer *evt)
u32 count;
u32 reg;
 
+   trace_printk("evt->flags %08x\n", evt->flags);
+
if (pm_runtime_suspended(dwc->dev)) {
pm_runtime_get(dwc->dev);
disable_irq_nosync(dwc->irq_gadget);
@@ -3027,7 +3030,9 @@ static irqreturn_t dwc3_check_event_buf(struct 
dwc3_event_buffer *evt)
}
 
count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
+   trace_printk("GEVNTCOUNT %08x\n", count);
count &= DWC3_GEVNTCOUNT_MASK;
+   trace_printk("%u events pending\n", count);
if (!count)
return IRQ_NONE;
 
@@ -3036,10 +3041,12 @@ static irqreturn_t dwc3_check_event_buf(struct 
dwc3_event_buffer *evt)
 
/* Mask interrupt */
reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
+   trace_printk("GEVNTSIZ %08x\n", reg);
reg |= DWC3_GEVNTSIZ_INTMASK;
dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);
 
amount = min(count, evt->length - evt->lpos);
+   trace_printk("copying events to cache\n");
memcpy(evt->cache + evt->lpos, evt->buf + evt->lpos, amount);
 
if (amount < count)
@@ -3053,7 +3060,7 @@ static irqreturn_t dwc3_check_event_buf(struct 
dwc3_event_buffer *evt)
 static ir