Re: [PATCH] usb: misc: usbtest: fix pattern tests for scatterlists.

2016-05-04 Thread Mathias Nyman

On 03.05.2016 17:37, Alan Stern wrote:

On Tue, 3 May 2016, Felipe Balbi wrote:


Hi,

Alan Stern  writes:

On Mon, 2 May 2016, Mathias Nyman wrote:


The current implemenentation restart the sent pattern for each entry in
the sg list. The receiving end expects a continuous pattern, and test
will fail unless scatterilst entries happen to be aligned with the
pattern


Ah.  We didn't spot this earlier because for non-xHCI controllers, the
scatterlist entries _must_ be aligned with the maxpacket size and
therefore with the pattern.


interesting. We actually found a similar issue with XHCI. scatterlist
has to be aligned to wMaxPacketSize but only before a link TRB. Mathias
has been working on a solution which involves memcpy()ing enough bytes
to align to wMaxPacketSize before the link TRB (it's very infrequent as
we have 256 TRBs per segment on XHCI), but if you know of a nicer way,
we're all ears :-)


You should be able to handle this without memcpy'ing anything.

If the individual scatterlist entries are large enough, you can simply
end the ring segment early.  Either put a link TRB before the physical
segment end, at the last point where the cumulative transfer size is
divisible by maxpacket, or else fill out from there to the end of the
segment with no-op TRBs.  Common case: Each scatterlist entry is a
multiple of 512 bytes and the maxpacket size is 1024.  Then you either
have to split the last entry in two or move it completely into the next
ring segment.

This approach doesn't work quite as well if the scatterlist entries are
very small.  For instance, if they are 8 bytes or smaller then you
might have to fill out the segment with 128 or more no-op TRBs, which
is not so good if the segment can hold only 256 TRBs.  I suppose we
could simply rule this out by requiring SG transfers to have a minimum
entry size of 128 bytes, or something like that.



First idea involved 3 steps to align the last trb.
1. If possible split the trb, doable if it contains a maxpacket boundary, else
2. Roll back to a maxpacket boundary trb, split it and set link trb mid segment.
3. If that is not possible use a bounce buffer.

While working on step 2 in noticed that I was rewriting way too much of xhci,
probably creating more regressions than solving alignment issues,
so current implementation just tries to split the trb, if it fails it will
use a bounce buffer.

And the bounce buffer part was a lot simpler than I first expected, I only
need to memcpy up to maxpacket size bytes in the worst case, and I can allocate 
a
maxpacket size bounce buffer in advance for every ring segment. I only need
to do a memcpy and dma map/unmap if bounce buffers are needed.

so far it passes all usbtest out tests, even with varying size scatterlists
with a small max side. In tranfsers are not yet ready

Once this is working, it can be optimized later with adding the mid-segment link
trb solution.

I'll post it as a RFC once I get the the in direction working

I hope most cases can be resolved by just splitting the TRB at maxpacket 
boundary.

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_handle_command_timeout and wait_for_completion

2016-05-09 Thread Mathias Nyman

On 06.05.2016 23:32, Joe Lawrence wrote:

Hello Mathias,

In c311e391a7ef "xhci: rework command timeout and cancellation," xHCI
command timeouts were refactored to flow through
xhci_handle_command_timeout.  We've seen a few instances of
hangs/crashes with upstream and RHEL7.2 kernels where someone gets stuck
on wait_for_completion(cmd->completion) when the underlying host
controller has been hotplug removed.

Without wait_for_completion_interruptible_timeout, I was wondering how
xhci_handle_command_timeout would handle scenarios that avoid the code
inside the ESHUTDOWN case below:

   void xhci_handle_command_timeout(unsigned long data)
   {
   struct xhci_hcd *xhci;
   int ret;
   unsigned long flags;
   u64 hw_ring_state;
   struct xhci_command *cur_cmd = NULL;
   xhci = (struct xhci_hcd *) data;

   /* mark this command to be cancelled */
   spin_lock_irqsave(&xhci->lock, flags);
   if (xhci->current_cmd) {
   cur_cmd = xhci->current_cmd;
   cur_cmd->status = COMP_CMD_ABORT;
   }


   /* Make sure command ring is running before aborting it */
   hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
   if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
   (hw_ring_state & CMD_RING_RUNNING))  {

   spin_unlock_irqrestore(&xhci->lock, flags);
   xhci_dbg(xhci, "Command timeout\n");
   ret = xhci_abort_cmd_ring(xhci);
   if (unlikely(ret == -ESHUTDOWN)) {

xhci_err(xhci, "Abort command ring failed\n");
xhci_cleanup_command_queue(xhci);
usb_hc_died(xhci_to_hcd(xhci)->primary_hcd);
xhci_dbg(xhci, "xHCI host controller is dead.\n");

   }
   return;
   }
   /* command timeout on stopped ring, ring can't be aborted */
   xhci_dbg(xhci, "Command timeout on stopped ring\n");
   xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
   spin_unlock_irqrestore(&xhci->lock, flags);
   return;
   }


When xhci_abort_cmd_ring returns -ESHUTDOWN, it's clear that
xhci_cleanup_command_queue will iterate through the cmd_list, calling
complete on the cmd->completion variable when necessary.

I was uncertain about cases that don't fall into the -ESHUTDOWN path.
By what mechanism will they (re)timeout or hit handle_cmd_completion?
What happens if the device were to be removed before then?

To give you an idea of the kinds of lockups we're seeing, here's an
example from our RHEL7.2 machine.  PID 4892 wants the dev->mutex that
PID 44765 has.  The latter is waiting for a command completion and the
former grows tired and emits a hung task warning:

   INFO: task java:4892 blocked for more than 120 seconds.
   ...
   PID: 4892   TASK: 880852305080  CPU: 42  COMMAND: "java"
#0 [880fed2078d0] __schedule at 8163a78d
#1 [880fed207938] schedule_preempt_disabled at 8163bf09
#2 [880fed207948] __mutex_lock_slowpath at 81639c05
#3 [880fed2079a8] mutex_lock at 8163906f
#4 [880fed2079c0] usb_disconnect at 81438957
#5 [880fed207a08] usb_remove_hcd at 8143e3e1
#6 [880fed207a38] usb_hcd_pci_remove at 814509ff
#7 [880fed207a60] xhci_pci_remove at 81456901
#8 [880fed207a80] pci_device_remove at 81328d9b
#9 [880fed207aa8] __device_release_driver at 813f60ef
   #10 [880fed207ac8] device_release_driver at 813f6183
   #11 [880fed207ae8] pci_stop_bus_device at 81322104
   #12 [880fed207b10] pci_stop_bus_device at 813220ab
   #13 [880fed207b38] pci_stop_bus_device at 813220ab
   #14 [880fed207b60] pci_stop_and_remove_bus_device at 813221f2
   #15 [880fed207b78] ioboard_bringdown at a05eb4c7 [ftmod]
   #16 [880fed207ba0] ftmod_ioboard_event at a05eb7c6 [ftmod]
   #17 [880fed207c28] io_state_change at a05e92d1 [ftmod]
   #18 [880fed207c78] OsIoStateChange at a06ba2e4 [ccmod]
   #19 [880fed207c90] IoStateChange at a06bad18 [ccmod]
   #20 [880fed207dc0] CcIoBringDownEx at a06baee0 [ccmod]
   #21 [880fed207e20] CcIoBringDownEx at a05e54db [ftmod]
   #22 [880fed207e48] cmd_bringdown at a05e68b8 [ftmod]
   #23 [880fed207e60] ctl_ioctl at a05e6c22 [ftmod]
   #24 [880fed207eb8] do_vfs_ioctl at 811f1f15
   #25 [880fed207f30] sys_ioctl at 811f2191
   #26 [880fed207f80] system_call_fastpath at 81645e89
   RIP: 7f88860cd257  RSP: 7f87a0d323d8  RFLAGS: 00010206
   RAX: 0010  RBX: 81645e89  RCX: 
   RDX: 7f87a0d32480  RSI:   RDI: 

Re: xhci_handle_command_timeout and wait_for_completion

2016-05-11 Thread Mathias Nyman

On 10.05.2016 17:04, Joe Lawrence wrote:

On 05/09/2016 06:18 AM, Mathias Nyman wrote:

On 06.05.2016 23:32, Joe Lawrence wrote:

...snip...

Given that the default command timeout is 5 seconds, it seems strange to
hit a 120 second hung task warning in this instance.  I can only think
that maybe something goofy is going on with xhci_handle_command_timeout
and an unfortunately timed host controller removal.




Idea is that when xhci is removed xhci_mem_cleanup() will take care of
the pending completions.

xhci_mem_cleanup()
   xhci_cleanup_command_queue()
  list_for_each_entry_safe(cur_cmd, tmp_cmd, &xhci->cmd_list, cmd_list)
 xhci_complete_del_and_free_cmd(cur_cmd, COMP_CMD_ABORT);
if (cmd->completion) {
 complete(cmd->completion);


xhci_stop will want the xhci->mutex when it calls xhci_mem_cleanup.
What happens if someone like xhci_alloc_dev has that mutex while it's
waiting for TRB_ENABLE_SLOT to complete?


Ah, now I got your concern.
I focused too much on the crash instead of the explanation.

Yes, its possible that case could happen. If xhci_alloc_dev() is
waiting for completion with mutex held while host is hotplug removed,
then command will timeout.

xhci_handle_command_timeout() might think command ring is not running
and just try to turn the command to no-op, and restart the command ring.
Host is removed so command ring will never start, and completion is never 
called.

There is another related issue when aborting the command ring never
genertes a cmd completion event, and we never call completion in that case 
either,
but thats was related to host dying completely.

I'll start writing a patch for both these cases

-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFT PATCH] xhci: Fix handling timeouted commands on hosts in weird states.

2016-05-12 Thread Mathias Nyman
If commands timeout we mark them for abortion, then stop the command
ring, and turn the commands to no-ops and finally restart the command
ring.

If the host is working properly the no-op commands will finish and
pending completions are called.
If we notice the host is failing driver clears the command ring and
completes, deletes and frees all pending commands.

There are two separate cases reported where host is believed to work
properly but is not. In the first case we successfully stop the ring
but no abort or stop commnand ring event is ever sent and host locks up.

The second case is if a host is removed, command times out and driver
believes the ring is stopped, and assumes it be restarted, but actually
ends up timing out on the same command forever.
If one of the pending commands has the xhci->mutex held it will block
xhci_stop() in the remove codepath which otherwise would cleanup pending
commands.

Add a check that clears all pending commands in case host is removed,
or we are stuck timeouting on the same command. Also restart the
command timeout timer when stopping the command ring to ensure we
recive an ring stop/abort event.

Cc: stable 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52deae4..c82570d 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -290,6 +290,14 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
 
temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
xhci->cmd_ring_state = CMD_RING_STATE_ABORTED;
+
+   /*
+* Writing the CMD_RING_ABORT bit should cause a cmd completion event,
+* however on some host hw the CMD_RING_RUNNING bit is correctly cleared
+* but the completion event in never sent. Use the cmd timeout timer to
+* handle those cases. Use twice the time to cover the bit polling retry
+*/
+   mod_timer(&xhci->cmd_timer, jiffies + (2 * XHCI_CMD_DEFAULT_TIMEOUT));
xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
&xhci->op_regs->cmd_ring);
 
@@ -314,6 +322,7 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
 
xhci_err(xhci, "Stopped the command ring failed, "
"maybe the host is dead\n");
+   del_timer(&xhci->cmd_timer);
xhci->xhc_state |= XHCI_STATE_DYING;
xhci_quiesce(xhci);
xhci_halt(xhci);
@@ -1246,22 +1255,21 @@ void xhci_handle_command_timeout(unsigned long data)
int ret;
unsigned long flags;
u64 hw_ring_state;
-   struct xhci_command *cur_cmd = NULL;
+   bool second_timeout = false;
xhci = (struct xhci_hcd *) data;
 
/* mark this command to be cancelled */
spin_lock_irqsave(&xhci->lock, flags);
if (xhci->current_cmd) {
-   cur_cmd = xhci->current_cmd;
-   cur_cmd->status = COMP_CMD_ABORT;
+   if (xhci->current_cmd->status == COMP_CMD_ABORT)
+   second_timeout = true;
+   xhci->current_cmd->status = COMP_CMD_ABORT;
}
 
-
/* Make sure command ring is running before aborting it */
hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
(hw_ring_state & CMD_RING_RUNNING))  {
-
spin_unlock_irqrestore(&xhci->lock, flags);
xhci_dbg(xhci, "Command timeout\n");
ret = xhci_abort_cmd_ring(xhci);
@@ -1273,6 +1281,15 @@ void xhci_handle_command_timeout(unsigned long data)
}
return;
}
+
+   /* command ring failed to restart, or host removed. Bail out */
+   if (second_timeout || xhci->xhc_state & XHCI_STATE_REMOVING) {
+   spin_unlock_irqrestore(&xhci->lock, flags);
+   xhci_dbg(xhci, "command timed out twice, ring start fail?\n");
+   xhci_cleanup_command_queue(xhci);
+   return;
+   }
+
/* command timeout on stopped ring, ring can't be aborted */
xhci_dbg(xhci, "Command timeout on stopped ring\n");
xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci DWC3 flavor problem

2016-05-19 Thread Mathias Nyman

On 19.05.2016 14:23, Joao Pinto wrote:

Hi Felipe,

On 5/19/2016 11:32 AM, Felipe Balbi wrote:


Hi,


Note that we really did get a command timeout. Can you add a little
extra debugging to try and figure out why that command failed?



After instrumenting and capturing FPGA signals, the driver could go a bit
further... Could this be a FPGA timming issue?


..


xhci-hcd xhci-hcd.0.auto: Timeout while waiting for setup device command
usb 3-1: hub failed to enable device, error -62
xhci-hcd xhci-hcd.0.auto: Endpoint 0x0 ep reset callback called
xhci-hcd xhci-hcd.0.auto: xHCI dying or halted, can't queue_command
xhci-hcd xhci-hcd.0.auto: FIXME: allocate a command ring segment
usb usb3-port1: couldn't allocate usb_device

Joao


Does the patch from Chris Bainbridge help?
It's currently only Gregs tree in the usb-next branch.

It fixes a locking issue where hw can't handle several ports being in default 
state at
the same time, and setup device command timeout issue when both usb2 and usb3 
devices
try to enumerate at the same time.

commit feb26ac31a2a5cb88d86680d9a94916a6343e9e6
usb: core: hub: hub_port_init lock controller instead of bus

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB 3.1 Gen 2 - 4.6 mainline broken

2016-05-20 Thread Mathias Nyman

Hi

I'm adding the mailing list to this reply
(removed your contact info)

USB 3.1 Gen 2 in 4.6 is not broken just because device is not seen as
SuperSpeedPlus 10Gbps device.

On 20.05.2016 02:04, Everett Wenzel wrote:

Hello,

I noticed that an FPGA was used to develop most of the USB3.1 Gen2 code.
This appears to be broken with actual devices. I have 2 startech docks,
and 2 startech hdd enclosures. I am also testing with an akitio gen2 ssd
tray.

When testing speeds with 3.0 and 3.1 ports I do notice a difference in
speeds. But dmesg and lsusb -t will never enumerate a device above
Gen1(5000m cap). It always states "new SuperSpeed device number *" in
dmesg.  I believe the code itself is broken so I was reaching out to you
for some advice, as this subject is still new and unexplored in depth.

Motherboard: https://www.asus.com/us/Motherboards/Z97AUSB_31/
Atikio: https://www.akitio.com/adapters/neutrino-bridge
Startech:
https://www.startech.com/HDD/Docking/2-bay-usb-3-1-gen-2-sata-dock~SDOCK2U
313



All parts in the chain need to support usb3.1 Gen2 capability before the speed 
of a
device is seen as SuperSpeedPlus.

1. First the xhci host controller SBRN register (Serial Bus Release Number) must
  be 3.1 (0x31). If this is true you will see this info message during boot:
  "Host supports USB 3.1 Enhanced SuperSpeed"

2. The supported protocols for ports listed in xhci extended capabilities need
   to have major and minor revisions set to 3 and 1. (3.1)
   If the xhci host has custom speed mappings (PSIC is set) then they must 
include
   the necessary settings for 3.1 10Gbps support (LP = 1 etc). If not then 
default
   speed mappings are set. Default mappings includ SSP 10Gbps support.
   The major and minor revisions can be seen with lsusb -v,
   (bcdUSB = 3.10 in device descriptor)
   If you compile the latest lsusb from sources it will also show the 
SuperSpeedPlus
device capability with the supported speeds.

3. The device itself needs to be 3.1 capable, bcdUSB = 3.1 and it needs to have 
the
   SuperSpeedPlus device capability with the correct speeds supported.

So far I've seen two 3.1 Gen2 host controllers with SBRN set to 0x30 instead of 
0x31
So that could be a likely issue.

If this is becoming a trend we might need to work around it, or ignore checking 
sbrn.

sample outputs:
# ./lsusb -t
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 1M
|__ Port 1: Dev 2, If 0, Class=, Driver=usb-storage, 1M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
|__ Port 7: Dev 2, If 0, Class=, Driver=, 12M
|__ Port 7: Dev 2, If 1, Class=, Driver=, 12M


** usb 3.1 root hub: **
#./lsusb -v
Bus 004 Device 001: ID 1d6b:0003
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   3.10
  bDeviceClass9
  bDeviceSubClass 0
  bDeviceProtocol 3
  ...
Hub Descriptor:
  ...
Binary Object Store Descriptor:
  ...
  SuperSpeed USB Device Capability:
  ...
  SuperSpeedPlus USB Device Capability:
bLength28
bDescriptorType16
bDevCapabilityType 10
bmAttributes 0x0023
  Sublink Speed Attribute count 3
  Sublink Speed ID count 1
wFunctionalitySupport   0x0001
bmSublinkSpeedAttr[0]   0x00050034
  Speed attr ID: 4 5Gb/s SuperSpeed
bmSublinkSpeedAttr[1]   0x000500b4
  Speed attr ID: 4 5Gb/s SuperSpeed
bmSublinkSpeedAttr[2]   0x000a4035
  Speed attr ID: 5 10Gb/s SuperSpeedPlus
bmSublinkSpeedAttr[3]   0x000a40b5
  Speed attr ID: 5 10Gb/s SuperSpeedPlus

** usb 3.1 mass storage device **
#./lsusb -v
Bus 004 Device 002: ID 174c:1351
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   3.10
  ...
  iManufacturer   2 Generic
  iProduct3 USB3.1 Device
Binary Object Store Descriptor:
  USB 2.0 Extension Device Capability:
  SuperSpeed USB Device Capability:
...
  SuperSpeedPlus USB Device Capability:
bLength20
bDescriptorType16
bDevCapabilityType 10
bmAttributes 0x0001
  Sublink Speed Attribute count 1
  Sublink Speed ID count 0
wFunctionalitySupport   0x1100
bmSublinkSpeedAttr[0]   0x000a4030
  Speed attr ID: 0 10Gb/s SuperSpeedPlus
bmSublinkSpeedAttr[1]   0x000a40b0
  Speed attr ID: 0 10Gb/s SuperSpeedPlus

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci DWC3 flavor problem

2016-05-20 Thread Mathias Nyman

On 19.05.2016 18:42, Joao Pinto wrote:


After a few moments the schedule problem happen again:

# INFO: task kworker/0:1:349 blocked for more than 120 seconds.
   Not tainted 4.6.0-rc5 #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1 D ff8008086c60 0   349  2 0x
Workqueue: usb_hub_wq hub_event
Call trace:
[] __switch_to+0xc8/0xd4
[] __schedule+0x18c/0x5c8
[] schedule+0x38/0x98
[] schedule_timeout+0x160/0x1ac
[] wait_for_common+0xac/0x150
[] wait_for_completion+0x14/0x1c
[] xhci_alloc_dev+0xf4/0x2a0
[] usb_alloc_dev+0x68/0x2cc
[] hub_event+0x784/0x11f4
[] process_one_work+0x130/0x2f4
[] worker_thread+0x54/0x434
[] kthread+0xd4/0xe8
[] ret_from_fork+0x10/0x40
INFO: task kworker/0:1:349 blocked for more than 120 seconds.
   Not tainted 4.6.0-rc5 #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1 D ff8008086c60 0   349  2 0x
Workqueue: usb_hub_wq hub_event
Call trace:
[] __switch_to+0xc8/0xd4
[] __schedule+0x18c/0x5c8
[] schedule+0x38/0x98
[] schedule_timeout+0x160/0x1ac
[] wait_for_common+0xac/0x150
[] wait_for_completion+0x14/0x1c
[] xhci_alloc_dev+0xf4/0x2a0
[] usb_alloc_dev+0x68/0x2cc
[] hub_event+0x784/0x11f4
[] process_one_work+0x130/0x2f4
[] worker_thread+0x54/0x434
[] kthread+0xd4/0xe8
[] ret_from_fork+0x10/0x40

So Chris' patch did not solve this problem either.

Thanks.


Looks like there still are some issue in cleaning up
pending commands after host dies.

This shouldn't happen, when host dies we clean up the command
ring and call completion for all pending commands, we set the state
to DISABLED or HALTED to prevent new commands from being queued.

But this is all long after the first command times out, ring abortion
fails, and we kill the host, right? So the bigger concern for you is
why the command never complete in the first place

-Mathias





--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 6/6] xhci: TD-fragment, align the unsplittable case with a bounce buffer

2016-05-26 Thread Mathias Nyman
If the last trb before a link is not packet size aligned, and is not
splittable then use a bounce buffer for that chunk of max packet size
unalignable data.

Allocate a max packet size bounce buffer for every segment of a bulk
endpoint ring at the same time as allocating the ring.
If we need to align the data before the link trb in that segment then
copy the data to the segment bounce buffer, dma map it, and enqueue it.
Once the td finishes, or is cancelled, unmap it.

For in transfers we need to first map the bounce buffer, then queue it,
after it finishes, copy the bounce buffer to the original sg list, and
finally unmap it

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-mem.c  |  76 ++---
 drivers/usb/host/xhci-ring.c | 113 ++-
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  11 -
 4 files changed, 164 insertions(+), 41 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index bad0d1f..08621b7 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -37,7 +37,9 @@
  * "All components of all Command and Transfer TRBs shall be initialized to 
'0'"
  */
 static struct xhci_segment *xhci_segment_alloc(struct xhci_hcd *xhci,
-   unsigned int cycle_state, gfp_t flags)
+  unsigned int cycle_state,
+  unsigned int max_packet,
+  gfp_t flags)
 {
struct xhci_segment *seg;
dma_addr_t  dma;
@@ -53,6 +55,14 @@ static struct xhci_segment *xhci_segment_alloc(struct 
xhci_hcd *xhci,
return NULL;
}
 
+   if (max_packet) {
+   seg->bounce_buf = kzalloc(max_packet, flags | GFP_DMA);
+   if (!seg->bounce_buf) {
+   dma_pool_free(xhci->segment_pool, seg->trbs, dma);
+   kfree(seg);
+   return NULL;
+   }
+   }
/* If the cycle state is 0, set the cycle bit to 1 for all the TRBs */
if (cycle_state == 0) {
for (i = 0; i < TRBS_PER_SEGMENT; i++)
@@ -70,6 +80,7 @@ static void xhci_segment_free(struct xhci_hcd *xhci, struct 
xhci_segment *seg)
dma_pool_free(xhci->segment_pool, seg->trbs, seg->dma);
seg->trbs = NULL;
}
+   kfree(seg->bounce_buf);
kfree(seg);
 }
 
@@ -317,11 +328,11 @@ static void xhci_initialize_ring_info(struct xhci_ring 
*ring,
 static int xhci_alloc_segments_for_ring(struct xhci_hcd *xhci,
struct xhci_segment **first, struct xhci_segment **last,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_segment *prev;
 
-   prev = xhci_segment_alloc(xhci, cycle_state, flags);
+   prev = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!prev)
return -ENOMEM;
num_segs--;
@@ -330,7 +341,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
while (num_segs > 0) {
struct xhci_segment *next;
 
-   next = xhci_segment_alloc(xhci, cycle_state, flags);
+   next = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!next) {
prev = *first;
while (prev) {
@@ -360,7 +371,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
  */
 static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd *xhci,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_ring*ring;
int ret;
@@ -370,13 +381,15 @@ static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd 
*xhci,
return NULL;
 
ring->num_segs = num_segs;
+   ring->bounce_buf_len = max_packet;
INIT_LIST_HEAD(&ring->td_list);
ring->type = type;
if (num_segs == 0)
return ring;
 
ret = xhci_alloc_segments_for_ring(xhci, &ring->first_seg,
-   &ring->last_seg, num_segs, cycle_state, type, flags);
+   &ring->last_seg, num_segs, cycle_state, type,
+   max_packet, flags);
if (ret)
goto fail;
 
@@ -470,7 +483,8 @@ int xhci_ring_expansion(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
ring->num_segs : num_segs_needed;
 
ret = xhci_alloc_segments_for_ring(xhci, &first, &last,
-  

[RFC PATCH 1/6] xhci: rename ep_ring variable in queue_bulk_tx(), no functional change

2016-05-26 Thread Mathias Nyman
Tiny change, a bit more readable.
The real reason for this change is that the coming td fragment work
had several over 80 lines character lines split just because of a few
extra characters in variable names.

no functional changes

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52deae4..c3e9a60 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3102,7 +3102,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
 {
-   struct xhci_ring *ep_ring;
+   struct xhci_ring *ring;
struct urb_priv *urb_priv;
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
@@ -3111,14 +3111,13 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
bool zero_length_needed;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len;
-   unsigned int full_len;
+   unsigned int running_total, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
 
-   ep_ring = xhci_urb_to_transfer_ring(xhci, urb);
-   if (!ep_ring)
+   ring = xhci_urb_to_transfer_ring(xhci, urb);
+   if (!ring)
return -EINVAL;
 
/* If we have scatter/gather list, we use it. */
@@ -3159,8 +3158,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 * until we've finished creating all the other TRBs.  The ring's cycle
 * state may change as we enqueue the other TRBs, so save it too.
 */
-   start_trb = &ep_ring->enqueue->generic;
-   start_cycle = ep_ring->cycle_state;
+   start_trb = &ring->enqueue->generic;
+   start_cycle = ring->cycle_state;
 
full_len = urb->transfer_buffer_length;
running_total = 0;
@@ -3199,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
-   field |= ep_ring->cycle_state;
+   field |= ring->cycle_state;
 
/* Chain all the TRBs together; clear the chain bit in the last
 * TRB to indicate it's the last TRB in the chain.
@@ -3209,10 +3208,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
} else {
field |= TRB_IOC;
if (i == last_trb_num)
-   td->last_trb = ep_ring->enqueue;
+   td->last_trb = ring->enqueue;
else if (zero_length_needed) {
trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ep_ring->enqueue;
+   urb_priv->td[1]->last_trb = ring->enqueue;
}
}
 
@@ -3233,7 +3232,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
more_trbs_coming = true;
else
more_trbs_coming = false;
-   queue_trb(xhci, ep_ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 0/6] xhci: conform with xhci td fragment constraints

2016-05-26 Thread Mathias Nyman
This patchseries makes sure we follow the td fragmentation constraints
in xhci 4.11.7.1 for bulk tranfers.

The specs say that:
"constraints must be imposed to ensure that the hardware gate count and
validation requirements are minimized for xHC implementations"

The constraint we need to conform to is to make sure the payload data is
packet aligned in the last trb we queue to a segment before a link trb
moves us to the next segment.

In most usecases the payload data is packet aligned, but cases where
scatterlists with several small enrtries are used may hit this issue.

Running "testusb -a -t 7 -s 2048 -v 41 -c 5000" was able to hang the host
after minutes of testing because of unaligned payload data.
After applying this series testusb ran without issues for 4 days.

usb-eth network adapters can potentially hit this issue.

This solution tries to align the data by splitting the trb at a packet
boundary, if that is not possible then a bounce buffer is used.
Initially the plan was to try to insert a link trb mid segment and mid TD
to fulfill the constraints before resorting to a bounce buffer, but
it turned out it requires a lot of driver rewrite. That is an option
that can be implemented later after driver cleanup.

The first 4 patches are more generic cleanups and reworks needed for
the two last patches that actually makes sure data is aligned.

Mathias Nyman (6):
  xhci: rename ep_ring variable in queue_bulk_tx(), no functional change
  xhci: properly prepare zero packet TD after normal bulk TD.
  xhci: use boolean to indicate last trb in td remainder calculation
  xhci: don't rely on precalculated value of needed trbs in the enqueue
loop
  xhci: align the last trb before link if it is easily splittable.
  xhci: TD-fragment, align the unsplittable case with a bounce buffer

 drivers/usb/host/xhci-mem.c  |  76 -
 drivers/usb/host/xhci-ring.c | 265 +--
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  11 +-
 4 files changed, 240 insertions(+), 117 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 5/6] xhci: align the last trb before link if it is easily splittable.

2016-05-26 Thread Mathias Nyman
TD fragments section 4.11.7.1 in xhci specs have additional requirements
on how trbs in TDs must be organized.

TD fragments shall not span transfer ring segments and TD fragments must
be packet aligned. Normally we don't care about TD fragments, on TD is one
big fragment, but if a TD spans ring segments it will be treated as two
fragments, and we need to comply with the alignment requirements.

For us this means that the payload data must be packet aligned in the
last trb before a link trb.
In most mass storage bulk tranfers we are lucky as the block size aligns
nicely with packet size, and there are no issues.
However, usb network adapters using scatterlists can hit this alignment
issue, and usbtest in kernel triggers this in minutes.

This patch is a partial solution, it solves the easy case when the last
trb before the link trb contains a packet boundary.
If that is the case then just split the trb at the boundary.
If not, then just print a debug message and continue as we have always
done, hoping for the best

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d86da81..c7c9521 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3098,6 +3098,27 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return (total_packet_count - ((transferred + trb_buff_len) / maxp));
 }
 
+static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
+u32 *trb_buff_len)
+{
+   unsigned int unalign;
+   unsigned int max_pkt;
+
+   max_pkt = usb_endpoint_maxp(&urb->ep->desc); /*FIXME MATTU GET_MAX..? */
+   unalign = (enqd_len + *trb_buff_len) % max_pkt;
+
+   /* we got lucky, last normal TRB data on segment is packet aligned */
+   if (unalign == 0)
+   return 0;
+
+   /* is the last nornal TRB alignable by splitting it */
+   if (*trb_buff_len > unalign) {
+   *trb_buff_len -= unalign;
+   return 0;
+   }
+   return 1;
+}
+
 /* This is very similar to what ehci-q.c qtd_fill() does */
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
@@ -3180,6 +3201,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 */
if (enqd_len + trb_buff_len < full_len) {
field |= TRB_CHAIN;
+   if (last_trb(xhci, ring, ring->enq_seg,
+ring->enqueue + 1)) {
+   if (xhci_align_td(xhci, urb, enqd_len,
+&trb_buff_len))
+   xhci_dbg(xhci, "TRB align fail\n");
+   }
} else {
field |= TRB_IOC;
more_trbs_coming = false;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 3/6] xhci: use boolean to indicate last trb in td remainder calculation

2016-05-26 Thread Mathias Nyman
We only need to know if we are queuing the last trb for a TD when
calculating the td remainder field.
The total number of trbs left is not used.

We won't be able to trust the pre-calculated number of trbs used if we
need to align trb data by splitting or merging trbs in order to satisfy
comply with data alignment requirements in xhci specs section 4.11.7.1.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 15dd226..f74ac1c 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3074,7 +3074,7 @@ int xhci_queue_intr_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
  */
 static u32 xhci_td_remainder(struct xhci_hcd *xhci, int transferred,
  int trb_buff_len, unsigned int td_total_len,
- struct urb *urb, unsigned int num_trbs_left)
+ struct urb *urb, bool more_trbs_coming)
 {
u32 maxp, total_packet_count;
 
@@ -3083,7 +3083,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return ((td_total_len - transferred) >> 10);
 
/* One TRB with a zero-length data packet. */
-   if (num_trbs_left == 0 || (transferred == 0 && trb_buff_len == 0) ||
+   if (!more_trbs_coming || (transferred == 0 && trb_buff_len == 0) ||
trb_buff_len == td_total_len)
return 0;
 
@@ -3198,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   more_trbs_coming = need_zero_pkt;
+   more_trbs_coming = false;
td->last_trb = ring->enqueue;
}
 
@@ -3209,13 +3209,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
/* Set the TRB length, TD size, and interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
trb_buff_len, full_len,
-   urb, num_trbs - i - 1);
-
+   urb, more_trbs_coming);
length_field = TRB_LEN(trb_buff_len) |
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   queue_trb(xhci, ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming | need_zero_pkt,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
@@ -3639,7 +3638,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, 
gfp_t mem_flags,
/* Set the TRB length, TD size, & interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
   trb_buff_len, td_len,
-  urb, trbs_per_td - j - 1);
+  urb, more_trbs_coming);
 
length_field = TRB_LEN(trb_buff_len) |
TRB_INTR_TARGET(0);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 2/6] xhci: properly prepare zero packet TD after normal bulk TD.

2016-05-26 Thread Mathias Nyman
If a zero-length packet is needed after a bulk transfer, then an
additional zero length TD was prepared before enqueueing the bulk transfer
This set up the zero packet TD structure with incorrect td->start_seg
and td->first_trb pointers.

Prepare the zero packet TD after the data bulk TD is enqueued instead.
It sets these pointers correctly.

This change also simplifies unnecessary complexity related to keeping
track of the last trb when enqueuing trbs.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 38 +++---
 1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index c3e9a60..15dd226 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3107,8 +3107,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
struct scatterlist *sg = NULL;
-   bool more_trbs_coming;
-   bool zero_length_needed;
+   bool more_trbs_coming = true;
+   bool need_zero_pkt = false;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
unsigned int running_total, block_len, trb_buff_len, full_len;
@@ -3139,17 +3139,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
last_trb_num = num_trbs - 1;
 
/* Deal with URB_ZERO_PACKET - need one more td/trb */
-   zero_length_needed = urb->transfer_flags & URB_ZERO_PACKET &&
-   urb_priv->length == 2;
-   if (zero_length_needed) {
-   num_trbs++;
-   xhci_dbg(xhci, "Creating zero length td.\n");
-   ret = prepare_transfer(xhci, xhci->devs[slot_id],
-   ep_index, urb->stream_id,
-   1, urb, 1, mem_flags);
-   if (unlikely(ret < 0))
-   return ret;
-   }
+   if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
+   need_zero_pkt = true;
 
td = urb_priv->td[0];
 
@@ -3207,12 +3198,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   if (i == last_trb_num)
-   td->last_trb = ring->enqueue;
-   else if (zero_length_needed) {
-   trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ring->enqueue;
-   }
+   more_trbs_coming = need_zero_pkt;
+   td->last_trb = ring->enqueue;
}
 
/* Only set interrupt on short packet for IN endpoints */
@@ -3228,10 +3215,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   if (i < num_trbs - 1)
-   more_trbs_coming = true;
-   else
-   more_trbs_coming = false;
queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
@@ -3253,6 +3236,15 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
}
}
 
+   if (need_zero_pkt) {
+   ret = prepare_transfer(xhci, xhci->devs[slot_id],
+  ep_index, urb->stream_id,
+  1, urb, 1, mem_flags);
+   urb_priv->td[1]->last_trb = ring->enqueue;
+   field = TRB_TYPE(TRB_NORMAL) | ring->cycle_state | TRB_IOC;
+   queue_trb(xhci, ring, 0, 0, 0, TRB_INTR_TARGET(0), field);
+   }
+
check_trb_math(urb, running_total);
giveback_first_trb(xhci, slot_id, ep_index, urb->stream_id,
start_cycle, start_trb);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 4/6] xhci: don't rely on precalculated value of needed trbs in the enqueue loop

2016-05-26 Thread Mathias Nyman
Queue trbs until all payload data in the urb is tranferred.

The actual number of trbs might need to change from the pre-calculated
number when the packet alignment restrictions for td fragments in
xhci 4.11.7.1 are taken into account.

Long term plan is to get rid of calculating the needed trbs in advance
all together. It's an unnecessary extra walk through the scatterlist.

This change also allows some bulk queue function simplifications

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 75 +---
 1 file changed, 29 insertions(+), 46 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f74ac1c..d86da81 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3109,9 +3109,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct scatterlist *sg = NULL;
bool more_trbs_coming = true;
bool need_zero_pkt = false;
-   unsigned int num_trbs, last_trb_num, i;
+   bool first_trb = true;
+   unsigned int num_trbs;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len, full_len;
+   unsigned int enqd_len, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
@@ -3120,14 +3121,19 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (!ring)
return -EINVAL;
 
+   full_len = urb->transfer_buffer_length;
/* If we have scatter/gather list, we use it. */
if (urb->num_sgs) {
num_sgs = urb->num_mapped_sgs;
sg = urb->sg;
+   addr = (u64) sg_dma_address(sg);
+   block_len = sg_dma_len(sg);
num_trbs = count_sg_trbs_needed(urb);
-   } else
+   } else {
num_trbs = count_trbs_needed(urb);
-
+   addr = (u64) urb->transfer_dma;
+   block_len = full_len;
+   }
ret = prepare_transfer(xhci, xhci->devs[slot_id],
ep_index, urb->stream_id,
num_trbs, urb, 0, mem_flags);
@@ -3136,8 +3142,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 
urb_priv = urb->hcpriv;
 
-   last_trb_num = num_trbs - 1;
-
/* Deal with URB_ZERO_PACKET - need one more td/trb */
if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
need_zero_pkt = true;
@@ -3152,40 +3156,20 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
start_trb = &ring->enqueue->generic;
start_cycle = ring->cycle_state;
 
-   full_len = urb->transfer_buffer_length;
-   running_total = 0;
-   block_len = 0;
-
/* Queue the TRBs, even if they are zero-length */
-   for (i = 0; i < num_trbs; i++) {
+   for (enqd_len = 0; enqd_len < full_len; enqd_len += trb_buff_len) {
field = TRB_TYPE(TRB_NORMAL);
 
-   if (block_len == 0) {
-   /* A new contiguous block. */
-   if (sg) {
-   addr = (u64) sg_dma_address(sg);
-   block_len = sg_dma_len(sg);
-   } else {
-   addr = (u64) urb->transfer_dma;
-   block_len = full_len;
-   }
-   /* TRB buffer should not cross 64KB boundaries */
-   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
-   trb_buff_len = min_t(unsigned int,
-   trb_buff_len,
-   block_len);
-   } else {
-   /* Further through the contiguous block. */
-   trb_buff_len = block_len;
-   if (trb_buff_len > TRB_MAX_BUFF_SIZE)
-   trb_buff_len = TRB_MAX_BUFF_SIZE;
-   }
+   /* TRB buffer should not cross 64KB boundaries */
+   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
+   trb_buff_len = min_t(unsigned int, trb_buff_len, block_len);
 
-   if (running_total + trb_buff_len > full_len)
-   trb_buff_len = full_len - running_total;
+   if (enqd_len + trb_buff_len > full_len)
+   trb_buff_len = full_len - enqd_len;
 
/* Don't change the cycle bit of the first TRB until later */
-   if (i == 0) {
+   if (first_trb) {
+   first_trb = false;
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
@@ -3194,7 +3178,7 @@ int xhci_que

Re: [RFC PATCH 6/6] xhci: TD-fragment, align the unsplittable case with a bounce buffer

2016-05-26 Thread Mathias Nyman

On 26.05.2016 14:32, Felipe Balbi wrote:


Hi,

Mathias Nyman  writes:

@@ -3098,24 +3136,66 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return (total_packet_count - ((transferred + trb_buff_len) / maxp));
  }




+   }
+   *trb_buff_len = new_buff_len;
+   seg->bounce_len = new_buff_len;
+   seg->bounce_offs = enqd_len;
+
+   xhci_dbg(xhci, "Bounce align, new buff len %d\n", *trb_buff_len);
+
+   /* FIXME MATTU make sure memory allocated memory is 64k aligned */

 ^
 do you wanna clean this up ?



Will do, new version coming

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v2 1/6] xhci: rename ep_ring variable in queue_bulk_tx(), no functional change

2016-05-26 Thread Mathias Nyman
Tiny change, a bit more readable.
The real reason for this change is that the coming td fragment work
had several over 80 lines character lines split just because of a few
extra characters in variable names.

no functional changes

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52deae4..c3e9a60 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3102,7 +3102,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
 {
-   struct xhci_ring *ep_ring;
+   struct xhci_ring *ring;
struct urb_priv *urb_priv;
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
@@ -3111,14 +3111,13 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
bool zero_length_needed;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len;
-   unsigned int full_len;
+   unsigned int running_total, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
 
-   ep_ring = xhci_urb_to_transfer_ring(xhci, urb);
-   if (!ep_ring)
+   ring = xhci_urb_to_transfer_ring(xhci, urb);
+   if (!ring)
return -EINVAL;
 
/* If we have scatter/gather list, we use it. */
@@ -3159,8 +3158,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 * until we've finished creating all the other TRBs.  The ring's cycle
 * state may change as we enqueue the other TRBs, so save it too.
 */
-   start_trb = &ep_ring->enqueue->generic;
-   start_cycle = ep_ring->cycle_state;
+   start_trb = &ring->enqueue->generic;
+   start_cycle = ring->cycle_state;
 
full_len = urb->transfer_buffer_length;
running_total = 0;
@@ -3199,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
-   field |= ep_ring->cycle_state;
+   field |= ring->cycle_state;
 
/* Chain all the TRBs together; clear the chain bit in the last
 * TRB to indicate it's the last TRB in the chain.
@@ -3209,10 +3208,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
} else {
field |= TRB_IOC;
if (i == last_trb_num)
-   td->last_trb = ep_ring->enqueue;
+   td->last_trb = ring->enqueue;
else if (zero_length_needed) {
trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ep_ring->enqueue;
+   urb_priv->td[1]->last_trb = ring->enqueue;
}
}
 
@@ -3233,7 +3232,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
more_trbs_coming = true;
else
more_trbs_coming = false;
-   queue_trb(xhci, ep_ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v2 0/6] xhci: conform with xhci td fragment constraints

2016-05-26 Thread Mathias Nyman
This patchseries makes sure we follow the td fragmentation constraints
in xhci 4.11.7.1 for bulk tranfers.

The specs say that:
"constraints must be imposed to ensure that the hardware gate count and
validation requirements are minimized for xHC implementations"

The constraint we need to conform to is to make sure the payload data is
packet aligned in the last trb we queue to a segment before a link trb
moves us to the next segment.

In most usecases the payload data is packet aligned, but cases where
scatterlists with several small enrtries are used may hit this issue.

Running "testusb -a -t 7 -s 2048 -v 41 -c 5000" was able to hang the host
after minutes of testing because of unaligned payload data.
After applying this series testusb ran without issues for 4 days.

usb-eth network adapters can potentially hit this issue.

This solution tries to align the data by splitting the trb at a packet
boundary, if that is not possible then a bounce buffer is used.
Initially the plan was to try to insert a link trb mid segment and mid TD
to fulfill the constraints before resorting to a bounce buffer, but
it turned out it requires a lot of driver rewrite. That is an option
that can be implemented later after driver cleanup.

The first 4 patches are more generic cleanups and reworks needed for
the two last patches that actually makes sure data is aligned.

changes since v1:
  cleanup code comments used during development.

Mathias Nyman (6):
  xhci: rename ep_ring variable in queue_bulk_tx(), no functional change
  xhci: properly prepare zero packet TD after normal bulk TD.
  xhci: use boolean to indicate last trb in td remainder calculation
  xhci: don't rely on precalculated value of needed trbs in the enqueue
loop
  xhci: align the last trb before link if it is easily splittable.
  xhci: TD-fragment, align the unsplittable case with a bounce buffer

 drivers/usb/host/xhci-mem.c  |  74 +++-
 drivers/usb/host/xhci-ring.c | 263 +--
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  11 +-
 4 files changed, 236 insertions(+), 117 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v2 4/6] xhci: don't rely on precalculated value of needed trbs in the enqueue loop

2016-05-26 Thread Mathias Nyman
Queue trbs until all payload data in the urb is tranferred.

The actual number of trbs might need to change from the pre-calculated
number when the packet alignment restrictions for td fragments in
xhci 4.11.7.1 are taken into account.

Long term plan is to get rid of calculating the needed trbs in advance
all together. It's an unnecessary extra walk through the scatterlist.

This change also allows some bulk queue function simplifications

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 75 +---
 1 file changed, 29 insertions(+), 46 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f74ac1c..d86da81 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3109,9 +3109,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct scatterlist *sg = NULL;
bool more_trbs_coming = true;
bool need_zero_pkt = false;
-   unsigned int num_trbs, last_trb_num, i;
+   bool first_trb = true;
+   unsigned int num_trbs;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len, full_len;
+   unsigned int enqd_len, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
@@ -3120,14 +3121,19 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (!ring)
return -EINVAL;
 
+   full_len = urb->transfer_buffer_length;
/* If we have scatter/gather list, we use it. */
if (urb->num_sgs) {
num_sgs = urb->num_mapped_sgs;
sg = urb->sg;
+   addr = (u64) sg_dma_address(sg);
+   block_len = sg_dma_len(sg);
num_trbs = count_sg_trbs_needed(urb);
-   } else
+   } else {
num_trbs = count_trbs_needed(urb);
-
+   addr = (u64) urb->transfer_dma;
+   block_len = full_len;
+   }
ret = prepare_transfer(xhci, xhci->devs[slot_id],
ep_index, urb->stream_id,
num_trbs, urb, 0, mem_flags);
@@ -3136,8 +3142,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 
urb_priv = urb->hcpriv;
 
-   last_trb_num = num_trbs - 1;
-
/* Deal with URB_ZERO_PACKET - need one more td/trb */
if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
need_zero_pkt = true;
@@ -3152,40 +3156,20 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
start_trb = &ring->enqueue->generic;
start_cycle = ring->cycle_state;
 
-   full_len = urb->transfer_buffer_length;
-   running_total = 0;
-   block_len = 0;
-
/* Queue the TRBs, even if they are zero-length */
-   for (i = 0; i < num_trbs; i++) {
+   for (enqd_len = 0; enqd_len < full_len; enqd_len += trb_buff_len) {
field = TRB_TYPE(TRB_NORMAL);
 
-   if (block_len == 0) {
-   /* A new contiguous block. */
-   if (sg) {
-   addr = (u64) sg_dma_address(sg);
-   block_len = sg_dma_len(sg);
-   } else {
-   addr = (u64) urb->transfer_dma;
-   block_len = full_len;
-   }
-   /* TRB buffer should not cross 64KB boundaries */
-   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
-   trb_buff_len = min_t(unsigned int,
-   trb_buff_len,
-   block_len);
-   } else {
-   /* Further through the contiguous block. */
-   trb_buff_len = block_len;
-   if (trb_buff_len > TRB_MAX_BUFF_SIZE)
-   trb_buff_len = TRB_MAX_BUFF_SIZE;
-   }
+   /* TRB buffer should not cross 64KB boundaries */
+   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
+   trb_buff_len = min_t(unsigned int, trb_buff_len, block_len);
 
-   if (running_total + trb_buff_len > full_len)
-   trb_buff_len = full_len - running_total;
+   if (enqd_len + trb_buff_len > full_len)
+   trb_buff_len = full_len - enqd_len;
 
/* Don't change the cycle bit of the first TRB until later */
-   if (i == 0) {
+   if (first_trb) {
+   first_trb = false;
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
@@ -3194,7 +3178,7 @@ int xhci_que

[RFC PATCH v2 2/6] xhci: properly prepare zero packet TD after normal bulk TD.

2016-05-26 Thread Mathias Nyman
If a zero-length packet is needed after a bulk transfer, then an
additional zero length TD was prepared before enqueueing the bulk transfer
This set up the zero packet TD structure with incorrect td->start_seg
and td->first_trb pointers.

Prepare the zero packet TD after the data bulk TD is enqueued instead.
It sets these pointers correctly.

This change also simplifies unnecessary complexity related to keeping
track of the last trb when enqueuing trbs.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 38 +++---
 1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index c3e9a60..15dd226 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3107,8 +3107,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
struct scatterlist *sg = NULL;
-   bool more_trbs_coming;
-   bool zero_length_needed;
+   bool more_trbs_coming = true;
+   bool need_zero_pkt = false;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
unsigned int running_total, block_len, trb_buff_len, full_len;
@@ -3139,17 +3139,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
last_trb_num = num_trbs - 1;
 
/* Deal with URB_ZERO_PACKET - need one more td/trb */
-   zero_length_needed = urb->transfer_flags & URB_ZERO_PACKET &&
-   urb_priv->length == 2;
-   if (zero_length_needed) {
-   num_trbs++;
-   xhci_dbg(xhci, "Creating zero length td.\n");
-   ret = prepare_transfer(xhci, xhci->devs[slot_id],
-   ep_index, urb->stream_id,
-   1, urb, 1, mem_flags);
-   if (unlikely(ret < 0))
-   return ret;
-   }
+   if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
+   need_zero_pkt = true;
 
td = urb_priv->td[0];
 
@@ -3207,12 +3198,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   if (i == last_trb_num)
-   td->last_trb = ring->enqueue;
-   else if (zero_length_needed) {
-   trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ring->enqueue;
-   }
+   more_trbs_coming = need_zero_pkt;
+   td->last_trb = ring->enqueue;
}
 
/* Only set interrupt on short packet for IN endpoints */
@@ -3228,10 +3215,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   if (i < num_trbs - 1)
-   more_trbs_coming = true;
-   else
-   more_trbs_coming = false;
queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
@@ -3253,6 +3236,15 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
}
}
 
+   if (need_zero_pkt) {
+   ret = prepare_transfer(xhci, xhci->devs[slot_id],
+  ep_index, urb->stream_id,
+  1, urb, 1, mem_flags);
+   urb_priv->td[1]->last_trb = ring->enqueue;
+   field = TRB_TYPE(TRB_NORMAL) | ring->cycle_state | TRB_IOC;
+   queue_trb(xhci, ring, 0, 0, 0, TRB_INTR_TARGET(0), field);
+   }
+
check_trb_math(urb, running_total);
giveback_first_trb(xhci, slot_id, ep_index, urb->stream_id,
start_cycle, start_trb);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v2 6/6] xhci: TD-fragment, align the unsplittable case with a bounce buffer

2016-05-26 Thread Mathias Nyman
If the last trb before a link is not packet size aligned, and is not
splittable then use a bounce buffer for that chunk of max packet size
unalignable data.

Allocate a max packet size bounce buffer for every segment of a bulk
endpoint ring at the same time as allocating the ring.
If we need to align the data before the link trb in that segment then
copy the data to the segment bounce buffer, dma map it, and enqueue it.
Once the td finishes, or is cancelled, unmap it.

For in transfers we need to first map the bounce buffer, then queue it,
after it finishes, copy the bounce buffer to the original sg list, and
finally unmap it

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-mem.c  |  74 ++---
 drivers/usb/host/xhci-ring.c | 111 ++-
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  11 -
 4 files changed, 160 insertions(+), 41 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index bad0d1f..6afe323 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -37,7 +37,9 @@
  * "All components of all Command and Transfer TRBs shall be initialized to 
'0'"
  */
 static struct xhci_segment *xhci_segment_alloc(struct xhci_hcd *xhci,
-   unsigned int cycle_state, gfp_t flags)
+  unsigned int cycle_state,
+  unsigned int max_packet,
+  gfp_t flags)
 {
struct xhci_segment *seg;
dma_addr_t  dma;
@@ -53,6 +55,14 @@ static struct xhci_segment *xhci_segment_alloc(struct 
xhci_hcd *xhci,
return NULL;
}
 
+   if (max_packet) {
+   seg->bounce_buf = kzalloc(max_packet, flags | GFP_DMA);
+   if (!seg->bounce_buf) {
+   dma_pool_free(xhci->segment_pool, seg->trbs, dma);
+   kfree(seg);
+   return NULL;
+   }
+   }
/* If the cycle state is 0, set the cycle bit to 1 for all the TRBs */
if (cycle_state == 0) {
for (i = 0; i < TRBS_PER_SEGMENT; i++)
@@ -70,6 +80,7 @@ static void xhci_segment_free(struct xhci_hcd *xhci, struct 
xhci_segment *seg)
dma_pool_free(xhci->segment_pool, seg->trbs, seg->dma);
seg->trbs = NULL;
}
+   kfree(seg->bounce_buf);
kfree(seg);
 }
 
@@ -317,11 +328,11 @@ static void xhci_initialize_ring_info(struct xhci_ring 
*ring,
 static int xhci_alloc_segments_for_ring(struct xhci_hcd *xhci,
struct xhci_segment **first, struct xhci_segment **last,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_segment *prev;
 
-   prev = xhci_segment_alloc(xhci, cycle_state, flags);
+   prev = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!prev)
return -ENOMEM;
num_segs--;
@@ -330,7 +341,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
while (num_segs > 0) {
struct xhci_segment *next;
 
-   next = xhci_segment_alloc(xhci, cycle_state, flags);
+   next = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!next) {
prev = *first;
while (prev) {
@@ -360,7 +371,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
  */
 static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd *xhci,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_ring*ring;
int ret;
@@ -370,13 +381,15 @@ static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd 
*xhci,
return NULL;
 
ring->num_segs = num_segs;
+   ring->bounce_buf_len = max_packet;
INIT_LIST_HEAD(&ring->td_list);
ring->type = type;
if (num_segs == 0)
return ring;
 
ret = xhci_alloc_segments_for_ring(xhci, &ring->first_seg,
-   &ring->last_seg, num_segs, cycle_state, type, flags);
+   &ring->last_seg, num_segs, cycle_state, type,
+   max_packet, flags);
if (ret)
goto fail;
 
@@ -470,7 +483,8 @@ int xhci_ring_expansion(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
ring->num_segs : num_segs_needed;
 
ret = xhci_alloc_segments_for_ring(xhci, &first, &last,
-  

[RFC PATCH v2 5/6] xhci: align the last trb before link if it is easily splittable.

2016-05-26 Thread Mathias Nyman
TD fragments section 4.11.7.1 in xhci specs have additional requirements
on how trbs in TDs must be organized.

TD fragments shall not span transfer ring segments and TD fragments must
be packet aligned. Normally we don't care about TD fragments, on TD is one
big fragment, but if a TD spans ring segments it will be treated as two
fragments, and we need to comply with the alignment requirements.

For us this means that the payload data must be packet aligned in the
last trb before a link trb.
In most mass storage bulk tranfers we are lucky as the block size aligns
nicely with packet size, and there are no issues.
However, usb network adapters using scatterlists can hit this alignment
issue, and usbtest in kernel triggers this in minutes.

This patch is a partial solution, it solves the easy case when the last
trb before the link trb contains a packet boundary.
If that is the case then just split the trb at the boundary.
If not, then just print a debug message and continue as we have always
done, hoping for the best

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d86da81..c7c9521 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3098,6 +3098,27 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return (total_packet_count - ((transferred + trb_buff_len) / maxp));
 }
 
+static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
+u32 *trb_buff_len)
+{
+   unsigned int unalign;
+   unsigned int max_pkt;
+
+   max_pkt = usb_endpoint_maxp(&urb->ep->desc); /*FIXME MATTU GET_MAX..? */
+   unalign = (enqd_len + *trb_buff_len) % max_pkt;
+
+   /* we got lucky, last normal TRB data on segment is packet aligned */
+   if (unalign == 0)
+   return 0;
+
+   /* is the last nornal TRB alignable by splitting it */
+   if (*trb_buff_len > unalign) {
+   *trb_buff_len -= unalign;
+   return 0;
+   }
+   return 1;
+}
+
 /* This is very similar to what ehci-q.c qtd_fill() does */
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
@@ -3180,6 +3201,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 */
if (enqd_len + trb_buff_len < full_len) {
field |= TRB_CHAIN;
+   if (last_trb(xhci, ring, ring->enq_seg,
+ring->enqueue + 1)) {
+   if (xhci_align_td(xhci, urb, enqd_len,
+&trb_buff_len))
+   xhci_dbg(xhci, "TRB align fail\n");
+   }
} else {
field |= TRB_IOC;
more_trbs_coming = false;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v2 3/6] xhci: use boolean to indicate last trb in td remainder calculation

2016-05-26 Thread Mathias Nyman
We only need to know if we are queuing the last trb for a TD when
calculating the td remainder field.
The total number of trbs left is not used.

We won't be able to trust the pre-calculated number of trbs used if we
need to align trb data by splitting or merging trbs in order to satisfy
comply with data alignment requirements in xhci specs section 4.11.7.1.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 15dd226..f74ac1c 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3074,7 +3074,7 @@ int xhci_queue_intr_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
  */
 static u32 xhci_td_remainder(struct xhci_hcd *xhci, int transferred,
  int trb_buff_len, unsigned int td_total_len,
- struct urb *urb, unsigned int num_trbs_left)
+ struct urb *urb, bool more_trbs_coming)
 {
u32 maxp, total_packet_count;
 
@@ -3083,7 +3083,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return ((td_total_len - transferred) >> 10);
 
/* One TRB with a zero-length data packet. */
-   if (num_trbs_left == 0 || (transferred == 0 && trb_buff_len == 0) ||
+   if (!more_trbs_coming || (transferred == 0 && trb_buff_len == 0) ||
trb_buff_len == td_total_len)
return 0;
 
@@ -3198,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   more_trbs_coming = need_zero_pkt;
+   more_trbs_coming = false;
td->last_trb = ring->enqueue;
}
 
@@ -3209,13 +3209,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
/* Set the TRB length, TD size, and interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
trb_buff_len, full_len,
-   urb, num_trbs - i - 1);
-
+   urb, more_trbs_coming);
length_field = TRB_LEN(trb_buff_len) |
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   queue_trb(xhci, ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming | need_zero_pkt,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
@@ -3639,7 +3638,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, 
gfp_t mem_flags,
/* Set the TRB length, TD size, & interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
   trb_buff_len, td_len,
-  urb, trbs_per_td - j - 1);
+  urb, more_trbs_coming);
 
length_field = TRB_LEN(trb_buff_len) |
TRB_INTR_TARGET(0);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 5/6] xhci: align the last trb before link if it is easily splittable.

2016-05-26 Thread Mathias Nyman

On 26.05.2016 15:08, Mathias Nyman wrote:

TD fragments section 4.11.7.1 in xhci specs have additional requirements
on how trbs in TDs must be organized.

TD fragments shall not span transfer ring segments and TD fragments must
be packet aligned. Normally we don't care about TD fragments, on TD is one
big fragment, but if a TD spans ring segments it will be treated as two
fragments, and we need to comply with the alignment requirements.

For us this means that the payload data must be packet aligned in the
last trb before a link trb.
In most mass storage bulk tranfers we are lucky as the block size aligns
nicely with packet size, and there are no issues.
However, usb network adapters using scatterlists can hit this alignment
issue, and usbtest in kernel triggers this in minutes.

This patch is a partial solution, it solves the easy case when the last
trb before the link trb contains a packet boundary.
If that is the case then just split the trb at the boundary.
If not, then just print a debug message and continue as we have always
done, hoping for the best

Signed-off-by: Mathias Nyman 
---
  drivers/usb/host/xhci-ring.c | 27 +++
  1 file changed, 27 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d86da81..c7c9521 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3098,6 +3098,27 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return (total_packet_count - ((transferred + trb_buff_len) / maxp));
  }

+static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
+u32 *trb_buff_len)
+{
+   unsigned int unalign;
+   unsigned int max_pkt;
+
+   max_pkt = usb_endpoint_maxp(&urb->ep->desc); /*FIXME MATTU GET_MAX..? */


Sigh.. so I managed to amend the cleanup to the wrong patch, was supposed to
be in 5/6 not 6/6

yet another version on its way

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 0/6] xhci: conform with xhci td fragment constraints

2016-05-26 Thread Mathias Nyman
This patchseries makes sure we follow the td fragmentation constraints
in xhci 4.11.7.1 for bulk tranfers.

The specs say that:
"constraints must be imposed to ensure that the hardware gate count and
validation requirements are minimized for xHC implementations"

The constraint we need to conform to is to make sure the payload data is
packet aligned in the last trb we queue to a segment before a link trb
moves us to the next segment.

In most usecases the payload data is packet aligned, but cases where
scatterlists with several small enrtries are used may hit this issue.

Running "testusb -a -t 7 -s 2048 -v 41 -c 5000" was able to hang the host
after minutes of testing because of unaligned payload data.
After applying this series testusb ran without issues for 4 days.

usb-eth network adapters can potentially hit this issue.

This solution tries to align the data by splitting the trb at a packet
boundary, if that is not possible then a bounce buffer is used.
Initially the plan was to try to insert a link trb mid segment and mid TD
to fulfill the constraints before resorting to a bounce buffer, but
it turned out it requires a lot of driver rewrite. That is an option
that can be implemented later after driver cleanup.

The first 4 patches are more generic cleanups and reworks needed for
the two last patches that actually makes sure data is aligned.

changes since v1:
  cleanup code comments used during development.

changes since v2:
  cleanup the code comments in the correct patches

Mathias Nyman (6):
  xhci: rename ep_ring variable in queue_bulk_tx(), no functional change
  xhci: properly prepare zero packet TD after normal bulk TD.
  xhci: use boolean to indicate last trb in td remainder calculation
  xhci: don't rely on precalculated value of needed trbs in the enqueue
loop
  xhci: align the last trb before link if it is easily splittable.
  xhci: TD-fragment, align the unsplittable case with a bounce buffer

 drivers/usb/host/xhci-mem.c  |  74 +++-
 drivers/usb/host/xhci-ring.c | 263 +--
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  11 +-
 4 files changed, 236 insertions(+), 117 deletions(-)

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 1/6] xhci: rename ep_ring variable in queue_bulk_tx(), no functional change

2016-05-26 Thread Mathias Nyman
Tiny change, a bit more readable.
The real reason for this change is that the coming td fragment work
had several over 80 lines character lines split just because of a few
extra characters in variable names.

no functional changes

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52deae4..c3e9a60 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3102,7 +3102,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
 {
-   struct xhci_ring *ep_ring;
+   struct xhci_ring *ring;
struct urb_priv *urb_priv;
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
@@ -3111,14 +3111,13 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
bool zero_length_needed;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len;
-   unsigned int full_len;
+   unsigned int running_total, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
 
-   ep_ring = xhci_urb_to_transfer_ring(xhci, urb);
-   if (!ep_ring)
+   ring = xhci_urb_to_transfer_ring(xhci, urb);
+   if (!ring)
return -EINVAL;
 
/* If we have scatter/gather list, we use it. */
@@ -3159,8 +3158,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 * until we've finished creating all the other TRBs.  The ring's cycle
 * state may change as we enqueue the other TRBs, so save it too.
 */
-   start_trb = &ep_ring->enqueue->generic;
-   start_cycle = ep_ring->cycle_state;
+   start_trb = &ring->enqueue->generic;
+   start_cycle = ring->cycle_state;
 
full_len = urb->transfer_buffer_length;
running_total = 0;
@@ -3199,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
-   field |= ep_ring->cycle_state;
+   field |= ring->cycle_state;
 
/* Chain all the TRBs together; clear the chain bit in the last
 * TRB to indicate it's the last TRB in the chain.
@@ -3209,10 +3208,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
} else {
field |= TRB_IOC;
if (i == last_trb_num)
-   td->last_trb = ep_ring->enqueue;
+   td->last_trb = ring->enqueue;
else if (zero_length_needed) {
trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ep_ring->enqueue;
+   urb_priv->td[1]->last_trb = ring->enqueue;
}
}
 
@@ -3233,7 +3232,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
more_trbs_coming = true;
else
more_trbs_coming = false;
-   queue_trb(xhci, ep_ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 2/6] xhci: properly prepare zero packet TD after normal bulk TD.

2016-05-26 Thread Mathias Nyman
If a zero-length packet is needed after a bulk transfer, then an
additional zero length TD was prepared before enqueueing the bulk transfer
This set up the zero packet TD structure with incorrect td->start_seg
and td->first_trb pointers.

Prepare the zero packet TD after the data bulk TD is enqueued instead.
It sets these pointers correctly.

This change also simplifies unnecessary complexity related to keeping
track of the last trb when enqueuing trbs.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 38 +++---
 1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index c3e9a60..15dd226 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3107,8 +3107,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
struct scatterlist *sg = NULL;
-   bool more_trbs_coming;
-   bool zero_length_needed;
+   bool more_trbs_coming = true;
+   bool need_zero_pkt = false;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
unsigned int running_total, block_len, trb_buff_len, full_len;
@@ -3139,17 +3139,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
last_trb_num = num_trbs - 1;
 
/* Deal with URB_ZERO_PACKET - need one more td/trb */
-   zero_length_needed = urb->transfer_flags & URB_ZERO_PACKET &&
-   urb_priv->length == 2;
-   if (zero_length_needed) {
-   num_trbs++;
-   xhci_dbg(xhci, "Creating zero length td.\n");
-   ret = prepare_transfer(xhci, xhci->devs[slot_id],
-   ep_index, urb->stream_id,
-   1, urb, 1, mem_flags);
-   if (unlikely(ret < 0))
-   return ret;
-   }
+   if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
+   need_zero_pkt = true;
 
td = urb_priv->td[0];
 
@@ -3207,12 +3198,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   if (i == last_trb_num)
-   td->last_trb = ring->enqueue;
-   else if (zero_length_needed) {
-   trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ring->enqueue;
-   }
+   more_trbs_coming = need_zero_pkt;
+   td->last_trb = ring->enqueue;
}
 
/* Only set interrupt on short packet for IN endpoints */
@@ -3228,10 +3215,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   if (i < num_trbs - 1)
-   more_trbs_coming = true;
-   else
-   more_trbs_coming = false;
queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
@@ -3253,6 +3236,15 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
}
}
 
+   if (need_zero_pkt) {
+   ret = prepare_transfer(xhci, xhci->devs[slot_id],
+  ep_index, urb->stream_id,
+  1, urb, 1, mem_flags);
+   urb_priv->td[1]->last_trb = ring->enqueue;
+   field = TRB_TYPE(TRB_NORMAL) | ring->cycle_state | TRB_IOC;
+   queue_trb(xhci, ring, 0, 0, 0, TRB_INTR_TARGET(0), field);
+   }
+
check_trb_math(urb, running_total);
giveback_first_trb(xhci, slot_id, ep_index, urb->stream_id,
start_cycle, start_trb);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 4/6] xhci: don't rely on precalculated value of needed trbs in the enqueue loop

2016-05-26 Thread Mathias Nyman
Queue trbs until all payload data in the urb is tranferred.

The actual number of trbs might need to change from the pre-calculated
number when the packet alignment restrictions for td fragments in
xhci 4.11.7.1 are taken into account.

Long term plan is to get rid of calculating the needed trbs in advance
all together. It's an unnecessary extra walk through the scatterlist.

This change also allows some bulk queue function simplifications

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 75 +---
 1 file changed, 29 insertions(+), 46 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f74ac1c..d86da81 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3109,9 +3109,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct scatterlist *sg = NULL;
bool more_trbs_coming = true;
bool need_zero_pkt = false;
-   unsigned int num_trbs, last_trb_num, i;
+   bool first_trb = true;
+   unsigned int num_trbs;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len, full_len;
+   unsigned int enqd_len, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
@@ -3120,14 +3121,19 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (!ring)
return -EINVAL;
 
+   full_len = urb->transfer_buffer_length;
/* If we have scatter/gather list, we use it. */
if (urb->num_sgs) {
num_sgs = urb->num_mapped_sgs;
sg = urb->sg;
+   addr = (u64) sg_dma_address(sg);
+   block_len = sg_dma_len(sg);
num_trbs = count_sg_trbs_needed(urb);
-   } else
+   } else {
num_trbs = count_trbs_needed(urb);
-
+   addr = (u64) urb->transfer_dma;
+   block_len = full_len;
+   }
ret = prepare_transfer(xhci, xhci->devs[slot_id],
ep_index, urb->stream_id,
num_trbs, urb, 0, mem_flags);
@@ -3136,8 +3142,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 
urb_priv = urb->hcpriv;
 
-   last_trb_num = num_trbs - 1;
-
/* Deal with URB_ZERO_PACKET - need one more td/trb */
if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
need_zero_pkt = true;
@@ -3152,40 +3156,20 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
start_trb = &ring->enqueue->generic;
start_cycle = ring->cycle_state;
 
-   full_len = urb->transfer_buffer_length;
-   running_total = 0;
-   block_len = 0;
-
/* Queue the TRBs, even if they are zero-length */
-   for (i = 0; i < num_trbs; i++) {
+   for (enqd_len = 0; enqd_len < full_len; enqd_len += trb_buff_len) {
field = TRB_TYPE(TRB_NORMAL);
 
-   if (block_len == 0) {
-   /* A new contiguous block. */
-   if (sg) {
-   addr = (u64) sg_dma_address(sg);
-   block_len = sg_dma_len(sg);
-   } else {
-   addr = (u64) urb->transfer_dma;
-   block_len = full_len;
-   }
-   /* TRB buffer should not cross 64KB boundaries */
-   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
-   trb_buff_len = min_t(unsigned int,
-   trb_buff_len,
-   block_len);
-   } else {
-   /* Further through the contiguous block. */
-   trb_buff_len = block_len;
-   if (trb_buff_len > TRB_MAX_BUFF_SIZE)
-   trb_buff_len = TRB_MAX_BUFF_SIZE;
-   }
+   /* TRB buffer should not cross 64KB boundaries */
+   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
+   trb_buff_len = min_t(unsigned int, trb_buff_len, block_len);
 
-   if (running_total + trb_buff_len > full_len)
-   trb_buff_len = full_len - running_total;
+   if (enqd_len + trb_buff_len > full_len)
+   trb_buff_len = full_len - enqd_len;
 
/* Don't change the cycle bit of the first TRB until later */
-   if (i == 0) {
+   if (first_trb) {
+   first_trb = false;
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
@@ -3194,7 +3178,7 @@ int xhci_que

[RFC PATCH v3 6/6] xhci: TD-fragment, align the unsplittable case with a bounce buffer

2016-05-26 Thread Mathias Nyman
If the last trb before a link is not packet size aligned, and is not
splittable then use a bounce buffer for that chunk of max packet size
unalignable data.

Allocate a max packet size bounce buffer for every segment of a bulk
endpoint ring at the same time as allocating the ring.
If we need to align the data before the link trb in that segment then
copy the data to the segment bounce buffer, dma map it, and enqueue it.
Once the td finishes, or is cancelled, unmap it.

For in transfers we need to first map the bounce buffer, then queue it,
after it finishes, copy the bounce buffer to the original sg list, and
finally unmap it

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-mem.c  |  74 ++
 drivers/usb/host/xhci-ring.c | 106 +++
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  10 +++-
 4 files changed, 155 insertions(+), 40 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index bad0d1f..6afe323 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -37,7 +37,9 @@
  * "All components of all Command and Transfer TRBs shall be initialized to 
'0'"
  */
 static struct xhci_segment *xhci_segment_alloc(struct xhci_hcd *xhci,
-   unsigned int cycle_state, gfp_t flags)
+  unsigned int cycle_state,
+  unsigned int max_packet,
+  gfp_t flags)
 {
struct xhci_segment *seg;
dma_addr_t  dma;
@@ -53,6 +55,14 @@ static struct xhci_segment *xhci_segment_alloc(struct 
xhci_hcd *xhci,
return NULL;
}
 
+   if (max_packet) {
+   seg->bounce_buf = kzalloc(max_packet, flags | GFP_DMA);
+   if (!seg->bounce_buf) {
+   dma_pool_free(xhci->segment_pool, seg->trbs, dma);
+   kfree(seg);
+   return NULL;
+   }
+   }
/* If the cycle state is 0, set the cycle bit to 1 for all the TRBs */
if (cycle_state == 0) {
for (i = 0; i < TRBS_PER_SEGMENT; i++)
@@ -70,6 +80,7 @@ static void xhci_segment_free(struct xhci_hcd *xhci, struct 
xhci_segment *seg)
dma_pool_free(xhci->segment_pool, seg->trbs, seg->dma);
seg->trbs = NULL;
}
+   kfree(seg->bounce_buf);
kfree(seg);
 }
 
@@ -317,11 +328,11 @@ static void xhci_initialize_ring_info(struct xhci_ring 
*ring,
 static int xhci_alloc_segments_for_ring(struct xhci_hcd *xhci,
struct xhci_segment **first, struct xhci_segment **last,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_segment *prev;
 
-   prev = xhci_segment_alloc(xhci, cycle_state, flags);
+   prev = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!prev)
return -ENOMEM;
num_segs--;
@@ -330,7 +341,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
while (num_segs > 0) {
struct xhci_segment *next;
 
-   next = xhci_segment_alloc(xhci, cycle_state, flags);
+   next = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!next) {
prev = *first;
while (prev) {
@@ -360,7 +371,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
  */
 static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd *xhci,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_ring*ring;
int ret;
@@ -370,13 +381,15 @@ static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd 
*xhci,
return NULL;
 
ring->num_segs = num_segs;
+   ring->bounce_buf_len = max_packet;
INIT_LIST_HEAD(&ring->td_list);
ring->type = type;
if (num_segs == 0)
return ring;
 
ret = xhci_alloc_segments_for_ring(xhci, &ring->first_seg,
-   &ring->last_seg, num_segs, cycle_state, type, flags);
+   &ring->last_seg, num_segs, cycle_state, type,
+   max_packet, flags);
if (ret)
goto fail;
 
@@ -470,7 +483,8 @@ int xhci_ring_expansion(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
ring->num_segs : num_segs_needed;
 
ret = xhci_alloc_segments_for_ring(xhci, &first, &last,
-  

[RFC PATCH v3 5/6] xhci: align the last trb before link if it is easily splittable.

2016-05-26 Thread Mathias Nyman
TD fragments section 4.11.7.1 in xhci specs have additional requirements
on how trbs in TDs must be organized.

TD fragments shall not span transfer ring segments and TD fragments must
be packet aligned. Normally we don't care about TD fragments, on TD is one
big fragment, but if a TD spans ring segments it will be treated as two
fragments, and we need to comply with the alignment requirements.

For us this means that the payload data must be packet aligned in the
last trb before a link trb.
In most mass storage bulk tranfers we are lucky as the block size aligns
nicely with packet size, and there are no issues.
However, usb network adapters using scatterlists can hit this alignment
issue, and usbtest in kernel triggers this in minutes.

This patch is a partial solution, it solves the easy case when the last
trb before the link trb contains a packet boundary.
If that is the case then just split the trb at the boundary.
If not, then just print a debug message and continue as we have always
done, hoping for the best

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d86da81..bf9ffa4 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3098,6 +3098,27 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return (total_packet_count - ((transferred + trb_buff_len) / maxp));
 }
 
+static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
+u32 *trb_buff_len)
+{
+   unsigned int unalign;
+   unsigned int max_pkt;
+
+   max_pkt = GET_MAX_PACKET(usb_endpoint_maxp(&urb->ep->desc));
+   unalign = (enqd_len + *trb_buff_len) % max_pkt;
+
+   /* we got lucky, last normal TRB data on segment is packet aligned */
+   if (unalign == 0)
+   return 0;
+
+   /* is the last nornal TRB alignable by splitting it */
+   if (*trb_buff_len > unalign) {
+   *trb_buff_len -= unalign;
+   return 0;
+   }
+   return 1;
+}
+
 /* This is very similar to what ehci-q.c qtd_fill() does */
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
@@ -3180,6 +3201,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 */
if (enqd_len + trb_buff_len < full_len) {
field |= TRB_CHAIN;
+   if (last_trb(xhci, ring, ring->enq_seg,
+ring->enqueue + 1)) {
+   if (xhci_align_td(xhci, urb, enqd_len,
+&trb_buff_len))
+   xhci_dbg(xhci, "TRB align fail\n");
+   }
} else {
field |= TRB_IOC;
more_trbs_coming = false;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 3/6] xhci: use boolean to indicate last trb in td remainder calculation

2016-05-26 Thread Mathias Nyman
We only need to know if we are queuing the last trb for a TD when
calculating the td remainder field.
The total number of trbs left is not used.

We won't be able to trust the pre-calculated number of trbs used if we
need to align trb data by splitting or merging trbs in order to satisfy
comply with data alignment requirements in xhci specs section 4.11.7.1.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 15dd226..f74ac1c 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3074,7 +3074,7 @@ int xhci_queue_intr_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
  */
 static u32 xhci_td_remainder(struct xhci_hcd *xhci, int transferred,
  int trb_buff_len, unsigned int td_total_len,
- struct urb *urb, unsigned int num_trbs_left)
+ struct urb *urb, bool more_trbs_coming)
 {
u32 maxp, total_packet_count;
 
@@ -3083,7 +3083,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return ((td_total_len - transferred) >> 10);
 
/* One TRB with a zero-length data packet. */
-   if (num_trbs_left == 0 || (transferred == 0 && trb_buff_len == 0) ||
+   if (!more_trbs_coming || (transferred == 0 && trb_buff_len == 0) ||
trb_buff_len == td_total_len)
return 0;
 
@@ -3198,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   more_trbs_coming = need_zero_pkt;
+   more_trbs_coming = false;
td->last_trb = ring->enqueue;
}
 
@@ -3209,13 +3209,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
/* Set the TRB length, TD size, and interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
trb_buff_len, full_len,
-   urb, num_trbs - i - 1);
-
+   urb, more_trbs_coming);
length_field = TRB_LEN(trb_buff_len) |
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   queue_trb(xhci, ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming | need_zero_pkt,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
@@ -3639,7 +3638,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, 
gfp_t mem_flags,
/* Set the TRB length, TD size, & interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
   trb_buff_len, td_len,
-  urb, trbs_per_td - j - 1);
+  urb, more_trbs_coming);
 
length_field = TRB_LEN(trb_buff_len) |
TRB_INTR_TARGET(0);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Support for Pravega USB3 controller

2016-05-30 Thread Mathias Nyman

On 30.05.2016 11:31, Mason wrote:

Hello Felipe,

On 30/05/2016 08:58, Felipe Balbi wrote:


Mason writes:


I'm working on a SoC which embeds an IP block from GDA Technologies
labeled "Pravega USB3 SuperSpeed Controller" (data-sheet is v0.99r
dated 2014-01-29). A cursory search returns:

http://www.sourcing.co.jp/prod_ip.htm
http://www.sourcing.co.jp/prod_ip/usb_host_pb.pdf

In the compliance section, the data-sheet lists:

- USB 3.0 Revision 1.0 and all associated ECNs
- Inter-Chip Supplement to the USB Revision 3.0 Specification, Revision 1.02
- Backward compatible with USB2.0 Revision 2.0 and all associated ECNs
- High Speed Inter Chip Specification, Rev 1.0 and associated ECNs
- USB 2.0 Link Power Management Addendum and associated Erratas
- xHCI specification version 1.0 (in host mode)

My question is: should I be able to use the generic XHCI driver to
drive this controller?


yes, you should. BUT... is this controller dual-role? If it is, then
it'd be nice to support both roles.


Dual-role means OTG? or just static host/device?
AFAIU, my version of the controller only supports host-mode.


(The Makefile builds xhci-plat-hcd.o but I don't see xhci-plat-hcd.c
Is it a generated file?)


The makefile is smart enough to figure it out. Don't worry. Kernel's
build system knows that it needs xhci-plat-hcd.c in order to build
xhci-plat-hcd.o


Oh, I wasn't worried. I was curious how/where that file is generated.


One more thing: AFAIU, the USB PHY is made by Synopsys.
Do I need a driver for that too? (I should examine 7b8ef22ea547)


IIRC there are no registers to be configured on Synopsys PHY, so no.


I looked more closely at the SoC documentation, which lists registers
such as

0x0 CONFIG
0x4 CONTROL
0x8 TEST
0xC STATUS
0x10CLK_RST_0
0x14CLK_RST_1
0x18PARAM_0
0x1CPARAM_1
0x20PARAM_2
0x80SNPS_CR_ADD
0x84SNPS_CR_DATA
0xC0RESET_CTRL

For example, PARAM_1 contains a field "TX De-emphasis at 6 dB".
That's PHY stuff, right?

Are registers such as PHY_TIMER_REGISTER or LTSSM_* standard in USB3/XHCI,
or is that PHY stuff too?


Not part of xhci standard at least,

http://www.intel.com/content/www/us/en/io/universal-serial-bus/extensible-host-controler-interface-usb-xhci.html

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to safely detach external HDD on USB 3.0

2016-05-31 Thread Mathias Nyman

On 30.05.2016 19:39, Alan Stern wrote:

On Sun, 29 May 2016, Marco Chiappero wrote:


Hello,

I'm experiencing a problem I'm not able to troubleshoot.

I own a WD Elements Portable USB 3.0 HDD (WDBU6Y0020BBK) containing a
drive model WD20NMVW-11AV3S3. When connected to a USB 3.0 port the drive
works perfectly fine until I try to stop and detach the drive from the
system, either with gnome-disk-utility or udisksctl: it stops spinning
but it restarts soon after and reattaches itself to the system again.
The problem does not occur when using a USB 2.0 port and does not occur
under Windows 10 with both USB 2.0 and 3.0, spinning down and modifying
the led activity accordingly, as expected.


This looks like a problem in the USB-3 link-state management.

Mathias, usb_remove_device() calls hub_port_logical_disconnect(), which
calls hub_port_disable(), which calls hub_usb3_port_disable().  That
routine sets the link state to SS_DISABLED, waits for the port to
enter that state, and then sets the state to RX_DETECT.  Next the hub
is suspended, but about 40 ms later it wakes up again because of a
connect-change event on the port.  You can see all this in the
Wireshark capture log (the wd_elements-usb3-linux_4.4.0.pcapng file).

This isn't supposed to happen.  The intention is that the port will
remain disabled until the device is unplugged and something is plugged
back in.  But apparently being in the RX_DETECT state causes the port
to get a connect-change event even though the cable has remained
plugged into the port the whole time.

So the question is, how do we prevent the port from re-enabling itself
until the cable has been unplugged?  Maybe the port should not be in
RX_DETECT at all?  But then how do we tell when the cable is unplugged?

Alan Stern


xhci specs say that port will move from Disconnected (PLS = RX_DETECT) to
Polling if "SuperSpeed far-end receiver terminations are detected"

From power-off state (PP=0) the connect status changes are only reported
if OCA is set so it seems that is not an option.

I'm looking at the capture log and if I understand it correcty, in frame
49 SET FEATURE, PLS = ss.disabled   (xhci will write PORT_PED bit)
59 SET FEATURE, PLS = RxDetect  (xhci will set pls=RxDetect)
.. get_port_status, nothing set, no changes reported (several of these)
73 SET FEATURE -> Port remote wake mask
.. get_port status reports Connected, enabled and port status change event

Could it be that setting the port remote wake mask somehow triggers the
connect status change event? It's the first time we write to PORTSC for
this port after setting link state to rx.detect

I need to do some more reading and investigating stillon this still.

-Mathias




I'm not sure whether the problem is a non compliant USB-Sata bridge
(probably a JMicron) or bad timing, bad FSM implementation or something
else, but I would like to determine the cause to either solve the
problem or steer away from this vendor in the future. Please note the
this issue is not distribution specific and is affecting a lot of people
and other manufacturers as well, e.g.:
https://bugzilla.novell.com/show_bug.cgi?id=922634
https://bugs.launchpad.net/ubuntu/+source/nautilus/+bug/792085
https://bugzilla.redhat.com/show_bug.cgi?id=1278944

Here you can find a few captures with Wireshark on both Ubuntu Linux
(3.13.0 & 4.4.0) and Windows 10:
https://drive.google.com/open?id=0B2tgr9hETOtgbVFvdExQZkpTQmc

However USBPcap for Windows cannot capture a bunch of relevant messages,
so there is almost nothing to see when detaching. Under Linux, USB 3.0
seems to be chattier than 2.0 after the SCSI START STOP UNIT command.

Any suggestion on how to investigate the issue is very welcome.

Thank you,
Regards


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/2] usb: host: xhci: rcar: fix the quirks setting of XHCI_NO_64BIT_SUPPORT

2016-06-01 Thread Mathias Nyman

isn't the following enough?

@@ -4886,7 +4886,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t 
get_quirks)
xhci->hcc_params2 = readl(&xhci->cap_regs->hcc_params2);
xhci_print_registers(xhci);

-   xhci->quirks = quirks;
+   xhci->quirks |= quirks;

get_quirks(dev, xhci);


Thank you for the comment!
You're correct. This also can resolve the issue.
Do you prefer such a simple patch?
At least, I prefer such a simple patch :)


I'll defer that to Mathias :-)



I think that xhci->quirks |= quirks will do as a rc fix.

looks like setting xhci->quirks need some generic cleanup for usb-next.
Now in 4.7-rc1 we set xhci->quirks before xhci_gen_setup in 
xhci_priv_init_quirk(),
and during xhci_gen_setup() when copying module parameters quirk, and when 
calling
the get_quirk() callback.

There is nothing special happening between xhci_priv_init_quirk and when calling
get_quirks() in xhci_gen_setup() that need quirks to be set in two stages.

xhci pci drivers only use the get_quirk callback, platform drivers use both.

Looks like the get_quirk() callback is a legacy thing from when the memory for
the xhci struct was allocated in xhci_gen_setup, and xhci->quirks could only be
set after that.

Now xhci struct is part of the usb_hcd, so we could probably get rid of the 
get_quirk
callback alltogether by setting the pci quirks in xhci_pci_setup

-Mathias  
--

To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] xhci fixes for usb-linus

2016-06-01 Thread Mathias Nyman
Hi Greg

These xhci fixes solve a 4.7-rc1 regression in xhci platform driver
quirks, a couple of pending completion cases during driver unload/host
lockup, and a deferred probe case.

-Mathias

Gabriel Krisman Bertazi (1):
  xhci: Cleanup only when releasing primary hcd

Mathias Nyman (2):
  xhci: Fix handling timeouted commands on hosts in weird states.
  xhci: fix platform quirks overwrite regression in 4.7-rc1

Thomas Petazzoni (1):
  usb: xhci-plat: properly handle probe deferral for devm_clk_get()

 drivers/usb/host/xhci-plat.c |  3 +++
 drivers/usb/host/xhci-ring.c | 30 --
 drivers/usb/host/xhci.c  | 29 -
 3 files changed, 43 insertions(+), 19 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] xhci: fix platform quirks overwrite regression in 4.7-rc1

2016-06-01 Thread Mathias Nyman
commit b1c127ae990b ("usb: host: xhci: plat: make use of new methods in
xhci_plat_priv") sets xhci->quirks before calling xhci_gen_setup(), which
will overwrite them.

Don't overwite the quirks, just add the new ones

Fixes: b1c127ae990b ("usb: host: xhci: plat: make use of new methods in 
xhci_plat_priv")
Reported-by: Yoshihiro Shimoda 
Cc: Felipe Balbi 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index fe95602..f2f9518 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4889,7 +4889,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t 
get_quirks)
xhci->hcc_params2 = readl(&xhci->cap_regs->hcc_params2);
xhci_print_registers(xhci);
 
-   xhci->quirks = quirks;
+   xhci->quirks |= quirks;
 
get_quirks(dev, xhci);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] xhci: Fix handling timeouted commands on hosts in weird states.

2016-06-01 Thread Mathias Nyman
If commands timeout we mark them for abortion, then stop the command
ring, and turn the commands to no-ops and finally restart the command
ring.

If the host is working properly the no-op commands will finish and
pending completions are called.
If we notice the host is failing, driver clears the command ring and
completes, deletes and frees all pending commands.

There are two separate cases reported where host is believed to work
properly but is not. In the first case we successfully stop the ring
but no abort or stop command ring event is ever sent and host locks up.

The second case is if a host is removed, command times out and driver
believes the ring is stopped, and assumes it will be restarted, but
actually ends up timing out on the same command forever.
If one of the pending commands has the xhci->mutex held it will block
xhci_stop() in the remove codepath which otherwise would cleanup pending
commands.

Add a check that clears all pending commands in case host is removed,
or we are stuck timing out on the same command. Also restart the
command timeout timer when stopping the command ring to ensure we
recive an ring stop/abort event.

Cc: stable 
Tested-by: Joe Lawrence 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 1287339..d7d5025 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -290,6 +290,14 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
 
temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
xhci->cmd_ring_state = CMD_RING_STATE_ABORTED;
+
+   /*
+* Writing the CMD_RING_ABORT bit should cause a cmd completion event,
+* however on some host hw the CMD_RING_RUNNING bit is correctly cleared
+* but the completion event in never sent. Use the cmd timeout timer to
+* handle those cases. Use twice the time to cover the bit polling retry
+*/
+   mod_timer(&xhci->cmd_timer, jiffies + (2 * XHCI_CMD_DEFAULT_TIMEOUT));
xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
&xhci->op_regs->cmd_ring);
 
@@ -314,6 +322,7 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
 
xhci_err(xhci, "Stopped the command ring failed, "
"maybe the host is dead\n");
+   del_timer(&xhci->cmd_timer);
xhci->xhc_state |= XHCI_STATE_DYING;
xhci_quiesce(xhci);
xhci_halt(xhci);
@@ -1246,22 +1255,21 @@ void xhci_handle_command_timeout(unsigned long data)
int ret;
unsigned long flags;
u64 hw_ring_state;
-   struct xhci_command *cur_cmd = NULL;
+   bool second_timeout = false;
xhci = (struct xhci_hcd *) data;
 
/* mark this command to be cancelled */
spin_lock_irqsave(&xhci->lock, flags);
if (xhci->current_cmd) {
-   cur_cmd = xhci->current_cmd;
-   cur_cmd->status = COMP_CMD_ABORT;
+   if (xhci->current_cmd->status == COMP_CMD_ABORT)
+   second_timeout = true;
+   xhci->current_cmd->status = COMP_CMD_ABORT;
}
 
-
/* Make sure command ring is running before aborting it */
hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
(hw_ring_state & CMD_RING_RUNNING))  {
-
spin_unlock_irqrestore(&xhci->lock, flags);
xhci_dbg(xhci, "Command timeout\n");
ret = xhci_abort_cmd_ring(xhci);
@@ -1273,6 +1281,15 @@ void xhci_handle_command_timeout(unsigned long data)
}
return;
}
+
+   /* command ring failed to restart, or host removed. Bail out */
+   if (second_timeout || xhci->xhc_state & XHCI_STATE_REMOVING) {
+   spin_unlock_irqrestore(&xhci->lock, flags);
+   xhci_dbg(xhci, "command timed out twice, ring start fail?\n");
+   xhci_cleanup_command_queue(xhci);
+   return;
+   }
+
/* command timeout on stopped ring, ring can't be aborted */
xhci_dbg(xhci, "Command timeout on stopped ring\n");
xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] xhci: Cleanup only when releasing primary hcd

2016-06-01 Thread Mathias Nyman
From: Gabriel Krisman Bertazi 

Under stress occasions some TI devices might not return early when
reading the status register during the quirk invocation of xhci_irq made
by usb_hcd_pci_remove.  This means that instead of returning, we end up
handling this interruption in the middle of a shutdown.  Since
xhci->event_ring has already been freed in xhci_mem_cleanup, we end up
accessing freed memory, causing the Oops below.

commit 8c24d6d7b09d ("usb: xhci: stop everything on the first call to
xhci_stop") is the one that changed the instant in which we clean up the
event queue when stopping a device.  Before, we didn't call
xhci_mem_cleanup at the first time xhci_stop is executed (for the shared
HCD), instead, we only did it after the invocation for the primary HCD,
much later at the removal path.  The code flow for this oops looks like
this:

xhci_pci_remove()
usb_remove_hcd(xhci->shared)
xhci_stop(xhci->shared)
xhci_halt()
xhci_mem_cleanup(xhci);  // Free the event_queue
usb_hcd_pci_remove(primary)
xhci_irq()  // Access the event_queue if STS_EINT is set. Crash.
xhci_stop()
xhci_halt()
// return early

The fix modifies xhci_stop to only cleanup the xhci data when releasing
the primary HCD.  This way, we still have the event_queue configured
when invoking xhci_irq.  We still halt the device on the first call to
xhci_stop, though.

I could reproduce this issue several times on the mainline kernel by
doing a bind-unbind stress test with a specific storage gadget attached.
I also ran the same test over-night with my patch applied and didn't
observe the issue anymore.

[  113.334124] Unable to handle kernel paging request for data at address 
0x0028
[  113.335514] Faulting instruction address: 0xdd4f767c
[  113.336839] Oops: Kernel access of bad area, sig: 11 [#1]
[  113.338214] SMP NR_CPUS=1024 NUMA PowerNV

[c00efe47ba90] c0720850 usb_hcd_irq+0x50/0x80
[c00efe47bac0] c073d328 usb_hcd_pci_remove+0x68/0x1f0
[c00efe47bb00] ddaf0128 xhci_pci_remove+0x78/0xb0
[xhci_pci]
[c00efe47bb30] c055cf70 pci_device_remove+0x70/0x110
[c00efe47bb70] c061c6bc __device_release_driver+0xbc/0x190
[c00efe47bba0] c061c7d0 device_release_driver+0x40/0x70
[c00efe47bbd0] c0619510 unbind_store+0x120/0x150
[c00efe47bc20] c06183c4 drv_attr_store+0x64/0xa0
[c00efe47bc60] c039f1d0 sysfs_kf_write+0x80/0xb0
[c00efe47bca0] c039e14c kernfs_fop_write+0x18c/0x1f0
[c00efe47bcf0] c02e962c __vfs_write+0x6c/0x190
[c00efe47bd90] c02eab40 vfs_write+0xc0/0x200
[c00efe47bde0] c02ec85c SyS_write+0x6c/0x110
[c00efe47be30] c0009260 system_call+0x38/0x108

Signed-off-by: Gabriel Krisman Bertazi 
Cc: Roger Quadros 
Cc: j...@jms.id.au
Cc: sta...@vger.kernel.org
Reviewed-by: Roger Quadros 
Cc:  #v4.3+
Tested-by: Joel Stanley 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c |  3 ++-
 drivers/usb/host/xhci.c  | 27 +++
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52deae4..1287339 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2721,7 +2721,8 @@ hw_died:
writel(irq_pending, &xhci->ir_set->irq_pending);
}
 
-   if (xhci->xhc_state & XHCI_STATE_DYING) {
+   if (xhci->xhc_state & XHCI_STATE_DYING ||
+   xhci->xhc_state & XHCI_STATE_HALTED) {
xhci_dbg(xhci, "xHCI dying, ignoring interrupt. "
"Shouldn't IRQs be disabled?\n");
/* Clear the event handler busy flag (RW1C);
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index fa7e1ef..fe95602 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -685,20 +685,23 @@ void xhci_stop(struct usb_hcd *hcd)
u32 temp;
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
 
-   if (xhci->xhc_state & XHCI_STATE_HALTED)
-   return;
-
mutex_lock(&xhci->mutex);
-   spin_lock_irq(&xhci->lock);
-   xhci->xhc_state |= XHCI_STATE_HALTED;
-   xhci->cmd_ring_state = CMD_RING_STATE_STOPPED;
 
-   /* Make sure the xHC is halted for a USB3 roothub
-* (xhci_stop() could be called as part of failed init).
-*/
-   xhci_halt(xhci);
-   xhci_reset(xhci);
-   spin_unlock_irq(&xhci->lock);
+   if (!(xhci->xhc_state & XHCI_STATE_HALTED)) {
+   spin_lock_irq(&xhci->lock);
+
+   xhci->xhc_state |= XHCI_STATE_HALTED;
+   xhci->cmd_ring_state = CMD_RING_STATE_STOPPED

[PATCH 3/4] usb: xhci-plat: properly handle probe deferral for devm_clk_get()

2016-06-01 Thread Mathias Nyman
From: Thomas Petazzoni 

On some platforms, the clocks might be registered by a platform
driver. When this is the case, the clock platform driver may very well
be probed after xhci-plat, in which case the first probe() invocation
of xhci-plat will receive -EPROBE_DEFER as the return value of
devm_clk_get().

The current code handles that as a normal error, and simply assumes
that this means that the system doesn't have a clock for the XHCI
controller, and continues probing without calling
clk_prepare_enable(). Unfortunately, this doesn't work on systems
where the XHCI controller does have a clock, but that clock is
provided by another platform driver. In order to fix this situation,
we handle the -EPROBE_DEFER error condition specially, and abort the
XHCI controller probe(). It will be retried later automatically, the
clock will be available, devm_clk_get() will succeed, and the probe()
will continue with the clock prepared and enabled as expected.

In practice, such issue is seen on the ARM64 Marvell 7K/8K platform,
where the clocks are registered by a platform driver.

Cc: 
Signed-off-by: Thomas Petazzoni 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-plat.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index 676ea45..1f3f981 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -196,6 +196,9 @@ static int xhci_plat_probe(struct platform_device *pdev)
ret = clk_prepare_enable(clk);
if (ret)
goto put_hcd;
+   } else if (PTR_ERR(clk) == -EPROBE_DEFER) {
+   ret = -EPROBE_DEFER;
+   goto put_hcd;
}
 
xhci = hcd_to_xhci(hcd);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to safely detach external HDD on USB 3.0

2016-06-08 Thread Mathias Nyman

On 06.06.2016 21:31, Alan Stern wrote:

On Sat, 4 Jun 2016, Marco Chiappero wrote:


On 31/05/2016 16:34, Alan Stern wrote:

On Tue, 31 May 2016, Mathias Nyman wrote:



xhci specs say that port will move from Disconnected (PLS = RX_DETECT) to
Polling if "SuperSpeed far-end receiver terminations are detected"

   From power-off state (PP=0) the connect status changes are only reported
if OCA is set so it seems that is not an option.

I'm looking at the capture log and if I understand it correcty, in frame
49 SET FEATURE, PLS = ss.disabled   (xhci will write PORT_PED bit)
59 SET FEATURE, PLS = RxDetect  (xhci will set pls=RxDetect)
.. get_port_status, nothing set, no changes reported (several of these)
73 SET FEATURE -> Port remote wake mask
.. get_port status reports Connected, enabled and port status change event

Could it be that setting the port remote wake mask somehow triggers the
connect status change event? It's the first time we write to PORTSC for
this port after setting link state to rx.detect


It's possible, but I doubt it.  More likely it just takes a few hundred
ms for the RxDetect link training to realize there is a connection.

Marco, you can check whether this is true by doing (as root):

echo on >/sys/bus/usb/devices/usb4/power/control

before starting your test.

Alan Stern


Hi,

Sorry for my late reply but I've been very busy this week.

You can find the new captures in the same directory. Or just use the
following direct links:
https://drive.google.com/open?id=0B2tgr9hETOtgTEU1Qmc2eWNIYlE
https://drive.google.com/open?id=0B2tgr9hETOtgZUV2TkZRRGJjMWs

I could not notice any difference though. Anyway I'm happy to test
patches if you want to try changes.


The new capture proves my earlier point.  It shows the same sequence of
link-state changes as before (DISABLED -> RX_DETECT -> CONNECTED), but
the port's remote wakeup mask was not set.

As far as I can tell, USB-3 does not have any link state that will
avoid negotiating a connection with a plugged-in cable but will signal
the host when the cable is unplugged.  Perhaps the best we can do is
put the link into U3 with remote wakeup disabled on the device.

Mathias, what do you think?



I found a device that I can reproduce this with, I also tried on windows and the
analyzer shows "safely remove hardware" puts it into U3

So U3 makes sense, but hub_usb3_port_disable() is used in several places in usb 
core,
and I'm not sure if U3 fits all of them.

Right now I'm even suspecting that the drives that auto-remount are the ones 
that are
working better according to the specs.

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to safely detach external HDD on USB 3.0

2016-06-08 Thread Mathias Nyman

On 08.06.2016 16:47, Mathias Nyman wrote:

On 06.06.2016 21:31, Alan Stern wrote:

On Sat, 4 Jun 2016, Marco Chiappero wrote:


On 31/05/2016 16:34, Alan Stern wrote:

On Tue, 31 May 2016, Mathias Nyman wrote:



xhci specs say that port will move from Disconnected (PLS = RX_DETECT) to
Polling if "SuperSpeed far-end receiver terminations are detected"

   From power-off state (PP=0) the connect status changes are only reported
if OCA is set so it seems that is not an option.

I'm looking at the capture log and if I understand it correcty, in frame
49 SET FEATURE, PLS = ss.disabled   (xhci will write PORT_PED bit)
59 SET FEATURE, PLS = RxDetect  (xhci will set pls=RxDetect)
.. get_port_status, nothing set, no changes reported (several of these)
73 SET FEATURE -> Port remote wake mask
.. get_port status reports Connected, enabled and port status change event

Could it be that setting the port remote wake mask somehow triggers the
connect status change event? It's the first time we write to PORTSC for
this port after setting link state to rx.detect


It's possible, but I doubt it.  More likely it just takes a few hundred
ms for the RxDetect link training to realize there is a connection.

Marco, you can check whether this is true by doing (as root):

echo on >/sys/bus/usb/devices/usb4/power/control

before starting your test.

Alan Stern


Hi,

Sorry for my late reply but I've been very busy this week.

You can find the new captures in the same directory. Or just use the
following direct links:
https://drive.google.com/open?id=0B2tgr9hETOtgTEU1Qmc2eWNIYlE
https://drive.google.com/open?id=0B2tgr9hETOtgZUV2TkZRRGJjMWs

I could not notice any difference though. Anyway I'm happy to test
patches if you want to try changes.


The new capture proves my earlier point.  It shows the same sequence of
link-state changes as before (DISABLED -> RX_DETECT -> CONNECTED), but
the port's remote wakeup mask was not set.

As far as I can tell, USB-3 does not have any link state that will
avoid negotiating a connection with a plugged-in cable but will signal
the host when the cable is unplugged.  Perhaps the best we can do is
put the link into U3 with remote wakeup disabled on the device.

Mathias, what do you think?



I found a device that I can reproduce this with, I also tried on windows and the
analyzer shows "safely remove hardware" puts it into U3

So U3 makes sense, but hub_usb3_port_disable() is used in several places in usb 
core,
and I'm not sure if U3 fits all of them.

Right now I'm even suspecting that the drives that auto-remount are the ones 
that are
working better according to the specs.



Tried a quick hack that sets link state in hub_usb3_port_disable() to U3 and 
return,
instead of SS_DISABLED -> RX_DETECT.

Seems to work, the mass storage no longer re-mounts after "udisks --detach"
Analyzer also shows that after unplugging the cable while device is in U3 the
device (upstream port) goes SS.Disabled, and downstream port (root hub) goes to 
RX_DETECT,
and new is detected again if cable is conneted back in.

-Mathias 


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[TESTPATCH] xhci: usb2 hw lpm extra debug

2016-06-15 Thread Mathias Nyman
custom testpatch to show more info about touching usb2 hw lpm
setting for usb3 devices

not for upstream, just testing

Signed-off-by: Mathias Nyman 
---
 drivers/usb/core/hub.c  | 32 +---
 drivers/usb/host/xhci.c | 11 +++
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index bee1351..14fcd87 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -3201,9 +3201,11 @@ int usb_port_suspend(struct usb_device *udev, 
pm_message_t msg)
}
 
/* disable USB2 hardware LPM */
-   if (udev->usb2_hw_lpm_enabled == 1)
+   if (udev->usb2_hw_lpm_enabled == 1) {
+   if (!udev->usb2_hw_lpm_capable)
+   dev_err(&udev->dev, "USB2 LPM disabling for non-capable 
dev\n");
usb_set_usb2_hardware_lpm(udev, 0);
-
+   }
if (usb_disable_ltm(udev)) {
dev_err(&udev->dev, "Failed to disable LTM before suspend\n.");
status = -ENOMEM;
@@ -4306,6 +4308,9 @@ static void hub_set_initial_usb2_lpm_policy(struct 
usb_device *udev)
if (!udev->usb2_hw_lpm_capable)
return;
 
+   if (udev->speed >= USB_SPEED_SUPER)
+   dev_err(&udev->dev, "USB2 LPM: set initial for SS devic\n");
+
if (hub)
connect_type = hub->ports[udev->portnum - 1]->connect_type;
 
@@ -4648,7 +4653,12 @@ hub_port_init(struct usb_hub *hub, struct usb_device 
*udev, int port1,
hcd->driver->update_device(hcd, udev);
hub_set_initial_usb2_lpm_policy(udev);
 fail:
+
+
if (retval) {
+   if (udev->usb2_hw_lpm_capable)
+   dev_err(&udev->dev, "USB2 LPM %s fails but capable is 
set, speed %d\n",
+   __func__, udev->speed);
hub_port_disable(hub, port1, 0);
update_devnum(udev, devnum);/* for disconnect processing */
}
@@ -4814,11 +4824,13 @@ static void hub_port_connect(struct usb_hub *hub, int 
port1, u16 portstatus,
udev->wusb = hub_is_wusb(hub);
 
/* Devices connected to SuperSpeed hubs are USB 3.0 or later */
-   if (hub_is_superspeed(hub->hdev))
+   if (hub_is_superspeed(hub->hdev)) {
udev->speed = USB_SPEED_SUPER;
-   else
+   if (udev->usb2_hw_lpm_capable)
+   dev_err(&udev->dev, "USB2 LPM capable set for 
fresh SS device\n");
+   } else {
udev->speed = USB_SPEED_UNKNOWN;
-
+   }
choose_devnum(udev);
if (udev->devnum <= 0) {
status = -ENOTCONN; /* Don't retry */
@@ -5444,9 +5456,15 @@ static int usb_reset_and_verify_device(struct usb_device 
*udev)
/* Disable USB2 hardware LPM.
 * It will be re-enabled by the enumeration process.
 */
-   if (udev->usb2_hw_lpm_enabled == 1)
+   if (udev->usb2_hw_lpm_enabled == 1) {
+   if (udev->speed >= USB_SPEED_SUPER)
+   dev_err(&udev->dev, " USB2 LPM %s for SS device\n",
+   __func__);
+   if (!udev->usb2_hw_lpm_capable)
+   dev_err(&udev->dev, " USB2 LPM %s for uncapable 
device\n",
+   __func__);
usb_set_usb2_hardware_lpm(udev, 0);
-
+   }
/* Disable LPM and LTM while we reset the device and reinstall the alt
 * settings.  Device-initiated LPM settings, and system exit latency
 * settings are cleared when the device is reset, so we have to set
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index fa7e1ef..fdda854 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4137,6 +4137,13 @@ int xhci_set_usb2_hardware_lpm(struct usb_hcd *hcd,
hlpm_addr = port_array[port_num] + PORTHLPMC;
field = le32_to_cpu(udev->bos->ext_cap->bmAttributes);
 
+   if (port_num >= xhci->num_usb2_ports)
+   dev_warn(&udev->dev, "USB2 LPM portnum %d > num_usb2_ports 
%d\n",
+port_num, xhci->num_usb2_ports);
+
+   if (udev->speed >= USB_SPEED_SUPER)
+   dev_warn(&udev->dev, "USB2 LPM for SS device!!\n");
+
xhci_dbg(xhci, "%s port %d USB2 hardware LPM\n",
enable ? "enable" : "disable", port_num + 1);
 
@@ -4247,6 +4254,10 @@ int xhci_update_device(struct usb_hcd *hcd, struct 
usb_device *udev)
xhci_check_usb2_port_capability(
xhci, portnu

[PATCH 00/14] xhci features for usb-next

2016-06-21 Thread Mathias Nyman
Hi Greg

These patches for 4.8 contain td-fragment work that ensures bulk tranfers
finally follows the aligment requirements in the xhci specs.

Also reworking helper functions for checking last and link trbs,
and cleanups for platform data

-Mathias

Heikki Krogerus (3):
  xhci: plat: adapt to unified device property interface
  usb: dwc3: host: use build-in property instead of platform data
  xhci: get rid of platform data

Mathias Nyman (11):
  xhci: rename ep_ring variable in queue_bulk_tx(), no functional change
  xhci: properly prepare zero packet TD after normal bulk TD.
  xhci: use boolean to indicate last trb in td remainder calculation
  xhci: don't rely on precalculated value of needed trbs in the enqueue
loop
  xhci: align the last trb before link if it is easily splittable.
  xhci: TD-fragment, align the unsplittable case with a bounce buffer
  xhci: clean up event ring checks from inc_enq()
  xhci: use and add separate function for checking for link trbs
  xhci: rework inc_deq() and fix off by one error.
  xhci: remove enqueue_is_link() helper
  xhci: rename and simplify last_trb_on_last_seg() helper

 drivers/usb/dwc3/host.c  |  18 +-
 drivers/usb/host/xhci-mem.c  |  74 ---
 drivers/usb/host/xhci-plat.c |   8 +-
 drivers/usb/host/xhci-ring.c | 457 ++-
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  10 +-
 include/linux/usb/xhci_pdriver.h |  27 ---
 7 files changed, 326 insertions(+), 273 deletions(-)
 delete mode 100644 include/linux/usb/xhci_pdriver.h

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/14] xhci: properly prepare zero packet TD after normal bulk TD.

2016-06-21 Thread Mathias Nyman
If a zero-length packet is needed after a bulk transfer, then an
additional zero length TD was prepared before enqueueing the bulk transfer
This set up the zero packet TD structure with incorrect td->start_seg
and td->first_trb pointers.

Prepare the zero packet TD after the data bulk TD is enqueued instead.
It sets these pointers correctly.

This change also simplifies unnecessary complexity related to keeping
track of the last trb when enqueuing trbs.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 38 +++---
 1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index c3e9a60..15dd226 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3107,8 +3107,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
struct scatterlist *sg = NULL;
-   bool more_trbs_coming;
-   bool zero_length_needed;
+   bool more_trbs_coming = true;
+   bool need_zero_pkt = false;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
unsigned int running_total, block_len, trb_buff_len, full_len;
@@ -3139,17 +3139,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
last_trb_num = num_trbs - 1;
 
/* Deal with URB_ZERO_PACKET - need one more td/trb */
-   zero_length_needed = urb->transfer_flags & URB_ZERO_PACKET &&
-   urb_priv->length == 2;
-   if (zero_length_needed) {
-   num_trbs++;
-   xhci_dbg(xhci, "Creating zero length td.\n");
-   ret = prepare_transfer(xhci, xhci->devs[slot_id],
-   ep_index, urb->stream_id,
-   1, urb, 1, mem_flags);
-   if (unlikely(ret < 0))
-   return ret;
-   }
+   if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
+   need_zero_pkt = true;
 
td = urb_priv->td[0];
 
@@ -3207,12 +3198,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   if (i == last_trb_num)
-   td->last_trb = ring->enqueue;
-   else if (zero_length_needed) {
-   trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ring->enqueue;
-   }
+   more_trbs_coming = need_zero_pkt;
+   td->last_trb = ring->enqueue;
}
 
/* Only set interrupt on short packet for IN endpoints */
@@ -3228,10 +3215,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   if (i < num_trbs - 1)
-   more_trbs_coming = true;
-   else
-   more_trbs_coming = false;
queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
@@ -3253,6 +3236,15 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
}
}
 
+   if (need_zero_pkt) {
+   ret = prepare_transfer(xhci, xhci->devs[slot_id],
+  ep_index, urb->stream_id,
+  1, urb, 1, mem_flags);
+   urb_priv->td[1]->last_trb = ring->enqueue;
+   field = TRB_TYPE(TRB_NORMAL) | ring->cycle_state | TRB_IOC;
+   queue_trb(xhci, ring, 0, 0, 0, TRB_INTR_TARGET(0), field);
+   }
+
check_trb_math(urb, running_total);
giveback_first_trb(xhci, slot_id, ep_index, urb->stream_id,
start_cycle, start_trb);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/14] xhci: use boolean to indicate last trb in td remainder calculation

2016-06-21 Thread Mathias Nyman
We only need to know if we are queuing the last trb for a TD when
calculating the td remainder field.
The total number of trbs left is not used.

We won't be able to trust the pre-calculated number of trbs used if we
need to align trb data by splitting or merging trbs in order to satisfy
comply with data alignment requirements in xhci specs section 4.11.7.1.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 15dd226..f74ac1c 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3074,7 +3074,7 @@ int xhci_queue_intr_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
  */
 static u32 xhci_td_remainder(struct xhci_hcd *xhci, int transferred,
  int trb_buff_len, unsigned int td_total_len,
- struct urb *urb, unsigned int num_trbs_left)
+ struct urb *urb, bool more_trbs_coming)
 {
u32 maxp, total_packet_count;
 
@@ -3083,7 +3083,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return ((td_total_len - transferred) >> 10);
 
/* One TRB with a zero-length data packet. */
-   if (num_trbs_left == 0 || (transferred == 0 && trb_buff_len == 0) ||
+   if (!more_trbs_coming || (transferred == 0 && trb_buff_len == 0) ||
trb_buff_len == td_total_len)
return 0;
 
@@ -3198,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
field |= TRB_CHAIN;
} else {
field |= TRB_IOC;
-   more_trbs_coming = need_zero_pkt;
+   more_trbs_coming = false;
td->last_trb = ring->enqueue;
}
 
@@ -3209,13 +3209,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
/* Set the TRB length, TD size, and interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
trb_buff_len, full_len,
-   urb, num_trbs - i - 1);
-
+   urb, more_trbs_coming);
length_field = TRB_LEN(trb_buff_len) |
TRB_TD_SIZE(remainder) |
TRB_INTR_TARGET(0);
 
-   queue_trb(xhci, ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming | need_zero_pkt,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
@@ -3639,7 +3638,7 @@ static int xhci_queue_isoc_tx(struct xhci_hcd *xhci, 
gfp_t mem_flags,
/* Set the TRB length, TD size, & interrupter fields. */
remainder = xhci_td_remainder(xhci, running_total,
   trb_buff_len, td_len,
-  urb, trbs_per_td - j - 1);
+  urb, more_trbs_coming);
 
length_field = TRB_LEN(trb_buff_len) |
TRB_INTR_TARGET(0);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/14] xhci: TD-fragment, align the unsplittable case with a bounce buffer

2016-06-21 Thread Mathias Nyman
If the last trb before a link is not packet size aligned, and is not
splittable then use a bounce buffer for that chunk of max packet size
unalignable data.

Allocate a max packet size bounce buffer for every segment of a bulk
endpoint ring at the same time as allocating the ring.
If we need to align the data before the link trb in that segment then
copy the data to the segment bounce buffer, dma map it, and enqueue it.
Once the td finishes, or is cancelled, unmap it.

For in transfers we need to first map the bounce buffer, then queue it,
after it finishes, copy the bounce buffer to the original sg list, and
finally unmap it

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-mem.c  |  74 ++
 drivers/usb/host/xhci-ring.c | 106 +++
 drivers/usb/host/xhci.c  |   5 +-
 drivers/usb/host/xhci.h  |  10 +++-
 4 files changed, 155 insertions(+), 40 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index bad0d1f..6afe323 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -37,7 +37,9 @@
  * "All components of all Command and Transfer TRBs shall be initialized to 
'0'"
  */
 static struct xhci_segment *xhci_segment_alloc(struct xhci_hcd *xhci,
-   unsigned int cycle_state, gfp_t flags)
+  unsigned int cycle_state,
+  unsigned int max_packet,
+  gfp_t flags)
 {
struct xhci_segment *seg;
dma_addr_t  dma;
@@ -53,6 +55,14 @@ static struct xhci_segment *xhci_segment_alloc(struct 
xhci_hcd *xhci,
return NULL;
}
 
+   if (max_packet) {
+   seg->bounce_buf = kzalloc(max_packet, flags | GFP_DMA);
+   if (!seg->bounce_buf) {
+   dma_pool_free(xhci->segment_pool, seg->trbs, dma);
+   kfree(seg);
+   return NULL;
+   }
+   }
/* If the cycle state is 0, set the cycle bit to 1 for all the TRBs */
if (cycle_state == 0) {
for (i = 0; i < TRBS_PER_SEGMENT; i++)
@@ -70,6 +80,7 @@ static void xhci_segment_free(struct xhci_hcd *xhci, struct 
xhci_segment *seg)
dma_pool_free(xhci->segment_pool, seg->trbs, seg->dma);
seg->trbs = NULL;
}
+   kfree(seg->bounce_buf);
kfree(seg);
 }
 
@@ -317,11 +328,11 @@ static void xhci_initialize_ring_info(struct xhci_ring 
*ring,
 static int xhci_alloc_segments_for_ring(struct xhci_hcd *xhci,
struct xhci_segment **first, struct xhci_segment **last,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_segment *prev;
 
-   prev = xhci_segment_alloc(xhci, cycle_state, flags);
+   prev = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!prev)
return -ENOMEM;
num_segs--;
@@ -330,7 +341,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
while (num_segs > 0) {
struct xhci_segment *next;
 
-   next = xhci_segment_alloc(xhci, cycle_state, flags);
+   next = xhci_segment_alloc(xhci, cycle_state, max_packet, flags);
if (!next) {
prev = *first;
while (prev) {
@@ -360,7 +371,7 @@ static int xhci_alloc_segments_for_ring(struct xhci_hcd 
*xhci,
  */
 static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd *xhci,
unsigned int num_segs, unsigned int cycle_state,
-   enum xhci_ring_type type, gfp_t flags)
+   enum xhci_ring_type type, unsigned int max_packet, gfp_t flags)
 {
struct xhci_ring*ring;
int ret;
@@ -370,13 +381,15 @@ static struct xhci_ring *xhci_ring_alloc(struct xhci_hcd 
*xhci,
return NULL;
 
ring->num_segs = num_segs;
+   ring->bounce_buf_len = max_packet;
INIT_LIST_HEAD(&ring->td_list);
ring->type = type;
if (num_segs == 0)
return ring;
 
ret = xhci_alloc_segments_for_ring(xhci, &ring->first_seg,
-   &ring->last_seg, num_segs, cycle_state, type, flags);
+   &ring->last_seg, num_segs, cycle_state, type,
+   max_packet, flags);
if (ret)
goto fail;
 
@@ -470,7 +483,8 @@ int xhci_ring_expansion(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
ring->num_segs : num_segs_needed;
 
ret = xhci_alloc_segments_for_ring(xhci, &first, &last,
-  

[PATCH 08/14] xhci: use and add separate function for checking for link trbs

2016-06-21 Thread Mathias Nyman
Add a new is_link_trb() function that only checks for link trbs.
We want to split generic last_trb() function which is used for both
event rings without link trbs, and endpoint and command rings with links.

This will allow us to easier check for link trbs added mid segments.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index c13b842..4de8a2b 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -115,6 +115,11 @@ static int last_trb(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
return TRB_TYPE_LINK_LE32(trb->link.control);
 }
 
+static bool trb_is_link(union xhci_trb *trb)
+{
+   return TRB_TYPE_LINK_LE32(trb->link.control);
+}
+
 static int enqueue_is_link_trb(struct xhci_ring *ring)
 {
struct xhci_link_trb *link = &ring->enqueue->link;
@@ -130,7 +135,7 @@ static void next_trb(struct xhci_hcd *xhci,
struct xhci_segment **seg,
union xhci_trb **trb)
 {
-   if (last_trb(xhci, ring, *seg, *trb)) {
+   if (trb_is_link(*trb)) {
*seg = (*seg)->next;
*trb = ((*seg)->trbs);
} else {
@@ -150,8 +155,7 @@ static void inc_deq(struct xhci_hcd *xhci, struct xhci_ring 
*ring)
 * If this is not event ring, and the dequeue pointer
 * is not on a link TRB, there is one more usable TRB
 */
-   if (ring->type != TYPE_EVENT &&
-   !last_trb(xhci, ring, ring->deq_seg, ring->dequeue))
+   if (ring->type != TYPE_EVENT && !trb_is_link(ring->dequeue))
ring->num_trbs_free++;
 
do {
@@ -199,13 +203,13 @@ static void inc_enq(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
 
chain = le32_to_cpu(ring->enqueue->generic.field[3]) & TRB_CHAIN;
/* If this is not event ring, there is one less usable TRB */
-   if (!last_trb(xhci, ring, ring->enq_seg, ring->enqueue))
+   if (!trb_is_link(ring->enqueue))
ring->num_trbs_free--;
next = ++(ring->enqueue);
 
ring->enq_updates++;
/* Update the dequeue pointer further if that was a link TRB */
-   while (last_trb(xhci, ring, ring->enq_seg, next)) {
+   while (trb_is_link(next)) {
 
/*
 * If the caller doesn't plan on enqueueing more TDs before
@@ -931,7 +935,7 @@ static void update_ring_for_set_deq_completion(struct 
xhci_hcd *xhci,
 * the dequeue pointer one segment further, or we'll jump off
 * the segment into la-la-land.
 */
-   if (last_trb(xhci, ep_ring, ep_ring->deq_seg, ep_ring->dequeue)) {
+   if (trb_is_link(ep_ring->dequeue)) {
ep_ring->deq_seg = ep_ring->deq_seg->next;
ep_ring->dequeue = ep_ring->deq_seg->trbs;
}
@@ -940,8 +944,7 @@ static void update_ring_for_set_deq_completion(struct 
xhci_hcd *xhci,
/* We have more usable TRBs */
ep_ring->num_trbs_free++;
ep_ring->dequeue++;
-   if (last_trb(xhci, ep_ring, ep_ring->deq_seg,
-   ep_ring->dequeue)) {
+   if (trb_is_link(ep_ring->dequeue)) {
if (ep_ring->dequeue ==
dev->eps[ep_index].queued_deq_ptr)
break;
@@ -2880,7 +2883,7 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
xhci_ring *ep_ring,
 
next = ring->enqueue;
 
-   while (last_trb(xhci, ring, ring->enq_seg, next)) {
+   while (trb_is_link(next)) {
/* If we're not dealing with 0.95 hardware or isoc rings
 * on AMD 0.96 host, clear the chain bit.
 */
@@ -3269,8 +3272,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 */
if (enqd_len + trb_buff_len < full_len) {
field |= TRB_CHAIN;
-   if (last_trb(xhci, ring, ring->enq_seg,
-ring->enqueue + 1)) {
+   if (trb_is_link(ring->enqueue + 1)) {
if (xhci_align_td(xhci, urb, enqd_len,
  &trb_buff_len,
  ring->enq_seg)) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/14] xhci: clean up event ring checks from inc_enq()

2016-06-21 Thread Mathias Nyman
Remove the event ring related checks in inc_enq()

Host hardware is the producer of events on the event ring,
driver will not queue anything, or call inc_enq() for the
event ring.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 64 +++-
 1 file changed, 28 insertions(+), 36 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 142e53e..c13b842 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -199,50 +199,42 @@ static void inc_enq(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
 
chain = le32_to_cpu(ring->enqueue->generic.field[3]) & TRB_CHAIN;
/* If this is not event ring, there is one less usable TRB */
-   if (ring->type != TYPE_EVENT &&
-   !last_trb(xhci, ring, ring->enq_seg, ring->enqueue))
+   if (!last_trb(xhci, ring, ring->enq_seg, ring->enqueue))
ring->num_trbs_free--;
next = ++(ring->enqueue);
 
ring->enq_updates++;
-   /* Update the dequeue pointer further if that was a link TRB or we're at
-* the end of an event ring segment (which doesn't have link TRBS)
-*/
+   /* Update the dequeue pointer further if that was a link TRB */
while (last_trb(xhci, ring, ring->enq_seg, next)) {
-   if (ring->type != TYPE_EVENT) {
-   /*
-* If the caller doesn't plan on enqueueing more
-* TDs before ringing the doorbell, then we
-* don't want to give the link TRB to the
-* hardware just yet.  We'll give the link TRB
-* back in prepare_ring() just before we enqueue
-* the TD at the top of the ring.
-*/
-   if (!chain && !more_trbs_coming)
-   break;
 
-   /* If we're not dealing with 0.95 hardware or
-* isoc rings on AMD 0.96 host,
-* carry over the chain bit of the previous TRB
-* (which may mean the chain bit is cleared).
-*/
-   if (!(ring->type == TYPE_ISOC &&
-   (xhci->quirks & XHCI_AMD_0x96_HOST))
-   && !xhci_link_trb_quirk(xhci)) {
-   next->link.control &=
-   cpu_to_le32(~TRB_CHAIN);
-   next->link.control |=
-   cpu_to_le32(chain);
-   }
-   /* Give this link TRB to the hardware */
-   wmb();
-   next->link.control ^= cpu_to_le32(TRB_CYCLE);
+   /*
+* If the caller doesn't plan on enqueueing more TDs before
+* ringing the doorbell, then we don't want to give the link TRB
+* to the hardware just yet. We'll give the link TRB back in
+* prepare_ring() just before we enqueue the TD at the top of
+* the ring.
+*/
+   if (!chain && !more_trbs_coming)
+   break;
 
-   /* Toggle the cycle bit after the last ring segment. */
-   if (last_trb_on_last_seg(xhci, ring, ring->enq_seg, 
next)) {
-   ring->cycle_state ^= 1;
-   }
+   /* If we're not dealing with 0.95 hardware or isoc rings on
+* AMD 0.96 host, carry over the chain bit of the previous TRB
+* (which may mean the chain bit is cleared).
+*/
+   if (!(ring->type == TYPE_ISOC &&
+ (xhci->quirks & XHCI_AMD_0x96_HOST)) &&
+   !xhci_link_trb_quirk(xhci)) {
+   next->link.control &= cpu_to_le32(~TRB_CHAIN);
+   next->link.control |= cpu_to_le32(chain);
}
+   /* Give this link TRB to the hardware */
+   wmb();
+   next->link.control ^= cpu_to_le32(TRB_CYCLE);
+
+   /* Toggle the cycle bit after the last ring segment. */
+   if (last_trb_on_last_seg(xhci, ring, ring->enq_seg, next))
+   ring->cycle_state ^= 1;
+
ring->enq_seg = ring->enq_seg->next;
ring->enqueue = ring->enq_seg->trbs;
next = ring->enqueue;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/14] xhci: rename ep_ring variable in queue_bulk_tx(), no functional change

2016-06-21 Thread Mathias Nyman
Tiny change, a bit more readable.
The real reason for this change is that the coming td fragment work
had several over 80 lines character lines split just because of a few
extra characters in variable names.

no functional changes

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52deae4..c3e9a60 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3102,7 +3102,7 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
 {
-   struct xhci_ring *ep_ring;
+   struct xhci_ring *ring;
struct urb_priv *urb_priv;
struct xhci_td *td;
struct xhci_generic_trb *start_trb;
@@ -3111,14 +3111,13 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
bool zero_length_needed;
unsigned int num_trbs, last_trb_num, i;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len;
-   unsigned int full_len;
+   unsigned int running_total, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
 
-   ep_ring = xhci_urb_to_transfer_ring(xhci, urb);
-   if (!ep_ring)
+   ring = xhci_urb_to_transfer_ring(xhci, urb);
+   if (!ring)
return -EINVAL;
 
/* If we have scatter/gather list, we use it. */
@@ -3159,8 +3158,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 * until we've finished creating all the other TRBs.  The ring's cycle
 * state may change as we enqueue the other TRBs, so save it too.
 */
-   start_trb = &ep_ring->enqueue->generic;
-   start_cycle = ep_ring->cycle_state;
+   start_trb = &ring->enqueue->generic;
+   start_cycle = ring->cycle_state;
 
full_len = urb->transfer_buffer_length;
running_total = 0;
@@ -3199,7 +3198,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
-   field |= ep_ring->cycle_state;
+   field |= ring->cycle_state;
 
/* Chain all the TRBs together; clear the chain bit in the last
 * TRB to indicate it's the last TRB in the chain.
@@ -3209,10 +3208,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
} else {
field |= TRB_IOC;
if (i == last_trb_num)
-   td->last_trb = ep_ring->enqueue;
+   td->last_trb = ring->enqueue;
else if (zero_length_needed) {
trb_buff_len = 0;
-   urb_priv->td[1]->last_trb = ep_ring->enqueue;
+   urb_priv->td[1]->last_trb = ring->enqueue;
}
}
 
@@ -3233,7 +3232,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
more_trbs_coming = true;
else
more_trbs_coming = false;
-   queue_trb(xhci, ep_ring, more_trbs_coming,
+   queue_trb(xhci, ring, more_trbs_coming,
lower_32_bits(addr),
upper_32_bits(addr),
length_field,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/14] xhci: don't rely on precalculated value of needed trbs in the enqueue loop

2016-06-21 Thread Mathias Nyman
Queue trbs until all payload data in the urb is tranferred.

The actual number of trbs might need to change from the pre-calculated
number when the packet alignment restrictions for td fragments in
xhci 4.11.7.1 are taken into account.

Long term plan is to get rid of calculating the needed trbs in advance
all together. It's an unnecessary extra walk through the scatterlist.

This change also allows some bulk queue function simplifications

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 75 +---
 1 file changed, 29 insertions(+), 46 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f74ac1c..d86da81 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3109,9 +3109,10 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
struct scatterlist *sg = NULL;
bool more_trbs_coming = true;
bool need_zero_pkt = false;
-   unsigned int num_trbs, last_trb_num, i;
+   bool first_trb = true;
+   unsigned int num_trbs;
unsigned int start_cycle, num_sgs = 0;
-   unsigned int running_total, block_len, trb_buff_len, full_len;
+   unsigned int enqd_len, block_len, trb_buff_len, full_len;
int ret;
u32 field, length_field, remainder;
u64 addr;
@@ -3120,14 +3121,19 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
if (!ring)
return -EINVAL;
 
+   full_len = urb->transfer_buffer_length;
/* If we have scatter/gather list, we use it. */
if (urb->num_sgs) {
num_sgs = urb->num_mapped_sgs;
sg = urb->sg;
+   addr = (u64) sg_dma_address(sg);
+   block_len = sg_dma_len(sg);
num_trbs = count_sg_trbs_needed(urb);
-   } else
+   } else {
num_trbs = count_trbs_needed(urb);
-
+   addr = (u64) urb->transfer_dma;
+   block_len = full_len;
+   }
ret = prepare_transfer(xhci, xhci->devs[slot_id],
ep_index, urb->stream_id,
num_trbs, urb, 0, mem_flags);
@@ -3136,8 +3142,6 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 
urb_priv = urb->hcpriv;
 
-   last_trb_num = num_trbs - 1;
-
/* Deal with URB_ZERO_PACKET - need one more td/trb */
if (urb->transfer_flags & URB_ZERO_PACKET && urb_priv->length > 1)
need_zero_pkt = true;
@@ -3152,40 +3156,20 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
start_trb = &ring->enqueue->generic;
start_cycle = ring->cycle_state;
 
-   full_len = urb->transfer_buffer_length;
-   running_total = 0;
-   block_len = 0;
-
/* Queue the TRBs, even if they are zero-length */
-   for (i = 0; i < num_trbs; i++) {
+   for (enqd_len = 0; enqd_len < full_len; enqd_len += trb_buff_len) {
field = TRB_TYPE(TRB_NORMAL);
 
-   if (block_len == 0) {
-   /* A new contiguous block. */
-   if (sg) {
-   addr = (u64) sg_dma_address(sg);
-   block_len = sg_dma_len(sg);
-   } else {
-   addr = (u64) urb->transfer_dma;
-   block_len = full_len;
-   }
-   /* TRB buffer should not cross 64KB boundaries */
-   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
-   trb_buff_len = min_t(unsigned int,
-   trb_buff_len,
-   block_len);
-   } else {
-   /* Further through the contiguous block. */
-   trb_buff_len = block_len;
-   if (trb_buff_len > TRB_MAX_BUFF_SIZE)
-   trb_buff_len = TRB_MAX_BUFF_SIZE;
-   }
+   /* TRB buffer should not cross 64KB boundaries */
+   trb_buff_len = TRB_BUFF_LEN_UP_TO_BOUNDARY(addr);
+   trb_buff_len = min_t(unsigned int, trb_buff_len, block_len);
 
-   if (running_total + trb_buff_len > full_len)
-   trb_buff_len = full_len - running_total;
+   if (enqd_len + trb_buff_len > full_len)
+   trb_buff_len = full_len - enqd_len;
 
/* Don't change the cycle bit of the first TRB until later */
-   if (i == 0) {
+   if (first_trb) {
+   first_trb = false;
if (start_cycle == 0)
field |= TRB_CYCLE;
} else
@@ -3194,7 +3178,7 @@ int xhci_que

[PATCH 05/14] xhci: align the last trb before link if it is easily splittable.

2016-06-21 Thread Mathias Nyman
TD fragments section 4.11.7.1 in xhci specs have additional requirements
on how trbs in TDs must be organized.

TD fragments shall not span transfer ring segments and TD fragments must
be packet aligned. Normally we don't care about TD fragments, on TD is one
big fragment, but if a TD spans ring segments it will be treated as two
fragments, and we need to comply with the alignment requirements.

For us this means that the payload data must be packet aligned in the
last trb before a link trb.
In most mass storage bulk tranfers we are lucky as the block size aligns
nicely with packet size, and there are no issues.
However, usb network adapters using scatterlists can hit this alignment
issue, and usbtest in kernel triggers this in minutes.

This patch is a partial solution, it solves the easy case when the last
trb before the link trb contains a packet boundary.
If that is the case then just split the trb at the boundary.
If not, then just print a debug message and continue as we have always
done, hoping for the best

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d86da81..bf9ffa4 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3098,6 +3098,27 @@ static u32 xhci_td_remainder(struct xhci_hcd *xhci, int 
transferred,
return (total_packet_count - ((transferred + trb_buff_len) / maxp));
 }
 
+static int xhci_align_td(struct xhci_hcd *xhci, struct urb *urb, u32 enqd_len,
+u32 *trb_buff_len)
+{
+   unsigned int unalign;
+   unsigned int max_pkt;
+
+   max_pkt = GET_MAX_PACKET(usb_endpoint_maxp(&urb->ep->desc));
+   unalign = (enqd_len + *trb_buff_len) % max_pkt;
+
+   /* we got lucky, last normal TRB data on segment is packet aligned */
+   if (unalign == 0)
+   return 0;
+
+   /* is the last nornal TRB alignable by splitting it */
+   if (*trb_buff_len > unalign) {
+   *trb_buff_len -= unalign;
+   return 0;
+   }
+   return 1;
+}
+
 /* This is very similar to what ehci-q.c qtd_fill() does */
 int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
struct urb *urb, int slot_id, unsigned int ep_index)
@@ -3180,6 +3201,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t 
mem_flags,
 */
if (enqd_len + trb_buff_len < full_len) {
field |= TRB_CHAIN;
+   if (last_trb(xhci, ring, ring->enq_seg,
+ring->enqueue + 1)) {
+   if (xhci_align_td(xhci, urb, enqd_len,
+&trb_buff_len))
+   xhci_dbg(xhci, "TRB align fail\n");
+   }
} else {
field |= TRB_IOC;
more_trbs_coming = false;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/14] xhci: remove enqueue_is_link() helper

2016-06-21 Thread Mathias Nyman
Only used in one place, replace with trb_is_link() helper

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 086b871..2d8dcb1f 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -107,12 +107,6 @@ static bool trb_is_link(union xhci_trb *trb)
return TRB_TYPE_LINK_LE32(trb->link.control);
 }
 
-static int enqueue_is_link_trb(struct xhci_ring *ring)
-{
-   struct xhci_link_trb *link = &ring->enqueue->link;
-   return TRB_TYPE_LINK_LE32(link->control);
-}
-
 static bool last_trb_on_seg(struct xhci_segment *seg, union xhci_trb *trb)
 {
return trb == &seg->trbs[TRBS_PER_SEGMENT - 1];
@@ -2873,7 +2867,7 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
xhci_ring *ep_ring,
}
}
 
-   if (enqueue_is_link_trb(ep_ring)) {
+   if (trb_is_link(ep_ring->enqueue)) {
struct xhci_ring *ring = ep_ring;
union xhci_trb *next;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/14] xhci: rework inc_deq() and fix off by one error.

2016-06-21 Thread Mathias Nyman
inc_deq() is called both for rings with link trbs and the event ring
without link trbs.
The last_trb() check in inc_deq() has a off by one error, going beyond
allocated array when checking if trb == [TRBS_PER_SEGMENT], and the whole
inc_deq() depend on this.

Rewrite the inc_deq() funciton, remove the faulty last_trb() helper, add
new last_trb_on_seg() and last_trb_on_ring() helpers

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 68 +---
 1 file changed, 32 insertions(+), 36 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 4de8a2b..086b871 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -102,19 +102,6 @@ static bool last_trb_on_last_seg(struct xhci_hcd *xhci, 
struct xhci_ring *ring,
return le32_to_cpu(trb->link.control) & LINK_TOGGLE;
 }
 
-/* Is this TRB a link TRB or was the last TRB the last TRB in this event ring
- * segment?  I.e. would the updated event TRB pointer step off the end of the
- * event seg?
- */
-static int last_trb(struct xhci_hcd *xhci, struct xhci_ring *ring,
-   struct xhci_segment *seg, union xhci_trb *trb)
-{
-   if (ring == xhci->event_ring)
-   return trb == &seg->trbs[TRBS_PER_SEGMENT];
-   else
-   return TRB_TYPE_LINK_LE32(trb->link.control);
-}
-
 static bool trb_is_link(union xhci_trb *trb)
 {
return TRB_TYPE_LINK_LE32(trb->link.control);
@@ -126,6 +113,17 @@ static int enqueue_is_link_trb(struct xhci_ring *ring)
return TRB_TYPE_LINK_LE32(link->control);
 }
 
+static bool last_trb_on_seg(struct xhci_segment *seg, union xhci_trb *trb)
+{
+   return trb == &seg->trbs[TRBS_PER_SEGMENT - 1];
+}
+
+static bool last_trb_on_ring(struct xhci_ring *ring,
+   struct xhci_segment *seg, union xhci_trb *trb)
+{
+   return last_trb_on_seg(seg, trb) && (seg->next == ring->first_seg);
+}
+
 /* Updates trb to point to the next TRB in the ring, and updates seg if the 
next
  * TRB is in a new segment.  This does not skip over link TRBs, and it does not
  * effect the ring dequeue or enqueue pointers.
@@ -151,31 +149,29 @@ static void inc_deq(struct xhci_hcd *xhci, struct 
xhci_ring *ring)
 {
ring->deq_updates++;
 
-   /*
-* If this is not event ring, and the dequeue pointer
-* is not on a link TRB, there is one more usable TRB
-*/
-   if (ring->type != TYPE_EVENT && !trb_is_link(ring->dequeue))
-   ring->num_trbs_free++;
-
-   do {
-   /*
-* Update the dequeue pointer further if that was a link TRB or
-* we're at the end of an event ring segment (which doesn't have
-* link TRBS)
-*/
-   if (last_trb(xhci, ring, ring->deq_seg, ring->dequeue)) {
-   if (ring->type == TYPE_EVENT &&
-   last_trb_on_last_seg(xhci, ring,
-   ring->deq_seg, ring->dequeue)) {
-   ring->cycle_state ^= 1;
-   }
-   ring->deq_seg = ring->deq_seg->next;
-   ring->dequeue = ring->deq_seg->trbs;
-   } else {
+   /* event ring doesn't have link trbs, check for last trb */
+   if (ring->type == TYPE_EVENT) {
+   if (!last_trb_on_seg(ring->deq_seg, ring->dequeue)) {
ring->dequeue++;
+   return;
}
-   } while (last_trb(xhci, ring, ring->deq_seg, ring->dequeue));
+   if (last_trb_on_ring(ring, ring->deq_seg, ring->dequeue))
+   ring->cycle_state ^= 1;
+   ring->deq_seg = ring->deq_seg->next;
+   ring->dequeue = ring->deq_seg->trbs;
+   return;
+   }
+
+   /* All other rings have link trbs */
+   if (!trb_is_link(ring->dequeue)) {
+   ring->dequeue++;
+   ring->num_trbs_free++;
+   }
+   while (trb_is_link(ring->dequeue)) {
+   ring->deq_seg = ring->deq_seg->next;
+   ring->dequeue = ring->deq_seg->trbs;
+   }
+   return;
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/14] xhci: rename and simplify last_trb_on_last_seg() helper

2016-06-21 Thread Mathias Nyman
It's only used with rings that have link trbs

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 65 +---
 1 file changed, 25 insertions(+), 40 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 2d8dcb1f..26ed512 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -89,19 +89,6 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
return seg->dma + (segment_offset * sizeof(*trb));
 }
 
-/* Does this link TRB point to the first segment in a ring,
- * or was the previous TRB the last TRB on the last segment in the ERST?
- */
-static bool last_trb_on_last_seg(struct xhci_hcd *xhci, struct xhci_ring *ring,
-   struct xhci_segment *seg, union xhci_trb *trb)
-{
-   if (ring == xhci->event_ring)
-   return (trb == &seg->trbs[TRBS_PER_SEGMENT]) &&
-   (seg->next == xhci->event_ring->first_seg);
-   else
-   return le32_to_cpu(trb->link.control) & LINK_TOGGLE;
-}
-
 static bool trb_is_link(union xhci_trb *trb)
 {
return TRB_TYPE_LINK_LE32(trb->link.control);
@@ -118,6 +105,11 @@ static bool last_trb_on_ring(struct xhci_ring *ring,
return last_trb_on_seg(seg, trb) && (seg->next == ring->first_seg);
 }
 
+static bool link_trb_toggles_cycle(union xhci_trb *trb)
+{
+   return le32_to_cpu(trb->link.control) & LINK_TOGGLE;
+}
+
 /* Updates trb to point to the next TRB in the ring, and updates seg if the 
next
  * TRB is in a new segment.  This does not skip over link TRBs, and it does not
  * effect the ring dequeue or enqueue pointers.
@@ -226,7 +218,7 @@ static void inc_enq(struct xhci_hcd *xhci, struct xhci_ring 
*ring,
next->link.control ^= cpu_to_le32(TRB_CYCLE);
 
/* Toggle the cycle bit after the last ring segment. */
-   if (last_trb_on_last_seg(xhci, ring, ring->enq_seg, next))
+   if (link_trb_toggles_cycle(next))
ring->cycle_state ^= 1;
 
ring->enq_seg = ring->enq_seg->next;
@@ -2867,36 +2859,29 @@ static int prepare_ring(struct xhci_hcd *xhci, struct 
xhci_ring *ep_ring,
}
}
 
-   if (trb_is_link(ep_ring->enqueue)) {
-   struct xhci_ring *ring = ep_ring;
-   union xhci_trb *next;
-
-   next = ring->enqueue;
+   while (trb_is_link(ep_ring->enqueue)) {
+   /* If we're not dealing with 0.95 hardware or isoc rings
+* on AMD 0.96 host, clear the chain bit.
+*/
+   if (!xhci_link_trb_quirk(xhci) &&
+   !(ep_ring->type == TYPE_ISOC &&
+ (xhci->quirks & XHCI_AMD_0x96_HOST)))
+   ep_ring->enqueue->link.control &=
+   cpu_to_le32(~TRB_CHAIN);
+   else
+   ep_ring->enqueue->link.control |=
+   cpu_to_le32(TRB_CHAIN);
 
-   while (trb_is_link(next)) {
-   /* If we're not dealing with 0.95 hardware or isoc rings
-* on AMD 0.96 host, clear the chain bit.
-*/
-   if (!xhci_link_trb_quirk(xhci) &&
-   !(ring->type == TYPE_ISOC &&
-(xhci->quirks & XHCI_AMD_0x96_HOST)))
-   next->link.control &= cpu_to_le32(~TRB_CHAIN);
-   else
-   next->link.control |= cpu_to_le32(TRB_CHAIN);
+   wmb();
+   ep_ring->enqueue->link.control ^= cpu_to_le32(TRB_CYCLE);
 
-   wmb();
-   next->link.control ^= cpu_to_le32(TRB_CYCLE);
+   /* Toggle the cycle bit after the last ring segment. */
+   if (link_trb_toggles_cycle(ep_ring->enqueue))
+   ep_ring->cycle_state ^= 1;
 
-   /* Toggle the cycle bit after the last ring segment. */
-   if (last_trb_on_last_seg(xhci, ring, ring->enq_seg, 
next)) {
-   ring->cycle_state ^= 1;
-   }
-   ring->enq_seg = ring->enq_seg->next;
-   ring->enqueue = ring->enq_seg->trbs;
-   next = ring->enqueue;
-   }
+   ep_ring->enq_seg = ep_ring->enq_seg->next;
+   ep_ring->enqueue = ep_ring->enq_seg->trbs;
}
-
return 0;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/14] usb: dwc3: host: use build-in property instead of platform data

2016-06-21 Thread Mathias Nyman
From: Heikki Krogerus 

This should allow xhci to remove handling of platform data.

Signed-off-by: Heikki Krogerus 
Cc: Felipe Balbi 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/dwc3/host.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c
index c679f63..67f90d7 100644
--- a/drivers/usb/dwc3/host.c
+++ b/drivers/usb/dwc3/host.c
@@ -16,14 +16,13 @@
  */
 
 #include 
-#include 
 
 #include "core.h"
 
 int dwc3_host_init(struct dwc3 *dwc)
 {
+   struct property_entry   props[2];
struct platform_device  *xhci;
-   struct usb_xhci_pdata   pdata;
int ret;
 
xhci = platform_device_alloc("xhci-hcd", PLATFORM_DEVID_AUTO);
@@ -47,14 +46,15 @@ int dwc3_host_init(struct dwc3 *dwc)
goto err1;
}
 
-   memset(&pdata, 0, sizeof(pdata));
+   memset(props, 0, sizeof(struct property_entry) * ARRAY_SIZE(props));
 
-   pdata.usb3_lpm_capable = dwc->usb3_lpm_capable;
-
-   ret = platform_device_add_data(xhci, &pdata, sizeof(pdata));
-   if (ret) {
-   dev_err(dwc->dev, "couldn't add platform data to xHCI 
device\n");
-   goto err1;
+   if (dwc->usb3_lpm_capable) {
+   props[0].name = "usb3-lpm-capable";
+   ret = platform_device_add_properties(xhci, props);
+   if (ret) {
+   dev_err(dwc->dev, "failed to add properties to xHCI\n");
+   goto err1;
+   }
}
 
phy_create_lookup(dwc->usb2_generic_phy, "usb2-phy",
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/14] xhci: plat: adapt to unified device property interface

2016-06-21 Thread Mathias Nyman
From: Heikki Krogerus 

Requesting the only property that the driver needs using the
unified device property interface so it will be available
for all types of platforms, not just the ones using DT.

Signed-off-by: Heikki Krogerus 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-plat.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index 676ea45..ff916b3 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -138,7 +138,6 @@ MODULE_DEVICE_TABLE(of, usb_xhci_of_match);
 
 static int xhci_plat_probe(struct platform_device *pdev)
 {
-   struct device_node  *node = pdev->dev.of_node;
struct usb_xhci_pdata   *pdata = dev_get_platdata(&pdev->dev);
const struct of_device_id *match;
const struct hc_driver  *driver;
@@ -199,7 +198,7 @@ static int xhci_plat_probe(struct platform_device *pdev)
}
 
xhci = hcd_to_xhci(hcd);
-   match = of_match_node(usb_xhci_of_match, node);
+   match = of_match_node(usb_xhci_of_match, pdev->dev.of_node);
if (match) {
const struct xhci_plat_priv *priv_match = match->data;
struct xhci_plat_priv *priv = hcd_to_xhci_priv(hcd);
@@ -220,7 +219,7 @@ static int xhci_plat_probe(struct platform_device *pdev)
goto disable_clk;
}
 
-   if ((node && of_property_read_bool(node, "usb3-lpm-capable")) ||
+   if (device_property_read_bool(&pdev->dev, "usb3-lpm-capable") ||
(pdata && pdata->usb3_lpm_capable))
xhci->quirks |= XHCI_LPM_SUPPORT;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/14] xhci: get rid of platform data

2016-06-21 Thread Mathias Nyman
From: Heikki Krogerus 

No more users for it.

Signed-off-by: Heikki Krogerus 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-plat.c |  5 +
 include/linux/usb/xhci_pdriver.h | 27 ---
 2 files changed, 1 insertion(+), 31 deletions(-)
 delete mode 100644 include/linux/usb/xhci_pdriver.h

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index ff916b3..906e445 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "xhci.h"
@@ -138,7 +137,6 @@ MODULE_DEVICE_TABLE(of, usb_xhci_of_match);
 
 static int xhci_plat_probe(struct platform_device *pdev)
 {
-   struct usb_xhci_pdata   *pdata = dev_get_platdata(&pdev->dev);
const struct of_device_id *match;
const struct hc_driver  *driver;
struct xhci_hcd *xhci;
@@ -219,8 +217,7 @@ static int xhci_plat_probe(struct platform_device *pdev)
goto disable_clk;
}
 
-   if (device_property_read_bool(&pdev->dev, "usb3-lpm-capable") ||
-   (pdata && pdata->usb3_lpm_capable))
+   if (device_property_read_bool(&pdev->dev, "usb3-lpm-capable"))
xhci->quirks |= XHCI_LPM_SUPPORT;
 
if (HCC_MAX_PSA(xhci->hcc_params) >= 4)
diff --git a/include/linux/usb/xhci_pdriver.h b/include/linux/usb/xhci_pdriver.h
deleted file mode 100644
index 376654b..000
--- a/include/linux/usb/xhci_pdriver.h
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
- * for more details.
- *
- */
-
-#ifndef __USB_CORE_XHCI_PDRIVER_H
-#define __USB_CORE_XHCI_PDRIVER_H
-
-/**
- * struct usb_xhci_pdata - platform_data for generic xhci platform driver
- *
- * @usb3_lpm_capable:  determines if this xhci platform supports USB3
- * LPM capability
- *
- */
-struct usb_xhci_pdata {
-   unsignedusb3_lpm_capable:1;
-};
-
-#endif /* __USB_CORE_XHCI_PDRIVER_H */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xhci: free the correct ring

2016-06-30 Thread Mathias Nyman

On 30.06.2016 15:26, Arnd Bergmann wrote:

gcc warns about what first looks like a reference to an uninitialized
variable:

drivers/usb/host/xhci-ring.c: In function 'handle_cmd_completion':
drivers/usb/host/xhci-ring.c:753:4: error: 'ep_ring' may be used uninitialized 
in this function [-Werror=maybe-uninitialized]
 xhci_unmap_td_bounce_buffer(xhci, ep_ring, cur_td);
 ^~
drivers/usb/host/xhci-ring.c:647:20: note: 'ep_ring' was declared here
   struct xhci_ring *ep_ring;
 ^~~

It's clear to see that the list_empty() check means it can never be
uninitialized, however it still looks wrong:

When ep->cancelled_td_list contains more than one entry, the
ep_ring variable will point to the ring that was retrieved
from the last urb, and we have to look it up again in the
second loop instead, which fixes the behavior and gets rid of the
warning too.

Signed-off-by: Arnd Bergmann 
Fixes: f9c589e142d0 ("xhci: TD-fragment, align the unsplittable case with a bounce 
buffer")
---
  drivers/usb/host/xhci-ring.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 21e1dd62ebf8..918e0c739b79 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -749,6 +749,7 @@ remove_finished_td:
/* Doesn't matter what we pass for status, since the core will
 * just overwrite it (because the URB has been unlinked).
 */
+   ep_ring = xhci_urb_to_transfer_ring(xhci, cur_td->urb);
if (ep_ring && cur_td->bounce_seg)
xhci_unmap_td_bounce_buffer(xhci, ep_ring, cur_td);
    xhci_giveback_urb_in_irq(xhci, cur_td, 0);



Thanks,

Acked-by: Mathias Nyman 

In most cases each endpoint has only one ring and one list of canceled TDs,
so each TD on a cancelled_td_list is for the same ring.

But with streams in use that might no longer be the case.
This does ensure that the streams work properly.

Greg, any chance to get this into usb-next and 4.8?

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] usb: xhci: add support for performing fake doorbell

2016-11-21 Thread Mathias Nyman

On 21.11.2016 09:57, Rafał Miłecki wrote:

Hi Mathias,

On 17 October 2016 at 22:30, Rafał Miłecki  wrote:

From: Rafał Miłecki 

Broadcom's Northstar XHCI controllers seem to need a special start
procedure to work correctly. There isn't any official documentation of
this, the problem is that controller doesn't detect any connected
devices with default setup. Moreover connecting USB device to controller
that doesn't run properly can cause SoC's watchdog issues.

A workaround that was successfully tested on multiple devices is to
perform a fake doorbell. This patch adds code for doing this and enables
it on BCM4708 family.

Signed-off-by: Rafał Miłecki 
---
V2: Enable quirk for brcm,bcm4708 machines instead of adding separated binding
 for it. Thanks Rob for your comment on this.


Do you think you can pick & push this one? V2 follows Rob's suggestion
and he has some DT knowledge for sure, so I guess it should be OK.
--


Is there some more background information on this?

I don't have any contacts to Broadcom myself, adding the BMC Kernel Feedback 
list to CC.
Maybe someone over there has an errata, documentation or just general feedback.

How was this workaround even figured out? ringing the doorbell for the first
device doesn't seem like something found by trial and error,  especially when
xhci specs state that:

"Software shall not write the Doorbell of an endpoint until after it has issued 
a
Configure Endpoint Command for the endpoint and received a successful Command
Completion Event."

The whole workaround is a bit intrusive, allocating a fake device, ring a 
doorbell for a
fake device in the wrong state, clearing off HSE (host system error) which 
should only be set
when things really go bad, some random usleeps, and possible calling 
xhci_start() twice.

I can't take this as is without some more info.

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Flood of hub_ext_port_status failed (err = -71) messages

2016-11-22 Thread Mathias Nyman

On 21.11.2016 23:52, Simon Arlott wrote:

I have a 5-port USB 2.0 hub attached to a USB 3.0 host by a long length
of Cat 5e cable (via a pair of cheap USB<->Cat5e adapters from eBay)
that only works at full speed and doesn't stay connected all the time.

This wouldn't be a problem except that the kernel can get stuck in a
loop with the "URB transfer length is wrong, xHC issue?" error.

The message "hub_ext_port_status failed (err = -71)" in particular can
occur 70+ times for every instance of this error. There's no guarantee
that it will ever disconnect the device and stop doing this, it has
previously suddenly started generating 30MB/minute of log messages for
30 minutes until I unplugged the device.

Would it be ok to rate limit both of these messages to reduce the impact
of poor connections?

2016-06-17T09:56:09.219+01:00  kernel: xhci_hcd :00:14.0: URB 
transfer length is wrong, xHC issue? req. len = 4, act. len = 4294967292


Your broken hardware triggered another interesting issue.
The len value 4294967292 looks like a u32 wrapped around.

What kernelversion was this?

-Mathias  


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Flood of hub_ext_port_status failed (err = -71) messages

2016-11-22 Thread Mathias Nyman

On 22.11.2016 11:52, Simon Arlott wrote:

On 22/11/16 08:37, Mathias Nyman wrote:

On 21.11.2016 23:52, Simon Arlott wrote:

2016-06-17T09:56:09.219+01:00  kernel: xhci_hcd :00:14.0: URB 
transfer length is wrong, xHC issue? req. len = 4, act. len = 4294967292


Your broken hardware triggered another interesting issue.
The len value 4294967292 looks like a u32 wrapped around.

What kernelversion was this?


These versions:


   1 Linux version 4.4.0-24-generic (buildd@lgw01-12) (gcc version 5.3.1 
20160413 (Ubuntu 5.3.1-14ubuntu2.1) ) #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 
2016 (Ubuntu 4.4.0-24.43-generic 4.4.10)
  134694 xhci_hcd :00:14.0: URB transfer length is wrong, xHC issue? req. 
len = 4, act. len = 4294967292



Ok, there are no fixes in the control transfers pde since 4.4-0-24, so whatever 
is causing this is still there.
Probably related to how we calculate the length of these erronous control 
transfers.

one clear fault is that we overwrite the previous error code with a zero if the 
length mismatch.

Can you take a log with xhci debugging enabled, this can be done by:
echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
before plugging in the faulty hub (cable)

A usbmon trace would also help, see Documentation/usb/usbmon.txt

How about If I write some test patches, do you have the time to apply them and 
try them out?

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFT PATCH 1/1] xhci: free xhci virtual devices with leaf nodes first

2016-11-23 Thread Mathias Nyman
the tt_info provided by a HS hub might be in use to by a child device
Make sure we free the devices in the correct order.

This is needed in special cases such as when xhci controller is
reset when resuming from hibernate, and all virt_devices are freed.

Also free the virt_devices starting from max slot_id as children
more commonly have higher slot_id than parent.

CC: 
Signed-off-by: Mathias Nyman 

---

Guenter Roeck, does this work for you?

A rework of how tt_info is stored and used might be needed,
but that will take some time and won't go to stable as easily.
---
 drivers/usb/host/xhci-mem.c | 38 --
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 6afe323..b3a5cd8 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -979,6 +979,40 @@ void xhci_free_virt_device(struct xhci_hcd *xhci, int 
slot_id)
xhci->devs[slot_id] = NULL;
 }
 
+/*
+ * Free a virt_device structure.
+ * If the virt_device added a tt_info (a hub) and has children pointing to
+ * that tt_info, then free the child first. Recursive.
+ * We can't rely on udev at this point to find child-parent relationships.
+ */
+void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_id)
+{
+   struct xhci_virt_device *vdev;
+   struct list_head *tt_list_head;
+   struct xhci_tt_bw_info *tt_info, *next;
+   int i;
+
+   vdev = xhci->devs[slot_id];
+   if (!vdev)
+   return;
+
+   tt_list_head = &(xhci->rh_bw[vdev->real_port - 1].tts);
+   list_for_each_entry_safe(tt_info, next, tt_list_head, tt_list) {
+   /* is this a hub device that added a tt_info to the tts list */
+   if (tt_info->slot_id == slot_id) {
+   /* are any devices using this tt_info? */
+   for (i = 1; i < HCS_MAX_SLOTS(xhci->hcs_params1); i++) {
+   vdev = xhci->devs[i];
+   if (vdev && (vdev->tt_info == tt_info))
+   xhci_free_virt_devices_depth_first(
+   xhci, i);
+   }
+   }
+   }
+   /* we are now at a leaf device */
+   xhci_free_virt_device(xhci, slot_id);
+}
+
 int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
struct usb_device *udev, gfp_t flags)
 {
@@ -1829,8 +1863,8 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci)
}
}
 
-   for (i = 1; i < MAX_HC_SLOTS; ++i)
-   xhci_free_virt_device(xhci, i);
+   for (i = HCS_MAX_SLOTS(xhci->hcs_params1); i > 0; i--)
+   xhci_free_virt_devices_depth_first(xhci, i);
 
dma_pool_destroy(xhci->segment_pool);
xhci->segment_pool = NULL;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT PATCH 1/1] xhci: free xhci virtual devices with leaf nodes first

2016-11-23 Thread Mathias Nyman

On 23.11.2016 15:32, Guenter Roeck wrote:

On 11/23/2016 04:24 AM, Mathias Nyman wrote:

the tt_info provided by a HS hub might be in use to by a child device
Make sure we free the devices in the correct order.

This is needed in special cases such as when xhci controller is
reset when resuming from hibernate, and all virt_devices are freed.

Also free the virt_devices starting from max slot_id as children
more commonly have higher slot_id than parent.

CC: 
Signed-off-by: Mathias Nyman 

---

Guenter Roeck, does this work for you?

A rework of how tt_info is stored and used might be needed,
but that will take some time and won't go to stable as easily.


I'll give it a try. One concern, though: xhci_free_virt_device() is called
from multiple places, and does not always remove all devices. Is it save
to assume that all other callers remove children first ?



This should be the only place that xhci does a massive xhci_free_virt_device(),

In the other places it's initiated per device by usb core which should handle
child-parent relationships, or then just error paths when failing to allocate
a device in the first place (no children yet)

-Mathias




--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT PATCH 1/1] xhci: free xhci virtual devices with leaf nodes first

2016-11-24 Thread Mathias Nyman

On 24.11.2016 11:02, Felipe Balbi wrote:


Hi,

Mathias Nyman  writes:

the tt_info provided by a HS hub might be in use to by a child device
Make sure we free the devices in the correct order.

This is needed in special cases such as when xhci controller is
reset when resuming from hibernate, and all virt_devices are freed.

Also free the virt_devices starting from max slot_id as children
more commonly have higher slot_id than parent.

CC: 
Signed-off-by: Mathias Nyman 

---

Guenter Roeck, does this work for you?

A rework of how tt_info is stored and used might be needed,
but that will take some time and won't go to stable as easily.
---
  drivers/usb/host/xhci-mem.c | 38 --
  1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 6afe323..b3a5cd8 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -979,6 +979,40 @@ void xhci_free_virt_device(struct xhci_hcd *xhci, int 
slot_id)
xhci->devs[slot_id] = NULL;
  }

+/*
+ * Free a virt_device structure.
+ * If the virt_device added a tt_info (a hub) and has children pointing to
+ * that tt_info, then free the child first. Recursive.
+ * We can't rely on udev at this point to find child-parent relationships.
+ */
+void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_id)
+{
+   struct xhci_virt_device *vdev;
+   struct list_head *tt_list_head;
+   struct xhci_tt_bw_info *tt_info, *next;
+   int i;
+
+   vdev = xhci->devs[slot_id];
+   if (!vdev)
+   return;
+
+   tt_list_head = &(xhci->rh_bw[vdev->real_port - 1].tts);
+   list_for_each_entry_safe(tt_info, next, tt_list_head, tt_list) {
+   /* is this a hub device that added a tt_info to the tts list */
+   if (tt_info->slot_id == slot_id) {


if (tt_info->slot_id != slot_id)
continue;


+   /* are any devices using this tt_info? */
+   for (i = 1; i < HCS_MAX_SLOTS(xhci->hcs_params1); i++) {


off-by-one here ? Why is i starting from 1?


+   vdev = xhci->devs[i];


slit_id 0 is reserved and xhci->devs[0] is not used, so ne need to check it.

All other places that check xhci->devs[0] are avtually buggy

-Mathias  


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT PATCH 1/1] xhci: free xhci virtual devices with leaf nodes first

2016-11-24 Thread Mathias Nyman

On 24.11.2016 13:03, Felipe Balbi wrote:


Hi,

Mathias Nyman  writes:

+   /* are any devices using this tt_info? */
+   for (i = 1; i < HCS_MAX_SLOTS(xhci->hcs_params1); i++) {


off-by-one here ? Why is i starting from 1?


+   vdev = xhci->devs[i];


slit_id 0 is reserved and xhci->devs[0] is not used, so ne need to
check it.


hmm... it's reserved for the HW, sure. Do you need to over allocate the
array by 1 just to keep this first member unused? Couldn't you handle
the +1/-1 (depending on the case) in xhci driver itself? Saves a bit of
memory there.



There are many things that needs fixing in this area, but not in this patch

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] xhci: Remove busy loop from xhci_abort_cmd_ring()

2016-11-24 Thread Mathias Nyman

On 16.11.2016 07:01, OGAWA Hirofumi wrote:

Now, xhci_abort_cmd_ring() is sleepable. So no reason to use busy loop
anymore.

- Convert udelay(1000) => msleep(1).


Sounds good.


- Add xhci_handshake_sleep(), and use it.


This seems a but overkill, I'd rather don't have xhci_handshake(), 
xhci_handshake_sleep()
and __xhci_handshake() to maintain.

As we now can sleep in xhci_abort_cmd_ring() I would rather just first wait for 
the completion
and then maybe handshake check the register.  At that point it shouldn't need 
to busyloop anymore,
one read should do it.

But this is all just optimizing an error path, I'm actually fine with taking 
just
patch 1/3 and 2/3 v3, and skipping 3/3. the udelay(1000) to msleep(1) I can add 
myself
 


As related change, current xhci_handshake() is strange behavior,
E.g. xhci_handshake(ptr, mask, done, 1) does

 result = readl(ptr);
 /* check result */
 udelay(1); <= meaningless delay
 return -ETIMEDOUT;


Only in the timeout case, so if we after 3 or 5 million register reads + 
udelays(1)
still don't get the value we want then there is an unnecessary udelay(1).

Not worth putting much effort on it now.

Sorry about the review delay, I got my hands quite full at the moment

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] usb: host: xhci: dynamically allocate devs array

2016-11-24 Thread Mathias Nyman

On 24.11.2016 15:33, Felipe Balbi wrote:

Instead of always defaulting to a 256-entry array,
we can dynamically allocate devs based on what HW
tells us it supports.

Note that we can't, yet, purge MAX_HC_SLOTS
completely because of struct
xhci_device_context_array reliance on it.

Signed-off-by: Felipe Balbi 
---
  drivers/usb/host/xhci-hub.c  |  2 +-
  drivers/usb/host/xhci-mem.c  |  4 ++--
  drivers/usb/host/xhci-ring.c |  2 +-
  drivers/usb/host/xhci.c  | 19 +++
  drivers/usb/host/xhci.h  |  2 +-
  5 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 0ef16900efed..7b1b58ad6aac 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -356,7 +356,7 @@ int xhci_find_slot_id_by_port(struct usb_hcd *hcd, struct 
xhci_hcd *xhci,
enum usb_device_speed speed;

slot_id = 0;
-   for (i = 0; i < MAX_HC_SLOTS; i++) {
+   for (i = 0; i < HCS_MAX_SLOTS(xhci->hcs_params1); i++) {


Normally that is what it should look like, but as we in xhci driver don't use
xhci->devs[0], and xhci->devs[HCS_MAX_SLOTS(xhci->hcs_params1)] can be a valid
virt_device it needs a +1.
Same goes for everywhere else (also when allocating)

This is probably originally due to that xhci hw returns a slot_id 0 for 
failure, and
1 to, including HCD_MAX_SLOTS[hcs_params1) for successful enable device command.

the virt_dev is then straight allocated to xhci->devs[slot_id] = kzalloc(..)
   
-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: handle COMP_STOP from SETUP phase too

2016-11-24 Thread Mathias Nyman

On 24.11.2016 15:33, Felipe Balbi wrote:

Stop Endpoint command can come at any point and we
have no control of that. We should make sure to
handle COMP_STOP on SETUP phase as well, otherwise
urb->actual_lenght might be set to negative values
in some occasions such as below:

  urb->length = 4;
  build_control_transfer_td_for(urb, ep);

stop_endpoint(ep);

COMP_STOP:
[...]
urb->actual_length = urb->length - trb->length;

trb->length is 8 for SETUP stage (8 control request
bytes), so actual_length would be set to -4 in this
case.

While doing that, also make sure to use TRB_TYPE
field of the actual TRB instead of matching pointers
to figure out in which stage of the control transfer
we got our completion event.

Signed-off-by: Felipe Balbi 
---


This is awsome. This probably fixes tons of issues
related to failing getting port status and other control transfers.

Looking at it now it turns out that in COMP_STOP case we can't rely on
endpoint ring dequeue pointer to be correct, and still we used it to determine
the control transfer stage. (setup, data, status).
No wonder there has been issues related to control transfers.

Using the TRB_TYPE of the actual TRB on the endpoint ring is definitely the
way to go

Thanks
-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Apollo Lake

2016-11-28 Thread Mathias Nyman

On 28.11.2016 09:24, Greg KH wrote:

On Mon, Nov 28, 2016 at 09:53:52AM +0800, 
wan.ahmad.zainie.wan.moha...@intel.com wrote:

From: Wan Ahmad Zainie 

Intel Apollo Lake also requires XHCI_PME_STUCK_QUIRK.
Adding its PCI ID to quirk.

Signed-off-by: Wan Ahmad Zainie 
---
  drivers/usb/host/xhci-pci.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index e96ae80..954abfd 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -165,7 +165,8 @@ static void xhci_pci_quirks(struct device *dev, struct 
xhci_hcd *xhci)
 pdev->device == PCI_DEVICE_ID_INTEL_SUNRISEPOINT_H_XHCI ||
 pdev->device == PCI_DEVICE_ID_INTEL_CHERRYVIEW_XHCI ||
 pdev->device == PCI_DEVICE_ID_INTEL_BROXTON_M_XHCI ||
-pdev->device == PCI_DEVICE_ID_INTEL_BROXTON_B_XHCI)) {
+pdev->device == PCI_DEVICE_ID_INTEL_BROXTON_B_XHCI ||
+pdev->device == PCI_DEVICE_ID_INTEL_APL_XHCI)) {
xhci->quirks |= XHCI_PME_STUCK_QUIRK;


When is Intel going to fix this?  Why don't we just blacklist all intel
hosts :(



Apollo Lake is Broxton-based/a Broxton derivative, with, well pretty much the 
same xHCI IP with
the same flaws, but different PCI ID.

I agree the quirk list is getting long. And annoying to maintain.
I'll see if there's a better way to identify the hosts that need this quirk.

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] usb: xhci: Remove unuseful 'return' and 'break' statement

2016-11-28 Thread Mathias Nyman

On 28.11.2016 09:41, Baolin Wang wrote:

On 28 November 2016 at 15:21, Greg KH  wrote:

On Mon, Nov 28, 2016 at 02:29:25PM +0800, Baolin Wang wrote:

Hi Mathias,

On 24 November 2016 at 19:16, Baolin Wang  wrote:

Since these 'return' statements are not generally useful in void
function, remove them. Also remove one unuseful 'break' statement
in xhci_setup_addressable_virt_dev() function.

Signed-off-by: Baolin Wang 
---
Changes since v1:
  - Add description of removing 'break' statement in commitlog.
---


Could you apply this patch if there are no other comments? Thanks.


Less than a week response for a simple cleanup patch?  Why the rush and
pressure?  Relax, this really isn't an important patch...


I am sorry for the pressure, I just thought it is one simple cleanup
patch. It is okay for me to wait for.



Looks ok.

If it applies I'll send it forward to usb-next after 4.10-rc1,
It should end up in 4.11

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT PATCH 1/1] xhci: free xhci virtual devices with leaf nodes first

2016-11-30 Thread Mathias Nyman

On 28.11.2016 22:24, Guenter Roeck wrote:

On Wed, Nov 23, 2016 at 02:24:27PM +0200, Mathias Nyman wrote:

the tt_info provided by a HS hub might be in use to by a child device
Make sure we free the devices in the correct order.

This is needed in special cases such as when xhci controller is
reset when resuming from hibernate, and all virt_devices are freed.

Also free the virt_devices starting from max slot_id as children
more commonly have higher slot_id than parent.

CC: 
Signed-off-by: Mathias Nyman 

---

Guenter Roeck, does this work for you?


Yes, it does.

Tested-by: Guenter Roeck 

Thanks,
Guenter



Thanks, I'll send it forward with proper Reported-by and Tested-by tags
after 4.10-rc1

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] usb: return error code when platform_get_irq fails

2016-11-30 Thread Mathias Nyman

On 30.11.2016 15:41, Matthias Brugger wrote:



On 29/11/16 13:57, Pan Bian wrote:

In function xhci_mtk_probe(), variable ret takes the return value. Its
value should be negative on failures. However, when the call to function
platform_get_irq() fails, it does not set the error code, and 0 will be
returned. 0 indicates no error. As a result, the callers of function
xhci_mtk_probe() will not be able to detect the error. This patch fixes
the bug by assigning the return value of platform_get_irq() to variable
ret if it fails.

Signed-off-by: Pan Bian 
---
 drivers/usb/host/xhci-mtk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 79959f1..f2365a4 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -560,8 +560,10 @@ static int xhci_mtk_probe(struct platform_device *pdev)
 goto disable_ldos;

 irq = platform_get_irq(pdev, 0);
-if (irq < 0)
+if (irq < 0) {
+ret = irq;
 goto disable_clk;
+}

 /* Initialize dma_mask and coherent_dma_mask to 32-bits */
 ret = dma_set_coherent_mask(dev, DMA_BIT_MASK(32));




Reviewed-by: Matthias Brugger 



Thanks, Added to queue

-Mathias



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] usb: host: xhci: Remove the watchdog timer and use command timer to watch stop endpoint command

2016-11-30 Thread Mathias Nyman

On 30.11.2016 11:02, Baolin Wang wrote:

If the hardware never responds to the stop endpoint command, the
URBs will never be completed, and we might hang the USB subsystem.
The original watchdog timer is used to watch if one stop endpoint
command is timeout, if timeout, then the watchdog timer will set
XHCI_STATE_DYING, try to halt the xHCI host, and give back all
pending URBs.

But now we already have one command timer to control command timeout,
thus we can also use the command timer to watch the stop endpoint
command, instead of one duplicate watchdog timer which need to be
removed.

Meanwhile we don't need the 'stop_cmds_pending' flag to identy if
this is the last stop endpoint command of one endpoint. Since we
can make sure we only set one stop endpoint command for one endpoint
by 'EP_HALT_PENDING' flag in xhci_urb_dequeue() function. Thus remove
this flag.

We also need to clean up the command queue before trying to halt the
xHCI host in xhci_stop_endpoint_command_timeout() function.


This isn't a bad idea.

There are anyway some corner cases and details that need to be
checked, such as suspend (which will clear the command queue), module unload
and abrupt host removal (like pci hotplug removal of host controller)
we need to make sure we can trust the command timer to always return the 
canceled URB

-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] xhci: Put warning message on a single line

2016-12-01 Thread Mathias Nyman

On 01.12.2016 12:17, Alexander Stein wrote:

This allows someone to grep for the complete warning message as in;
xhci-hcd xhci-hcd.0.auto: USB core suspending device not in U0/U1/U2.

Signed-off-by: Alexander Stein 
---


Thanks, added to queue for 4.11

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] usb: host: xhci: Remove the watchdog timer and use command timer to watch stop endpoint command

2016-12-01 Thread Mathias Nyman

On 01.12.2016 06:54, Baolin Wang wrote:

On 30 November 2016 at 22:09, Mathias Nyman
 wrote:

On 30.11.2016 11:02, Baolin Wang wrote:


If the hardware never responds to the stop endpoint command, the
URBs will never be completed, and we might hang the USB subsystem.
The original watchdog timer is used to watch if one stop endpoint
command is timeout, if timeout, then the watchdog timer will set
XHCI_STATE_DYING, try to halt the xHCI host, and give back all
pending URBs.

But now we already have one command timer to control command timeout,
thus we can also use the command timer to watch the stop endpoint
command, instead of one duplicate watchdog timer which need to be
removed.

Meanwhile we don't need the 'stop_cmds_pending' flag to identy if
this is the last stop endpoint command of one endpoint. Since we
can make sure we only set one stop endpoint command for one endpoint
by 'EP_HALT_PENDING' flag in xhci_urb_dequeue() function. Thus remove
this flag.

We also need to clean up the command queue before trying to halt the
xHCI host in xhci_stop_endpoint_command_timeout() function.



This isn't a bad idea.

There are anyway some corner cases and details that need to be
checked, such as suspend (which will clear the command queue), module unload
and abrupt host removal (like pci hotplug removal of host controller)
we need to make sure we can trust the command timer to always return the
canceled URB


Yes, you are right, we need to check these carefully.

Suspend process, module unload and abrupt host removal, they all will
issue usb_disconnect() firstly before clear the command queue, it will
check URBs for every endpoint by
usb_disconnect()--->usb_disable_device()--->usb_disable_endpoint(),
which will make sure every URBs of endpoints will be cancelled by the
stop endpoint command responding or the timeout function of stop
endpoint command (xhci_stop_endpoint_command_timeout()) in
usb_hcd_flush_endpoint(). From that point, we can make sure the
command timer will be useful to handle stop endpoint command timeout.
Please correct me if I said something wrong. Thanks.



This relies on current queued command that times out to be the stop endpoint 
command.

If host partially, or completely hangs there might be any number of commands in 
the
command queue, and we would need to wait for each one of them to timeout, finish
before we finally get to the stop endpoint command, and give back the urb.

I think it would be better to first fix the issues with the current watchdog 
function, get
those fixes into stable, and then think about moving to the command queue timer.

In short, this patch doesn't currently fix any existing issue, but might cause 
the
timeout to be more unreliable

-Mathias   
 
--

To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/25] usb: host: xhci: dynamically allocate devs array

2016-12-02 Thread Mathias Nyman

On 01.12.2016 15:30, Felipe Balbi wrote:

Instead of always defaulting to a 256-entry array,
we can dynamically allocate devs based on what HW
tells us it supports.

Note that we can't, yet, purge MAX_HC_SLOTS
completely because of struct
xhci_device_context_array reliance on it.

Signed-off-by: Felipe Balbi 
---
  drivers/usb/host/xhci-hub.c  |  2 +-
  drivers/usb/host/xhci-mem.c  |  4 ++--
  drivers/usb/host/xhci-ring.c |  2 +-
  drivers/usb/host/xhci.c  | 19 +++
  drivers/usb/host/xhci.h  |  2 +-
  5 files changed, 20 insertions(+), 9 deletions(-)


...


diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 1cd56417cbec..1113b5fea7b4 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -766,6 +766,8 @@ void xhci_shutdown(struct usb_hcd *hcd)
/* Yet another workaround for spurious wakeups at shutdown with HSW */
if (xhci->quirks & XHCI_SPURIOUS_WAKEUP)
pci_set_power_state(to_pci_dev(hcd->self.controller), 
PCI_D3hot);
+
+   kfree(xhci->devs);


I don't think freeing xhci->devs at PCI .shutdown callback will work for driver 
unload, or for the
PCI hotplugged Alpine Ridge xhci solution.  xhci->devs would leak at next 
driver load.

xhci_mem_clenup() would be better


  }

  #ifdef CONFIG_PM
@@ -4896,6 +4898,11 @@ int xhci_gen_setup(struct usb_hcd *hcd, 
xhci_get_quirks_t get_quirks)

xhci->quirks |= quirks;

+   xhci->devs = kcalloc(HCS_MAX_SLOTS(xhci->hcs_params1) + 1,
+   sizeof(struct xhci_virt_device *), GFP_KERNEL);


maybe allocate the memory in xhci_mem_init(), it's done a bit later in 
xhci_gen_setup()
from xhci_init().

we halt, reset and set dma mask in between, but those should not depend on 
xhci->devs in
any way.

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 24/25] usb: host: xhci: use trb_to_noop() from xhci_handle_stopped_cmd_ring()

2016-12-02 Thread Mathias Nyman

On 01.12.2016 15:31, Felipe Balbi wrote:

instead of open coding how to convert a TRB to no-op, let's use our
newly introduced helper.

Signed-off-by: Felipe Balbi 
---
  drivers/usb/host/xhci-ring.c | 12 ++--
  1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 80fa3dbdbdd8..973182ee6954 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1218,7 +1218,6 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd 
*xhci,
 struct xhci_command *cur_cmd)
  {
struct xhci_command *cmd;
-   u32 cycle_state;

/* Turn all aborted commands in list to no-ops, then restart */
list_for_each_entry(cmd, &xhci->cmd_list,
@@ -1231,15 +1230,8 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd 
*xhci,

xhci_dbg(xhci, "Turn aborted command %p to no-op\n",
 cmd->command_trb);
-   /* get cycle state from the original cmd trb */
-   cycle_state = le32_to_cpu(
-   cmd->command_trb->generic.field[3]) & TRB_CYCLE;
-   /* modify the command trb to no-op command */
-   cmd->command_trb->generic.field[0] = 0;
-   cmd->command_trb->generic.field[1] = 0;
-   cmd->command_trb->generic.field[2] = 0;
-   cmd->command_trb->generic.field[3] = cpu_to_le32(
-   TRB_TYPE(TRB_CMD_NOOP) | cycle_state);
+
+   trb_to_noop(cmd->command_trb);


This won't work, notice that no ops for transfer and command rings are 
different.
The trb_to_noop() helper you added in 23/25 set trb type to TRB_TR_NOOP

/* Transfer Ring No-op (not for the command ring) */
#define TRB_TR_NOOP 8
/* No-op Command - not for transfer rings */
#define TRB_CMD_NOOP23

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] usb: xhci: fix possible wild pointer

2016-12-02 Thread Mathias Nyman

On 02.12.2016 04:29, Lu Baolu wrote:

handle_cmd_completion() frees a command structure which might
be still referenced by xhci->current_cmd. This might cause
problem when xhci->current_cmd is accessed after that.

A real-life case could be like this. The host takes a very long
time to respond to a command, and the command timer is fired at
the same time when the command completion event arrives. The
command completion handler frees xhci->current_cmd before the
timer function can grab xhci->lock. Afterward, timer function
grabs the lock and go ahead with checking and setting members
of xhci->current_cmd.

Cc:  # v3.16+
Signed-off-by: Lu Baolu 
---
  drivers/usb/host/xhci-ring.c | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index bdf6b13..13e05f6 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1267,14 +1267,16 @@ void xhci_handle_command_timeout(unsigned long data)
bool second_timeout = false;
xhci = (struct xhci_hcd *) data;

-   /* mark this command to be cancelled */
spin_lock_irqsave(&xhci->lock, flags);
-   if (xhci->current_cmd) {
-   if (xhci->current_cmd->status == COMP_CMD_ABORT)
-   second_timeout = true;
-   xhci->current_cmd->status = COMP_CMD_ABORT;
+   if (!xhci->current_cmd) {
+   spin_unlock_irqrestore(&xhci->lock, flags);
+   return;
}

+   if (xhci->current_cmd->status == COMP_CMD_ABORT)
+   second_timeout = true;
+   xhci->current_cmd->status = COMP_CMD_ABORT;
+
/* Make sure command ring is running before aborting it */
hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
@@ -1422,6 +1424,10 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
xhci->current_cmd = list_entry(cmd->cmd_list.next,
   struct xhci_command, cmd_list);
mod_timer(&xhci->cmd_timer, jiffies + XHCI_CMD_DEFAULT_TIMEOUT);
+   } else if (xhci->current_cmd == cmd) {
+   xhci->current_cmd = NULL;
+   } else {
+   xhci_warn(xhci, "WARN current_cmd doesn't match command\n");
}

  event_handled:



Thanks,

I might do some tweaking to (or remove)  the warn message when applying if
that is ok

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB3.0 DWC3 can't work with latest Linux-4.9.rc7

2016-12-05 Thread Mathias Nyman

On 05.12.2016 13:00, Felipe Balbi wrote:


(no top-posting, please)

Hi,

it helps if you include maintainers for correct
drivers. scripts/get_maintainer.pl helps a lot.

Jerry Huang  writes:

And I tested USB2.0 disk, with Sriram's 6 patches, it works well. So Just 
USB3.0 can't work.

BTW,
The board with USB3.0 DWC3 controller and the USB3.0 disk work well with 
Linux-4.1.8.


alright, in that case you can bisect. Please do so and find the first
commit when this started happening.


-Original Message-
From: Jerry Huang
Sent: Monday, December 05, 2016 6:11 PM
To: st...@rowland.harvard.edu; linux-usb@vger.kernel.org
Cc: Sriram Dash 
Subject: USB3.0 DWC3 can't work with latest Linux-4.9.rc7

Hi, Guys,
1. I tested the USB3.0 DWC3 with latest Linux-4.9-rc7 on LS1043ardb board.
But I got the below errors, this error is returned from dma_supported(dev, mask) while running 
" dma_set_mask_and_coherent" in function " xhci_plat_probe".
...
[0.756166] xhci-hcd: probe of xhci-hcd.0.auto failed with error -5
[0.762438] xhci-hcd: probe of xhci-hcd.1.auto failed with error -5
[0.768712] xhci-hcd: probe of xhci-hcd.2.auto failed with error -5
[0.775049] usbcore: registered new interface driver usb-storage
...

2. Then, I applied 6 patches from Sriram:
https://patchwork.kernel.org/patch/9434163/
https://patchwork.kernel.org/patch/9434155/
https://patchwork.kernel.org/patch/9434131/
https://patchwork.kernel.org/patch/9434165/
https://patchwork.kernel.org/patch/9434167/
https://patchwork.kernel.org/patch/9434133/

however, I got another error as below shows when I attach one USB disk to this 
board.
It seems USB device return the 8 bytes descriptor, but all 8 byte data driver 
read  are zero from the buffer received.
Has any comment for this issue?

usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
[  164.323396] usb 2-1: device descriptor read/8, error -61


hmm, connection refused. Mathias, any idea?



usb/core/message.c  usb_get_descritptor() sets the -61 (ENODATA) if
the control transfer received data,  but the descriptor type byte doesn't match.
(as the 8 zeros probably would do)

does xhci verbose debugging show any control transfer related messages during 
this?

echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control

-Mathias
  


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] usb: host: xhci: Fix possible wild pointer when handling abort command

2016-12-05 Thread Mathias Nyman

On 05.12.2016 09:51, Baolin Wang wrote:

When current command was supposed to be aborted, host will free the command
in handle_cmd_completion() function. But it might be still referenced by
xhci->current_cmd, which need to set NULL.

Signed-off-by: Baolin Wang 
---
This patch is based on Lu Baolu's new fix patch:
usb: xhci: fix possible wild pointer
https://www.spinics.net/lists/linux-usb/msg150219.html
---
  drivers/usb/host/xhci-ring.c |5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 62dd1c7..9965a4c 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1362,8 +1362,11 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
 */
if (cmd_comp_code == COMP_CMD_ABORT) {
xhci->cmd_ring_state = CMD_RING_STATE_STOPPED;
-   if (cmd->status == COMP_CMD_ABORT)
+   if (cmd->status == COMP_CMD_ABORT) {
+   if (xhci->current_cmd == cmd)
+   xhci->current_cmd = NULL;
goto event_handled;
+   }
}

cmd_type = TRB_FIELD_TO_TYPE(le32_to_cpu(cmd_trb->generic.field[3]));



True, thanks

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/25] usb: host: xhci: cleanup series

2016-12-07 Thread Mathias Nyman

On 07.12.2016 14:41, Felipe Balbi wrote:


Hi,

Felipe Balbi  writes:

hi Mathias,

here's a much longer series of cleanups which I have been working on for
the past few days. Let me know what you think about it.

I did some light tests SKL and everything still works as before. I know
you have some reservations about my changes to trb_in_td() but IMHO, if
we can assume this function to be always correct, there's no need to add
debugging messages to it and as for the trb_in_td() call which existed
only for debugging purposes, I guess we need to find a better way of
adding debug statements for that part of the code.

Let me know what you think.


I have patches updated in xhci-cleanup on my k.org branch. Just making
sure these are all the comments I'll get. Mathias, are you done
reviewing?



Not yet, there are other things pending that took priority over this series

-Mathias
 


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Should xhci_irq() call usb_hc_died()?

2016-12-12 Thread Mathias Nyman

On 12.12.2016 10:43, Felipe Balbi wrote:


Hi,

Bjorn Helgaas  writes:

Hi Mathias,

ehci_irq(), ohci_irq(), fotg210_irq(), and oxu210_hcd_irq() contain code
equivalent to this:

   status = ehci_readl(...);
   if (status == ~(u32) 0) {
 ...
 usb_hc_died(hcd);
 ...
 return IRQ_HANDLED;
   }

xhci_irq() has a similar check, but does not call usb_hc_died():

   status = readl(...);
   if (status = 0x) {
 ...
 return IRQ_HANDLED;
   }

Should xhci_irq() also call usb_hc_died()?  Maybe there's some reason
for it to be different than the others, but it wasn't obvious to this
casual observer :)


It probably should, I'm not aware of any reason why not, and a quick look at the
logs didn't reveal anything.

Currently we are calling usb_hcd_died() in a couple of timeout cases if we read
0x from the pci registers, So eventually usb_hc_died() will be called.

I'll take a look at this in more detail



you might just have fixed several bugs in dealing with a dead HC :-)

Can you provide a patch? (well, unless Mathias has a strong reason not
to call usb_hc_died(), of course).


I don't think this is the worst case, there are a couple of other reasons such 
as
normal pci remove case we halt the host and reset the hardware after first HCD 
(USB2)
is removed, with all the secondary HCD (USB3) sand all its devices still 
connected,

Or then the abnormal case where HC disappears, we may time out while giving 
back a
killed URB, and  may end up never returning it. USB core waits with the roothub 
device
lock held for the URB, and we try to tear down xhci, which also requires the 
roothub
device lock at some point -> deadlock.

I'm am looking at these, but I need to make sure i fix it properly and not 
cause even
more issues.

-Mathias  
  

 



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: usb:xhci: support disable usb2 LPM Remote Wakeup

2016-12-12 Thread Mathias Nyman

On 12.12.2016 06:00, Thang Q. Nguyen wrote:

On Sat, Dec 10, 2016 at 4:36 AM, Rob Herring  wrote:

On Sun, Dec 04, 2016 at 07:42:01PM +0700, Thang Q. Nguyen wrote:

From: Thang Nguyen 

As per USB 2.0 link power management addendum ECN, table 1-2, page 4,
device or host initiated via resume signaling; device-initiated resumes
can be optionally enabled/disabled by software. This patch adds support
to control enabling the USB2 RWE feature via DT/ACPI attribute.

Signed-off-by: Vu Nguyen 
Signed-off-by: Thang Nguyen 
---
  Documentation/devicetree/bindings/usb/usb-xhci.txt | 1 +
  drivers/usb/host/xhci-plat.c   | 3 +++
  drivers/usb/host/xhci.c| 5 -
  drivers/usb/host/xhci.h| 1 +
  4 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/usb/usb-xhci.txt 
b/Documentation/devicetree/bindings/usb/usb-xhci.txt
index 966885c..9b4cd14 100644
--- a/Documentation/devicetree/bindings/usb/usb-xhci.txt
+++ b/Documentation/devicetree/bindings/usb/usb-xhci.txt
@@ -25,6 +25,7 @@ Required properties:

  Optional properties:
- clocks: reference to a clock
+  - usb2-rwe-disable: disable USB2 LPM Remote Wakeup capable


Remote wakeup has been around since USB 1.0 days. Does this need to be
USB2 or XHCI specific?

This is XHCI specific. Per XHCI specification 1.1, remote wakeup is
optional for XHCI 1.0 and required for XHCI 1.1. This patch provides
ability for software to disable RWE for USB2 in XHCI1.0 controller.


I think I understand what's going on.

USB:
  
The good old USB2 suspend is called L2. Device enters it after 3ms if there is no link activity.

If a device can remote wakeup (RWE) it's stated in the descriptor. RWE can be 
turned on
of off using standard SET/CLEAR Fature requests

The LPM L1 USB2 state again is entered with a LPM extended transaction to avoid 
the
3ms wait before powersaving. L1 state is exit can be done with a simialr RWE as 
L2 resume.
The RWE from L1 can turned on/off using a bit in the LPM extended transaction.

XHCI:

Specs say that if the device supports RWE we should enable it for LPM L1 exit 
as well.
This is done by setting the RWE (LPM L1) bit in PORTPMSC register. This bit 
only affect LPM L1 remote
wake. see 4.23.5.1.1.1

The issue might be that xhci driver never check if the device actually supports 
RWE, we always
set the PORTPMSC RWE  (for LPM L1) bit.

How about checking something like udev->do_remote_wakeup and setting and 
setting the bit
based on that.

The function that you are changing,  xhci_set_usb2_hardware_lpm() should only 
be used if
host has Hardware LPM Cabaility bit (HLC) set for that USB2 port in the
USB 2.0 xHCI Supported Protocol Capability.
Host that don't supprt LPM won't have that set. See xhci 7.2.2.1.3.2
 
-Mathias








--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT PATCH 1/1] xhci: free xhci virtual devices with leaf nodes first

2016-12-12 Thread Mathias Nyman

On 09.12.2016 23:28, Guenter Roeck wrote:

On Wed, Nov 30, 2016 at 01:41:24PM +0200, Mathias Nyman wrote:

On 28.11.2016 22:24, Guenter Roeck wrote:

On Wed, Nov 23, 2016 at 02:24:27PM +0200, Mathias Nyman wrote:

the tt_info provided by a HS hub might be in use to by a child device
Make sure we free the devices in the correct order.

This is needed in special cases such as when xhci controller is
reset when resuming from hibernate, and all virt_devices are freed.

Also free the virt_devices starting from max slot_id as children
more commonly have higher slot_id than parent.

CC: 
Signed-off-by: Mathias Nyman 

---

Guenter Roeck, does this work for you?


Yes, it does.

Tested-by: Guenter Roeck 

Thanks,
Guenter



Thanks, I'll send it forward with proper Reported-by and Tested-by tags
after 4.10-rc1


Do you have this patch sitting in some branch, by any chance ?
I would like to pick it up if possible.



Now in

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git for-usb-linus

Branch is not very well tested yet

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-12 Thread Mathias Nyman

On 05.12.2016 09:51, Baolin Wang wrote:

If a command event is found on the event ring during an interrupt,
we need to stop the command timer with del_timer(). Since del_timer()
can fail if the timer is running and waiting on the xHCI lock, then
it maybe get the wrong timeout command in xhci_handle_command_timeout()
if host fetched a new command and updated the xhci->current_cmd in
handle_cmd_completion(). For this situation, we need a way to signal
to the command timer that everything is fine and it should exit.


Ah, right, this could actually happen.
 


We should introduce a counter (xhci->current_cmd_pending) for the number
of pending commands. If we need to cancel the command timer and del_timer()
succeeds, we decrement the number of pending commands. If del_timer() fails,
we leave the number of pending commands alone.

For handling timeout command, in xhci_handle_command_timeout() we will check
the counter after decrementing it, if the counter (xhci->current_cmd_pending)
is 0, which means xhci->current_cmd is the right timeout command. If the
counter (xhci->current_cmd_pending) is greater than 0, which means current
timeout command has been handled by host and host has fetched new command as
xhci->current_cmd, then just return and wait for new current command.


A counter like this could work.

Writing the abort bit can generate either ABORT+STOP, or just STOP
event, this seems to cover both.

quick check, case 1: timeout and cmd completion at the same time.

cpu1cpu2

queue_command(first), p++ (=1)
queue_command(more),
--completion irq fires---- timer times out at same time--
handle_cmd_completion() handle_cmd_timeout(),)
lock(xhci_lock  )   spin_on(xhci_lock)
del_timer() fail, p (=1, nochange)
cur_cmd = list_next(), p++ (=2)
unlock(xhci_lock)   
lock(xhci_lock)
p-- (=1)
if (p > 0), exit
OK works

case 2: normal timeout case with ABORT+STOP, no race.

cpu1cpu2

queue_command(first), p++ (=1)
queue_command(more),
handle_cmd_timeout()
p-- (P=0), don't exit
mod_timer(), p++ (P=1)
write_abort_bit()
handle_cmd_comletion(ABORT)
del_timer(), ok, p-- (p = 0)
handle_cmd_completion(STOP) 
del_timer(), fail, (P=0)
handle_stopped_cmd_ring()
cur_cmd = list_next(), p++ (=1)
mod_timer()

OK, works, and same for just STOP case, with the only difference that
during handle_cmd_completion(STOP) p is decremented (p--)

So unless there is a way to find out if cur_cmd is valid in command timeout
in command timeout with the help of existing flags and lists this would be a 
working
solution.

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] xhci: Remove busy loop from xhci_abort_cmd_ring()

2016-12-14 Thread Mathias Nyman

On 14.12.2016 01:40, OGAWA Hirofumi wrote:

ping about [PATCH 1/3, 2/3, 3/3]?



1/3 and 2/3 will be sent to 4.10 usb-linus after rc1,
3/3 maybe to usb-next after that

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_reset_endpoint() doesn't reset endpoint

2016-12-14 Thread Mathias Nyman

On 14.12.2016 12:58, Michal Necasek wrote:

prior to the endpoint reset. SetFeature(CLEAR_HALT) resets the toggle
on the device, but not on the host. But we know for a fact that the
device sends a packet (with data toggle 0) which the host USB stack
never sees, and a data toggle mismatch explains that quite well.

We are using USBFS to talk to the printer, but that shouldn't matter
much. I will note that the available documentation<1> explicitly says
that USBDEVFS_RESETEP and USBDEVFS_CLEAR_HALT both reset the data
toggle. That is indeed the case for the Linux EHCI driver but not
xHCI. Both of the USBFS IOCTLs call into xhci_reset_endpoint() which
does nothing.



This is very likely the case.

xhci can not reset the host side of the endpoint unless it really is halted.
xhci 4.6.8:

"If the endpoint is not in the Halted state when an Reset Endpoint  Command
is executed -The xHC shall reject the command and generate a  Command
Completion Event with the Completion Code set to Context State Error."


Normal halt/stall case is that xhci receives a STALL from the device,
and immediately resets the endpoint (clears toggle, host side) then
propagates the HALT status to usb core.
USB core then sends SetFeature(CLEAR_HALT) to the device which will reset the
toggle for the device side of the endpoint, and host and device toggles
will be in sync.

After this xhci_endpoint_reset() is called by usb core to inform xhci that the
endpoint was reset, but currently we don't do anything in it.

If SetFeature(CLEAR_HALT) is called without endpoint actually being HALTED we 
can not
reset it from xhci. we should issue a config endpoint command to reset the host 
side
toggle, as mentioned in xhci 1.0 120814 as a last note:

"Note: The Reset Endpoint Command may only be issued to endpoints in the Halted 
state.
If software wishes reset the Data Toggle or Sequence Number of an endpoint that 
isn't
in the Halted state, then software may issue a Configure Endpoint Command with 
the Drop
and Add bits set for the target endpoint. that is in the Stopped state."

There was a case with a scanner we believed had the same issue, and we tried to
resolve it by issuing the configure endpoint command in xhci_endpoint_reset() 
but
if I remember correctly It did not resolve the case and code never got anywhere.

I might have some really old implementation somewhere for this, at least there 
is
a really old and outdated hack at


git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git ep_reset_halt_test
https://git.kernel.org/cgit/linux/kernel/git/mnyman/xhci.git/log/?h=ep_reset_halt_test

which really is quite a hack, and based on 3.19 kernel so it's probably only 
useful
as an Idea to base a real solution on.

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-19 Thread Mathias Nyman

On 13.12.2016 05:21, Baolin Wang wrote:

Hi Mathias,

On 12 December 2016 at 23:52, Mathias Nyman
 wrote:

On 05.12.2016 09:51, Baolin Wang wrote:


If a command event is found on the event ring during an interrupt,
we need to stop the command timer with del_timer(). Since del_timer()
can fail if the timer is running and waiting on the xHCI lock, then
it maybe get the wrong timeout command in xhci_handle_command_timeout()
if host fetched a new command and updated the xhci->current_cmd in
handle_cmd_completion(). For this situation, we need a way to signal
to the command timer that everything is fine and it should exit.



Ah, right, this could actually happen.




We should introduce a counter (xhci->current_cmd_pending) for the number
of pending commands. If we need to cancel the command timer and
del_timer()
succeeds, we decrement the number of pending commands. If del_timer()
fails,
we leave the number of pending commands alone.

For handling timeout command, in xhci_handle_command_timeout() we will
check
the counter after decrementing it, if the counter
(xhci->current_cmd_pending)
is 0, which means xhci->current_cmd is the right timeout command. If the
counter (xhci->current_cmd_pending) is greater than 0, which means current
timeout command has been handled by host and host has fetched new command
as
xhci->current_cmd, then just return and wait for new current command.



A counter like this could work.

Writing the abort bit can generate either ABORT+STOP, or just STOP
event, this seems to cover both.

quick check, case 1: timeout and cmd completion at the same time.

cpu1cpu2

queue_command(first), p++ (=1)
queue_command(more),
--completion irq fires---- timer times out at same time--
handle_cmd_completion() handle_cmd_timeout(),)
lock(xhci_lock  )   spin_on(xhci_lock)
del_timer() fail, p (=1, nochange)
cur_cmd = list_next(), p++ (=2)
unlock(xhci_lock)
 lock(xhci_lock)
 p-- (=1)
 if (p > 0), exit
OK works

case 2: normal timeout case with ABORT+STOP, no race.

cpu1cpu2

queue_command(first), p++ (=1)
queue_command(more),
 handle_cmd_timeout()
 p-- (P=0), don't exit
 mod_timer(), p++ (P=1)
 write_abort_bit()
handle_cmd_comletion(ABORT)
del_timer(), ok, p-- (p = 0)
handle_cmd_completion(STOP)
del_timer(), fail, (P=0)
handle_stopped_cmd_ring()
cur_cmd = list_next(), p++ (=1)
mod_timer()

OK, works, and same for just STOP case, with the only difference that
during handle_cmd_completion(STOP) p is decremented (p--)


Yes, that's the cases what I want to handle, thanks for your explicit
explanation.



Gave this some more thought over the weekend, and this implementation
doesn't solve the case when the last command times out and races with the
completion handler:

cpu1cpu2

queue_command(first), p++ (=1)
--completion irq fires---- timer times out at same time--
handle_cmd_completion() handle_cmd_timeout(),)
lock(xhci_lock )spin_on(xhci_lock)
del_timer() fail, p (=1, nochange)
no more commands, P (=1, nochange)
unlock(xhci_lock)
lock(xhci_lock)
p-- (=0)
p == 0, continue, even if we should not.
  
For this we still need to rely on checking cur_cmd == NULL in the timeout function.

(Baolus patch sets it to NULL if there are no more commands pending)

And then we could replace the whole counter with a simple check if the timeout 
timer
is pending in the timeout function:

xhci_handle_command_timeout()
lock()
if (!cur_cmd || timer_pending(timeout_timer)) {
unlock();
return;
}

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-19 Thread Mathias Nyman

On 19.12.2016 13:34, Baolin Wang wrote:

Hi Mathias,

On 19 December 2016 at 18:33, Mathias Nyman
 wrote:

On 13.12.2016 05:21, Baolin Wang wrote:


Hi Mathias,

On 12 December 2016 at 23:52, Mathias Nyman
 wrote:


On 05.12.2016 09:51, Baolin Wang wrote:



If a command event is found on the event ring during an interrupt,
we need to stop the command timer with del_timer(). Since del_timer()
can fail if the timer is running and waiting on the xHCI lock, then
it maybe get the wrong timeout command in xhci_handle_command_timeout()
if host fetched a new command and updated the xhci->current_cmd in
handle_cmd_completion(). For this situation, we need a way to signal
to the command timer that everything is fine and it should exit.




Ah, right, this could actually happen.




We should introduce a counter (xhci->current_cmd_pending) for the number
of pending commands. If we need to cancel the command timer and
del_timer()
succeeds, we decrement the number of pending commands. If del_timer()
fails,
we leave the number of pending commands alone.

For handling timeout command, in xhci_handle_command_timeout() we will
check
the counter after decrementing it, if the counter
(xhci->current_cmd_pending)
is 0, which means xhci->current_cmd is the right timeout command. If the
counter (xhci->current_cmd_pending) is greater than 0, which means
current
timeout command has been handled by host and host has fetched new
command
as
xhci->current_cmd, then just return and wait for new current command.




A counter like this could work.

Writing the abort bit can generate either ABORT+STOP, or just STOP
event, this seems to cover both.

quick check, case 1: timeout and cmd completion at the same time.

cpu1cpu2

queue_command(first), p++ (=1)
queue_command(more),
--completion irq fires---- timer times out at same time--
handle_cmd_completion() handle_cmd_timeout(),)
lock(xhci_lock  )   spin_on(xhci_lock)
del_timer() fail, p (=1, nochange)
cur_cmd = list_next(), p++ (=2)
unlock(xhci_lock)
  lock(xhci_lock)
  p-- (=1)
  if (p > 0), exit
OK works

case 2: normal timeout case with ABORT+STOP, no race.

cpu1cpu2

queue_command(first), p++ (=1)
queue_command(more),
  handle_cmd_timeout()
  p-- (P=0), don't exit
  mod_timer(), p++ (P=1)
  write_abort_bit()
handle_cmd_comletion(ABORT)
del_timer(), ok, p-- (p = 0)
handle_cmd_completion(STOP)
del_timer(), fail, (P=0)
handle_stopped_cmd_ring()
cur_cmd = list_next(), p++ (=1)
mod_timer()

OK, works, and same for just STOP case, with the only difference that
during handle_cmd_completion(STOP) p is decremented (p--)



Yes, that's the cases what I want to handle, thanks for your explicit
explanation.



Gave this some more thought over the weekend, and this implementation
doesn't solve the case when the last command times out and races with the
completion handler:

cpu1cpu2

queue_command(first), p++ (=1)
--completion irq fires---- timer times out at same time--
handle_cmd_completion() handle_cmd_timeout(),)
lock(xhci_lock )spin_on(xhci_lock)
del_timer() fail, p (=1, nochange)
no more commands, P (=1, nochange)
unlock(xhci_lock)
 lock(xhci_lock)
 p-- (=0)
 p == 0, continue, even if we should
not.
   For this we still need to rely on
checking cur_cmd == NULL in the timeout function.
(Baolus patch sets it to NULL if there are no more commands pending)


As I pointed out in patch 1 of this patchset, this patchset is based
on Lu Baolu's new fix patch:
usb: xhci: fix possible wild pointer
https://www.spinics.net/lists/linux-usb/msg150219.html

After applying Baolu's patch, after decrement the counter, we will
check the xhci->cur_command if is NULL. So in this situation:
cpu1cpu2

  queue_command(first), p++ (=1)
  --completion irq fires---- timer times out at same time--
  handle_cmd_completion() handle_cmd_timeout(),)
  lock(xhci_lock )spin_on(xhci_lock)
  del_timer() fail, p (=1, nochange)
  no more commands, P (=1, nochange)
  unlock(xhci_lock)
  lock(xhci_lock)
  p-- (=0)
  no current command, return
  if (!xhci->current_cmd) {
   

Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-20 Thread Mathias Nyman

On 20.12.2016 09:30, Baolin Wang wrote:
...

Alright, I gathered all current work related to xhci races and timeouts
and put them into a branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes

Its based on 4.9
It includes a few other patches just to avoid conflicts and  make my life easier

Interesting patches are:

ee4eb91 xhci: remove unnecessary check for pending timer
0cba67d xhci: detect stop endpoint race using pending timer instead of counter.
4f2535f xhci: Handle command completion and timeout race
b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort command
529a5a0 usb: xhci: fix possible wild pointer
4766555 xhci: Fix race related to abort operation
de834a3 xhci: Use delayed_work instead of timer for command timeout
69973b8 Linux 4.9

The fixes for command queue races will go to usb-linus and stable, the
reworks for stop ep watchdog timer will go to usb-next.

Still completely untested, (well it compiles)

Felipe gave instructions how to modify dwc3 driver to timeout on address
devicecommands to test these, I'll try to set that up.

All additional testing is welcome, especially if you can trigger timeouts
and races

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-21 Thread Mathias Nyman

On 21.12.2016 08:17, Lu Baolu wrote:

Hi Mathias,

I have some comments for the implementation of xhci_abort_cmd_ring() below.

On 12/20/2016 11:13 PM, Mathias Nyman wrote:

On 20.12.2016 09:30, Baolin Wang wrote:
...

Alright, I gathered all current work related to xhci races and timeouts
and put them into a branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes

Its based on 4.9
It includes a few other patches just to avoid conflicts and  make my life easier

Interesting patches are:

ee4eb91 xhci: remove unnecessary check for pending timer
0cba67d xhci: detect stop endpoint race using pending timer instead of counter.
4f2535f xhci: Handle command completion and timeout race
b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort command
529a5a0 usb: xhci: fix possible wild pointer
4766555 xhci: Fix race related to abort operation
de834a3 xhci: Use delayed_work instead of timer for command timeout
69973b8 Linux 4.9

The fixes for command queue races will go to usb-linus and stable, the
reworks for stop ep watchdog timer will go to usb-next.

Still completely untested, (well it compiles)

Felipe gave instructions how to modify dwc3 driver to timeout on address
devicecommands to test these, I'll try to set that up.

All additional testing is welcome, especially if you can trigger timeouts
and races

-Mathias




Below is the latest code. I put my comments in line.

  322 static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
  323 {
  324 u64 temp_64;
  325 int ret;
  326
  327 xhci_dbg(xhci, "Abort command ring\n");
  328
  329 reinit_completion(&xhci->cmd_ring_stop_completion);
  330
  331 temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
  332 xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
  333 &xhci->op_regs->cmd_ring);

We should hold xhci->lock when we are modifying xhci registers
at runtime.



Makes sense, but we need to unlock it before sleeping or waiting for completion.
I need to look into that in more detail.

But this was an issue already before these changes.


The retry of setting CMD_RING_ABORT is not necessary according to
previous discussion. We have cleaned code for second try in
xhci_handle_command_timeout(). Need to clean up here as well.



Yes it can be cleaned up as well, but the two cases are a bit different.
The cleaned up one was about command ring not starting again after it was 
stopped.

This second try is a workaround for what we thought was the command ring failing
to stop in the first place, but is most likely due to the race that OGAWA 
Hirofumi
fixed.  It races if the stop command ring interrupt happens between writing the 
abort
bit and polling for the ring stopped bit. The interrupt hander may start the 
command
ring again, and we would believe we failed to stop it in the first place.

This race could probably be fixed by just extending the lock (and preventing
interrupts) to cover both writing the abort bit and polling for the command ring
running bit, as you pointed out here previously.

But then again I really like OGAWA Hiroumi's solution that separates the
command ring stopping from aborting commands and restarting the ring.

The current way of always restarting the command ring as a response to
a stop command ring command really limits its usage.

So, with this in mind most reasonable would be to
1. fix the lock to cover abort+CRR check, and send it to usb-linus +stable
2. rebase OGAWA Hirofumi's changes on top of that, and send to usb-linus only
3. remove unnecessary second abort try as a separate patch, send to usb-next
4. remove polling for the Command ring running (CRR), waiting for completion
   is enough, if completion times out then we can check CRR. for usb-next
   
I'll fix the typos these patches would introduce. Fixing old typos can be done as separate

patches later.

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-21 Thread Mathias Nyman

On 21.12.2016 08:57, Lu Baolu wrote:

Hi Mathias,

I have some comments for the implementation of
xhci_handle_command_timeout() as well.

On 12/20/2016 11:13 PM, Mathias Nyman wrote:

On 20.12.2016 09:30, Baolin Wang wrote:
...

Alright, I gathered all current work related to xhci races and timeouts
and put them into a branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes

Its based on 4.9
It includes a few other patches just to avoid conflicts and  make my life easier

Interesting patches are:

ee4eb91 xhci: remove unnecessary check for pending timer
0cba67d xhci: detect stop endpoint race using pending timer instead of counter.
4f2535f xhci: Handle command completion and timeout race
b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort command
529a5a0 usb: xhci: fix possible wild pointer
4766555 xhci: Fix race related to abort operation
de834a3 xhci: Use delayed_work instead of timer for command timeout
69973b8 Linux 4.9

The fixes for command queue races will go to usb-linus and stable, the
reworks for stop ep watchdog timer will go to usb-next.

Still completely untested, (well it compiles)

Felipe gave instructions how to modify dwc3 driver to timeout on address
devicecommands to test these, I'll try to set that up.

All additional testing is welcome, especially if you can trigger timeouts
and races

-Mathias




I post the code below and add my comments in line.

1276 void xhci_handle_command_timeout(struct work_struct *work)
1277 {
1278 struct xhci_hcd *xhci;
1279 int ret;
1280 unsigned long flags;
1281 u64 hw_ring_state;
1282
1283 xhci = container_of(to_delayed_work(work), struct xhci_hcd, 
cmd_timer);
1284
1285 spin_lock_irqsave(&xhci->lock, flags);
1286
1287 /*
1288  * If timeout work is pending, or current_cmd is NULL, it means we
1289  * raced with command completion. Command is handled so just 
return.
1290  */
1291 if (!xhci->current_cmd || delayed_work_pending(&xhci->cmd_timer)) {
1292 spin_unlock_irqrestore(&xhci->lock, flags);
1293 return;
1294 }
1295 /* mark this command to be cancelled */
1296 xhci->current_cmd->status = COMP_CMD_ABORT;
1297
1298 /* Make sure command ring is running before aborting it */
1299 hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
1300 if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
1301 (hw_ring_state & CMD_RING_RUNNING))  {
1302 /* Prevent new doorbell, and start command abort */
1303 xhci->cmd_ring_state = CMD_RING_STATE_ABORTED;
1304 spin_unlock_irqrestore(&xhci->lock, flags);
1305 xhci_dbg(xhci, "Command timeout\n");
1306 ret = xhci_abort_cmd_ring(xhci);
1307 if (unlikely(ret == -ESHUTDOWN)) {
1308 xhci_err(xhci, "Abort command ring failed\n");
1309 xhci_cleanup_command_queue(xhci);
1310 usb_hc_died(xhci_to_hcd(xhci)->primary_hcd);
1311 xhci_dbg(xhci, "xHCI host controller is dead.\n");
1312 }
1313 return;
1314 }
1315
1316 /* host removed. Bail out */
1317 if (xhci->xhc_state & XHCI_STATE_REMOVING) {
1318 spin_unlock_irqrestore(&xhci->lock, flags);
1319 xhci_dbg(xhci, "host removed, ring start fail?\n");
1320 xhci_cleanup_command_queue(xhci);
1321 return;
1322 }

I think this part of code should be moved up to line 1295.


The XHCI_STATE_REMOVING and XHCI_STATE_DYING needs a rework,
I'm working on that.

Basically we want XHCI_STATE_REMOVING to mean that  all devices are going,
away and driver will be removed. Don't bother with re-calculating available
bandwidths after every device removal, but do use xhci hardware to disable
devices cleanly etc.

XHCI_STATE_DYING should mean hardware is not working/responding. Don't
bother writing any registers or queuing anything. Just return all
pending and cancelled URBs, notify core we died, and free all allocated memory.



1323
1324 /* command timeout on stopped ring, ring can't be aborted */
1325 xhci_dbg(xhci, "Command timeout on stopped ring\n");
1326 xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
1327 spin_unlock_irqrestore(&xhci->lock, flags);

This part of code is tricky. I have no idea about in which case should this
code be executed? Anyway, we shouldn't call xhci_handle_stopped_cmd_ring()
here, right?



This isn't changed it these patches.

It will remove the aborted commands and restart the ring. It's useful if

Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-21 Thread Mathias Nyman

On 21.12.2016 04:22, Baolin Wang wrote:

Hi Mathias,

On 20 December 2016 at 23:13, Mathias Nyman
 wrote:

On 20.12.2016 09:30, Baolin Wang wrote:
...

Alright, I gathered all current work related to xhci races and timeouts
and put them into a branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
timeout_race_fixes

Its based on 4.9
It includes a few other patches just to avoid conflicts and  make my life
easier

Interesting patches are:

ee4eb91 xhci: remove unnecessary check for pending timer
0cba67d xhci: detect stop endpoint race using pending timer instead of
counter.
4f2535f xhci: Handle command completion and timeout race
b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort
command
529a5a0 usb: xhci: fix possible wild pointer
4766555 xhci: Fix race related to abort operation
de834a3 xhci: Use delayed_work instead of timer for command timeout
69973b8 Linux 4.9

The fixes for command queue races will go to usb-linus and stable, the
reworks for stop ep watchdog timer will go to usb-next.


How about applying below patch in your 'timeout_race_fixes' branch?
[PATCH] usb: host: xhci: Clean up commands when stop endpoint command is timeout
https://lkml.org/lkml/2016/12/13/94



usb_hc_died() should eventyally end up calling xhci_mem_cleanup()
which will cleanup the command queue. But I need to look into that

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-21 Thread Mathias Nyman

On 21.12.2016 16:10, OGAWA Hirofumi wrote:

Mathias Nyman  writes:


Below is the latest code. I put my comments in line.

   322 static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
   323 {
   324 u64 temp_64;
   325 int ret;
   326
   327 xhci_dbg(xhci, "Abort command ring\n");
   328
   329 reinit_completion(&xhci->cmd_ring_stop_completion);
   330
   331 temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
   332 xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
   333 &xhci->op_regs->cmd_ring);

We should hold xhci->lock when we are modifying xhci registers
at runtime.



Makes sense, but we need to unlock it before sleeping or waiting for
completion.  I need to look into that in more detail.

But this was an issue already before these changes.


We set CMD_RING_STATE_ABORTED state under locking. I'm not checking what
is for taking lock for register though, I guess it should be enough just
lock around of read=>write of ->cmd_ring if need lock.


After your patch it should be enough to have the lock only while
reading and writing the cmd_ring register.

If we want a locking fix that applies more easily to older stable releases 
before
your change then the lock needs to cover set CMD_RING_STATE_ABORT, read cmd_reg,
write cmd_reg and busiloop checking CRR bit.  Otherwise the stop cmd ring 
interrupt
handler may restart the ring just before we start checing CRR. The stop cmd ring
interrupt will set the CMD_RING_STATE_ABORTED to CMD_RING_STATE_RUNNING so
ring will really restart in the interrupt handler.



[Rather ->cmd_ring user should check CMD_RING_STATE_ABORTED state.]


But then again I really like OGAWA Hiroumi's solution that separates the
command ring stopping from aborting commands and restarting the ring.

The current way of always restarting the command ring as a response to
a stop command ring command really limits its usage.

So, with this in mind most reasonable would be to
1. fix the lock to cover abort+CRR check, and send it to usb-linus +stable
2. rebase OGAWA Hirofumi's changes on top of that, and send to usb-linus only
3. remove unnecessary second abort try as a separate patch, send to usb-next
4. remove polling for the Command ring running (CRR), waiting for completion
 is enough, if completion times out then we can check CRR. for usb-next


I think we should check both of CRR and even of stop completion. Because
CRR and stop completion was not same time (both can be first).


We can keep both, maybe change the order and do the busylooping-checking after
waiting for completion, but thats a optimization for usb-next sometimes later.

Thanks

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Intel cherrytrail xhci extended cap phy/mux support

2016-12-22 Thread Mathias Nyman

On 22.12.2016 14:45, Hans de Goede wrote:

Hi,

On 22-12-16 13:11, Felipe Balbi wrote:


Hi,

Hans de Goede  writes:

Hi All,

Here are 2 patches which can and should be merged separately, but
which do belong together, as together they add support for the
usb-phy / mux bits found in the Intel Cherrytrail xhci vendor defined
extended capabilities registers.

I did not want to hide an entire phy driver inside the xhci code,
nor did I want to add an extcon dependency to the xhci code, so
I've chosen to simply make the xhci code register a platform childdev
with an IOMEM resource with the Intel Cherrytrail xhci vendor defined
extended capabilities registers when these are found. This keeps the
xhci driver clean and allows writing a separate phy driver, this is
the first patch, which should be merged through the USB tree.

The second patch adds the actual phy driver, one thing which stands
out is that it completely depends on extcon devices to get idpin and
vbus state as that is done by other chips on cherrytrail devices,
atm the extcon devices are hardcoded to axp288_extcon for the vbus
detection and the ACPI INT3496:00 device for the idpin detection.

On some boards a different pmic then the axp288 may be used, so in
the future we may end up with an array with extcon device names to
try until we've found one.

The phy driver also adds a mode sysfs attribute, this is useful for
testing, as well with a so called charging oth hub, where the idpin
will indicate device/charge mode, but we can also use usb-devices
plugged into the hub by switching the data lines to host mode.


Baolu has worked on this for months but it turned out that several folks
said 'no' to his patchset. You're not really dealing with a PHY, it's
just a portmux which can generate some UTMI messages to make sure dwc3
is happy.


Right, I opted for a phy driver because that is the closest I could
get to something resembling what this thing does.


The end of the story about this, at least for broxton, was that mux
control was moved to ASL. I'm not sure why folks didn't update CHT (or
other devices') BIOS though.


I don't think expecting things to get "fixed" by firmware updates, esp.
firmware updates adding whole new functionality is going to happen, that
just is not realistic (esp. with cheap devices like these). And even if
it were fixed in firmware users are really bad add updating there firmware.


Baolu, any comments to this series?

Personally, I'm unwilling to take this (or the other 2-patch dwc3 series
adding IDA) because there will only be a single user.


Only a single user ? You mean only a single SoC will use this I guess ?

But there are many many devices with this SoC and lots of people are
trying to run "plain" Linux on these.

I can see how you don't want this as a phy driver though, I would
be happy to drop the phy-binding bits (they're not really used
anyways) and drop this under drivers/misc adding myself to MAINTAINERS
for it, would that be an acceptable solution ?

About the "other 2-patch dwc3 series", I'm fine with you not taking
the IDA patch, but the first patch is unrelated to IDA, it is a pure
bug-fix and really should be upstreamed.

Matthias, assuming Felipe is ok with putting this in drivers/misc
(minus the phy bindings), are you ok with the xhci blurb which
registers the platform-device for the "misc" driver to bind to?


Not the ideal solution, but creating a device would sound better than carrying 
it
in xhci driver. ASL/firmware would be the preferred solution.

I'm probably not aware of all possible solutions to this kind of a MFD-like
situation so I'm open to other Ideas.

patch [1/2] will create a platform device for almost all Intel hosts, even if
we only want it for CHT, so as is that solution can't be accepted.

-Mathias

 



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1] xhci: Fix race related to abort operation

2016-12-23 Thread Mathias Nyman

On 23.12.2016 08:50, Lu Baolu wrote:

Hi Mathias,

This is a follow-up patch for below comment

"rebase OGAWA Hirofumi's changes on top of that, and send to usb-linus only"

in below discussion thread.

https://lkml.org/lkml/2016/12/21/186

I rebased the patch and added unlock-before and lock-after xhci->lock during
waiting for the command stopped event.



Thanks,
I edited the patch in a very similar way already,

-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] usb: xhci: hold lock over xhci_abort_cmd_ring()

2016-12-23 Thread Mathias Nyman

On 23.12.2016 08:46, Lu Baolu wrote:

In command timer function, xhci_handle_command_timeout(), xhci->lock
is unlocked before call into xhci_abort_cmd_ring(). This might cause
race between the timer function and the event handler.

The xhci_abort_cmd_ring() function sets the CMD_RING_ABORT bit in the
command register and polling it until the setting takes effect. A stop
command ring event might be handled between writing the abort bit and
polling for it. The event handler will restart the command ring, which
causes the failure of polling, and we ever believed that we failed to
stop it.

As a bonus, this also fixes some issues of calling functions without
locking in xhci_handle_command_timeout().



Did the same thing, moved the unlock to cover also abort_cmd_ring(),

but this one takes care of locking the command ring cleanup as well
so I'll pick up this instead

-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2016-12-23 Thread Mathias Nyman

On 22.12.2016 03:46, Lu Baolu wrote:

Hi,

On 12/21/2016 11:18 PM, OGAWA Hirofumi wrote:

Mathias Nyman  writes:


We set CMD_RING_STATE_ABORTED state under locking. I'm not checking what
is for taking lock for register though, I guess it should be enough just
lock around of read=>write of ->cmd_ring if need lock.

After your patch it should be enough to have the lock only while
reading and writing the cmd_ring register.

If we want a locking fix that applies more easily to older stable
releases before your change then the lock needs to cover set
CMD_RING_STATE_ABORT, read cmd_reg, write cmd_reg and busiloop
checking CRR bit.  Otherwise the stop cmd ring interrupt handler may
restart the ring just before we start checing CRR. The stop cmd ring
interrupt will set the CMD_RING_STATE_ABORTED to
CMD_RING_STATE_RUNNING so ring will really restart in the interrupt
handler.

Just for record (no chance to make patch I myself for now, sorry), while
checking locking slightly, I noticed unrelated missing locking.

xhci_cleanup_command_queue()

We are calling it without locking, but we need to lock for accessing list.


Yeah. I can make the patch.



Force updated timeout_race_fixes branch.

I'll be out for a few days during xmas, continue testing after that

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xHCI issues Reset Device Command at invalid states

2017-01-02 Thread Mathias Nyman

On 30.12.2016 14:01, Felipe Balbi wrote:


Hi Mathias,

So the problem I found with v4.10-rc1 doesn't appear to be a
regression. I can't, however, trigger it with Broadwell, only Skylake
and Kabylake.

According to tracepoints, our Reset Device Command sometimes completes
with "Context State Error", which tells us that we're issuing Reset
Device Command when we shouldn't.


This could happen if reset device is issued twice,

xhci reset device command only works for slots in Addressed or
Configured states. Reset device sets the slot to "Defult" state.

If the slot is already in Default state when a reset device command
is issued xHC will return a  "Context State Error"



Another thing I noticed is that we're clearing PortFeature(PortReset)
less than 20ms after setting it. Full tracepoint data attached (slot 28
is the device in question. Slot 12, AFAICT, it's parent hub), but here's
a snippet showing both problems:


The device is connected to a external hub. I think this should be handled
by usb core.

hub_port_reset() in usb core should have 10ms TDRST to drive resume
and 10 ms TRSTRCY for reset recovery, (looks like it has 10+40ms for recovery)

I haven't checked, but I think it should be ok to ask the hub about the port 
status
GetPortStatus() even during reset recovery, as long as we don't communicate 
with the
device behind the hut that is being reset.

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xHCI issues Reset Device Command at invalid states

2017-01-02 Thread Mathias Nyman

On 02.01.2017 14:13, Felipe Balbi wrote:


Hi,

Felipe Balbi  writes:

Hi Mathias,

So the problem I found with v4.10-rc1 doesn't appear to be a
regression. I can't, however, trigger it with Broadwell, only Skylake
and Kabylake.

According to tracepoints, our Reset Device Command sometimes completes
with "Context State Error", which tells us that we're issuing Reset
Device Command when we shouldn't.

Another thing I noticed is that we're clearing PortFeature(PortReset)
less than 20ms after setting it. Full tracepoint data attached (slot 28


actually, this is not a problem. Reset is supposed to be 10ms minimum.


device-reset-2819  [002] d..1   154.868782: xhci_queue_trb: CMD: Reset Device 
Command: slot 28 flags C
   -0 [003] d.h2   154.868832: xhci_handle_event: EVENT: TRB 
0001be2b2e60 status 'Context State Error' len 0 slot 28 ep 0 type 'Command 
Completion Event' flags e:C


And here's xHCI telling us that slot 28's Context was is invalid status.


I traced this down to Slot being in Default state. According to Figure
10 Slot State Diagram, Reset Device is only allow from Configured or
Addressed. Any other states, Reset Device is invalid and shouldn't be
issued.

Now I'm wondering whether we should hide this fact from USB Core, or do
we have a bigger problem with USB Core itself.


when core calls hcd->driver->reset_device() xhci
will always submit a reset device command  without checking the slot state,
even if it will return context state error when slot is in a default state.

As we just checked it looks like USB2 specs, fig 9.1 allows resetting
the device is default state, so this limitation is xhci specific.

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command

2017-01-02 Thread Mathias Nyman

On 27.12.2016 05:07, Baolin Wang wrote:

Hi,

On 21 December 2016 at 21:00, Mathias Nyman
 wrote:

On 21.12.2016 04:22, Baolin Wang wrote:


Hi Mathias,

On 20 December 2016 at 23:13, Mathias Nyman
 wrote:


On 20.12.2016 09:30, Baolin Wang wrote:
...

Alright, I gathered all current work related to xhci races and timeouts
and put them into a branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
timeout_race_fixes

Its based on 4.9
It includes a few other patches just to avoid conflicts and  make my life
easier

Interesting patches are:

ee4eb91 xhci: remove unnecessary check for pending timer
0cba67d xhci: detect stop endpoint race using pending timer instead of
counter.
4f2535f xhci: Handle command completion and timeout race
b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort
command
529a5a0 usb: xhci: fix possible wild pointer
4766555 xhci: Fix race related to abort operation
de834a3 xhci: Use delayed_work instead of timer for command timeout
69973b8 Linux 4.9

The fixes for command queue races will go to usb-linus and stable, the
reworks for stop ep watchdog timer will go to usb-next.



How about applying below patch in your 'timeout_race_fixes' branch?
[PATCH] usb: host: xhci: Clean up commands when stop endpoint command is
timeout
https://lkml.org/lkml/2016/12/13/94



usb_hc_died() should eventyally end up calling xhci_mem_cleanup()
which will cleanup the command queue. But I need to look into that


usb_hc_died() did not call xhci_mem_cleanup() to clean up command
queue, it just disconnects all children devices attached on the dying
hub. I did not find where it will clean up the command queue when
issuing usb_hc_died(). Also before issuing usb_hc_died() in
xhci_handle_command_timeout(), we will call
xhci_cleanup_command_queue(). I think it should same as in
xhci_stop_endpoint_command_watchdog().



You're right, it doesn't call xhci_mem_cleanup.
Need to look at this after getting first series of fixes to usb-linus

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   10   >