On 6/12/2024 2:34 AM, Alexander Lobakin wrote:
From: Josh Hay <joshua.a....@intel.com>
Date: Tue, 11 Jun 2024 11:13:53 -0700



On 6/11/2024 3:44 AM, Alexander Lobakin wrote:
From: David Decotigny <ddeco...@gmail.com>
Date: Tue, 4 Jun 2024 16:34:48 -0700

[...]

Note that there are several patches fixing Tx (incl. timeouts) in my
tree, including yours (Joshua's) which you somehow didn't send yet ._.
Maybe start from them first?

I believe it was you who specifically asked our team to defer pushing
any upstream patches while you were working on the XDP series "to avoid
having to rebase", which was a reasonable request at the time. We also

It was only related to the virtchnl refactoring and later I cancelled
that when I realized it will go earlier than our series.

had no reason to believe the existing upstream idpf implementation was
experiencing timeouts (it is being tested by numerous validation teams).
So there was no urgency to get those patches upstream. Which patches in
your tree do you believe fix specific timeout situations? It appears you

[0][1][2]

pulled in some of the changes from the out-of-tree driver, but those
were all enhancements. It wasn't until the workload that David mentioned

No, there are all fixes.

[0] is your from the OOT, extended. > [1] is mine and never was in the OOT.
[2] is your from the OOT, extended by MichaƂ.

My main point was since no other tx timeouts have been reported on the upstream driver, none of these seem like critical fixes. This particular timeout signature did not seem to match any of these in general. E.g. it would have been obvious if the timeout was because of what [0] fixes. It's also possible these timeouts did not manifest on the upstream driver because it is missing other OOT changes.


They really do help.

No disagreement there. I would've loved to push these changes sooner, but we already covered why that didn't happen.


Note that there's one more Tx timeout patch from you in the OOT, but it
actually broke Tx xD

If you are implying that our team would commit code that is knowingly broken, that is absolutely not true. I believe what you're referring to is a change that introduced a tx timeout that also took a very specific workload to trigger it. That issue was fixed and not applicable to the current upstream implementation, so I do not see how that is relevant to this conversation.


was run on the current driver that we had any indication there were
timeout issues.


I don't buy 30 seconds, at least for now. Maybe I'm missing something.

Nacked-by: Alexander Lobakin <aleksander.loba...@intel.com>


In the process of debugging the newly discovered timeout, our
architecture team made it clear that 5 seconds is too low for this type
of device, with a non deterministic pipeline where packets can take a
number of exception/slow paths. Admittedly, we don't know the exact

Slowpath which takes 30 seconds to complete, seriously?

The architecture team said 5s is too low. 30s was chosen to give ample cushion to avoid changing the timeout should this situation come up again.


number, so the solution for the time being was to bump it up with a
comfortable buffer. As we tune things and debug with various workloads,
we can bring it back down. As David mentioned, there is precedent for an
extended timeout for smartnics. Why is it suddenly unacceptable for
Intel's device?

I don't know where this "suddenly" comes from.
Because even 5 seconds is too much.
HW usually send packets in microseconds if not faster. Extending the
timeout will hide real issues and make debugging more difficult.

Can you please elaborate on exactly why it would be more difficult? If something is actually wrong in HW, it seems unlikely extra time would correct it. If something is functionally wrong in the driver, why does it matter if it's 5s, 15s, or 30s? It will timeout just the same.


I suspect this all is for OOO packets with an explicit sending timestamp
passed from the userspace, but as I said, you need to teach the kernel
watchdog to account them.

Out of order completions can happen for numerous reasons, some of which the driver will know nothing about, i.e. the userspace timestamps are not the only things that trigger OOO completions. We can detect that we're processing completion B before A, but it's only at that time that we can tell the stack to _maybe_ expect a delayed completion. I'm open to discussing this further, but it does not seem like a simple solution that can be implemented in the immediate future.

Otherwise, I can ask the driver to send a packet in 31 seconds, what
then, there will be a timeout and you will send a patch to extend it to
60 seconds?

I hope there are checks in the stack itself that would not allow the packet to be scheduled beyond the timeout window :)




Thanks,
Olek

Thanks,
Josh

[0] https://github.com/alobakin/linux/commit/aad547037598
[1] https://github.com/alobakin/linux/commit/50f4c27ba13e
[2] https://github.com/alobakin/linux/commit/4a9b6c5d0ee8

Thanks,
Olek

Thanks,
Josh

Reply via email to