https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
Mark Johnston changed:
What|Removed |Added
Resolution|--- |FIXED
Status|Open
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #26 from shail...@google.com ---
(In reply to Mark Johnston from comment #25)
It does, I found this bug only on top of the two prior changes:
https://reviews.freebsd.org/D46690
https://reviews.freebsd.org/D46691
and I figured
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #25 from Mark Johnston ---
(In reply to shailend from comment #24)
Does that diff depend on the other two gve diffs which had been posted
previously? That is, in what order should they be reviewed?
--
You are receiving this m
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #24 from shail...@google.com ---
Fixed in https://reviews.freebsd.org/D47138
Thanks a lot for all the help @markj, @kib, and @gallatin!
--
You are receiving this mail because:
You are the assignee for the bug.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #23 from shail...@google.com ---
(In reply to Mark Johnston from comment #22)
Yup gve_xmit_br enqueueing itself is the problem. Since the cleanup task
gve_tx_cleanup_tq already runs off of interrupts, I am thinking of fixing thi
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #22 from Mark Johnston ---
To be clear, the problem is that gve_xmit_br() requeues itself when gve_xmit()
is full (i.e., returns ENOBUFS)? Shouldn't it be queuing a cleanup task?
--
You are receiving this mail because:
You ar
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #21 from Konstantin Belousov ---
(In reply to shailend from comment #20)
Then, this is especially looks like a live-lock.
User thread should not have the priority 4, it is in the range of priorities of
the interrupt threads. S
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #20 from shail...@google.com ---
(In reply to Konstantin Belousov from comment #19)
Thanks for the explanation. The iperf thread owning the lock and the driver
thread looping on the cpu both have priority 4. The driver thread wa
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #19 from Konstantin Belousov ---
(In reply to shailend from comment #18)
Locks (except spinlocks) do not have any magic properties WRT disabling
scheduling. So it is absolutely fine for a thread owning a lock to be
put off CPU
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #18 from shail...@google.com ---
(In reply to Konstantin Belousov from comment #14)
Although I do not have access to the VMs to do `show pcpu`, I checked my notes
to find this `ps` entry:
100438 Run CPU 11
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #17 from Konstantin Belousov ---
(In reply to Mark Johnston from comment #16)
I doubt that system would stay silent about a CPU with disabled interrupts,
our IPI code does not tolerate such condition.
In fact, I asked about pcp
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #16 from Mark Johnston ---
This smells a bit like a thread disabled interrupts and then went off-CPU
somehow. The iperf thread is stuck in the runqueue of a CPU and nothing gets
scheduled there, so it doesn't run.
If this is n
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #15 from shail...@google.com ---
(In reply to Konstantin Belousov from comment #14)
Unfortunately I have lost access to the VMs in this repro and will need to make
a fresh repro. I'll post the "show pcpu" for the new repro, hopef
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #14 from Konstantin Belousov ---
(In reply to shailend from comment #13)
What does 'show pcpu 11' show?
--
You are receiving this mail because:
You are the assignee for the bug.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #13 from shail...@google.com ---
(In reply to Andrew Gallatin from comment #12)
Hmmm interesting. In this case though, I'm sure nothing is traversing the
networking stack, and no cpu is being consumed. The offending thread seems
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #12 from Andrew Gallatin ---
Comment on attachment 253834
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=253834
procstat_kka
Are we absolutely certain that this is a deadlock and not a livelock? If you
look at netwo
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #11 from shail...@google.com ---
Just a more succinct view of iperf thread 100719's central role in this
deadlock:
```
db> show lockchain 100413
thread 100413 (pid 0, gve0 rxq 0) is blocked on lock 0xfe00df57a3d0 (sleep
mute
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #10 from shail...@google.com ---
Superficially it looks like that the iperf thread 100719 was interrupted by an
ipi while it held the uma zone lock. It is the only iperf thread in the "run"
state, the rest are in "stop".
--
You
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #9 from shail...@google.com ---
(In reply to Mark Johnston from comment #7)
Also the trace for the uma zone lock holding iperf thread:
```
db> trace 100719
Tracing pid 857 tid 100719 td 0xf800b87ca000
sched_switch() at sche
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #8 from shail...@google.com ---
Created attachment 253834
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=253834&action=edit
procstat_kka
This is the output of procstat -kka, after the onset of a deadlock, with a
singl
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #7 from Mark Johnston ---
(In reply to shailend from comment #6)
The memory utilization is low, so this is not a low memory deadlock.
We have an iperf thread which is holding a UMA zone lock and an inpcb lock, and
it looks like
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #6 from shail...@google.com ---
I reproduced this with invariants+witness+ddb, it takes much longer to hit the
deadlock due to lowered throughput due to invariants and witness. Backtraces of
locked driver threads:
```
[root@Fr
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
Andrew Gallatin changed:
What|Removed |Added
CC||galla...@freebsd.org
--- Comment
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
Mark Johnston changed:
What|Removed |Added
CC||ma...@freebsd.org
Stat
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #3 from Konstantin Belousov ---
If you have WITNESS configured, then you can get an overview of the locks
ownership on the system, using the 'show alllocks' ddb command. This should
allow you to see lock owners, including the s
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
--- Comment #2 from shail...@google.com ---
Actually I did run it with INVARIANTS and WITNESS and other options listed on
https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/#kerneldebug-deadlocks
and the deadlock reproduces wi
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
Konstantin Belousov changed:
What|Removed |Added
CC||k...@freebsd.org
--- Comment
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560
Mark Linimon changed:
What|Removed |Added
Keywords||vendor
Assignee|b...@free
28 matches
Mail list logo