Sean,
Was this really "Reviewed by: sbruno@" or just "Tested by: sbruno@"? I
was very happy to see so many reviewers on the original commit but you
seem to be the only one still left on the list.
Regards,
Navdeep
On 01/20/15 15:35, Sean Bruno wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
On 01/20/15 14:30, Hans Petter Selasky wrote:
On 01/20/15 22:11, Gleb Smirnoff wrote:
On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov
wrote: K> > Like stated in the manual page,
callout_reset_curcpu/on() does not work K> > with MPSAFE callouts
any more! K> I.e. you 'fixed' some undeterminate bugs in callout
migration by not K> doing migration at all anymore. K> K> > K> >
You need to use callout_init_{mtx,rm,rw} and remove the custom
locking K> > inside the callback in the TCP stack to get it
working like before! K> K> No, you need to do this, if you think
that whole callout KPI must be K> rototiled. It is up to the
person who modifies the KPI, to ensure that K> existing code is
not broken. K> K> As I understand, currently we are back to the
one-cpu callouts. K> Do other people consider this situation
acceptable ?
I think this isn't acceptable. The commit to a complex subsystem
lacked a review from persons involved in the system before. The
commit to subsystem broke consumers of the subsystem and this was
even done not accidentially, but due to Hans not caring about
it.
As for me this is enough to request a backout, and let the
change back in only after proper review.
Hi Gleb,
Backing out my callout API patch means we will for sure
re-introduce an unknown callout spinlock hang, as noted to me by
several people. What do you think about that? dram Maybe "Jason
Wolfe" CC'ed can add to 10-stable w/o my patches:
Jason picked up this patch for work and it resolved our instability
issues that had remained unsolved for quite some time as reported to
freebsd-net:
https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html
This had gone undiagnosed for some time (even with the gracious help
of jhb in offline emails, thanks btw!).
There's some diagnostics in that email thread that may be of value to
you folks for determination of the validity of changing the callout
API or at least understanding why we were involved in diagnostics.
While I'd sure love to tune performance, the fact that our machines
were basically going out to lunch without these changes, probably
means that others were seeing it and didn't know what else to do. As
much as I enjoy a good "break out the pitch forks and torches" email
thread, this increased stability for us and is allowing us to upgrade
from freebsd8 to freebsd10. Bear this in mind when you throw your
voice in favor of reverting.
int callout_reset_sbt_on(struct callout *c, sbintime_t sbt,
sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int
flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int
cancelled, direct;
+ cpu = timeout_cpu; /* XXX test code XXX */
cancelled = 0;
Jason or I would have to run this in production, which would be
problematic I fear. We never had a deterministic test case that would
exhibit the reported failure. We merely "tested in production" and
saw that panics ceased. We didn't note a dropoff in our traffic
either, perhaps we are not as efficient as others in this corner case,
but we were consistently seeing the spinlock hangs after a day or so
of traffic.
And see if he observes a callout spinlock hang or not on his test
setup. The patch above should force all callouts to the same thread
basically. Then we could maybe see if single threading the callouts
has anything to do with solving the spinlock hang.
The "rewritten" callout API still has all the features and
capabilities the old one had, when used as described in "man 9
callout".
At the present moment I'm not technically convinced a backout is
correct.
Neither am I, to be honest. Just based on *results*.
Gleb: I think we would see far better results with high speed
internet links using TCP if we could extend the LRO (large receive
offload) code to accumulate more than 64KBytes worth of data per
call to the TCP stack instead of complaining about some callouts
ending up on the same thread! Actually I have a patch for that.
--HPS
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQF8BAEBCgBmBQJUvuYrXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kJTMIAMfh6ghV/AwQauY+a44q1hjJ
WC7E3u69FK0opgSYg71kk6HckbyB+sTWND6HdXnpyrcMLXUt74zlB8c48wbUUV5+
EwKNYzGNNnDNhoc0WtPMect8e9Y1kBRvSGfHBdVrqATXfPOyZEa+i4lQAXpiFKIt
nndqVrAH7bJM6143YDpnIg7vaR+8IQnC2ztSP4ogJzh03DZ7zVsg4BsoCPg50eVZ
kr46cXcE+SP/TsQBsVNVwRJD5NFie6QJdLoTnwkd0XfQGJMIWivNgUcE4tIxqPIf
nGQ0xMJCotpNLuPtzYzCIurSaDHdHmL6bjkhjTtBmWsMNfdGH8TKoih79GDxkTg=
=Y3rd
-----END PGP SIGNATURE-----
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"