rrs added a comment.
Hmm thinking about your comment jhb, we could easily add the
callout_drain_async to the
current callout code. If you think its worth while maybe we should add that to
D1711
Jhb, if you think its worth doing add that to D1711 and I will work on it ;-)
REVISION DETAIL
http
adrian added a comment.
.. except he said callout_drain(). What happens if that's put in as part of the
teardown process?
REVISION DETAIL
https://reviews.freebsd.org/D1777
To: rrs, imp, sbruno, gnn, rwatson, lstewart, kostikbel, adrian, bz, jhb
Cc: bz, emaste, hiren, julian, hselasky, freebsd
rrs added a comment.
JHB:
The scenario you outline is *exactly* the panic that was seen by sbruno. I
guess my description
was unclear.
The existing code in that other thread right now does a callout_stop and
tests the return code. If its one its one (which says I canceled a callout)
then it
l
On Thu, 5 Feb 2015 02:14:41 +0300, Lev Serebryakov wrote:
> On 05.02.2015 01:16, Lev Serebryakov wrote:
>
> > I have such rules in my firewall:
> >
> > nat 9 config redirect_port tcp 192.168.134.2:16881 16881
> > redirect_port udp 192.158.134.2:16881 16881 redirect_port tcp
> > 192.168.134
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512
On 02/02/15 09:35, Pieper, Jeffrey E wrote:
> In the past we have, yes. This was a few years ago, but iirc the
> current implementation is supposed to be the official solution.
>
> Jeff
Jeff:
Thanks for you feedback.
Jack:
I've tried compiling
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512
On 05.02.2015 01:16, Lev Serebryakov wrote:
> I have such rules in my firewall:
>
> nat 9 config redirect_port tcp 192.168.134.2:16881 16881
> redirect_port udp 192.158.134.2:16881 16881 redirect_port tcp
> 192.168.134.2:22 2
>
> nat 1 config
jhb added a comment.
This is just "How It Works". You are always supposed to do a callout_drain()
before freeing the storage belonging to a callout. I don't understand how you
are preventing the callout/lock being freed out from under the callout routine
in this version either. Now you can h
rrs added a comment.
I don't think this is a refcnt issue bz, the base of this is a hole in the way
the callout code works. Basically there is a window when
a) The callout_wheel is executing, it sees that a "lock" has been configured,
so it goes to
release the callout wheel lock and then lo
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512
On 05.02.2015 01:16, Lev Serebryakov wrote:
> nat 9 config redirect_port tcp 192.168.134.2:16881 16881
> redirect_port udp 192.158.134.2:16881 16881 redirect_port tcp
> 192.168.134.2:22 2
Also, if I add "log" to this config:
nat 9 config log r
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512
I have such rules in my firewall:
nat 9 config redirect_port tcp 192.168.134.2:16881 16881 redirect_port
udp 192.158.134.2:16881 16881 redirect_port tcp 192.168.134.2:22 2
nat 1 config ip $EXT_IP same_ports
...
// Packets from outer world
110
bz added a subscriber: bz.
bz added a reviewer: bz.
bz added a comment.
It smells like a lle_refcnt bug to me and I have send a private email to errs
to understand the problem better (and also if other optional kernel options
might have been used while this was experienced).
REVISION DETAIL
h
adrian added a comment.
I'm still a little iffy about this kind of fix. It looks more like a lifecycle
management thing that we should fix, rather than just moving to MPSAFE. But I
unfortunately don't have the time to go look at the lle code and figure out how
it should look. :(
REVISION DETAI
rrs created this revision.
rrs added reviewers: jhb, imp, sbruno, gnn, rwatson, lstewart, kostikbel,
adrian.
rrs added subscribers: hselasky, julian, hiren, emaste, freebsd-net.
REVISION SUMMARY
There is a hole in the timer code that when you give it
a lock, it will lock that lock and then ch
rrs added a comment.
I have created D1777 to address the nd6/arp crash separately. I am currently in
the midst
of testing these.
REVISION DETAIL
https://reviews.freebsd.org/D1711
To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky
Cc: julian, hiren, jhb, kostikbel,
adrian added a comment.
I think we should talk to rwatson about his mbuf hijinx and see about extending
flags. This isn't the only option to add.
REVISION DETAIL
https://reviews.freebsd.org/D1761
To: hselasky, rrs, glebius, gnn, emaste, lstewart, rwatson, bz, imp, np, adrian
Cc: freebsd-net
_
rrs added a comment.
Imp:
Ok I have spent a bit of time puzzling this out. First I was mistaken, the
callouts being run
are either arptimer or nd6timer(function name not right). These are not using
giant but
the passed in lle structure rw_lock.
We need to adjust these so that they check they:
hselasky added a comment.
Adrian: We do have one bit left after M_FLOWID was removed. Should I use it for
this? And what would be an appropriate name. M_LENGTH_OK ?
REVISION DETAIL
https://reviews.freebsd.org/D1761
To: hselasky, rrs, glebius, gnn, emaste, lstewart, rwatson, bz, imp, np, adria
imp added a comment.
So why doesn't the lle* code set mpsafe to 1? After re-reading the man page
several times, I'm thinking that's the solution here. It already uses other
locks and reference counts to synchronize things, so why get Giant involved at
all?
REVISION DETAIL
https://reviews.fre
adrian requested changes to this revision.
adrian added a comment.
This revision now requires changes to proceed.
I still don't want to see the hash type stuff polluted with things that aren't
hashes, making it more difficult to implement RSS work distribution along with
other kinds of features.
imp added a comment.
For the lle* code, it looks like the reference count for the data structure
improperly doesn't cover the implicit use of the mutex by the callout system.
That seems to be the real bug here, no? Protecting a mutex with a reference
count without holding a reference to that mu
rrs added a comment.
Julian:
The simple fix is to change the callout_init_rw -> callout_init(c, 1);
That makes it so the callout_stop() in this instance would return 0, not 1.
Since the
callout could not be stopped.
This would then cause the reference count to drop to 1 (it was 2 the original
rrs updated this revision to Diff 3631.
rrs added a comment.
This revision now requires review to proceed.
This fixes the comment as imp suggested and the indent...
CHANGES SINCE LAST UPDATE
https://reviews.freebsd.org/D1711?vs=3603&id=3631
REVISION DETAIL
https://reviews.freebsd.org/D1711
hselasky added a comment.
julian: What do you mean by "wait a bit". Spinning or sleeping?
REVISION DETAIL
https://reviews.freebsd.org/D1711
To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, hselasky, adrian, imp, sbruno
Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net
julian added a comment.
let me see if this is correct..
I'm assuming that the reference count can not change if you hold the lock?
if so then before releasing the lock, we can check if the count is 1.. if it is
then we are the only person who even knows where this thing is so there can't
be any
rrs added a comment.
Julian:
The point is *exactly* that, the callout *has* a reference.. and now that the
table is being
flushed if the callout_stop returns 1 it thinks the callout has been stopped,
which it has, which
means it will not run and release its reference. Thus the lowering of the
hselasky added a comment.
julian: Hence a lock is used, the callback won't be called when callout_stop()
returns 1. Only the mutex will still be used. Maybe a callout_reset() having 1
tick as timeout will work instead of callout_stop(). If the callout_reset()
returns 0 we add a ref instead of g
julian added a subscriber: julian.
julian added a comment.
>>! In D1711#66, @rrs wrote:
> 3) The callout_stop() on CPU 2 does what it is supposed to and sets the
> cc_cancel bit to true and
> return 1. This causes callout_stop() to lower the reference count which
> means when llentry_free(
hselasky edited reviewers, added: np; removed: rmacklem
hselasky removed a subscriber: np.
hselasky updated this revision to Diff 3630.
hselasky added a comment.
Address comments from imp and np:
- Update patch to use IP_MAXPACKET as default for LRO which preserves existing
behaviour.
- Add to n
rrs added a comment.
Ok guys, I have puzzled out what that crash *may* be that was posted by Hiren.
The same
issue exists in the timeout code rewrite that Han's has up on the board as
well. Though
the callout_drain_async() may solve it if the user called that instead. Here is
what is happening.
On 2/4/15 6:08 PM, bycn82 wrote:
/Cool,
But maybe not all people are following this topic, so can you please
simplify it by answering below question in order to allow more
people to know what is going on here.
/
/What kind of problem you are facing and how does your patch resolve it?
/
le
On 2/4/15 5:24 PM, Lev Serebryakov wrote:
--
Re-installation of state (with second, third, etc... packet of
connection) should update TCP state of state (sorry!), or it will die
in 10 seconds.
This version seems to be final (apart from name of new option!).
It works perfectly on my route
hiren added a comment.
>>! In D1711#59, @rrs wrote:
> Hiren:
>
> Ok looking at kern_timeout.c thats a call to
> class->lc_lock(c_lock, lock_status);
>
> If my 10.x matches yours.
>
> And the call inside that kern_rwlock.c:757
> is
>
> v = rw->rw_lock;
> owner = (struct thread *)RW_OWNER(v);
hiren added a comment.
>>! In D1711#60, @hiren wrote:
>>>! In D1711#58, @rrs wrote:
>
>> hiren:
>>
>> This looks interesting to me, it is definitely something I would like to
>> look at. I assume you
>> are on 10.stable like Sean?
>
> Yes, its plain stable10+D1711.
> Also, all 3 panics are fr
*Cool, But maybe not all people are following this topic, so can you please
simplify it by answering below question in order to allow more people to
know what is going on here.*
*What kind of problem you are facing and how does your patch resolve it?*
On 4 February 2015 at 17:24, Lev Serebryako
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512
On 03.02.2015 19:55, Lev Serebryakov wrote:
>> Ok, "allow-state"/"deny-state" was very limited idea. Here is
>> more universal mechanism: new "keep-state-only" (aliased as
>> "record-only") option, which works exactly as "keep-state" BUT
>> cancel
hiren added a comment.
>>! In D1711#61, @hselasky wrote:
> Hi,
>
> There is only one or two likely consumers of callout_init_rw() at the present
> moment, and one of them is:
>
> ./netinet6/nd6.c: canceled = callout_stop(&ln->ln_timer_ch);
> ./netinet6/nd6.c: can
hiren added a comment.
>>! In D1711#59, @rrs wrote:
> Hiren:
>
> Ok looking at kern_timeout.c thats a call to
> class->lc_lock(c_lock, lock_status);
>
> If my 10.x matches yours.
It's not :-(
Looks like what we have here is not stock stable10 really. I'll check all the
details and get back
37 matches
Mail list logo