On Dec 7, 2006, at 11:05, Jeff Garzik wrote:
Yes, I merged the code, but looking deeper at phy its clear I
missed some things.
Looking into libphy's workqueue stuff, it has the following sequence:
disable interrupts
schedule_work()
... time passes ...
... wo
On Thu, 7 Dec 2006, Andrew Morton wrote:
>
> We _can_ trust it in the context of
>
> void flush_work(struct work_struct *work)
Yes, but the way the bits are defined, the "pending" bit is not meaningful
as a synchronization event, for example - because _other_ users can't
trust it once
On Thu, 7 Dec 2006 10:01:56 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Thu, 7 Dec 2006, Andrew Morton wrote:
> >
> > umm.. Putting a work_struct* into struct cpu_workqueue_struct and then
> > doing appropriate things with cpu_workqueue_struct.lock might work.
>
> Yeah, tha
On Thu, 7 Dec 2006 09:57:15 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Thu, 07 Dec 2006 12:05:38 -0500
> Jeff Garzik <[EMAIL PROTECTED]> wrote:
>
> > Yes, I merged the code, but looking deeper at phy its clear I missed
> > some things.
> >
> > Looking into libphy's workqueue stuff, it
On Thu, 7 Dec 2006, Jeff Garzik wrote:
> Looking into libphy's workqueue stuff, it has the following sequence:
>
> disable interrupts
> schedule_work()
>
> ... time passes ...
> ... workqueue routine is called ...
>
> enable interrupts
> handle interrupt
>
>
On Thu, 7 Dec 2006, Andrew Morton wrote:
>
> umm.. Putting a work_struct* into struct cpu_workqueue_struct and then
> doing appropriate things with cpu_workqueue_struct.lock might work.
Yeah, that looks sane. We can't hide anything in "struct work", because we
can't trust it any more once it'
On Thu, 07 Dec 2006 12:05:38 -0500
Jeff Garzik <[EMAIL PROTECTED]> wrote:
> Yes, I merged the code, but looking deeper at phy its clear I missed
> some things.
>
> Looking into libphy's workqueue stuff, it has the following sequence:
>
> disable interrupts
> schedule_work()
>
>
On Thu, 7 Dec 2006 08:49:02 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Wed, 6 Dec 2006, Andrew Morton wrote:
> >
> > But this will return to the caller if the callback is presently running on
> > a different CPU. The whole point here is to be able to reliably kill off
> > th
Yes, I merged the code, but looking deeper at phy its clear I missed
some things.
Looking into libphy's workqueue stuff, it has the following sequence:
disable interrupts
schedule_work()
... time passes ...
... workqueue routine is called ...
enable int
On Wed, 6 Dec 2006, Andrew Morton wrote:
>
> But this will return to the caller if the callback is presently running on
> a different CPU. The whole point here is to be able to reliably kill off
> the pending work so that the caller can free resources.
I mentioned that in one of the emails.
We
On Wed, 6 Dec 2006, Linus Torvalds wrote:
> I didn't get any answers on this. I'd like to get this issue resolved, but
> since I don't even use libphy on my main machine, I need somebody else to
> test it for me.
I tested it in the evening with the system I implemented it for
originally. It
> On Thu, 07 Dec 2006 10:29:39 + David Howells <[EMAIL PROTECTED]> wrote:
> Andrew Morton <[EMAIL PROTECTED]> wrote:
>
> > I guess I don't understand exactly what problem the noautorel stuff is
> > trying to solve. It _seems_ to me that in all cases we can simply stuff
> > the old `data' fiel
Andrew Morton <[EMAIL PROTECTED]> wrote:
> I guess I don't understand exactly what problem the noautorel stuff is
> trying to solve. It _seems_ to me that in all cases we can simply stuff
> the old `data' field in alongside the controlling work_struct or
> delayed_work which wants to operate on i
On Wed, 6 Dec 2006 22:42:07 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:
> But I wouldn't want to think about an implementation as long as we have
> that WORK_STRUCT_NOAUTOREL horror in there. Can we just nuke that? Only
> three drivers need it and I bet they can be modified to use the usual
>
On Wed, 6 Dec 2006 17:21:50 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Wed, 6 Dec 2006, Linus Torvalds wrote:
> >
> > How about something like this?
>
> I didn't get any answers on this. I'd like to get this issue resolved, but
> since I don't even use libphy on my main ma
On Wed, 6 Dec 2006, Linus Torvalds wrote:
>
> How about something like this?
I didn't get any answers on this. I'd like to get this issue resolved, but
since I don't even use libphy on my main machine, I need somebody else to
test it for me.
Just to remind you all, here's the patch again. Th
On Wed, 6 Dec 2006, David Howells wrote:
>
> Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> > (a) "volatile" on kernel data is basically always a bug, and you should
> > use locking.
>
> But what about when you're building a lock? Actually, I suspect correct usage
> of asm constraints an
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> (a) "volatile" on kernel data is basically always a bug, and you should
> use locking.
But what about when you're building a lock? Actually, I suspect correct usage
of asm constraints and memory barriers trumps volatile anyway even there.
David
On Wed, 6 Dec 2006, Linus Torvalds wrote:
>
> and remove the "volatile" from all the bitop accessor functions.
It might also be interesting to see if this would change code-size at all.
There's a number of things that check different bits in the same word
right now, and they just reload the w
On Wed, 6 Dec 2006, Linus Torvalds wrote:
>
> Sadly, gcc doesn't do it in this case. Still, I'd rather keep the source
> clean, and hope that the compiler improves eventually, than to make the
> code uglier.
Actually, it's our own damn fault.
We've long had our arguments "const volatile" to
On Wed, 6 Dec 2006, Jeff Garzik wrote:
>
> Honestly I wonder if some of these situations really want
> "kill_scheduled_work_unless_it_is_already_running_right_now_if_so_wait_for_it"
We could do that, and the code to do it would be fairly close to what the
"run it" code is. Just replace the
On Wed, 6 Dec 2006, Linus Torvalds wrote:
>
> Gcc should do it for us, afaik. I didn't check, but gcc is generally
> pretty good at combining logical operations like this, because it's very
> common.
Sadly, gcc doesn't do it in this case. Still, I'd rather keep the source
clean, and hope tha
On Wed, 6 Dec 2006, David Howells wrote:
>
> + if (get_wq_data(work) == cwq
> + && test_bit(WORK_STRUCT_PENDING, &work->management)
>
> I wonder if those can be combined, perhaps:
Gcc should do it for us, afaik. I didn't check, but gcc is generally
pretty good at combining logical
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> test_bit() with a constant number is done very much in C, and very much on
> purpose. _Exactly_ to allow the compiler to combine these kinds of things.
Ah... I've read that several times, and each time I've assumed it's referring
to *addr not nr as be
Honestly I wonder if some of these situations really want
"kill_scheduled_work_unless_it_is_already_running_right_now_if_so_wait_for_it"
Since its during shutdown, usually the task just wants to know that the
code in the workqueue won't be touching the hardware or data structures
after point.
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> How about something like this?
At first glance, this looks reasonable.
It also looks like it should be used to replace a lot of the
cancel_delayed_work() calls that attempt to cancel _undelayed_ work items.
That would allow a number of work items to be
On Wed, 6 Dec 2006, Andrew Morton wrote:
>
> I think so too. But it would be imprudent to hang around waiting for me
> to write it :(
How about something like this?
This
(a) depends on the just-merged "struct work" cleanup
(b) is totally untested
(c) probably kills you slowly and painfully
On Wed, 6 Dec 2006 15:25:22 + (GMT)
"Maciej W. Rozycki" <[EMAIL PROTECTED]> wrote:
> > Maybe the lesson here is that flush_scheduled_work() is a bad function.
> > It should really be flush_this_work(struct work_struct *w). That is in
> > fact what approximately 100% of the flush_scheduled_wor
On Tue, 5 Dec 2006, Andrew Morton wrote:
> But running flush_scheduled_work() from within dev_close() is a very
> sensible thing to do, and dev_close is called under rtnl_lock().
> davem is -> thattaway ;)
And when within dev_close() there is quite a chance there is
linkwatch_event() somewhere
On Tue, 5 Dec 2006, Andy Fleming wrote:
> We need to make sure there are no more pending phy_change() invocations in the
> work queue, this is true, however I'm not convinced that this avoids the
> problem. And now that I come back to this email after Linus's response, let
> me add that I agree w
> Ah. The point is that the phy code doesn't want to flush _all_ pending
> callbacks. It only wants to flush its own one. And its own one doesn't
> take rtnl_lock().
OK, got it. You're absolutely correct.
> Maybe the lesson here is that flush_scheduled_work() is a bad function.
> It shou
> But running flush_scheduled_work() from within dev_close() is a very
> sensible thing to do, and dev_close is called under rtnl_lock().
I can't argue with that -- this has actually bitten me in the past.
Hmm, I'll try to understand why we need rtnl_lock() to cover dev_close...
- R.
-
To uns
On Tue, 05 Dec 2006 13:37:37 -0800
Roland Dreier <[EMAIL PROTECTED]> wrote:
> > a) Ban the calling of flush_scheduled_work() from under rtnl_lock().
> >Sounds hard.
>
> Unfortunate if this is happening a lot. It seems like the most
> sensible fix -- flush_scheduled_work() is in effect cal
> a) Ban the calling of flush_scheduled_work() from under rtnl_lock().
>Sounds hard.
Unfortunate if this is happening a lot. It seems like the most
sensible fix -- flush_scheduled_work() is in effect calling into
an unknown and changeable in the future set of functions (since it
waits for
On Tue, 5 Dec 2006 14:59:31 -0600
Andy Fleming <[EMAIL PROTECTED]> wrote:
> Ok, I think this is the summary:
>
> - phy_change() is the work queue callback function (scheduled when a
> PHY interrupt occurs)
>
> - dev_close() invokes the controller's stop/close/whatever function,
> and it call
On Dec 5, 2006, at 14:39, Andrew Morton wrote:
On Tue, 5 Dec 2006 17:48:05 + (GMT)
"Maciej W. Rozycki" <[EMAIL PROTECTED]> wrote:
Essentially there is a race when disconnecting from a PHY, because
interrupt delivery uses the event queue for processing. The
function to
handle interrupt
On Tue, 5 Dec 2006 17:48:05 + (GMT)
"Maciej W. Rozycki" <[EMAIL PROTECTED]> wrote:
> Essentially there is a race when disconnecting from a PHY, because
> interrupt delivery uses the event queue for processing. The function to
> handle interrupts that is called from the event queue is phy_c
On Tue, 5 Dec 2006 10:07:21 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Tue, 5 Dec 2006, Maciej W. Rozycki wrote:
> >
> > One way of avoiding it is calling flush_scheduled_work() from
> > phy_stop_interrupts(). This is fine as long as a caller of
> > phy_stop_interrupts()
On Dec 5, 2006, at 11:48, Maciej W. Rozycki wrote:
Essentially there is a race when disconnecting from a PHY, because
interrupt delivery uses the event queue for processing. The
function to
handle interrupts that is called from the event queue is phy_change().
It takes a pointer to a stru
On Mon, 4 Dec 2006, Steve Fox wrote:
> Unfortunately Maciej didn't get cc'ed correctly on your mail. Hopefully
> I've added the right person to this post as well as Andy who has also
> recently changed this code.
Thanks -- my parser of the LKML does not always trigger.
Maciej
-
To unsubscribe
On Tue, 5 Dec 2006, Maciej W. Rozycki wrote:
>
> One way of avoiding it is calling flush_scheduled_work() from
> phy_stop_interrupts(). This is fine as long as a caller of
> phy_stop_interrupts() (not necessarily the immediate one calling into
> libphy) does not hold the netlink lock.
>
>
On Sun, 3 Dec 2006, Andrew Morton wrote:
> wtf? That code was merged? This bug has been known for months and after
> several unsuccessful attempts at trying to understand what on earth that
> hackery is supposed to be doing I gave up on it and asked Jeff to look after
> it.
I am surprised it g
On Sun, 2006-12-03 at 01:16 -0800, Andrew Morton wrote:
> On Sun, 03 Dec 2006 00:50:55 -0500
> Ben Collins <[EMAIL PROTECTED]> wrote:
>
> > Fixes this problem when libphy is compiled as module:
> >
> > WARNING: "current_is_keventd" [drivers/net/phy/libphy.ko] undefined!
> >
> > diff --git a/kernel
On Sun, 03 Dec 2006 00:50:55 -0500
Ben Collins <[EMAIL PROTECTED]> wrote:
> Fixes this problem when libphy is compiled as module:
>
> WARNING: "current_is_keventd" [drivers/net/phy/libphy.ko] undefined!
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 17c2f03..1cf226b 100644
> --
Fixes this problem when libphy is compiled as module:
WARNING: "current_is_keventd" [drivers/net/phy/libphy.ko] undefined!
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 17c2f03..1cf226b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -608,6 +608,7 @@ int current_is_kevent
45 matches
Mail list logo