Hi,
Thank you for your comment.
> this looks a lot better than the previous patch!! However, we already have a
> state marker for _down_ that we should probably reuse. Can you try the
> attached
> patch and see if it works for you? It's basically your patch without the
> added
> remove flag
Kenzo Iwami wrote:
Hi,
I created a patch that uses watchdog_task but fixes the race condition
that occurred in old the e1000 driver.
I've obtained information about the panic caused by the old e1000 driver
using e1000_watchdog_task. According to the crash dump, the panic was
caused by a timer_l
Hi,
I created a patch that uses watchdog_task but fixes the race condition
that occurred in old the e1000 driver.
I've obtained information about the panic caused by the old e1000 driver
using e1000_watchdog_task. According to the crash dump, the panic was
caused by a timer_list whose contents we
Hi,
> My patch may seem like a huge change, but in essence the change is
> pretty simple.
>
> In my patch, the interrupt handler code will check whether the interrupted
> code is holding the swfw semaphore. If it is held, the watchdog function
> is deferred until swfw semaphore is released.
> The
Hi,
Thank you for your comment.
> thanks for staying patient while most of us were out or busy. Apart from
> acknowledging
> that you might have fixed a problem with your patch, we're very reluctant to
> merge such
> a huge change in our driver that touches much more cases then the one that
Kenzo Iwami wrote:
With this patch applied, I confirmed that the system doesn't panic.
I think this patch can fix this problem.
Does this patch have problems.
Kenzo,
thanks for staying patient while most of us were out or busy. Apart from acknowledging
that you might have fixed a problem with
Hi,
During the holiday season, I posted a patch that fixed this problem without
using spinlocks nor disabling interrupts.
http://marc.theaimsgroup.com/?l=linux-netdev&m=116649413613845&w=2
With this patch applied, I confirmed that the system doesn't panic.
I think this patch can fix this proble
Hi,
Previously, I posted a patch that fixed this problem without using spinlocks
nor disabling interrupts.
I have rebased this patch for 2.6.20-rc1.
Does this patch have problems?
I welcome any comments.
--
Kenzo Iwami ([EMAIL PROTECTED])
Signed-off-by: Kenzo Iwami <[EMAIL PROTECTED]>
diff
Hi,
> There are several issues that are conflicting and mixing that make it less
> than
> intuitive to decide what the better fix is.
>
> Most of all, we discussed that adding a spinlock is not going to fix the
> underlying
> problem of contention, as the code that would need to be spinlocked
Kenzo Iwami wrote:
Hi,
Doesn't this just mean that we need a spinlock or some other kind of
semaphore around acquiring, using, and releasing this resource? We keep
going around and around about this but I'm pretty sure spinlocks are
meant to be able to solve exactly this issue.
The problem is
Hi,
>> Doesn't this just mean that we need a spinlock or some other kind of
>> semaphore around acquiring, using, and releasing this resource? We keep
>> going around and around about this but I'm pretty sure spinlocks are
>> meant to be able to solve exactly this issue.
>>
>> The problem is goin
Kenzo Iwami wrote:
> ethtool processing holding semaphore
> INTERRUPT
> e1000_watchdog waits for semaphore to be released
>
> The semaphore e1000_watchdog is waiting for can only be released when
> ethtool resumes from interrupt after e1000_watchdog finishes
> (basically a deadlock)
>
Hi,
Thank you for your comment.
>>> I think this problem occurs because interrupt handler is executed in same
>>> CPU as process that acquires semaphore.
>>> How about disabling interrupt while the process is holding the semaphore?
>>> I think this is possible, if the total lock time has been red
Kenzo Iwami wrote:
Hi,
Even if the total lock time can be reduced, it's possible that interrupt
handler is executed while the interrupted code is still holding the
semaphore.
I think your method only decrease the frequency of this problem.
Why does reducing the lock time solve this problem?
t
Hi,
Even if the total lock time can be reduced, it's possible that interrupt
handler is executed while the interrupted code is still holding the
semaphore.
I think your method only decrease the frequency of this problem.
Why does reducing the lock time solve this problem?
Hi,
>>> Even if the total lock time can be reduced, it's possible that interrupt
>>> handler is executed while the interrupted code is still holding the
>>> semaphore.
>>> I think your method only decrease the frequency of this problem.
>>> Why does reducing the lock time solve this problem?
>> t
On Mon, Oct 30, 2006 at 09:30:24AM -0800, Auke Kok wrote:
> >Even if the total lock time can be reduced, it's possible that interrupt
> >handler is executed while the interrupted code is still holding the
> >semaphore.
> >I think your method only decrease the frequency of this problem.
> >Why does
Kenzo Iwami wrote:
Hi,
Thank you for your comment.
Anyway as I said in the same e-mail, we're working on reducing the lock timeout to a
reasonable time. This will unfortunately take some time, as we need to change some major
components in the driver to make sure this doesn't happen.
How abou
Hi,
Thank you for your comment.
> Anyway as I said in the same e-mail, we're working on reducing the lock
> timeout to a
> reasonable time. This will unfortunately take some time, as we need to
> change some major
> components in the driver to make sure this doesn't happen
Kenzo Iwami wrote:
Hi,
Thank you for your comment.
Anyway as I said in the same e-mail, we're working on reducing the lock timeout to a
reasonable time. This will unfortunately take some time, as we need to change some major
components in the driver to make sure this doesn't happen.
How abou
Hi,
Thank you for your comment.
>>> Anyway as I said in the same e-mail, we're working on reducing the lock
>>> timeout to a
>>> reasonable time. This will unfortunately take some time, as we need to
>>> change some major
>>> components in the driver to make sure this doesn't happen.
>>
>> Ho
Kenzo Iwami wrote:
Hi,
This problem originally occurred in a very large cluster system using snmp
for server management. About two servers panicked each day. The program I sent
is to reproduce this problem in a very short time. It does occur under normal
load when there is a lot of servers.
hmm
Hi,
>> This problem originally occurred in a very large cluster system using snmp
>> for server management. About two servers panicked each day. The program I
>> sent
>> is to reproduce this problem in a very short time. It does occur under normal
>> load when there is a lot of servers.
>
> hmm,
Kenzo Iwami wrote:
Hi,
Thank you for your comment.
This panic report falls in the category "how hard can I break my system as root".
Explicitly abusing the system performing restricted calls depletes resources and
harasses the sw lock (in this case). The reason that the driver attempts to wai
Hi,
Thank you for your comment.
> This panic report falls in the category "how hard can I break my system as
> root".
> Explicitly abusing the system performing restricted calls depletes resources
> and
> harasses the sw lock (in this case). The reason that the driver attempts to
> wait that
Kenzo Iwami wrote:
Hi,
Thank you for your comment.
A watchdog timeout panic occurred in e1000 driver (7.2.9-NAPI).
where's the panic message ?
attached the panic message (e1000_panic).
[...]
This problem only occurs on a server using ethernet controller inside
631xESB/632xESB, and NMI wat
Kenzo Iwami wrote:
A watchdog timeout panic occurred in e1000 driver (7.2.9-NAPI).
where's the panic message ?
Please CC the maintainers of the driver at all times. Our e-mail addresses are widely
visible everywhere.
If e1000_watchdog is called when processing ioctl from ethtool, the syste
27 matches
Mail list logo