On Friday 07 December 2012 15:51:02 Vitalii Demianets wrote: > On Friday 07 December 2012 11:41:45 Vitalii Demianets wrote: > > On Friday 07 December 2012 00:15:59 Hans J. Koch wrote: > > > On Thu, Dec 06, 2012 at 11:11:38AM +0200, Vitalii Demianets wrote: > > > > On Thursday 06 December 2012 04:41:01 Hans J. Koch wrote: > > > > > > @@ -63,7 +68,7 @@ static irqreturn_t uio_pdrv_genirq_handler(int > > > > > > irq, > > > > > > > > struct uio_info *dev_info) > > > > > > > > > > * remember the state so we can allow user space to enable it > > > > > > later. */ > > > > > > > > > > > > - if (!test_and_set_bit(0, &priv->flags)) > > > > > > + if (!test_and_set_bit(UIO_IRQ_DISABLED, &priv->flags)) > > > > > > disable_irq_nosync(irq); > > > > > > > > > > That is not safe. That flag can only be changed by userspace, and > > > > > if the userspace part is not running, the CPU on which the irq is > > > > > pending will hang. > > > > > The handler has to disable the interrupt in any case. > > > > > (The original version had the same problem, I know...) > > > > > > > > Perhaps I'm missing something here, but I don't see your point. If > > > > userspace is not (implicitly) calling uio_pdrv_genirq_irqcontrol() by > > > > writing to the device file, then on the first invocation of interrupt > > > > handler that IRQ is disabled by disable_irq_nosync() and that's the > > > > end of story - no more irqs, no problems. Until userspace writes > > > > non-zero to the device file, of course. > > > > > > If you run into the irq handler, the interrupt was obviously enabled. > > > Since irq sharing is not supported by this driver, you ALWAYS have to > > > disable it because it IS your interrupt. Storing the enabled/disabled > > > status in a variable is not needed at all here. Or am I missing > > > something? > > > > Good point. I agree that having UIO_IRQ_DISABLED flag is redundant. > > Inside the irq handler we know for sure that irq is enabled. In > > uio_pdrv_genirq_irqcontrol() this knowledge is not mandatory, we could > > enable/disable irq unconditionally. > > On second thought, we can't call enable_irq()/disable_irq() unconditionally > because of the potential disable counter (irq_desc->depth) disbalance. > That's why we need UIO_IRQ_DISABLED flag, and that's why we should check it > in uio_pdrv_genirq_irqcontrol(). > On the other hand, you are right in that we don't need to check it inside > irq handler. Inside irq handler we can disable irq and set the flag > unconditionally, because: > a) We know for sure that irqs are enabled, because we are inside > (not-shared) irq handler; > and > b) We are guarded from potential race conditions by spin_lock_irqsave() > call in uio_pdrv_genirq_irqcontrol(). > > So,yes, we can get rid of costly atomic call to > test_and_set_bit(UIO_IRQ_DISABLED,..) inside irq handler. But I still don't > like the idea of mixing this optimization with bug fixes in a single patch.
On the third thought, we can't ;) Imagine the SMP system where uio_pdrv_genirq_irqcontrol() is being executed on CPU0 and irq handler is running concurrently on CPU1. To protect from disable_irq counter disbalance we must first check current irq status, and in atomic manner. Thus we prevent double-disable, one from uio_pdrv_genirq_irqcontrol() running on CPU0 and another form irq handler running on CPU1. Above consideration justifies current code. But it seems we have potential concurrency problem here anyway. Here is theoretical scenario: 1) Initial state: irq is enabled, uio_pdrv_genirq_irqcontrol() starts being executed on CPU0 with irq_on=1 and at the same time, concurrently, irq handler starts being executed on CPU1; 2) irq handler executes line if (!test_and_set_bit(UIO_IRQ_DISABLED, &priv->flags)) as irq was enabled, the condition holds. And now UIO_IRQ_DISABLED is set. 3) uio_pdrv_genirq_irqcontrol() executes line if (test_and_clear_bit(UIO_IRQ_DISABLED, &priv->flags)) as UIO_IRQ_DISABLED was set by CPU1, the condition holds. And now UIO_IRQ_DISABLED is cleared. 4) CPU0 executes enable_irq() . IRQ is enabled. "Unbalanced enable for IRQ %d\n" warning is issued. 5) irq handler on CPU1 executes disable_irq_nosync(). IRQ is disabled. And now we are in situation where IRQ is disabled, but UIO_IRQ_DISABLED is cleared. Bad. The above scenario is purely theoretical, I have no means to check it in hardware. The (theoretical) solution is to guard UIO_IRQ_DISABLED test and actual irq disabling inside irq handler by priv->lock. What do you think about it? Is it worth worrying about? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/