Re: [PATCH v2 05/10] mailbox: add MediaTek GPUEB IPI mailbox

Chia-I Wu Mon, 15 Sep 2025 21:55:51 -0700
On Mon, Sep 15, 2025 at 6:34 AM Nicolas Frattaroli
<nicolas.frattar...@collabora.com> wrote:
>
> On Saturday, 13 September 2025 00:11:10 Central European Summer Time Chia-I 
> Wu wrote:
> > On Fri, Sep 12, 2025 at 11:38 AM Nicolas Frattaroli
> > <nicolas.frattar...@collabora.com> wrote:
> > <snipped>
> > > +static irqreturn_t mtk_gpueb_mbox_thread(int irq, void *data)
> > > +{
> > > +       struct mtk_gpueb_mbox_chan *ch = data;
> > > +       int status;
> > > +
> > > +       status = atomic_cmpxchg(&ch->rx_status,
> > > +                               MBOX_FULL | MBOX_CLOGGED, MBOX_FULL);
> > > +       if (status == (MBOX_FULL | MBOX_CLOGGED)) {
> > > +               mtk_gpueb_mbox_read_rx(ch);
> > > +               writel(BIT(ch->num), ch->ebm->mbox_ctl + 
> > > MBOX_CTL_IRQ_CLR);
> > > +               mbox_chan_received_data(&ch->ebm->mbox.chans[ch->num],
> > > +                                       ch->rx_buf);
> > Given what other drivers do, and how mtk_mfg consumes the data, we should
> >
> >   char buf[MAX_OF_RX_LEN]; //  MAX_OF_RX_LEN is 32; we can also
> > allocate it during probe
> >   mtk_gpueb_mbox_read_rx(ch);
> >   mbox_chan_received_data(..., buf);
> >
> > mtx_mfg makes a copy eventually anyway.
>
> We don't right now, at least not until after the callback returns.
> So we need to have the copy in the mtk_mfg callback, not after the
> completion. That's fine and I do want to do this as this is what
> the mailbox framework seems to expect clients to do.
>
> > We don't need to maintain any
> > extra copy.
> >
> > Then we might not need rx_status.
>
> We can probably get rid of it if we keep the per-channel
> interrupt handler. Otherwise, we may still need clogged,
> as we don't want to process interrupts on channels we have
> no user for.
>
> >
> > > +               atomic_set(&ch->rx_status, 0);
> > > +               return IRQ_HANDLED;
> > > +       }
> > > +
> > > +       return IRQ_NONE;
> > > +}
> > > +
> > > +static int mtk_gpueb_mbox_send_data(struct mbox_chan *chan, void *data)
> > > +{
> > > +       struct mtk_gpueb_mbox_chan *ch = chan->con_priv;
> > > +       int i;
> > > +       u32 *values = data;
> > > +
> > > +       if (atomic_read(&ch->rx_status))
> > > +               return -EBUSY;
> > > +
> > > +       /*
> > > +        * We don't want any fancy nonsense, just write the 32-bit values 
> > > in
> > > +        * order. memcpy_toio/__iowrite32_copy don't work here, because 
> > > fancy.
> > > +        */
> > > +       for (i = 0; i < ch->c->tx_len; i += 4)
> > > +               writel(values[i / 4], ch->ebm->mbox_mmio + 
> > > ch->c->tx_offset + i);
> > > +
> > > +       writel(BIT(ch->num), ch->ebm->mbox_ctl + MBOX_CTL_IRQ_SET);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static int mtk_gpueb_mbox_startup(struct mbox_chan *chan)
> > > +{
> > > +       struct mtk_gpueb_mbox_chan *ch = chan->con_priv;
> > > +       int ret;
> > > +
> > > +       atomic_set(&ch->rx_status, 0);
> > > +
> > > +       ret = clk_enable(ch->ebm->clk);
> > > +       if (ret) {
> > > +               dev_err(ch->ebm->dev, "Failed to enable EB clock: %pe\n",
> > > +                       ERR_PTR(ret));
> > > +               goto err_clog;
> > > +       }
> > > +
> > > +       writel(BIT(ch->num), ch->ebm->mbox_ctl + MBOX_CTL_IRQ_CLR);
> > > +
> > > +       ret = devm_request_threaded_irq(ch->ebm->dev, ch->ebm->irq, 
> > > mtk_gpueb_mbox_isr,
> > > +                                       mtk_gpueb_mbox_thread, 
> > > IRQF_SHARED | IRQF_ONESHOT,
> > > +                                       ch->full_name, ch);
> > I don't think this warrants a per-channel irq thread.
> >
> > mbox_chan_received_data is atomic. I think wecan start simple with
> > just a devm_request_irq for all channels. mtk_gpueb_mbox_isr can
> >
> >   read bits from MBOX_CTL_RX_STS
> >   for each bit set:
> >     read data from rx
> >     mbox_chan_received_data
> >   write bits to MBOX_CTL_IRQ_CLR
> >
>
> I don't like this approach. It brings us back to having to process
> multiple channels per ISR, keep track of when the interrupt should
> be enabled and disabled based on how many channels are in use, and
> also is not in line with what e.g. omap-mailbox.c does.
>
> Remember that `mbox_chan_received_data` synchronously calls the
> mailbox client's rx_callback. In mediatek_mfg's case, this is
> fairly small, though with the request to not make the rx buffer
> persist beyond the rx_callback it will gain an additional memory
> copy. But we can't guarantee that someone isn't going to put a
> slow operation in the path. Sure, it's going to be atomic, but
> waiting for a spinlock is atomic and not something an ISR would
> enjoy. I don't think mailbox clients would expect that if they
> take their time they'll stall the interrupt handler for every
> other channel.
>
> So we'd keep the interrupt disabled for all channels until the
> client that received a message has processed it.
>
> I can see myself getting rid of the handler and just having the
> thread function as the bottom half, but I'd really like to keep
> the one-IRQ-request-per-channel thing I've got going now as it
> made the code a lot easier to reason about. However, doing this
> would mean the interrupt is re-enabled after the generic upper
> half, when all the business logic that needs to not run
> concurrently for an individual channel is in the bottom half.
>
> As far as I can tell, this would then mean we'd have to add
> some concurrency exclusion mechanism to the bottom half.
>
> Moving all the logic into the upper half handler function
> would make that handler somewhat longer, and I don't know
> if IRQF_ONESHOT masks the interrupt for all users of that
> IRQ number or just for those with that dev_id. If it's per
> dev_id, then I'm fine with moving stuff up there. But from
> my reading of the core IRQ handling code, that does not
> appear to be the case; one channel getting a reply would
> mask *all* channels of the mailbox until the upper half is
> completed, and if the upper half calls into a driver
> callback synchronously, that may take a hot minute.
>
> Put differently: Is there a problem with one thread per used
> channel, or are we going off vibes here? The way it currently
> works uses the shared interrupt to mark just that one channel
> as busy with rx_status before letting the IRQ for all channels
> be unmasked again, which seems ideal to me.
No, one thread per used channel can work. I can't say I like it, but I
also don't know the hw as well as you do.
Re: [PATCH v2 05/10] mailbox: add MediaTek GPUEB IPI mailbox

Reply via email to