On Thu, Jul 14, 2022 at 02:44:16PM +0200, Alexander Bluhm wrote:
> On Thu, Jul 14, 2022 at 06:16:14AM +0300, Vitaliy Makkoveev wrote:
> > > On 14 Jul 2022, at 05:51, Visa Hankala <[email protected]> wrote:
> > > 
> > > On Thu, Jul 14, 2022 at 04:39:33AM +0300, Vitaliy Makkoveev wrote:
> > >> It looks like sometimes the `unp_conn' doesn't reloaded in
> > >> the "unp->unp_conn != unp2" check, and READ_ONCE() should prevent this.
> > > 
> > > Are you sure about the non-reloading of unp->unp_conn? I did a quick
> > > look in the compiled output on amd64, mips64 and riscv64, and did not
> > > spot any obvious change (there is no change on mips64 and riscv64).
> > > 
> > 
> > It seems to be. Otherwise I have no explanation of syzkaller reports.
> > 
> > May be it is not compiler, but CPU cache specific. I see the
> > difference on amd64, compiler loads to register before compare.
> > 
> > mov    0x20(%r13),%rax
> > cmp    %r12,%rax
> > je     0xffff...
> > 
> > instead of
> > 
> > cmp    %r12,0x20(%r13)
> > je     0xffff...
> > 
> > I can???t say what is the real difference in cmp behaviour in both
> > cases. May be some one could explain.
> 
> For me this chunk of assembly looks equivalent.  If this is the
> only binary change, I don't expect READ_ONCE() to fix syzkaller.
> I am not against adding READ_ONCE(), it prevents unexpected compiler
> optimisations.  But currently such a optimization does not happen.
> 
> > > The loads are surrounded by lock/reference operations that imply
> > > memory clobbering. The compiler should not assume that unp->unp_conn
> > > is unchanged.
> 
> Maybe I did not understand correctly why READ_ONCE() is needed.
> 
> > >>        if ((unp2 = unp->unp_conn) == NULL)
> > >>                return NULL;
> > >> 
> > >>        so2 = unp2->unp_socket;
> 
> Can the compiler modify this code to this?
> 
>           if ((unp2 = unp->unp_conn) == NULL)
>                   return NULL;
>           so2 = unp->unp_conn->unp_socket;
> 
> Then we can get a NULL deref if another CPU sets unp->unp_conn to
> NULL between our two reads.  There READ_ONCE() makes sense.
>

Another CPU can't modify unp->unp_conn because unp is locked. We need to
lock both sockets to modify unp->unp_conn or unp->unp_refs. I added this
to be sure that `unp2' is reassigned when we returned here from
"if (unp->unp_conn != unp2)" block.

> > >>                solock(so);
> > >> 
> > >>                /* Datagram socket could be reconnected due to re-lock. */
> > >>                if (unp->unp_conn != unp2) {
> > >>                        sounlock(so2);
> > >>                        unp_rele(unp2);
> > >>                        goto again;
> > >>                }
> > >> 
> > >>                unp_rele(unp2);
> 
> Can the compiler assume unp->unp_conn == unp2 ?  Then it would be
> valid to do unp_rele(unp->unp_conn).  Not sure whether C standard
> allows this.  And it would be stupid if the compiler would do that.
> 
> Sould we use more READ_ONCE() to be safe?
> 

The compiler could decide that `unp2' can't be changed and skip this
block. But it shouldn't because we have rwlock and atomic operations
between.

Also may be the problem lays in other layer, but triggered here...

Reply via email to