----- Original Message ----- > From: "Michael Cree" <mc...@orcon.net.nz> > To: "Mathieu Desnoyers" <mathieu.desnoy...@efficios.com> > Cc: "Greg KH" <gre...@linuxfoundation.org>, linux-al...@vger.kernel.org, > "Richard Henderson" <r...@twiddle.net>, "Ivan > Kokshaysky" <i...@jurassic.park.msu.ru>, "Matt Turner" <matts...@gmail.com>, > "Huang Ying" <ying.hu...@intel.com>, > linux-kernel@vger.kernel.org, "Paul McKenney" <paul...@linux.vnet.ibm.com>, > "David Howells" <dhowe...@redhat.com>, > "Pranith Kumar" <bobby.pr...@gmail.com>, sta...@vger.kernel.org > Sent: Saturday, February 7, 2015 7:47:29 PM > Subject: Re: [PATCH] llist: Fix missing lockless_dereference() > > On Sat, Feb 07, 2015 at 10:30:44PM +0000, Mathieu Desnoyers wrote: > > > On Fri, Feb 06, 2015 at 09:08:21PM -0500, Mathieu Desnoyers wrote: > > > > A lockless_dereference() appears to be missing in llist_del_first(). > > > > It should only matter for Alpha in practice. > > What could one anticipate to be the symptoms of such a missing > lockless_dereference()?
This can trigger corruption of the lockless linked-list, which is used across a few subsystems. AFAIU, the scenario is as follows. Please bear with me, because it's been a while since I've read on the Alpha multi-cache-banks behavior. The list here would be initially non-empty. Initial state of new_last->next is unset (newly allocated); IOW: garbage. CPU A adds a node into the list while CPU B removes a node from the head of the list. CPU A CPU B llist_add_batch() - Stores to new_last->next - implicit full mb before cmpxchg makes the update to CPU A's cache bank containing new_last->next visible to other CPUs before CPU A's cache bank update making head->first visible to other CPUs. - cmpxchg updates head->first = new_first llist_del_first() - entry = load head->first -> here, lack of barrier on Alpha creates a window where CPU B's cache bank can see the updated "head->first", but the cache bank holding the next value did not receive the update yet, since each cache bank have their own channel, which can be independently saturated. - next = load entry->next (dereference entry pointer) - cmpxchg updates head->first = next -> can store unset "next" value into head->first, thus corrupting the linked list. > > The Alpha kernel is behaving pretty well provided one builds a machine > specific kernel and UP. When running an SMP kernel some packages > (most notably the java runtime, but there are a few others) occasionally > lock up in a pthread call --- could be a problem in libc rather then the > kernel. Are those lockups always occasional, or you have ways to reproduce them frequently with stress-tests ? Thanks, Mathieu > > > > Meta-comment, do we really care about Alpha anymore? Is it still > > > consered an "active" arch we support? > > There are a few of us still running recent kernels on Alpha. I am > maintaining the unofficial Debian alpha port at debian-ports, and the > Debian popcon shows about 10 installations of Debian Alpha. > > Cheers > Michael. > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/