> From: Seth Jennings [mailto:sjenn...@linux.vnet.ibm.com] > Subject: Re: [PATCH 7/8] zswap: add to mm/ > > On 01/03/2013 04:33 PM, Dan Magenheimer wrote: > >> From: Seth Jennings [mailto:sjenn...@linux.vnet.ibm.com] > >> > >> However, once the flushing code was introduced and could free an entry > >> from the zswap_fs_store() path, it became necessary to add a per-entry > >> refcount to make sure that the entry isn't freed while another code > >> path was operating on it. > > > > Hmmm... doesn't the refcount at least need to be an atomic_t? > > An entry's refcount is only ever changed under the tree lock, so > making them atomic_t would be redundantly atomic.
Maybe I'm missing something still but then I think you also need to evaluate and act on the refcount (not just read it) while your treelock is held. I.e., in: > + /* page is already in the swap cache, ignore for now */ > + spin_lock(&tree->lock); > + refcount = zswap_entry_put(entry); > + spin_unlock(&tree->lock); > + > + if (likely(refcount)) > + return 0; > + > + /* if the refcount is zero, invalidate must have come in */ > + /* free */ > + zs_free(tree->pool, entry->handle); > + zswap_entry_cache_free(entry); > + atomic_dec(&zswap_stored_pages); the entry's refcount may be changed by another processor immediately after the unlock, and then the "if (refcount)" is testing a stale value and you will get (I think) a memory leak. There is similar racy code in zswap_fs_invalidate_page which I think could lead to a double free. There's another I think in zswap_fs_load... And the refcount is dec'd in one path inside of zswap_fs_store as well which may race with the above. When flushing multiple zpages to free a pageframe, you may need to test refcounts for all the entries while within the lock. If so, this is one place where the high-density storage will make things messy, especially if page boundaries are crossed. A nit: Even I, steeped in tmem terminology, was confused by your use of "fs"... to nearly all readers it will be translated as "filesystem" which is mystifying. Just spell it out "frontswap", even if it causes a few lines to be wrapped. Have a good weekend! Dan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/