On Thu, Oct 18, 2018 at 5:00 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > The below code seems to be problemetic: > dsm_cleanup_using_control_segment() > { > .. > if (!dsm_control_segment_sane(old_control, mapped_size)) > { > dsm_impl_op(DSM_OP_DETACH, old_control_handle, 0, &impl_private, > &mapped_address, &mapped_size, LOG); > .. > } > > Here, don't we need to use dsm_control_* variables instead of local > variable mapped_* variables?
I was a little fuzzy on when exactly dsm_cleanup_using_control_segment() and dsm_postmaster_shutdown() run, but after some more testing I think I have this straight now. You can test by setting dsm_control->magic to 42 in a debugger and trying three cases: 1. Happy shutdown: dsm_postmaster_shutdown() complains on shutdown. 2. kill -9 a non-postmaster process: dsm_postmaster_shutdown() complains during auto-restart. 3. kill -9 the postmaster, manually start up again: dsm_cleanup_using_control_segment() runs. It ignores the old segment quietly if it doesn't pass the sanity test. So to answer your question: no, dsm_cleanup_using_control_segment() is case 3. This entirely new postmaster process has never had the segment mapped in, so the dsm_control_* variables are not relevant here. Hmm.... but if you're running N other independent clusters on the same machine that started up after this cluster crashed in case 3, I think there is an N-in-four-billion chance that the segment with that ID now belongs to another cluster and happens to be its DSM control segment, and therefore passes the magic-number sanity test, and then we'll nuke it and all the segments it references. Am I missing something? Could we fix that by putting some kind of cluster ID in the control segment? So on second thoughts, I think I should remove the dsm_cleanup_using_control_segment() change from the patch, because it widens the risk of case 3 nuking someone else's stuff to include even segments that don't pass the sanity test. That's not good. Hunk removed. > ... Do we need to reset the dsm_control_* > variables in dsm_cleanup_for_mmap? I don't think so -- that unlinks the files in file-backed DSM, but we also call the other code in that case. > @@ -336,13 +333,14 @@ dsm_postmaster_shutdown(int code, Datum arg) > * stray shared memory segments, but there's not much we can do about that > * if the metadata is gone. > */ > - nitems = dsm_control->nitems; > if (!dsm_control_segment_sane(dsm_control, dsm_control_mapped_size)) > { > ereport(LOG, > (errmsg("dynamic shared memory control segment is corrupt"))); > - return; > + nitems = 0; > > The comment above this code says: > /* > * If some other backend exited uncleanly, it might have corrupted the > * control segment while it was dying. In that case, we warn and ignore > * the contents of the control segment. This may end up leaving behind > * stray shared memory segments, but there's not much we can do about that > * if the metadata is gone. > */ > > Don't we need to adjust this comment, if we want to go with what you > have proposed? The comment seems correct: we ignore the *contents* of the control segment. But to be clearer I'll add a sentence to say that we still attempt to destroy the control segment itself. > BTW, it doesn't appear insane to me that if the > control segment got corrupted, then we leave it as it is rather than > trying to destroy it. I am not sure, but if it is corrupted, then is > it guaranteed that destroy will always succeed? You're right that it might fail. If the shm_unlink() fails it doesn't matter, we just produce LOG, but if the unmap() fails we still consider the segment to be mapped (because it is!), so ... hmm, we'd better forget about it if we plan to continue running, by clearing those variables explicitly. That's actually another pre-existing way for the assertion to fail in current master. (Realistically, that unmmap() could only fail if our global variables are trashed so we try to unmap() a junk address; AFAIK our model here is that the postmaster is sane, even if everything else is insane, so that's sort of outside the model. But this new patch addresses it anyway.) Thanks for the review! -- Thomas Munro http://www.enterprisedb.com
0001-Fix-thinko-in-handling-of-corrupted-DSM-control-a-v2.patch
Description: Binary data