Yes, there has been a conflation of the standard deviation and the r.m.s. of the distribution when it comes to "sigmas". The mathematical formulas look similar (for a Normal distribution) so some people have sloppily transferred the meanings of the mathematical symbols from one concept to the other.
There is another matter in this topic that has also bothered me. People talk about the number of sigmas high a peak is (as in number of r.m.s.'s) when making some argument about the probability of a peak being that high. The problem is that the r.m.s. is calculated from individual samples of the map at grid points but the conclusion is related to "peaks". If you want to comment on the probability of a peak having a height of so much or larger, you have to work with the distribution of peak heights (the values on the shoulders of the peaks being irrelevant to the topic.). You need to identify all the peaks in the map and work from the distribution of their heights. In a 2Fo-Fc style map there is a bimodal distribution with a large number of small peaks in the bulk solvent and a bunch of strong peaks in the region of the ordered molecules. While the r.m.s. of the bulk solvent region might give a reasonable estimate of the sigma (as in the uncertainty in peak heights) of this map, the one r.m.s. cutoff for interpreting the map is simply a tool to try to find the line separating the two distributions so that the big peaks will be inside the contours and the small peaks will be outside. As was stated previously in this thread, when there is a greater proportion of bulk solvent in the crystal the small peaks contribute more to the r.m.s. calculation and the big peaks, of the same significance, will appear to ride higher above the one "sigma" contour. A 1.2 r.m.s. peak in a map with 80% solvent should be considered less likely to be an atom than a 1.2 r.m.s. peak in a map with 40% solvent. When searching for water you have a nearly complete atomic model and can use that to put your map on an absolute scale of "electron scattering equivalents"/A^3 (OMG we're not back to that again ;-) ) and avoid this whole sliding scale problem. Dale Tronrud On 04/21/10 17:21, James Holton wrote: > Like so many rules of thumb, the 3-sigma fofc and 1-sigma 2fofc is a > reasonable guideline that works very well in most cases despite being > based on a flawed assumption. The "0.3% chance" of a peak being above 3 > "sigmas" assumes that the histogram of electron density values is > Gaussian. It is not! In fact, it is a funny-looking bimodal > distribution (the peaks are protein and solvent regions). Programs like > SOLVE use this fact to identify the correct heavy-atom constellation > among all the wrong ones (which tend to produce maps with more > Gaussian-looking histograms). > > It also seems to be a very common misconception that "1 sigma" is the > "noise level" in an electron density map. Not sure where that one got > started or how. No doubt due to the unfortunate use of the greek letter > "sigma" to denote a standard deviation in statistics. Indeed, the > "sigma" scaling of an electron density map is calculated the same way as > a standard deviation, but one need only calculate a "noise free" map > from a PDB file to notice that the "sigma" of such maps is not zero. > > Does anyone know original references for sigma cutoff rules like this? > > -James Holton > MAD Scientist > > Ed Pozharski wrote: >> I second Tim's opinion. In the days of CNS/O, there was a popular rule >> to place waters in 3 sigma peaks that make chemical sense, then >> re-refine and keep those waters that produce more than 1 sigma in 2fo-fc >> map. (With Coot the default cutoff is 5). >> >> There could be a bizarre probabilistic argument for a particular choice >> of sigma cutoff - with 3 sigmas you have ~0.3% chance of a particular >> peak to be simply a random spike. Which means that if the map is on, >> say, 0.5A grid, there is a decent chance to have one such peak per >> 3.5x3.5x3.5A volume. With 5 sigmas the size of the cube goes up to >> ~60x60x60A, so 5 sigma peaks are almost guaranteed not to be flukes. >> >> On Sat, 2010-04-17 at 22:46 +0200, Tim Gruene wrote: >> >>> Hello Sudhir Kumar, >>> >>> most of all the waters in your structure should make chemical sense. >>> When the >>> density around the water is weak it may just mean that the water is >>> not fully >>> occupied. >>> >>> Tim >>> >>> On Sat, Apr 17, 2010 at 09:47:35PM +0900, Sudhir Kumar wrote: >>> >>>> hi all >>>> sorry for such a basic query, i'ld like to know what is the >>>> acceptable sigma >>>> cut off for waters to be kept in a model if data is of about 1.6 A. >>>> thanks in advance >>>> Sudhir Kumar >>>> Research Scholar >>>> Structural Biology Laboratory >>>> SLS, JNU, >>>> New Delhi-110067 >>>> >> >> >>
