I got the error propagation formula from the Undisputed Source of All
Human Knowledge (USAHK):
http://en.wikipedia.org/wiki/Propagation_of_uncertainty
Specifically, the power law formula: if f = a*A^(+/-b), then sigma(f)/f
= b*sigma(A)/A
However, I hope nobody mistook the "derivation" at the top of my last
post as any sort of "truth"! I was actually trying to show that the
flawed assumption of I=F^2 leads to a ridiculous conclusion: F=0.5.
Anyway, as long as I=F^2, I think Ron's formula and mine are equivalent?:
sigma(I)/I = 2*sigma(F)/F
sigma(I)/(F^2) = 2*sigma(F)/F
sigma(I) = 2*F^2*sigma(F)/F
sigma(I) = 2*F*sigma(F)
The main difference being that Ron derived his the "correct" way, by
differentiating the parent formula.
Ron does make a very good point that this and pretty much all other
error propagation formulas do make the fundamental assumption that the
errors obey a Gaussian (normal) distribution. This is not really the
case for F as it approaches zero. This is very much what French &
Wilson (1978) were on about:
http://dx.doi.org/10.1107/S0567739478001114
Pedagogically, one can address this F->0 problem with a simulation.
That is, take a range of values for F, square them and add noise (the
noise in a spot intensity does have a normal distribution in all but the
near-zero-background case) to get a long list of "Iobs" vs F. Then
graph Iobs on the x axis vs F on the y to see what the "most likely"
value of F is, given that you've seen a given Iobs. If you try this
(and I have) you immediately realize that the histogram of "possible Fs"
for a given Iobs depends on the distribution of F values you used in the
first place.
For example, if you make your list of "Iobs" vs F values using
sigma(Iobs)=1 and F equally distributed between 0 and 5, then you get an
"Iobs vs F" graph where the points all cluster around the sqrt(Iobs)
curve. No surprises there. If you take all values of F that gave Iobs
between 15.9 and 16.1 and compute the average and rms deviation from
that average you get: <F> = 4 and sigma(F) = 0.125, which agrees with
the sigma(F) = sigma(I)/(2*F) rule above. Still no surprises. However,
if you take all values of F that produced Iobs between -0.1 and 0.1 you
get an average value <F> = 0.58 and sigma(F) = 0.38. This does not
obey the error-propagation rule! But if you think about it: since you
have the "prior knowledge" that F cannot be negative then the average
value of "possible F" is always going to be positive. It is interesting
to note here that although the signal-to-noise ratio for intensity is
zero (<Iobs> ~ 0, sigma(Iobs) = 1), the signal-to-noise ratio for F is
~1.5. Is this for real? Yes, it seems to be. The extra "signal" comes
not only from the knowledge that F can't be less than zero, but that it
is evenly distributed between 0 and 5.
Oh, wait. Is F really evenly distributed? No, it's not. In fact,
Wilson (1949) noticed that F^2 has an exponential distribution (small
values being much much more likely than large ones). If you use this
"Wilson distribution", then the collection of F values that give Iobs
between -0.1 and 0.1 is has <F> = 0.66 and sigma(F) = 0.31. A bit
different than the 0<F<5 case above, and if you look at the histogram of
these F values, you will see that its very different. In neither case
is it Gaussian, and this is where we start getting into "likelihood
distribution" stuff.
It is probably worth mentioning here that the French-Wilson "truncation"
procedure is a bit controversial, and there are some rather prominent
crystallographers who think refinement should stay in the intensity
domain, where the error distributions are nicely Gaussian. Then the
"prior knowledge" comes from the model itself and is not "double
counted". That said, the extra "boost" in F/sigma(F) can potentially
have some advantages, for example in anomalous differences where delta-F
is comparable to F itself. But that is a much longer story, and a bit
more than I think the OP was asking about.
-James Holton
MAD Scientist
On 8/21/2011 1:43 AM, Ronald E Stenkamp wrote:
James, could you please give more information about where and/or how
you obtained the relationship "sigma(I)/I = 2*sigma(F)/F"? A
different equation, sigma(I)=2*F*sigma(F), can be derived from
sigma(I)^2 = (d(I)/dF)^2 * sigma(F)^2. I understand that that
equation is based on normal distribution of errors and has numerical
problems when F is small, so there are other approximations that have
been used to convert sigma(I) to sigma(F). However, none that I've
seen end up stating that sigma(F)=0.5. Thanks. Ron
On Sat, 20 Aug 2011, James Holton wrote:
There is a formula for sigma(F) (aka "SIGF"), but it is actually a
common misconception that it is simply related to F. You need to
know a few other things about the experiment that was done to collect
the data. The misconception seems to arise because the fist thing
textbooks tell you is that F = sqrt(I), where "I" is the intensity of
the spot. Then, later on, they tell you that sigma(I) = sqrt(I)
because of "counting statistics". Now, if you look up a table of
error-propagation formulas, you will find that if I=F^2, then
sigma(I)/I = 2*sigma(F)/F, and by substituting these equations
together you readily obtain:
sigma(F) = F/2*sigma(I)/I
sigma(F) = F/2*sigma(I)/F^2
sigma(F) = sigma(I)/(2*F)
sigma(F) = sigma(I)/(2*sqrt(I))
sigma(F) = sqrt(I)/(2*sqrt(I))
sigma(F) = 0.5
Which says that the error in F is always the same, no matter what
your exposure time? Hmm.
The critical thing missing from the equations above is something we
crystallographers call a "scale factor". We love scale factors
because they let us get away with not knowing a great many things,
like the volume of the crystal, the absolute intensity of the x-ray
beam, and the exact "gain" of the detector. It's not that we can't
measure or look up these things, but few of us have the time. And,
by and large, as long as you are aware that there is always an
unknown "scale factor", it doesn't really get in your way. So, the
real equation is:
I_in_photons = scale*F^2
where scale =
Ibeam*re^2*Vxtal*lambda^3*Loentz_factor*Polar_factor*Attenuation_factor*exposure_time/deltaphi/Vcell^2
This "scale factor" comes from Equation 1 in the following paper:
http://dx.doi.org/10.1107/S0907444910007262
where we took pains to describe the exact meaning of each of these
variables (and their units!) in great detail. It is open access, so
I won't go through them here. I will, however, add that for spots on
the detector there are a few other factors still missing, like the
detector gain, obliquity, parallax, and spot partiality, but these
are all "taken care of" by the data processing program. The main
thing is to figure out the number of photons that were accumulated
for a given h,k,l index, and then take the square root of that to get
the "counting error". Oh, and you also need to know the number of
"background" photons that fell into the pixels used to add up photons
for the h,k,l of interest. The square root of this count must be
combined with the "counting error" of the spot photons, along with a
few other sources of error. This is what we discuss around Equation
(18) in the linked-to paper above.
The short answer, however, is that sqrt(I_in_photons) is only one
component of sigma(I). The other factors fall into three main
categories: readout noise, counting noise and what I call "fractional
noise". Now, if you have a number of different sources of noise, you
get the total noise by adding up the squares of all the components,
and then taking the square root:
sigma(I_in_photons) = sqrt( I_in_photons + background_photons +
sigma_readout^2 + frac_error*I_in_photons^2 )
For those of you who use SCALA and think the sqrt( sigI^2 + B*I +
sdadd*I^2 ) form of this equation looks a lot like the SDCORRection
line, good job! That is a very perceptive observation.
What separates the three kinds of noise is how they relate to the
exposure time. For example, readout noise is always the same, no
matter what the exposure time is, but as you increase the exposure
time, the number of photons in the spots and the background go up
proportionally. This means that the contribution of "counting noise"
to sigma(I) increases as the square root of the exposure time. On
modern detectors, the read-out noise is equivalent to the "counting
noise" of a few (or even zero) photons/pixel, and so as soon as you
have more than about 10 photon/pixel of background, the readout noise
is no longer significant.
So, in general, noise increases with the square root of exposure
time, but the signal (I_in_photons) increases in direct proportion to
exposure time, so the signal-to-noise ratio (from counting noise
alone) goes up with the square root of exposure time. That is, until
you hit the third type of noise: fractional noise. There are many
sources of fractional noise: shutter timing error, crystal vibration,
fliker in the incident beam intensity, inaccurate scaling factors
(including the absorption correction), and variations in detector
sensitivity across its face. Essentially, these amount to the error
bars on all the terms in the "scale" formula above. On a good
diffractometer, all these errors are small, usually less than a few
percent, but one thing they all have in common is their contribution
to sigma(I) is proportional to the the signal (I_in_photons), not the
square root of it! This is the reason why the Rmerge of bright,
low-angle spots never gets down to the the 0.1% you would expect from
counting a million photons, even if you did count that many. It is
also the reason why "SDadd" in SCALA (the "estimated error" in
scalepack) tends to be around 3-5%. It is also the reason why
measuring anomalous differences smaller than ~3% is so difficult.
So, if you know the magnitude of all these sources of error, then it
should be possible to derive a formula for SIGF, but I don't think
anyone has quite written down the whole thing yet. Possibly because
the difference between Fobs and Fcalc is about 4-5x bigger than SIGF
anyway.
-James Holton
MAD Scientist
On 8/18/2011 8:19 AM, G Y wrote:
Dear all,
I am a student in crystallography. So not quite familiar with some
even basic concepts.
In shelx .hkl file or ccp4 .mtz file there is a column SIGF which is
related to standard deviation of the structure factor. I read
through many text book for crystallography, there are many formulas
about this topic. Sometimes it is a square of sigma, sometimes it is
not.
My question is:
1. What is the exact mathematical formula for SIGF or SIGFP in ccp4
or shelx format file?
2. If it can be calculated from F, why it is necessary to include it
in ccp4 or shelx reflection file (they have F already) ?
3. Is this value really important in structure determination? Why
and how? As I understood, during data collection each reflection
measured several times, so there is a deviation from the average F.
That is the meaning of SIGF. But how to use this value in structure
determination? Is there some kind of correction or refinement on F
according to SIGF?
And also when multiplicity during data collection is low, the SIGF
would not be so interested. So is that means the SIGF would not be
so important in some measurements?
Any kind reply from you guys would be greatly appreciated. Many thanks!
Best regards,
G