[ccp4bb] AW: [ccp4bb] Series termination effect calculation.

Oliver Einsle Tue, 18 Sep 2012 08:43:31 -0700

Good point! With 9 parameters there must be a zillion combinations that produce 
a decent fit.  But even if they are chosen to be similar they have no physical 
meaning, right?


Cheers, Oliver.


-----Ursprüngliche Nachricht-----
Von: Tim Gruene [mailto:t...@shelx.uni-ac.gwdg.de]
Gesendet: Dienstag, 18. September 2012 15:32
An: Oliver Einsle
Cc: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] Series termination effect calculation.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Oliver,

when you fit the values from ICA Tab 6.1.1.1 with gnuplot, the values of C and 
N become much more comparable. c(C) = 0.017 and especially
c(N) = 0.025 > 0!!!
for C:
Final set of parameters            Asymptotic Standard Error
=======================            ==========================

a1              = 0.604126         +/- 0.02326      (3.85%)
a2              = 2.63343          +/- 0.03321      (1.261%)
a3              = 1.52123          +/- 0.03528      (2.319%)
a4              = 1.2211           +/- 0.02225      (1.822%)
b1              = 0.185807         +/- 0.00629      (3.385%)
b2              = 14.6332          +/- 0.1355       (0.9263%)
b3              = 41.6948          +/- 0.5345       (1.282%)
b4              = 0.717984         +/- 0.01251      (1.743%)
c               = 0.0171359        +/- 0.002045     (11.93%)

for N:
Final set of parameters            Asymptotic Standard Error
=======================            ==========================

a1              = 0.723788         +/- 0.04334      (5.988%)
a2              = 3.24589          +/- 0.04074      (1.255%)
a3              = 1.90049          +/- 0.04422      (2.327%)
a4              = 1.10071          +/- 0.0413       (3.752%)
b1              = 0.157345         +/- 0.007552     (4.8%)
b2              = 10.106           +/- 0.1041       (1.03%)
b3              = 30.0211          +/- 0.3946       (1.314%)
b4              = 0.567116         +/- 0.01914      (3.376%)
c               = 0.0252303        +/- 0.003284     (13.01%)

In 1967, Mann only calculated to sin \theta/lambda = 0, ... 1.5, and their 
tabulated values do indeed fit decently within that range, but not out to 6A.

I thought this was notworthy, and I am curious which values for these constants 
refinement programs use nowadays. Maybe George, Garib, Pavel, and Gerard may 
want to comment?

Cheers,
Tim

On 09/18/2012 10:11 AM, Oliver Einsle wrote:
> Hi there,
>
> I was just pointed to this thread and should comment on the
> discussion, as actually made the plots for this paper. James has
> clarified the issue much better than I could have, and indeed the
> calculations will fail for larger Bragg angles if you do not assume a
> reasonable B-factor (I used B=10 for the plots).
>
> Doug Rees has pointed out at the time that for large theta the c-term
> of the Cromer/Mann approximation becomes dominant, and this is where
> chaos comes in, as the Cromer/Mann parameters are only derived from a
> fit to the actual HF-calculation. They are numbers without physical
> meaning, which becomes particularly obvious if you compare the
> parameters for C and N:
>
>
> C:   2.3100  20.8439   1.0200  10.2075   1.5886  0.5687  0.8650
> 51.6512 0.2156 N:  12.2126  0.0057   3.1322  9.8933   2.0125
> 28.9975  1.1663  0.5826 -11.5290
>
> The scattering factors for these are reasonably similar, but the
> c-values are entirely different. The B-factor dampens this out and
> this is an essential point.
>
>
>
> For clarity: I made the plots using Waterloo Maple with the following
> code:
>
> restart; SF :=Matrix(17,9,readdata("scatter.dat",float,9));
>
> biso := 10; e    :=  1; AFF  :=
> (e)->(SF[e,1]*exp(-SF[e,2]*s^2)+SF[e,3]*exp(-SF[e,4]*s^2)
> +SF[e,5]*exp(-SF[e,6]*s^2)+SF[e,7]*exp(-SF[e,8]*s^2)
> +SF[e,9])*exp(-biso*s^2/4);
>
> H    :=  AFF(1); C    :=  AFF(2); N    :=  AFF(3); Ox   :=
> AFF(4); S    :=  AFF(5); Fe   :=  AFF(6); Fe2  :=  AFF(7); Fe3  :=
> AFF(8); Cu   :=  AFF(9); Cu1  :=  AFF(10); Cu2  :=  AFF(11); Mo
> :=  AFF(12); Mo4  :=  AFF(13); Mo5  :=  AFF(14); Mo6  :=  AFF(15);
>
> // Plot scattering factors
>
> plot([C,N,Fe,S], s=0..1);
>
>
> // Figure 1:
>
> rho0 := (r) ->  Int((4*Pi*s^2)*Fe2*sin(2*Pi*s*r)/(2*Pi*s*r),
> s=0..1/dmax); dmax := 1.0; plot (rho0, -5..5);
>
>
> // Figure 1 (inset): Electron Density Profile
>
> rho := (r,f)
> ->(Int((4*Pi*s^2)*f*sin(2*Pi*s*r)/(2*Pi*s*r),s=0..1/dmax));
> cofactor:= 9*rho(3.3,S) + 6*rho(2.0,Fe2) + 1*rho(3.49,Mo6) +
> 1*rho(3.51,Fe3); plot(cofactor, dmax=0.5..3.5);
>
>
> The file scatter.dat is simply a collection of some form factors,
> courtesy of atomsf.lib (see attachment).
>
>
>
> Cheers,
>
> Oliver.
>
>
>
> Am 9/17/12 11:24 AM schrieb "Tim Gruene" unter
> <t...@shelx.uni-ac.gwdg.de>:
>
> Dear James et al.,
>
> so to summarise, the answer to Niu's question is that he must add a
> factor of e^(-Bs^2) to the formula of Cromer/Mann and then adjust the
> value of B until it matches the inset. Given that you claim
> rho=0.025e/A^3 (I assume for 1/dmax approx. 0) for B=12 and the inset
> shows a value of about 0.6, a somewhat higher B-value should work.
>
> Cheers, Tim
>
> On 09/17/2012 08:32 AM, James Holton wrote:
>>>> Yes, the constant term in the "5-Gaussian" structure factor tables
>>>> does become annoying when you try to plot electron density in real
>>>> space, but only if you try to make the B factor zero.  If the B
>>>> factors are ~12 (like they are in 1m1n), then the electron density
>>>> 2.0 A from an Fe atom is not
>>>> -0.2 e-/A^3, it is 0.025 e-/A^3. This is only 1% of the electron
>>>> density at the center of a nitrogen atom with the same B factor.
>>>>
>>>> But if you do set the B factor to zero, then the electron density
>>>> at the center of any atom (using the 5-Gaussian
>>>> model) is infinity.  To put it in gnuplot-ish, the structure factor
>>>> of Fe (in reciprocal space) can be plotted with this
>>>> function:
>>>>
>>>> Fe_sf(s)=Fe_a1*exp(-Fe_b1*s*s)+Fe_a2*exp(-Fe_b2*s*s)+Fe_a3*exp(-Fe_
>>>> b3*s*s
>>>>
>>>>
)+Fe_a4*exp(-Fe_b4*s*s)+Fe_c
>>>>
>>>>
>>>> where: Fe_c = 1.036900; Fe_a1 = 11.769500; Fe_a2 = 7.357300;
>>>> Fe_a3 = 3.522200; Fe_a4 = 2.304500; Fe_b1 = 4.761100; Fe_b2 =
>>>> 0.307200; Fe_b3 = 15.353500; Fe_b4 = 76.880501; and "s" is
>>>> sin(theta)/lambda
>>>>
>>>> applying a B factor is then just multiplication by
>>>> exp(-B*s*s)
>>>>
>>>>
>>>> Since the terms are all Gaussians, the inverse Fourier transform
>>>> can actually be done analytically, giving the real-space version,
>>>> or the expression for electron density vs distance from the nucleus
>>>> (r):
>>>>
>>>> Fe_ff(r,B) = \
>>>> +Fe_a1*(4*pi/(Fe_b1+B))**1.5*safexp(-4*pi**2/(Fe_b1+B)*r*r)
>>>> \ +Fe_a2*(4*pi/(Fe_b2+B))**1.5*safexp(-4*pi**2/(Fe_b2+B)*r*r)
>>>> \ +Fe_a3*(4*pi/(Fe_b3+B))**1.5*safexp(-4*pi**2/(Fe_b3+B)*r*r)
>>>> \ +Fe_a4*(4*pi/(Fe_b4+B))**1.5*safexp(-4*pi**2/(Fe_b4+B)*r*r)
>>>> \ +Fe_c *(4*pi/(B))**1.5*safexp(-4*pi**2/(B)*r*r);
>>>>
>>>> Where here applying a B factor requires folding it into each
>>>> Gaussian term.  Notice how the Fe_c term blows up as B->0?
>>>> This is where most of the series-termination effects come from. If
>>>> you want the above equations for other atoms, you can get them from
>>>> here:
>>>> http://bl831.als.lbl.gov/~jamesh/pickup/all_atomsf.gnuplot
>>>> http://bl831.als.lbl.gov/~jamesh/pickup/all_atomff.gnuplot
>>>>
>>>> This "infinitely sharp spike problem" seems to have led some people
>>>> to conclude that a zero B factor is non-physical, but nothing could
>>>> be further from the truth!  The scattering from mono-atomic gasses
>>>> is an excellent example of how one can
>>>> observe the B=0 structure factor.   In fact, gas scattering
>>>> is how the quantum mechanical self-consistent field calculations of
>>>> electron clouds around atoms was experimentally verified.  Does
>>>> this mean that there really is an infinitely sharp "spike" in the
>>>> middle of every atom?  Of course not.  But there is a "very" sharp
>>>> spike.
>>>>
>>>> So, the problem of "infinite density" at the nucleus is really just
>>>> an artifact of the 5-Gaussian formalism.
>>>> Strictly speaking, the "5-Gaussian" structure factor representation
>>>> you find in ${CLIBD}/atomsf.lib (or Table
>>>> 6.1.1.4 in the International Tables volume C) is nothing more than
>>>> a curve fit to the "true" values listed in ITC volume C tables
>>>> 6.1.1.1 (neutral atoms) and 6.1.1.3 (ions).  These latter tables
>>>> are the Fourier transform of the "true"
>>>> electron density distribution around a particular atom/ion obtained
>>>> from quantum mechanical self-consistent field calculations (like
>>>> those of Cromer, Mann and many others).
>>>>
>>>> The important thing to realize is that the fit was done in
>>>> _reciprocal_ space, and if you look carefully at tables
>>>> 6.1.1.1 and 6.1.1.3, you can see that even at REALLY high angle
>>>> (sin(theta)/lambda = 6, or 0.083 A resolution) there is still
>>>> significant elastic scattering from the heavier atoms.
>>>> The purpose of the "constant term" in the 5-Gaussian representation
>>>> is to try and capture this high-angle "tail", and for the really
>>>> heavy atoms this can be more than 5 electron equivalents.  In real
>>>> space, this is equivalent to saying that about 5 electrons are
>>>> located within at least
>>>> ~0.03 A of the nucleus.  That's a very short distance, but it is
>>>> also not zero.  This is because the first few shells of electrons
>>>> around things like a Uranium nucleus actually are very small and
>>>> dense.  How, then, can we have any hope of modelling heavy atoms
>>>> properly without using a map grid sampling of 0.01A ?  Easy!  The B
>>>> factors are never zero.
>>>>
>>>> Even for a truly infinitely sharp peak (aka a single electron), it
>>>> doesn't take much of a B factor to spread it out to a reasonable
>>>> size. For example, applying a B factor of
>>>> 9 to a point charge will give it a full-width-half max (FWHM) of
>>>> 0.8 A, the same as the "diameter" of a carbon atom.  A carbon atom
>>>> with B=12 has FWHM = 1.1 A, the same as a "point"
>>>> charge with B=16.  Carbon at B=80 and a point with B=93 both have
>>>> FWHM = 2.6 A.  As the B factor becomes larger and larger, it tends
>>>> to dominate the atomic shape (looks like a single Gaussian).  This
>>>> is why it is so hard to assign atom types from density alone.  In
>>>> fact, with B=80, a Uranium atom at 1/100th occupancy is essentially
>>>> indistinguishable from a hydrogen atom. That is, even a modest B
>>>> factor pretty much "washes out" any sharp features the atoms might
>>>> have.
>>>> Sometimes I wonder why we bother with "form factors" at all, since
>>>> at modest resolutions all we really need is Z (the atomic number)
>>>> and the B factor.  But, then again, I suppose it doesn't hurt
>>>> either.
>>>>
>>>>
>>>> So, what does this have to do with series termination?
>>>> Series termination arises in the inverse Fourier transform (making
>>>> a map from structure factors).  Technically, the "tails" of a
>>>> Gaussian never reach zero, so any sort of "resolution cutoff"
>>>> always introduces some error into the electron density calculation.
>>>> That is, if you create an arbitrary electron-density map, convert
>>>> it into structure factors and then "fft" it back, you do _not_ get
>>>> the same map that you started with!  How much do they differ?
>>>> Depends on the RMS value of the high-angle structure factors that
>>>> have been cut off (Parseval's theorem).  The "infinitely sharp
>>>> spike" problem exacerbates this, because the B=0 structure factors
>>>> do not tend toward zero as fast as a Gaussian with the "atomic
>>>> width" would.
>>>>
>>>> So, for a given resolution, when does the B factor get "too sharp"?
>>>> Well, for "protein" atoms, the following B factors will introduce
>>>> an rms error in the electron density map equal to about 5% of the
>>>> peak height of the atoms when the data are
>>>> cut to the following resolution: d     B 1.0 <5 1.5 8 2.0 27
>>>> 2.5 45 3.0 65 3.5 86 4.0 >99
>>>>
>>>> smaller B factors than this will introduce more than 5% error at
>>>> each of these resolutions.  Now, of course, one is often not nearly
>>>> as concerned with the average error in the map as you are with the
>>>> error at a particular point of interest, but the above numbers can
>>>> serve as a rough guide.  If you want to see the series-termination
>>>> error at a particular point in the map, you will have to calculate
>>>> the "true" map of your model (using a program like SFALL), and then
>>>> run the map back and forth through the Fourier transform and
>>>> resolution cutoff (such as with SFALL and FFT).  You can then use
>>>> MAPMAN or Chimera to probe the electron density at the point of
>>>> interest.
>>>>
>>>> But, to answer the OP's question, I would not recommend trying to
>>>> do fancy map interpretation to identify a mystery atom.  Instead,
>>>> just refine the occupancy of the mystery atom and see where that
>>>> goes. Perhaps jiggling the rest of the molecule with "kick maps" to
>>>> see how stable the occupancy is.
>>>> Since refinement only does forward-FFTs, it is formally insensitive
>>>> to series termination errors.  It is only map calculation where
>>>> series termination can become a problem.
>>>>
>>>> Thanks to Garib for clearing up that last point for me!
>>>>
>>>> -James Holton MAD Scientist
>>>>
>>>>
>>>> On 9/15/2012 3:12 AM, Tim Gruene wrote: Dear Ian,
>>>>
>>>> provided that f(s) is given by the formula in the Cromer/Mann
>>>> article, which I believe we have agreed on, the inset of
>>>> Fig.1 of the Science article we are talking about is claimed to be
>>>> the graph of the function g, which I added as pdf to this email for
>>>> better readability.
>>>>
>>>> Irrespective of what has been plotted in any other article
>>>> meantioned throughout this thread, this claim is incorrect, given
>>>> a_i, b_i, c > 0.
>>>>
>>>> I am sure you can figure this out yourself. My argument was not
>>>> involving mathematical programs but only one-dimensional calculus.
>>>>
>>>> Cheers, Tim
>>>>
>>>> On 09/14/2012 04:46 PM, Ian Tickle wrote:
>>>>>>> On 14 September 2012 15:15, Tim Gruene <t...@shelx.uni-ac.gwdg.de>
>>>>>>> wrote:
>>>>>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>>>>>>
>>>>>>>> Hello Ian,
>>>>>>>>
>>>>>>>> your article describes f(s) as sum of four Gaussians, which is
>>>>>>>> not the same f(s) from Cromer's and Mann's paper and the one
>>>>>>>> used both by Niu and me. Here, f(s) contains a constant, as I
>>>>>>>> pointed out to in my response, which makes the integral
>>>>>>>> oscillate between plus and minus infinity as the upper integral
>>>>>>>> border (called 1/dmax in the article Niu refers to) goes to
>>>>>>>> infinity).
>>>>>>>>
>>>>>>>> Maybe you can shed some light on why your article uses a
>>>>>>>> different f(s) than Cromer/Mann. This explanation might be the
>>>>>>>> answer to Nius question, I reckon, and feed my curiosity, too.
>>>>>>> Tim & Niu, oops yes a small slip in the paper there, it should
>>>>>>> have read "4 Gaussians + constant term": this is clear from the
>>>>>>> ITC reference given and the $CLIBD/atomsf.lib table referred to.
>>>>>>> In practice it's actually rendered as a sum of 5 Gaussians after
>>>>>>> you multiply the f(s) and atomic Biso factor terms, so unless
>>>>>>> Biso = 0 (very unphysical!) there is actually no constant term.
>>>>>>> My integral for rho(r) certainly doesn't oscillate between plus
>>>>>>> and minus infinity as d_min -> zero.  If yours does then I
>>>>>>> suspect that either the Biso term was forgotten or if not then a
>>>>>>> bug in the integration routine (e.g. can it handle properly the
>>>>>>> point at r = 0 where the standard formula for the density gives
>>>>>>> 0/0?).  I used QUADPACK
>>>>>>> (http://people.sc.fsu.edu/~jburkardt/f_src/quadpack/quadpack.htm
>>>>>>> l)
>>>>>>>
>>>>>>>
which seems pretty good at taking care of such singularities
>>>>>>> (assuming of course that the integral does actually converge).
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> -- Ian
>>>>>>>
>>>> -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr.
>>>> 4 D-37077 Goettingen
>>>>
>>>> GPG Key ID = A46BEE1A
>>>>
>>>>
>

- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFQWHfUUxlJ7aRr7hoRAr4VAJ4isN0PLYafsdZgVOYseV+MricBVgCfftQd
4a7EpBF1hud6sM6L0SxBXqE=
=ijgy
-----END PGP SIGNATURE-----

[ccp4bb] AW: [ccp4bb] Series termination effect calculation.

Reply via email to