Hi James, using dynamic N-Gaussian approximation to form-factor tables as described here (pages 27-29):
http://cci.lbl.gov/publications/download/iucrcompcomm_jan2004.pdf and used in Phenix since 2004, avoids both: singularity at B=0 and inaccurate density values (compared to the raw forma-factor tables) for B->0. Attached is the script that proves this point. To run, simply "phenix.python run.py". Pavel On Sun, Sep 16, 2012 at 11:32 PM, James Holton <jmhol...@lbl.gov> wrote: > Yes, the constant term in the "5-Gaussian" structure factor tables does > become annoying when you try to plot electron density in real space, but > only if you try to make the B factor zero. If the B factors are ~12 (like > they are in 1m1n), then the electron density 2.0 A from an Fe atom is not > -0.2 e-/A^3, it is 0.025 e-/A^3. This is only 1% of the electron density at > the center of a nitrogen atom with the same B factor. > > But if you do set the B factor to zero, then the electron density at the > center of any atom (using the 5-Gaussian model) is infinity. To put it in > gnuplot-ish, the structure factor of Fe (in reciprocal space) can be > plotted with this function: > Fe_sf(s)=Fe_a1*exp(-Fe_b1*s*s)**+Fe_a2*exp(-Fe_b2*s*s)+Fe_a3*** > exp(-Fe_b3*s*s)+Fe_a4*exp(-Fe_**b4*s*s)+Fe_c > > where: > Fe_c = 1.036900; > Fe_a1 = 11.769500; Fe_a2 = 7.357300; Fe_a3 = 3.522200; Fe_a4 = 2.304500; > Fe_b1 = 4.761100; Fe_b2 = 0.307200; Fe_b3 = 15.353500; Fe_b4 = 76.880501; > and "s" is sin(theta)/lambda > > applying a B factor is then just multiplication by exp(-B*s*s) > > > Since the terms are all Gaussians, the inverse Fourier transform can > actually be done analytically, giving the real-space version, or the > expression for electron density vs distance from the nucleus (r): > > Fe_ff(r,B) = \ > +Fe_a1*(4*pi/(Fe_b1+B))**1.5***safexp(-4*pi**2/(Fe_b1+B)*r*r) \ > +Fe_a2*(4*pi/(Fe_b2+B))**1.5***safexp(-4*pi**2/(Fe_b2+B)*r*r) \ > +Fe_a3*(4*pi/(Fe_b3+B))**1.5***safexp(-4*pi**2/(Fe_b3+B)*r*r) \ > +Fe_a4*(4*pi/(Fe_b4+B))**1.5***safexp(-4*pi**2/(Fe_b4+B)*r*r) \ > +Fe_c *(4*pi/(B))**1.5*safexp(-4*pi****2/(B)*r*r); > > Where here applying a B factor requires folding it into each Gaussian > term. Notice how the Fe_c term blows up as B->0? This is where most of the > series-termination effects come from. If you want the above equations for > other atoms, you can get them from here: > http://bl831.als.lbl.gov/~**jamesh/pickup/all_atomsf.**gnuplot<http://bl831.als.lbl.gov/~jamesh/pickup/all_atomsf.gnuplot> > http://bl831.als.lbl.gov/~**jamesh/pickup/all_atomff.**gnuplot<http://bl831.als.lbl.gov/~jamesh/pickup/all_atomff.gnuplot> > > This "infinitely sharp spike problem" seems to have led some people to > conclude that a zero B factor is non-physical, but nothing could be further > from the truth! The scattering from mono-atomic gasses is an excellent > example of how one can observe the B=0 structure factor. In fact, gas > scattering is how the quantum mechanical self-consistent field calculations > of electron clouds around atoms was experimentally verified. Does this > mean that there really is an infinitely sharp "spike" in the middle of > every atom? Of course not. But there is a "very" sharp spike. > > So, the problem of "infinite density" at the nucleus is really just an > artifact of the 5-Gaussian formalism. Strictly speaking, the "5-Gaussian" > structure factor representation you find in ${CLIBD}/atomsf.lib (or Table > 6.1.1.4 in the International Tables volume C) is nothing more than a curve > fit to the "true" values listed in ITC volume C tables 6.1.1.1 (neutral > atoms) and 6.1.1.3 (ions). These latter tables are the Fourier transform > of the "true" electron density distribution around a particular atom/ion > obtained from quantum mechanical self-consistent field calculations (like > those of Cromer, Mann and many others). > > The important thing to realize is that the fit was done in _reciprocal_ > space, and if you look carefully at tables 6.1.1.1 and 6.1.1.3, you can see > that even at REALLY high angle (sin(theta)/lambda = 6, or 0.083 A > resolution) there is still significant elastic scattering from the heavier > atoms. The purpose of the "constant term" in the 5-Gaussian representation > is to try and capture this high-angle "tail", and for the really heavy > atoms this can be more than 5 electron equivalents. In real space, this is > equivalent to saying that about 5 electrons are located within at least > ~0.03 A of the nucleus. That's a very short distance, but it is also not > zero. This is because the first few shells of electrons around things like > a Uranium nucleus actually are very small and dense. How, then, can we > have any hope of modelling heavy atoms properly without using a map grid > sampling of 0.01A ? Easy! The B factors are never zero. > > Even for a truly infinitely sharp peak (aka a single electron), it doesn't > take much of a B factor to spread it out to a reasonable size. For > example, applying a B factor of 9 to a point charge will give it a > full-width-half max (FWHM) of 0.8 A, the same as the "diameter" of a carbon > atom. A carbon atom with B=12 has FWHM = 1.1 A, the same as a "point" > charge with B=16. Carbon at B=80 and a point with B=93 both have FWHM = > 2.6 A. As the B factor becomes larger and larger, it tends to dominate the > atomic shape (looks like a single Gaussian). This is why it is so hard to > assign atom types from density alone. In fact, with B=80, a Uranium atom > at 1/100th occupancy is essentially indistinguishable from a hydrogen atom. > That is, even a modest B factor pretty much "washes out" any sharp features > the atoms might have. Sometimes I wonder why we bother with "form factors" > at all, since at modest resolutions all we really need is Z (the atomic > number) and the B factor. But, then again, I suppose it doesn't hurt > either. > > > So, what does this have to do with series termination? Series termination > arises in the inverse Fourier transform (making a map from structure > factors). Technically, the "tails" of a Gaussian never reach zero, so any > sort of "resolution cutoff" always introduces some error into the electron > density calculation. That is, if you create an arbitrary electron-density > map, convert it into structure factors and then "fft" it back, you do _not_ > get the same map that you started with! How much do they differ? Depends > on the RMS value of the high-angle structure factors that have been cut off > (Parseval's theorem). The "infinitely sharp spike" problem exacerbates > this, because the B=0 structure factors do not tend toward zero as fast as > a Gaussian with the "atomic width" would. > > So, for a given resolution, when does the B factor get "too sharp"? Well, > for "protein" atoms, the following B factors will introduce an rms error in > the electron density map equal to about 5% of the peak height of the atoms > when the data are cut to the following resolution: > d B > 1.0 <5 > 1.5 8 > 2.0 27 > 2.5 45 > 3.0 65 > 3.5 86 > 4.0 >99 > > smaller B factors than this will introduce more than 5% error at each of > these resolutions. Now, of course, one is often not nearly as concerned > with the average error in the map as you are with the error at a particular > point of interest, but the above numbers can serve as a rough guide. If > you want to see the series-termination error at a particular point in the > map, you will have to calculate the "true" map of your model (using a > program like SFALL), and then run the map back and forth through the > Fourier transform and resolution cutoff (such as with SFALL and FFT). You > can then use MAPMAN or Chimera to probe the electron density at the point > of interest. > > But, to answer the OP's question, I would not recommend trying to do fancy > map interpretation to identify a mystery atom. Instead, just refine the > occupancy of the mystery atom and see where that goes. Perhaps jiggling > the rest of the molecule with "kick maps" to see how stable the occupancy > is. Since refinement only does forward-FFTs, it is formally insensitive to > series termination errors. It is only map calculation where series > termination can become a problem. > > Thanks to Garib for clearing up that last point for me! > > -James Holton > MAD Scientist > > > > On 9/15/2012 3:12 AM, Tim Gruene wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Dear Ian, >> >> provided that f(s) is given by the formula in the Cromer/Mann article, >> which I believe we have agreed on, the inset of Fig.1 of the Science >> article we are talking about is claimed to be the graph of the >> function g, which I added as pdf to this email for better readability. >> >> Irrespective of what has been plotted in any other article meantioned >> throughout this thread, this claim is incorrect, given a_i, b_i, c > 0. >> >> I am sure you can figure this out yourself. My argument was not >> involving mathematical programs but only one-dimensional calculus. >> >> Cheers, >> Tim >> >> On 09/14/2012 04:46 PM, Ian Tickle wrote: >> >>> On 14 September 2012 15:15, Tim Gruene <t...@shelx.uni-ac.gwdg.de> >>> wrote: >>> >>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>>> >>>> Hello Ian, >>>> >>>> your article describes f(s) as sum of four Gaussians, which is >>>> not the same f(s) from Cromer's and Mann's paper and the one used >>>> both by Niu and me. Here, f(s) contains a constant, as I pointed >>>> out to in my response, which makes the integral oscillate between >>>> plus and minus infinity as the upper integral border (called >>>> 1/dmax in the article Niu refers to) goes to infinity). >>>> >>>> Maybe you can shed some light on why your article uses a >>>> different f(s) than Cromer/Mann. This explanation might be the >>>> answer to Nius question, I reckon, and feed my curiosity, too. >>>> >>> Tim & Niu, oops yes a small slip in the paper there, it should >>> have read "4 Gaussians + constant term": this is clear from the >>> ITC reference given and the $CLIBD/atomsf.lib table referred to. >>> In practice it's actually rendered as a sum of 5 Gaussians after >>> you multiply the f(s) and atomic Biso factor terms, so unless Biso >>> = 0 (very unphysical!) there is actually no constant term. My >>> integral for rho(r) certainly doesn't oscillate between plus and >>> minus infinity as d_min -> zero. If yours does then I suspect that >>> either the Biso term was forgotten or if not then a bug in the >>> integration routine (e.g. can it handle properly the point at r = 0 >>> where the standard formula for the density gives 0/0?). I used >>> QUADPACK >>> (http://people.sc.fsu.edu/~**jburkardt/f_src/quadpack/**quadpack.html<http://people.sc.fsu.edu/~jburkardt/f_src/quadpack/quadpack.html> >>> ) >>> which seems pretty good at taking care of such singularities >>> (assuming of course that the integral does actually converge). >>> >>> Cheers >>> >>> -- Ian >>> >>> - -- - -- >> Dr Tim Gruene >> Institut fuer anorganische Chemie >> Tammannstr. 4 >> D-37077 Goettingen >> >> GPG Key ID = A46BEE1A >> >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.4.12 (GNU/Linux) >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ >> >> iD8DBQFQVFSBUxlJ7aRr7hoRAoPYAK**DNQu84ozIz5Mn/qmRKiLxXPw/**zPgCgwd75 >> KUHsKzaSdi9mL5kzZBeOqUI= >> =mbnY >> -----END PGP SIGNATURE----- >> >
run.py
Description: Binary data