Re: [ccp4bb] Stuck refinement

James Holton Thu, 24 Jun 2010 09:47:47 -0700

Francis E Reyes wrote:

Sorry a late comer to this thread but the OP mentioned "tweaking theerror model" in HKL2000. I have heard this before.What's the validityin this? Does it actually help or does it only help the integrationnumbers but you'll pay for it during refinement?
FR

There is no validity to "tweaking the error model"! It is a HORRIBLEidea! Unless, of course, you are trying to figure out where you made a"mistake".

It is one thing to TRY increasing the error bars to see how big your"unknown systematic error" is, but you should never just sally forthafter doing that. This is equivalent to taking the data points:

110 +/- 5
100 +/- 5
90 +/- 5
50 +/- 5
45 +/- 5

and using an "error scale factor" of 5.3 to change all the error bars to26.5 and make your "merged" result 79 +/- 13. The "Chi^2" here is 1.0because the new "sigma" of each point is equal to the rms scatter in theobservations. However, the way you arrived at this Chi^2=1 does notmake any sense! Why would the individual error bars be so ridiculouslyunderestimated? Looking at the points, this is obviously a bimodaldistribution (i.e. over-merging).

I once saw one poor sod turn a 7-fold NCS molecule into a 2-foldcrystallographic operator by simply rejecting "outlier" spots.Fortunately, they caught it eventually. But I fear there are more thana few wrong structures in the PDB because of these ill-advisedpractices: rejecting tons of perfectly good data, or using a completelynon-physical "error model". Remember, by inflating your error bars youcan make any number of wrong conclusions agree with your data to "withinthe error bars".

In tzhou's case, I think the "wrong conclusion" was assuming noradiation damage. Effectively, the crystal at the end of the data setwas different from the crystal at the beginning. Sounds like even theunit cell dimensions changed significantly! I hope it is not surprisingto anyone that refining a single-conformer atomic model against a movingaverage of the damaged and undamaged structures will get your R-factors"stuck".

I suppose it is appropriate here to mention the meaning of theparameters in the "error model". In denzo/HKL2K, these are the "errorscale factor" value and "estimated error" table. In SCALA, the "errormodel" is given on the "SDCORRection" line, which has 3 numbers: sdfac(equivalent to the "error scale factor"), SdB (no equivalent in denzo),and sdadd (equivalent to the "estimated error"). In XSCALE, the"WEIGHT" keyword defines the equivalent of sdfac/"error scale factor".How these numbers are applied to the data is described fairly well inthe various program's manuals, but the noise sources they account for donot seem to be widely known:

sdadd > 0 reflects the sum of all sources of fractional noise, or "%error". If the x-ray beam is flickering, the shutter timing is notperfect, the crystal vibrating in the cryo stream, or the detectorcalibration (a scale factor on each pixel) is not perfect, then you willget a new source of error that is proportional to intensity.Essentially, sdadd represents the combined error of all the "scalefactors" in the experiment. In typical experiments, this all sums toabout 3%/spot (sdadd = 0.03), which is why I/sd tends to top out at ~30for datasets with modest multiplicity (see Diederichs, Acta D, 2010).

Now would be a good time to ask yourself why a particular resolution binwould have an "estimated error" different from the rest?

sdfac != 1.0 generally arises from using an incorrect detector gain (anearly universal practice). The fact that almost every detector has anon-zero point-spread function (PSF) tends to make the rms variation ina field of identically illuminated pixels (flat field) less than whatone would expect from "photon-counting noise". This is because the"true noise" is being "blurred" by the PSF. The reduced noise can foolone into thinking that the detector experienced more photons than itactually did (making the signal-to-noise ratio higher), leading to theconclusion that the detector gain (ADU/photon) is lower than it actuallyis. This is a fine assumption for getting the signal-to-noise of thebackground, but unfortunately, this "noise suppression" does not applyto spots. There is some debate about which "gain" is "right" to use indata processing but in my experience correcting things after-the-fact inscaling using sdfac is not harmful. Unless, of course, you inflatesdfac beyond what is required to correct the gain!

sdB != 0 introduces a factor proportional to the square root ofintensity (proportional to photon counting noise), which can also arisefrom an incorrect detector gain. In practice, this term tends to "soakup" problems in the scaling of the other two. However, in my"optimization" of these error-model parameters I have found that sdB"refines" to 0 if I have used the correct detector gain.

It is worth noting here that no "error model" implementations seem tohave a factor for handling noise sources that are independent ofintensity, such as the detector read-out noise. However, one can"fudge" this by lowering the "zero" level in data processing (ADCOFFSETin MOSFLM). Essentially, you are fooling MOSFLM into thinking thatthere is a constant, flat "background" of "extra photons" in every pixeland equating the read-out noise to the noise you would get from this"extra" background. For example, and ADSC Q315r in hardware bin modehas a "true" GAIN of 1.83 ADU/(1 A photon) and a "true" ADCOFFSET of 40ADU. However, the read-out noise tends to give rms 3 ADU on a blankimage, which is equivalent to the noise deposited by ~3 photons/pixel.Therefore, lowering the ADCOFFSET to 37 will "simulate" the read-outnoise. However, for other readout modes and other detector types this"fudge" will be different. Interestingly, the current default ADCOFFSETin MOSFLM for ADSC detectors is 8, which would be appropriate for aQuantum 4.

For all practical purposes, however, the read-out noise from a moderndetector is almost always lost in the background, which dominates theper-pixel error in all but the most exotic cases (weak spots andnear-zero background).

Anyway, my summary recommendation is to "get to know" your detector andwhat kind of "error model" you usually get from it. If you find thatyou have to "inflate your sigmas", then something is wrong.


-James Holton
MAD Scientist

On Jun 23, 2010, at 8:25 AM, "Zhou, Tongqing (NIH/VRC) [E]"<tz...@mail.nih.gov> wrote:
Hi All,
The problem has also been solved with a new 2.0A dataset collectedover the last weekend. Same space group and dimensions, much lessradiation damage. This time I used APS SER-CAT's weaker BM beamline.
Thanks,


Tongqing

Tongqing Zhou, Ph.D.
Staff Scientist
Structural Biology Section
Vaccine Research Center, NIAID/NIH
Building 40, Room 4607B
40 Convent Drive, MSC3027
Bethesda, MD 20892
(301) 594-8710 (Tel)
(301) 793-0794 (Cell)
(301) 480-2658 (Fax)
******************************************************************
The information in this e-mail and any of its attachments isconfidential and may contain sensitive information. It should not beused by anyone who is not the original intended recipient. If youhave received this e-mail in error please inform the sender anddelete it from your mailbox or any other storage devices. NationalInstitute of Allergy and Infectious Diseases shall not acceptliability for any statements made that are sender's own and notexpressly made on behalf of the NIAID by one of its representatives.
******************************************************************


-----Original Message-----
From: Zhou, Tongqing (NIH/VRC) [E]
Sent: Tuesday, June 15, 2010 10:45 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Stuck refinement

Hi, Everyone,
Thank you all very much for the nice suggestions. I am trying toreply within this email.
I agree that the problem may be rooted from the crystal itself, wenoticed during data collection that a wedge of the rotation was verymosaic, HKL2000 was able to pick up the right spots, but the scalinggives high chi^2, and when I used the rejection files, HKL2000complained "more than 50000 rejections". Colleagues suggestedtweaking the error model, the complaint of "more than 50000rejections' went away and rejection dropped to below 300 spots. Thenew error model reduced the chi^2 as well as the I/sigI in the lowresolution shells.
I run the P222 data set with Xtriage, the report says no twining, butthe symmetry was too low. However, HKL2000 won't even pick up highersymmetry groups during indexing. I also rescaled the data omittingthe bad wedge, xtriage gives "normal" report.
Refinement was done with combination of simulated annealing, TLS,ADP, individual sites in Phenix. The molecular replacement was donewith CDR-loop-trimmed antibody Fab and antigen structures. The mapquality was good and I was able to rebuild the new loops without anyproblem.
I will have beam time later this week, I think it will be better toput a better crystal on.
Best regards,


Tongqing

Tongqing Zhou, Ph.D.
Staff Scientist
Structural Biology Section
Vaccine Research Center, NIAID/NIH
Building 40, Room 4607B
40 Convent Drive, MSC3027
Bethesda, MD 20892
(301) 594-8710 (Tel)
(301) 793-0794 (Cell)
(301) 480-2658 (Fax)
******************************************************************
The information in this e-mail and any of its attachments isconfidential and may contain sensitive information. It should not beused by anyone who is not the original intended recipient. If youhave received this e-mail in error please inform the sender anddelete it from your mailbox or any other storage devices. NationalInstitute of Allergy and Infectious Diseases shall not acceptliability for any statements made that are sender's own and notexpressly made on behalf of the NIAID by one of its representatives.
******************************************************************


-----Original Message-----
From: Eleanor Dodson [mailto:c...@ysbl.york.ac.uk]
Sent: Tuesday, June 15, 2010 4:46 AM
To: Zhou, Tongqing (NIH/VRC) [E]
Cc: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Stuck refinement
When this happens, I firstly suspect that the spacegroup may bewrong. We had a case where the symmetry was pseudo I4212 but was really
I222 (or was it really I212121) Anyway most of the structure obeyed the
I41212 symmetry but there was a tail which did not..)

Feed the unmerged reflections into pointless and see what it suggests

Eleanor

Zhou, Tongqing (NIH/VRC) [E] wrote:
Hi Everyone,
I have some problem in refining a structure. The data goes to 2.4A(with some 30% completeness at 2.15A), the structure was solved byMR with Phaser, refinement was done with Phenix, but the r andr-free are now staying at 26% and 32%, even with all possible watersand missing fragments added. Data was collected at APS at cryocondition. One thing I noticed during HKL2000 data processing wasthat the chi^2 were way too high at lower resolutions shells, I hadto adjust the default error model in HKL2000 to get the chi^2 toaround 1, but this adjustment reduced the overall I/sigI ratio a lot(from around 20 to 5).
The quality of electron density maps looks fine to me for a 2.4 Adata set and I was able to build all the missing CDR loops for theantibody in the complex. I am lost now, should I just re-collect anew data set?
Thanks,


Tongqing

Tongqing Zhou, Ph.D.
Staff Scientist
Structural Biology Section
Vaccine Research Center, NIAID/NIH
Building 40, Room 4607B
40 Convent Drive, MSC3027
Bethesda, MD 20892
(301) 594-8710 (Tel)
(301) 793-0794 (Cell)
(301) 480-2658 (Fax)
******************************************************************
The information in this e-mail and any of its attachments isconfidential and may contain sensitive information. It should not beused by anyone who is not the original intended recipient. If youhave received this e-mail in error please inform the sender anddelete it from your mailbox or any other storage devices. NationalInstitute of Allergy and Infectious Diseases shall not acceptliability for any statements made that are sender's own and notexpressly made on behalf of the NIAID by one of its representatives.
******************************************************************

Re: [ccp4bb] Stuck refinement

Reply via email to