Hi Dale, You're absolutely right - the multiple hypothesis testing problem is one that is often not considered, let alone properly accounted for. Whilst this can be accounted for by appropriate adjustment of significance levels when a known number of explicit hypotheses are tested (and when estimated sigmas are appropriate and reliable...), this is extremely difficult in the present context when we passively conduct a large number of quick map evaluations subjectively by eye. Objective guidelines in such a case, which don't essentially boil down to an automated procedure, or unduly inhibit the process in other ways, would be valuable. I don't think there's a clear answer to this today, although raising awareness of such issues is very prudent. Indeed, there is an outstanding need for additional approaches for cross-validation, and perhaps re-evaluation of policies regarding provision of evidence of the reproducibility of crystallographic models. You're correct to say that, ultimately, there is (presently) no substitute for education and experience.
Best regards, Rob > On 3 Dec 2020, at 08:09, Dale Tronrud <de...@daletronrud.com> wrote: > > Hi, > > Dr Nicholls brings up many interesting points, but doesn't touch on the > major point I had hoped to make in my letter. Whenever you start making > multiple tests of your hypothesis you have to evaluate each of those tests > with a higher standard than you would if you only applied one. If you take a > survey of the amount of fat people eat along with their history of heart > disease you can calculate a correlation and find it significant with a p > value of 0.05. If, instead, you perform a survey asking for twenty different > dietary behaviors and twenty health outcomes and find a correlation between > eating fat and heart disease you need a much higher "signal" to determine its > significance. You just made 400 comparisons and a p of 0.05 allows 20 > spurious correlations to appear significant. > > If you are exploring your data set to decide if a compound has bound, and > your try several different refinement programs and calculate several > different map types based on the results of those refinements, and then > adjust the blur of each map, and pick the map with the strongest peak in the > putative binding site, you have to consider the significance of that peak > height to be less than if you had just calculated one map and got that same > height. > > Ignoring this counterintuitive fact has resulted in a huge number of > studies in many fields to be published that ultimately turned out to not be > reproducible. It likely has also resulted in the deposition of a lot of > "complex" models in the PDB that aren't correct. > > Yes, I am arguing for an ideal, hoping to pull some of you over toward my > side a bit. I certainly understand that one has to be flexible when solving > a difficult problem, but you can't ignore that this "flexibility" has > significant consequences for understanding the results of your work. > > Dr Nicholls' letter brings up a related topic which I'd like to explore. > His letter repeatedly mentions the importance of "intuition" when > interpreting a map. Yes, the power of human intuition, and our inability to > replicate it in silico is the reason we are still staring at maps in Coot. > Intuition is a remarkable tool which, by its nature, is difficult to describe. > > Yet, no one is born with an innate intuition for interpreting electron > density maps. Intuition is acquired thru practice. Practice is not simple > repetition, however. You can't become proficient in shooting basketball > hoops by simply repeatedly throwing a basketball on the roof of your garage. > You have to have a proper backboard and a hoop. Now, after repeatedly > throwing the ball and "feeling" the difference between it going through the > hoop and not, you will develop the ability to make a basket w/o really > thinking about it. You will have developed an intuition for achieving that > task. > > There are two caveats. First, you have to actually watch the ball go > through the hoop. If you close your eyes right after your throw you will > never develop a useful skill. It is the feedback from the success or failure > of each attempt that makes it practice. Second, no matter how much time you > spend shooting baskets, you will never get better at dribbling the ball. > Good practice allows you to develop intuition, but only intuition about that > task. > > Let's say you are working on a project, but having difficulty interpreting > your map at some critical location. You ask around and learn of some spiffy > new map calculation and you want to try it. While you certainly can > calculate the map, you have no intuition on how to interpret it. You have > not practiced with that type of map. > > It may look similar to the maps you've looked at before, but that > similarity can be a trap. By now a large number of us here on the BB have > had the experience of looking at a high resolution electrostatic potential > (ESP) map and "feeling" that something is wrong with it. The carbonyl oxygen > bumps are too small and the acid groups are oddly weak. Wow, those magnesium > ions really stand out -- Maybe they're potassium instead? No, there is > nothing wrong with the ESP map. The fault is with our intuition which was > based on many, many hours of looking at ED maps. To interpret ESP maps you > have to practice with a bunch of ESP maps first. > > You cannot develop intuition for the spiffy map calculated from your > project's data since you don't know its correct interpretation -- It cannot > give you feedback. Before you calculate this map for your data you should > calculate versions for many other *completed* projects and get a "feel" for > what that kind of map shows under different circumstances. Practice, > practice, practice, then you will be ready to return to your little mystery > and be able to apply your, newly acquired, intuition. > > Yes, I try new refinement programs - But first I run refinement with them > on familiar proteins. Yes, I try new styles of map calculations - But first I > calculate those maps for cases where I know the answer. I've refined a fair > number of structures, probably not as many as most of you, but at the end of > a refinement I take the answer and go back to the original maps. Looking at > those maps in light of the answer is what improves my map interpretation > skills, such as they are, the most. > > All of my practice has been with ED (and some ESP) maps of better than 3 A > resolution. Despite all the intuition I can bring to bear on them, when it > comes to a 4 A resolution map I'm no better than an undergrad. > > Your first experience with a new technique should never be with your > current project's data. You should work to add that technique to your tool > box, and then move back to your data. Practice, and more practice will build > that squishy neural network in your head. > > Descending from soapbox, > Dale Tronrud > > > On 12/1/2020 8:31 AM, Robert Nicholls wrote: >> Dear all, >> I feel the need to respond following last week’s critique of the use of >> Coot’s map blurring tool for providing diagnostic insight and aiding ligand >> identification… >>> On 24 Nov 2020, at 16:02, Dale Tronrud <de...@daletronrud.com >>> <mailto:de...@daletronrud.com>> wrote: >>> >>> To me, this sounds like a very dangerous way to use this tool decide if a >>> ligand has bound. I would be very reluctant to modify my map with a range >>> of arbitrary parameters until it looked like what I wanted to see. The >>> sharpening and blurring of this tool is not guided or limited by theory or >>> data. >> I disagree with this, subject to the important qualification that care is >> needed with interpretation. Blurring isn't a crime - it merely involves >> adjusting the weighting given to lower versus higher resolution reflections, >> and thus allows relaxation of the choice of high-resolution limit, and >> facilitates local investigation of regions that exhibit a poor >> signal-to-noise ratio. This is particularly pertinent to liganded compounds, >> which are typically present with sub-unitary occupancies. >> Coot's blurring merely involves convolution of the whole map with an >> isotropic 3D Gaussian, with a parameter (B-factor) to control the standard >> deviation of the Gaussian. This corresponds to reweighting the structure >> factors in order to give higher weight to lower-resolution reflections. This >> approach is guided by a very simple theory: higher resolution structure >> factors (SFs) are typically noisier, with a worse signal-to-noise ratio than >> lower resolution SFs (due to increased errors in both observed >> higher-resolution reflections and calculated phases). Consequently, >> increasing the blurring B-factor reduces the effect of the noisier >> higher-resolution SFs. This results in a map that should be more reliable, >> but at the expense of reduced structural detail due to artificially reducing >> the effective resolution. >> It should be noted that this does assume that lower resolution reflections >> are more reliable than higher resolution ones. So, good low-resolution data >> quality and completeness is important. >> Unfortunately, determination of an optimal B-factor parameter is not >> presently automated. Consequently, users are currently expected to trial >> different values in the Coot slider tool in order to maximise information >> and gain, for want of a better word, intuition. Furthermore, due to the >> spatially heterogeneous nature of atomic positional uncertainty in >> macromolecular complexes, it can be that different B-factor parameters are >> of optimal usefulness in different local regions of the map that exhibit >> different signal-to-noise ratios. Such issues are on-going areas of research. >> The main problem is that interpretation is subjective. In difficult cases, >> it is necessary to obtain as much information and insight as possible in >> order to gain a good intuition. If you can't see a ligand in the "standard" >> maps, but you can see evidence for a ligand in blurred density (or >> difference density) maps of the various types, then it means that careful >> exploration of those avenues is required. Any "evidence" from viewing such >> maps and map types should serve to guide intuition, and should be digested >> along with all other available information. Such complementary maps should >> be seen as diagnostics to gain intuition, rather than something that can be >> used as an unequivocal argument for ligand binding. >> Ultimately, the presence of significant density in a blurred map means that >> there is something substantial present. Or in a blurred difference density >> that there is something missing from the current model. This could be a >> missing ligand, or it could be a mismodelled region of the macromolecule, or >> it could be mismodelled solvent (in which case re-evaluating any solvent >> mask may be worthwhile). Ultimately it is down to the practitioner to >> explore all potential explanations for any such behaviour, in order to >> maximise intuition and convince themselves of the crystal's structural >> composition. >> In some cases the presence of density in a blurred map might be sufficient >> to convince the practitioner that it is worth pursing investigation of >> binding. This may take various forms: hypothesising an approximate pose for >> the ligand; the nature of interactions in the structural environment of the >> macromolecule; re-evaluation after modelling and refinement; or simply >> stating that there may be evidence of binding. In many cases, the latter is >> the appropriate action, and, as Robbie quite rightly pointed out: "in a >> scientific setting this digging is not to come to a strong conclusion, but >> only to see if you should pursue the project and do additional experiments". >>> On 24 Nov 2020, at 16:02, Dale Tronrud <de...@daletronrud.com >>> <mailto:de...@daletronrud.com>> wrote: >>> [...] to avoid bias in the interpretation of the results, all of the >>> statistical procedures are decided upon BEFORE the study is even began. >>> This protocol is written down and peer reviewed at the start. Then the >>> study is performed and the protocol is followed exactly. >>> [...] I would recommend that you decide what sort of map you think is the >>> best at showing features of your active site, based on the resolution of >>> your data set and other qualities of your project, before you calculate >>> your first Fourier transform. If you think a Polder map is the bee's knees >>> then calculate a Polder map and live with it. If you are convinced of the >>> value of a FEM, or a Buster map, or a SA omit map, or whatever, calculate >>> that map instead and live with it. >> I agree that such an approach would be more scientific, and I certainly find >> this idea very appealing. Whilst I hesitate to speak against such a >> philosophy, I feel it is necessary to temper/balance this view by pitching a >> counterargument in the interests of pragmatism - in general it's just not >> that practical. And perhaps propositions for revolution of best-practice >> policies within the field should be distinct from current practical >> recommendation, in the interests of avoiding potential confusion for the >> student/user who simply wants a solution that they can apply to today's >> problems. >> Whilst it sounds like a nice ideal, in general it is difficult to know which >> pathologies might be encountered (e.g. ambiguous density in the binding >> site; twinning; modelling difficulties around a symmetry axis; multiple >> conformations; semi-disorder; post-translational chemical modifications; >> radiation damage… the list goes on). It's completely acceptable for someone >> encountering a problem for the first time to explore what tools are >> available to guide any decision-making, in the hope of achieving the best >> model possible. A typical user cannot be expected to outline a strategy for >> every eventuality a priori - that sounds more like the design of an >> automated pipeline, not advice that users should be expected follow. >> In summary, it's unadvisable to put all eggs in one basket (of one type of >> map, Polder or otherwise). If an experienced user likes a particular tool >> because it's worked well for them in the past, it doesn't mean that they >> shouldn't try other tools now (in this case: view other types of maps) the >> next time they encounter a problem. Especially given that tools in our field >> are still very much evolving over time. Different approaches may have more >> value and provide more insight in different circumstances. >> Best regards, >> Rob >> ------------------------------------------------------------------------ >> To unsubscribe from the CCP4BB list, click the following link: >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 >> <https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1> > > ######################################################################## > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing > list hosted by www.jiscmail.ac.uk, terms & conditions are available at > https://www.jiscmail.ac.uk/policyandsecurity/ ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/