Cutting data back is a BAD THING.. If the information is not provided no refinement program can use it... Especially for B factor estimation it is the high resolution data that indicates AlphaHelix1 is better positioned than SurfaceLoop 3... E
On Fri, 2 Aug 2024 at 17:19, Reza Khayat <rkha...@ccny.cuny.edu> wrote: > With regards to the image that Bohdan sent and Eleanor's statements, I'm > curious if the splitting of B-factors in Bohdan's image is due to the > increased amount of data (which may diminish the extent of uncertainty), > due to the diminished ensemble of structures within the crystal, or both. > What happens to the B-factors of a structure that was derived from a > 1Angstrom data set if you reduce the amount of data to 3Angstrom. In other > words, you are diminishing the amount of data but not affecting the > ensemble of structures that define the crystal. Perhaps I'm way off on this > one.... > > Best wishes, > Reza > ------------------------------ > *From:* CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of John R > Helliwell <jrhelliw...@gmail.com> > *Sent:* 02 August 2024 11:47 AM > *To:* CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK> > *Subject:* [EXTERNAL] Re: [ccp4bb] How high a B factor is too high to > assume a loop is in place, in the AlphaFold era? > > Dear Colleagues, > I think this paper from 1979 is still very interesting:- > Crystallographic studies of the dynamic properties of lysozyme > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nature.com_articles_280563a0&d=DwMFaQ&c=4NmamNZG3KTnUCoC6InoLJ6KV1tbVKrkZXHRwtIMGmo&r=1DzJFW0v6TgEhkW1gy_-ke-RbtvS1fzEbD5_hcb9Up0&m=_Me4Xe5QZbbYN_GNBrKXdDe2jPv25n-V7XAp03Qcx-XmE9JutFEcl_X81WALv787&s=jC_Z5R86pF5k_iS5FpD1922HfoZySK0czqxWXOR8Gag&e=> > nature.com > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nature.com_articles_280563a0&d=DwMFaQ&c=4NmamNZG3KTnUCoC6InoLJ6KV1tbVKrkZXHRwtIMGmo&r=1DzJFW0v6TgEhkW1gy_-ke-RbtvS1fzEbD5_hcb9Up0&m=_Me4Xe5QZbbYN_GNBrKXdDe2jPv25n-V7XAp03Qcx-XmE9JutFEcl_X81WALv787&s=jC_Z5R86pF5k_iS5FpD1922HfoZySK0czqxWXOR8Gag&e=> > [image: apple-touch-icon-f39cb19454.png] > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nature.com_articles_280563a0&d=DwMFaQ&c=4NmamNZG3KTnUCoC6InoLJ6KV1tbVKrkZXHRwtIMGmo&r=1DzJFW0v6TgEhkW1gy_-ke-RbtvS1fzEbD5_hcb9Up0&m=_Me4Xe5QZbbYN_GNBrKXdDe2jPv25n-V7XAp03Qcx-XmE9JutFEcl_X81WALv787&s=jC_Z5R86pF5k_iS5FpD1922HfoZySK0czqxWXOR8Gag&e=> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nature.com_articles_280563a0&d=DwMFaQ&c=4NmamNZG3KTnUCoC6InoLJ6KV1tbVKrkZXHRwtIMGmo&r=1DzJFW0v6TgEhkW1gy_-ke-RbtvS1fzEbD5_hcb9Up0&m=_Me4Xe5QZbbYN_GNBrKXdDe2jPv25n-V7XAp03Qcx-XmE9JutFEcl_X81WALv787&s=jC_Z5R86pF5k_iS5FpD1922HfoZySK0czqxWXOR8Gag&e=> > Have a great weekend, > John > > Emeritus Professor John R Helliwell DSc > > > > > On 2 Aug 2024, at 16:29, Bohdan Schneider <bohdan.schnei...@gmail.com> > wrote: > > Hello: > > yes, a great discussion! I second Eleanor's statement that B-factors of > high resolution structures do carry a message about atom flexibility. I > attach a screenshot of a figure from our paper (Schneider et al.: Local > dynamics of proteins and DNA evaluated from crystallographic B factors, > Acta Cryst. (2014). D70, 2413–2419) that shows clear resolution dependence > of B factors at protein/protein interface for amino acids and waters. Our > high resolution group of structures could not be below 1 Å as Eleanor > suggests but even modest limit to 1.9 Å and then structures at 1.9-2.5 and > 2.5-3.0 show the effect clearly. We looked at several other groups of atoms > (backbone/side chains at the protein core, at the protein surface, DNA > phosphates/bases, waters at the interfaces or bound on the protein surface) > and saw the same dependence. > > Best, > > Bohdan, bs.structbio.org > > On 2024-08-02 13:26, Eleanor Dodson wrote: > > All interesting points.. (And good to see a reference to /" P.A. Machin, > J.W. Campbell, M. Elder (Eds) > > Refinement of Protein Structures, SERC Daresbury Laboratory, Warrington, > UK (1980)"/ > > - for those who remember, a super exciting discussion over what was > feasible for refinement, and how to do it! ) > > My take - if a crystal diffracts to 1A we can be fairly sure of the > accurate position of most of the coordinates, see other conformations for > some regions, and give realistic B values to most atoms. > > If the crystal only diffracts to 3A then the lattice is not perfect, and > there must be multiple conformations for lots of the molecule. > > There is not going to be sufficient experimental data to model this > properly so every parameter assuming a single conformer - coordinate, B > value, occupancy - is an approximation. Restraints help to some extent but > they impose prior knowledge and do not glean information from the > experimental data. > > The "trash can" should indicate the degree of uncertainty and interpreting > that is a bit problematic. B values twice the overall B ?? Hmm- do NOT > base too much faith in that part of the model.. As crystallographers I > think maybe we need to flag this better for trusting users of the > information. Omitting that region? I am not sure .. How do others model > those floppy lysines? I usually make a sort of informed guess but indeed > giving a single conformation is not the truth, the whole truth, and nothing > but the truth.. > > On Fri, 2 Aug 2024 at 01:14, James Holton <jmhol...@lbl.gov> wrote: > > __ > > I submit that modern B factor restraints make them much less trashy > > than they were in the early days. As Pavel points out the exact > > strategies differ from program to program, but I don't think anybody > > does unrestrained B factor refinement. Not by default. > > Besides, all we are really doing is fitting Gaussian-shaped peaks to > > the "curve" of the data. These peaks have a width and a height. For > > example, a carbon atom with B=20 has a peak density of 1.6 e-/A^3 > > and a full-width-at-half-max (FWHM) of 1.4 A. That is it! That is > > the model density being fit. If you increase to B=80 the peak drops > > to 0.3 e-/A^3 and the FWHM increases to 2.6 A. At the largest B you > > can stuff into a PDB file (999.99), the peak height is 0.008 e-/A^3 > > and the "peak" is 8.45A wide. Your disordered loop, however, is > > probably not sampling from a symmetric Gaussian distribution like > > that. This is the real problem with large B factors. They can fit > > better than sharper B atoms, but that doesn't mean they fit well. > > Occupancy is easy because all it does is scale the height without > > affecting the width. So, an 0.5 occupancy atom model is half the > > height of a full-occupancy one. The width is unchanged. B factors > > impact both width and height because they must preserve the number > > of electrons in the peak. This is perhaps why they are often > > confusing and mysterious. We should also never forget that bulk > > solvent gets excluded with exactly the same radii rules from every > > modeled atom, regardless of B factor and occupancy. So, the "change > > in density" from adding or deleting an atom is a little more > > complicated than adding or subtracting a Gaussian peak. > > Nevertheless, if you want to fit peak height and width independently > > (like we do in pretty much every other kind of curve fitting), then > > you should refine occupancy and B factors at the same time. > > Over-fitting you say? Hardly. Polynomials are easy to over-fit, but > > not Gaussians. Observations/parameters is a useful guide for > > polynomial fits, but in general the hallmark of over-fitting is that > > the prediction passes exactly through all the observed points (and > > not the cross-validation or "Rfree" points). I have never seen a > > macromolecular refinement end up with Rwork = 0. Have you? > > At the end of the day, what we do with our models is look at their > > parameters and try to extract the physically meaningful reality they > > are trying to capture. Restraints are very helpful in preventing > > many types of unrealistic situations, but ultimately it is up to you > > to decide if the fitted model makes sense. > > -James Holton > > MAD Scientist > > On 7/30/2024 11:30 AM, Ian Tickle wrote: > > > Obviously no refined parameters can ever be completely error-free, > > it's just that for the co-ordinates we have very accurate > > geometric restraints so that the relative uncertainty in the > > refined co-ordinates is small (but try refining co-ordinates > > without restraints!). For the B factors we don't have accurate > > estimates (if any) for their restraints so their relative > > uncertainty after refinement is much greater. > > > -- Ian > > > > On Tue, Jul 30, 2024 at 6:57 PM Oganesyan, Vaheh < > vaheh.oganes...@astrazeneca.com> wrote: > > > Yes, it is and I like the definition of shared “trash bin”. It > > will have more physical meaning if we can separate those > > contributions into separate bins. > > > Vaheh > > > *From:* Pavel Afonine <pafon...@gmail.com > > <mailto:pafon...@gmail.com>> > > *Sent:* Tuesday, July 30, 2024 1:51 PM > > *To:* Oganesyan, Vaheh <vaheh.oganes...@astrazeneca.com > > <mailto:vaheh.oganes...@astrazeneca.com>> > > *Cc:* CCP4BB@jiscmail.ac.uk <mailto:CCP4BB@jiscmail.ac.uk> > > *Subject:* Re: [ccp4bb] How high a B factor is too high to > > assume a loop is in place, in the AlphaFold era? > > > Vaheh, > > > I think coordinates are no different from B factors, > > occupancies, f', or f'' in this respect. Coordinates can play > > their "trash bin" role by adjusting to the noise at the > > expense of violated geometry (bonds, angles, planes, torsions, > > etc.). As I mentioned in my previous email, their trash bin > > capacity is much smaller (but definitely not zero!) because > > the number and strength (confidence) of geometry restraints > > are much greater than those of ADP restraints. > > > I agree that all refined parameters share this trash bin > > capacity, but to varying degrees. Isn't this essentially what > > we call the error on the refined parameter? All refined > > parameters have their error bars, which we have referred to as > > the "trash bin" in this thread. > > > Pavel > > > On Tue, Jul 30, 2024 at 10:09 AM Oganesyan, Vaheh > > <vaheh.oganes...@astrazeneca.com> wrote: > > > Your point is taken, Pavel. However, despite resolution, > > you define coordinate of the atom as a geometric point > > with no width. Although coordinates are “refineable”, they > > have no capacity for “trash”. Their “trash” still goes > > into B-factor “trash bin”. At least this is how I see it. > > > Thank you. > > > *Vaheh Oganesyan, Ph.D.* > > *R&D **| Biologics Engineering* > > One Medimmune Way, Gaithersburg, MD 20878 > > T: 301-398-5851 > > _vaheh.oganes...@astrazeneca.com > > > *From:* Pavel Afonine <pafon...@gmail.com>> > > *Sent:* Tuesday, July 30, 2024 11:45 AM > > *To:* Oganesyan, Vaheh <vaheh.oganes...@astrazeneca.com> > > *Cc:* CCP4BB@jiscmail.ac.uk <mailto:CCP4BB@jiscmail.ac.uk> > > *Subject:* Re: [ccp4bb] How high a B factor is too high to > > assume a loop is in place, in the AlphaFold era? > > > From this perspective, all refinable atomic model > > parameters can be viewed as trash bins, with the size of > > these bins being proportional to the amount of prior > > information (restraints) imposed on these parameters. For > > example, coordinates have the most restraints and thus are > > the smallest trash bins, while B factors have the least > > restraints and thus are one of the largest bins. > > > Pavel > > > On Tue, Jul 30, 2024 at 8:25 AM Oganesyan, Vaheh > > <vaheh.oganes...@astrazeneca.com> wrote: > > > Early in my Crystallography life I was postdoc with > > Robert Huber in Munich. We had those gatherings once a > > week when in very informal way we can ask and answer > > questions. I remember my question about B factors: how > > is it possible to have high resolution structure and > > average B-factor of 100A^2 . I think it was Robert or > > Albrecht Messerschmidt who told that B-factor is a > > “trash can” that describes not only loosely positioned > > atoms but also all other problems that either you > > created during processing, harvesting or crystal had > > from the beginning. > > > *Vaheh Oganesyan, Ph.D.* > > *R&D **| Biologics Engineering* > > One Medimmune Way, Gaithersburg, MD 20878 > > T: 301-398-5851 > > _vaheh.oganes...@astrazeneca.com > > > *From:* CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> *On > Behalf Of *James > > Holton > > *Sent:* Tuesday, July 30, 2024 10:35 AM > > *To:* CCP4BB@JISCMAIL.AC.UK <mailto:CCP4BB@JISCMAIL.AC.UK> > > *Subject:* Re: [ccp4bb] How high a B factor is too > > high to assume a loop is in place, in the AlphaFold era? > > > How high B factors can go depends on the refinement > > program you are using. > > > In fact, my impression is that the division between > > the "let the B factors blow up" and "delete the > > unseen" camps is correlated to their preferred > > refinement program. You see, phenix.refine is > > relatively aggressive with B factor refinement, and > > will allow "missing" atoms to attain very high B > > factors. Refmac, on the other hand, has restraints > > that try to make B factor distributions look like > > those found in the PDB, and so tends to keep nearby B > > factors similar. As a result, you may get "red > > density" for disordered regions from refmac, inviting > > you to delete the offending atoms, but not from > > phenix, which will raise the B factor until the > > density fits. > > > Then there are programs like VagaBond that don't > > formally have B factors, but rather let an ensemble of > > chains spread out in the loopy regions you are > > concerned about. This might be the way to go? > > > You can also do ensemble refinement in the latest > > Amber. That is, you run an MD simulation of a unit > > cell (or more) and gradually increase structure factor > > restraints. This would probably result in the "fan" of > > loops you have in mind? > > > -James Holton > > MAD Scientist > > > On 7/28/2024 8:13 AM, Javier Gonzalez wrote: > > Dear CCP4bb, > > > I'm refining the ~3A crystal structure of a big > > protein, largely composed of alpha helices > > connected by poorly-resolved loops. > > > In the old pre-AlphaFold (AF) days I used to > > simply remove those loops/regions with too high B > > factors, because there was little to none density > > at 1 sigma in a 2Fo-Fc map. > > > However, considering that the quality of a > > readily-computable AF model is comparable to a 3A > > experimental structure, and that the UniProt > > database is flooded with noodle-like AF models, I > > was considering depositing a combined model in the > > PDB. > > > Once R/Rfree reach a minimum for the model > > truncated in poorly resolved loops, I would > > calculate an augmented model with AF calculated > > missing regions (provided they have an acceptable > > pLDDT value), assign them zero occupancy, and run > > only one cycle of refinement to calculate the > > formal refinement statistics. > > > Would that be acceptable? Has anyone tried a > > similar approach? > > > I'd rather do that instead of depositing a > > counterintuitive model with truncated regions that > > few people would find useful!! > > > Thank you for your comments, > > > Javier > > -- Dr. Javier M. González > > Instituto de Bionanotecnología del NOA > > (INBIONATEC-CONICET) > > Universidad Nacional de Santiago del Estero (UNSE) > > RN9, Km 1125. Villa El Zanjón. (G4206XCP) > > Santiago del Estero. Argentina > > > Tel: +54-(0385)-4238352 > > > ######################################################################## > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a > mailing list hosted by www.jiscmail.ac.uk, terms & conditions are > available at https://www.jiscmail.ac.uk/policyandsecurity/ > <Figure_2_Acta Cryst. (2014).D70,2413.png> > > > ------------------------------ > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.jiscmail.ac.uk_cgi-2Dbin_WA-2DJISC.exe-3FSUBED1-3DCCP4BB-26A-3D1&d=DwMFaQ&c=4NmamNZG3KTnUCoC6InoLJ6KV1tbVKrkZXHRwtIMGmo&r=1DzJFW0v6TgEhkW1gy_-ke-RbtvS1fzEbD5_hcb9Up0&m=_Me4Xe5QZbbYN_GNBrKXdDe2jPv25n-V7XAp03Qcx-XmE9JutFEcl_X81WALv787&s=bCUX91_1eGn3_kNtEpo2a8v3oVEZUu02yUMmM0A8z1E&e=> > > ------------------------------ > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/