[ccp4bb] Introducing the UNTANGLE Challenge

James Holton Thu, 18 Jan 2024 16:34:31 -0800

Greetings Everybody,

I present to you a Challenge.

Structural biology would be far more powerful if we can get our modelsout of local minima, and together, I believe we can find a way to escapethem.

tldr: I dare any one of you to build a model that scores better than my"best.pdb" model below. That is probably impossible, so I also dare youto approach or even match "best.pdb" by doing something more clever thanjust copying it. Difficulty levels range from 0 to 11. First one tomatch the best.pdb energy score an Rfree wins the challenge, and I'dlike you to be on my paper. You have nine months.

Details of the challenge, scoring system, test data, and availablestarting points can be found here:

https://bl831.als.lbl.gov/~jamesh/challenge/twoconf/

Why am I doing this?

We all know that macromolecules adopt multiple conformations. That ishow they function. And yet, ensemble refinement still has a hard timecompeting with conventional single-conformer-with-a-few-split-side-chainmodels when it comes to revealing correlated motions, or even justsimultaneously satisfying density data and chemical restraints. That is,ensembles still suffer from the battle between R factors and geometryrestraints. This is because the ensemble member chains cannot passthrough each other, and get tangled. The tangling comes from thedensity, not the chemistry. Refinement in refmac, shelxl, phenix,simulated annealing, qFit, and even coot cannot untangle them.

The good news is: knowledge of chemistry, combined with R factors,appears to be a powerful indicator of how near a model is to beinguntangled. What is really exciting is that the genuine, underlyingensemble cannot be tangled. The true ensemble _defines_ the density; itis not being fit to it. The more untangled a model gets the closer itcomes to the true ensemble, with deviations from reasonable chemistrybecoming easier and easier to detect. In the end, when all alternativehypotheses have been eliminated, the model must match the truth.

Why can't we do this with real data? Because all ensemble models aretangled. Let's get to untangling them, shall we?

To demonstrate, I have created a series of examples that areprogressively more difficult to solve, but the ground truth model anddensity is the same in all cases. Build the right model, and it will notonly explain the data to within experimental error, and have the bestpossible validation stats, but it will reveal the true, underlyingcooperative motion of the protein as well.


Unless, of course, you can prove me wrong?

-James Holton
MAD Scientist

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

[ccp4bb] Introducing the UNTANGLE Challenge

Reply via email to