Dear Wouter,
That does sound like a useful tool indeed - finding the proverbial needle in a haystack! That's the challenge with such a rare event: rather like a "true" Ramachandran outlier, when they do occur they're usually a sign of an important motif in your protein that should be remarked upon. To others making the same point: yes, I'm well aware of the existence of true cis peptides, and both re-calculate the background rate in high-res structures and briefly discuss their nature in my paper (my personal favourite example is tissue transglutaminase (2q3z) which contains two - one of which is induced by the formation of a vicinal disulfide bond. It's believed that reduction of the disulfide switches the backbone back to trans to activate the enzyme). But I'm currently unaware of any protein that contains more than 3-4 cis bonds that stand up under scrutiny, while there are many models out there with tens of, or up to a few hundred. For examples of erroneous assignment at high res look at 3ncq, 2gec or 2j82. It's not such a problem at high resolution, but at lower resolutions I'm more concerned about why the cis bonds have crept into the model. Are they simple innocuous oversights (as pointed out by Robbie Joosten, most - but certainly not all - appear in poorly-defined density), or have they come about due to accidentally force-fitting a loop that is fundamentally wrong (e.g. due to an adjacent strand being out of register)? In most cases it's of course the former, but what worries me is the example of a structure I found (since corrected by the authors) that had 86 cis bonds (1.4%), yet only 0.4% Ramachandran and RSRZ outliers. In a "good" structure one would expect an erroneous cis bond to introduce an outlier in some other metric - but it seems equally possible that in a "bad" structure it could bring an outlier back into a favoured region. Hope this clarifies my point. Cheers, Tristan ________________________________ From: wouter.t...@radboudumc.nl <wouter.t...@radboudumc.nl> Sent: Monday, 16 February 2015 9:55 PM To: Tristan Croll; CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Cis-peptide bond checking Dear Tristan, Thank you for your post of earlier today regarding the problem of cis and trans peptide planes in the PDB. We also realised this problem a while ago and an article describing this problem and a solution is presently under review at Acta Cryst. D. After analysis of the PDB we can state with >95% certainty that ~4600 trans -> cis flips in ~2800 entries (and ~70K peptide-plane flips) are needed in the PDB. Around a third of the trans -> cis corrections concern non-prolines. We hope to be able to deal with the problem of cis -> trans corrections later. In the tradition of our group, the software to detect these flips is already available at swift.cmbi.ru.nl. Hopefully, the referees of our article consider this topic just as important as you and I do :-). Kind regards, Wouter Touw and Gert Vriend On 02/16/2015 10:58 AM, Tristan Croll wrote: Dear all, My apologies for the spam-like nature of my post, but I would like to draw your attention to an important issue (outlined in an upcoming short communication to Acta D, which will appear at doi:10.1107/S1399004715000826 once it's online). At present, neither the structural quality checks in commonly-used crystallography packages nor those run on deposition of a structure to the PDB are flagging the presence of non-proline cis peptide bonds. This has led to the presence of many erroneous cis bonds creeping into the PDB - primarily in low-resolution structures as one would expect, but I have identified clearly erroneous examples in structures with resolutions as high as 1.3 Angstroms. From my analysis, I estimate that a few thousand structures have been affected to some extent, with the worst cases having as high as 3% of their peptide bonds in cis. Particularly if you have published anything >2.5 Angstroms in the past few years, may I gently suggest that you make a quick double-check of your deposited structures? This can be done quickly and simply in Coot (Extensions-Modelling-Residues with Cis peptide bonds). Best regards, Tristan Het Radboudumc staat geregistreerd bij de Kamer van Koophandel in het handelsregister onder nummer 41055629. The Radboud university medical center is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.