Re: [ccp4bb] choosing an NMR structure from PDB

Harry Powell Wed, 03 May 2023 07:16:46 -0700

Hi folks

many thanks for the replies - all very helpful.


I can see that my example of P01132 might have been taken as a protein that I 
have a real current interest in - unfortunately, it isn’t (pace! to anyone who 
is working on malaria), it was just the first example I found that had no 
structures other than NMR (for any of the 60,000 odd other UniProts in the PDB 
which have both NMR and X-ray structures, I’ll leave it up to you to decide 
which I would choose!).

As David points out, PDBe-KB is incredibly useful for finding things like this 
(and in fact what I tend to use as a first port of call), and the sorting is 
useful - though (moving away from my original question to the wonderful world 
of X-ray) I find that it’s still necessary to engage brain when looking at the 
results (e.g. for P38398, 4y2g (2.5Å, Rwork 0.215, Rfree 0.252) is #1 but 1t15 
(1.85Å, Rwork 0.206, Rfree 0.222) is at #6 - why? looking at the PDB-REDO 
entries is educational), so it’s challenging to write scripts that will scrape 
the whole DB and give the “best” model for each.

Harry


> Hi Harry,
> 
> A useful starting point when looking for the 'best' [insert criteria here] 
> structure in the PDB for a specific protein is to visit the PDBe-KB 
> aggregated views pages (https://pdbekb.org). These pages group PDB data based 
> upon UniProt accession and the 'structures' tab on these pages shows all the 
> available PDB entries containing this protein, as well as other resources 
> that provide structural data for this protein (including some 'new-fangled 
> predicted models'). For your UniProt ID, the relevant page is 
> https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures.
> 
> The list of of PDB entries is sorted to have the 'best' structure at the top 
> - in this case, weighted by a combination of UniProt coverage, resolution 
> (for X-ray/EM) and validation. It also displays information on resolution (if 
> applicable), any bound ligands etc. to give this context to help in choosing 
> a suitable structure.
> 
> As you mention, for your example these are all NMR entries containing similar 
> sized fragments of the full length UniProt sequence, so the ordering is 
> predominantly using validation data to sort these. Unfortunately, in your 
> case the most recent entry was before mandatory deposition of chemical 
> shifts, so you do not have the option of experimental validation which is now 
> available for recently deposited NMR entries. Therefore these are all ordered 
> based on geometric validation.
> 
> So, although there is no concrete answer to your question, the above process 
> should help in filtering the options.
> 
> Kind Regards,
> David

> On 3 May 2023, at 13:51, Randy John Read <rj...@cam.ac.uk> wrote:
> 
> Hi Harry,
> 
> My advice would be to use one of those new-fangled predicted models. You can 
> find a model in the AlphaFold database at the EBI 
> (https://alphafold.ebi.ac.uk/entry/P01132). If you look at it, there are 
> parts (likely corresponding to the constructs that were crystallised) that 
> look confidently predicted, connected by poorly-predicted loops. If you take 
> the PDB file and the PAE matrix, you can run process_predicted_model either 
> from Phenix or CCP4, which will give you individual files for the confident 
> parts of the full prediction that are likely to have the correct relative 
> orientations. (If you want to use the models for molecular replacement, 
> you’ll find that the least-confident parts are downweighted by being assigned 
> high B-factors, which is much better than having the best parts of the 
> models, with pLDDT near 100, downweighted the most by interpreting pLDDT as a 
> B-factor.) I would bet that these models will be more accurate than typical 
> NMR models.
> 
> Best wishes,
> 
> Randy

> Hi Harry
>  
> First off I would look at quality metrics. For the structures from the same 
> paper you will need to check what the differences are in the paper and choose 
> what’s closest to what you have need (experimental conditions etc) assuming 
> they all have similar quality
>  
> As a secondary priority you will want to look at number of restraints (and 
> especially in the region of interest)
>  
> Note I would expect the more modern structures from cyana and aria to better 
> due to improved methodologies
>  
> Regards
> Gary

> Hi, here is my small contribution.
> 
> The answer to your question depends quite dramatically on the intended use. 
> If you want the "best" structure you might want to see how many restraints 
> per residue were used and if high-resolution restraints as RDCs had been used.
> Recall that NMR structures are build using monomer libraries that are 
> seriously different from the x-ray ones. 
> 
> Best,
> 
> 
> E.
> 

> 
>> On 3 May 2023, at 11:45, Harry Powell 
>> <0000193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:
>> 
>> Hi folks
>> 
>> I was wondering.
>> 
>> If there is a UniProt entry (for example, P01132, but there are plenty of 
>> others) for which I want the “best” (whatever that might mean) 
>> representative _experimental_ structure (i.e. not one of these new-fangled 
>> predicted models that some folk say have removed the need for actually doing 
>> experiments), but there are only NMR models - how do I choose?
>> 
>> I don’t mean “which model from the ensemble do I choose” - that’s a 
>> different question.
>> 
>> For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 
>> 1EPH, 1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same 
>> paper, so may be in different conditions (e.g. pH). All except the first 
>> (1A3P) cover the same bit of sequence.
>> 
>> Specifically, what should I look for in the downloadable files (mmCIF, for 
>> example) from the PDB?
>> 
>> Thoughts?
>> 
>> Harry
>> ########################################################################
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>> 
>> This message was issued to members of http://www.jiscmail.ac.uk/CCP4BB, a 
>> mailing list hosted by http://www.jiscmail.ac.uk/, terms & conditions are 
>> available at https://www.jiscmail.ac.uk/policyandsecurity/
> 
> -----
> Randy J. Read
> Department of Haematology, University of Cambridge
> Cambridge Institute for Medical Research     Tel: +44 1223 336500
> The Keith Peters Building
> Hills Road                                                       E-mail: 
> rj...@cam.ac.uk
> Cambridge CB2 0XY, U.K.                              
> www-structmed.cimr.cam.ac.uk
> 

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] choosing an NMR structure from PDB

Reply via email to