Hi Harry,

A useful starting point when looking for the 'best' [insert criteria here] structure in the PDB for a specific protein is to visit the PDBe-KB aggregated views pages (https://pdbekb.org). These pages group PDB data based upon UniProt accession and the 'structures' tab on these pages shows all the available PDB entries containing this protein, as well as other resources that provide structural data for this protein (including some 'new-fangled predicted models'). For your UniProt ID, the relevant page is https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures.

The list of of PDB entries is sorted to have the 'best' structure at the top - in this case, weighted by a combination of UniProt coverage, resolution (for X-ray/EM) and validation. It also displays information on resolution (if applicable), any bound ligands etc. to give this context to help in choosing a suitable structure.

As you mention, for your example these are all NMR entries containing similar sized fragments of the full length UniProt sequence, so the ordering is predominantly using validation data to sort these. Unfortunately, in your case the most recent entry was before mandatory deposition of chemical shifts, so you do not have the option of experimental validation which is now available for recently deposited NMR entries. Therefore these are all ordered based on geometric validation.

So, although there is no concrete answer to your question, the above process should help in filtering the options.

Kind Regards,
David

On 03/05/2023 11:45, Harry Powell wrote:
Hi folks

I was wondering.

If there is a UniProt entry (for example, P01132, but there are plenty of 
others) for which I want the “best” (whatever that might mean) representative 
_experimental_ structure (i.e. not one of these new-fangled predicted models 
that some folk say have removed the need for actually doing experiments), but 
there are only NMR models - how do I choose?

I don’t mean “which model from the ensemble do I choose” - that’s a different 
question.

For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH, 
1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so 
may be in different conditions (e.g. pH). All except the first (1A3P) cover the 
same bit of sequence.

Specifically, what should I look for in the downloadable files (mmCIF, for 
example) from the PDB?

Thoughts?

Harry
########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

--
David Armstrong
Outreach and Training Lead
PDBe
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD UK

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to