Hi Harry,
A useful starting point when looking for the 'best' [insert criteria
here] structure in the PDB for a specific protein is to visit the
PDBe-KB aggregated views pages (https://pdbekb.org). These pages group
PDB data based upon UniProt accession and the 'structures' tab on these
pages shows all the available PDB entries containing this protein, as
well as other resources that provide structural data for this protein
(including some 'new-fangled predicted models'). For your UniProt ID,
the relevant page is
https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures.
The list of of PDB entries is sorted to have the 'best' structure at the
top - in this case, weighted by a combination of UniProt coverage,
resolution (for X-ray/EM) and validation. It also displays information
on resolution (if applicable), any bound ligands etc. to give this
context to help in choosing a suitable structure.
As you mention, for your example these are all NMR entries containing
similar sized fragments of the full length UniProt sequence, so the
ordering is predominantly using validation data to sort these.
Unfortunately, in your case the most recent entry was before mandatory
deposition of chemical shifts, so you do not have the option of
experimental validation which is now available for recently deposited
NMR entries. Therefore these are all ordered based on geometric validation.
So, although there is no concrete answer to your question, the above
process should help in filtering the options.
Kind Regards,
David
On 03/05/2023 11:45, Harry Powell wrote:
Hi folks
I was wondering.
If there is a UniProt entry (for example, P01132, but there are plenty of
others) for which I want the “best” (whatever that might mean) representative
_experimental_ structure (i.e. not one of these new-fangled predicted models
that some folk say have removed the need for actually doing experiments), but
there are only NMR models - how do I choose?
I don’t mean “which model from the ensemble do I choose” - that’s a different
question.
For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, 1EPH,
1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same paper, so
may be in different conditions (e.g. pH). All except the first (1A3P) cover the
same bit of sequence.
Specifically, what should I look for in the downloadable files (mmCIF, for
example) from the PDB?
Thoughts?
Harry
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/
--
David Armstrong
Outreach and Training Lead
PDBe
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD UK
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/