Hi folks many thanks for the replies - all very helpful.
I can see that my example of P01132 might have been taken as a protein that I have a real current interest in - unfortunately, it isn’t (pace! to anyone who is working on malaria), it was just the first example I found that had no structures other than NMR (for any of the 60,000 odd other UniProts in the PDB which have both NMR and X-ray structures, I’ll leave it up to you to decide which I would choose!). As David points out, PDBe-KB is incredibly useful for finding things like this (and in fact what I tend to use as a first port of call), and the sorting is useful - though (moving away from my original question to the wonderful world of X-ray) I find that it’s still necessary to engage brain when looking at the results (e.g. for P38398, 4y2g (2.5Å, Rwork 0.215, Rfree 0.252) is #1 but 1t15 (1.85Å, Rwork 0.206, Rfree 0.222) is at #6 - why? looking at the PDB-REDO entries is educational), so it’s challenging to write scripts that will scrape the whole DB and give the “best” model for each. Harry > Hi Harry, > > A useful starting point when looking for the 'best' [insert criteria here] > structure in the PDB for a specific protein is to visit the PDBe-KB > aggregated views pages (https://pdbekb.org). These pages group PDB data based > upon UniProt accession and the 'structures' tab on these pages shows all the > available PDB entries containing this protein, as well as other resources > that provide structural data for this protein (including some 'new-fangled > predicted models'). For your UniProt ID, the relevant page is > https://www.ebi.ac.uk/pdbe/pdbe-kb/proteins/P01132/structures. > > The list of of PDB entries is sorted to have the 'best' structure at the top > - in this case, weighted by a combination of UniProt coverage, resolution > (for X-ray/EM) and validation. It also displays information on resolution (if > applicable), any bound ligands etc. to give this context to help in choosing > a suitable structure. > > As you mention, for your example these are all NMR entries containing similar > sized fragments of the full length UniProt sequence, so the ordering is > predominantly using validation data to sort these. Unfortunately, in your > case the most recent entry was before mandatory deposition of chemical > shifts, so you do not have the option of experimental validation which is now > available for recently deposited NMR entries. Therefore these are all ordered > based on geometric validation. > > So, although there is no concrete answer to your question, the above process > should help in filtering the options. > > Kind Regards, > David > On 3 May 2023, at 13:51, Randy John Read <rj...@cam.ac.uk> wrote: > > Hi Harry, > > My advice would be to use one of those new-fangled predicted models. You can > find a model in the AlphaFold database at the EBI > (https://alphafold.ebi.ac.uk/entry/P01132). If you look at it, there are > parts (likely corresponding to the constructs that were crystallised) that > look confidently predicted, connected by poorly-predicted loops. If you take > the PDB file and the PAE matrix, you can run process_predicted_model either > from Phenix or CCP4, which will give you individual files for the confident > parts of the full prediction that are likely to have the correct relative > orientations. (If you want to use the models for molecular replacement, > you’ll find that the least-confident parts are downweighted by being assigned > high B-factors, which is much better than having the best parts of the > models, with pLDDT near 100, downweighted the most by interpreting pLDDT as a > B-factor.) I would bet that these models will be more accurate than typical > NMR models. > > Best wishes, > > Randy > Hi Harry > > First off I would look at quality metrics. For the structures from the same > paper you will need to check what the differences are in the paper and choose > what’s closest to what you have need (experimental conditions etc) assuming > they all have similar quality > > As a secondary priority you will want to look at number of restraints (and > especially in the region of interest) > > Note I would expect the more modern structures from cyana and aria to better > due to improved methodologies > > Regards > Gary > Hi, here is my small contribution. > > The answer to your question depends quite dramatically on the intended use. > If you want the "best" structure you might want to see how many restraints > per residue were used and if high-resolution restraints as RDCs had been used. > Recall that NMR structures are build using monomer libraries that are > seriously different from the x-ray ones. > > Best, > > > E. > > >> On 3 May 2023, at 11:45, Harry Powell >> <0000193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote: >> >> Hi folks >> >> I was wondering. >> >> If there is a UniProt entry (for example, P01132, but there are plenty of >> others) for which I want the “best” (whatever that might mean) >> representative _experimental_ structure (i.e. not one of these new-fangled >> predicted models that some folk say have removed the need for actually doing >> experiments), but there are only NMR models - how do I choose? >> >> I don’t mean “which model from the ensemble do I choose” - that’s a >> different question. >> >> For P01132, for example, I could choose (from the PDB) 1A3P, 1EGF, 1EPG, >> 1EPH, 1EPI, 1EPJ, 1GK5 or 3EGF. Note that some of these are from the same >> paper, so may be in different conditions (e.g. pH). All except the first >> (1A3P) cover the same bit of sequence. >> >> Specifically, what should I look for in the downloadable files (mmCIF, for >> example) from the PDB? >> >> Thoughts? >> >> Harry >> ######################################################################## >> >> To unsubscribe from the CCP4BB list, click the following link: >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 >> >> This message was issued to members of http://www.jiscmail.ac.uk/CCP4BB, a >> mailing list hosted by http://www.jiscmail.ac.uk/, terms & conditions are >> available at https://www.jiscmail.ac.uk/policyandsecurity/ > > ----- > Randy J. Read > Department of Haematology, University of Cambridge > Cambridge Institute for Medical Research Tel: +44 1223 336500 > The Keith Peters Building > Hills Road E-mail: > rj...@cam.ac.uk > Cambridge CB2 0XY, U.K. > www-structmed.cimr.cam.ac.uk > ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/