And if the actual residue ID is ambiguous ? UNK is exactly what you should use. There's also a distinction between getting it to work in the refinement program and having it properly annotated in PDB - e.g. I've encountered some monomer inconsistencies between Coot and Phenix.
The RCSB ligand page for UNK is strange and isn't going to reduce the confusion: https://www.rcsb.org/ligand/UNK but is consistent with them allowing Cgamma in UNK residues. But from: https://www.wwpdb.org/documentation/procedure "Use of UNX/UNL/UNK There are times when an amino acid residue, nucleotide, atom, or ligand is unidentified. These ligand codes should be used in the following cases: * UNX: unknown atom or ion * UNL: unknown ligand * UNK: unknown amino acid * N: unknown nucleotide UNX UNX is the code for one atom or ion, by itself, when author does not know the identity of that atom or ion. NOTE: The ligand name is UNX, but the atom name is UNK. The atom type is "X". UNL UNL is the code for unknown ligand. This is for where author has added atoms to the coordinates to satisfy the electron density but the true ligand identity is not known. . For example, see PDB entry 3MHO. UNK UNK is the code for unknown amino acid only. For example, a poly-ALA or poly-GLY chain would be processed as poly-UNK, if the author does not know how the coordinates align with the sequence and the residue numbering is arbitrary. The sequence would be poly UNK and the residues in coordinates would be listed as UNK. The sequence, if it is known, may be listed in the REMARK 999 and its mmCIF tokens. (If the authors do know the alignment of sequence and coordinates, the poly-ALA or poly-GLY residues should be changed to match the sequence). The atom names of UNK are N,CA,CB,CG,O,C, and the atom types are N,C,C,C,O,C. We are aware of issue regarding where atom passed CG but amino acid identity is not known, and the issues for the break in electron densi between UNK residues." So in the case I indicated - Nyv1 - I was using it in the way described above - I can see backbone but there's a 1 residue ambiguity as to the residue assignment. I "voted" by giving it a residue number but actual assignment was uncertain because the segment overlaps upon itself by symmetry, blurring the already unexceptional electron density. This is the only time I've used it but IMHO the correct usage - using an actual residue like ALA assigns the wrong​ sequence to the residue in most cases. I do not remember if UNK is successfully refined in phenix (I might have used ALA) but it would take you 10 seconds in a text editor to make the change if you were using PDB format. Phil Jeffrey Princeton ________________________________ From: CCP4 bulletin board <[email protected]> on behalf of Nicholas Larsen <[email protected]> Sent: Saturday, February 13, 2021 8:24 PM To: [email protected] <[email protected]> Subject: Re: [ccp4bb] Bug in mmCIF handling of UNK residues? I hope this doesn't confuse the discussion, but my understanding was "UNK" stood for "unknown" residue and this will cause errors. UNK naming convention is the default output of Schrodinger when generating ligand PDB files. Coot will display the PDB containing "UNK" as a residue, but if you try to use the CIF file to real-space refine, the ligand will blow up. I found that renaming the residue in the output PDB and regenerating the CIF file with the corrected RESID name solved the problem. So in my experience, the problem is the name "UNK" and this just needs to be switched to something else. Has anyone else seen this? Nick On Sat, Feb 13, 2021 at 4:29 PM Tristan Croll <[email protected]<mailto:[email protected]>> wrote: Browsing backwards through a dozen or so of the most recent UNK-containing structures, I haven't found a counter-example yet - apart from those where the UNK residues are a single contiguous stretch and given their own chain ID. So a recent problem? ________________________________ From: Philip D. Jeffrey <[email protected]<mailto:[email protected]>> Sent: 12 February 2021 19:56 To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>>; Tristan Croll <[email protected]<mailto:[email protected]>> Subject: Re: Bug in mmCIF handling of UNK residues? Doesn't seem to be the case for all instances: that table isn't present in 5BV0 despite the N-terminal residues of Nyv1 being modeled as UNK in the Vps16:Vps33:Nyv1 complex due to a symmetry overlap. Nyv1, C165-179 are UNK with partial occupancy, which is the N-terminal part of the model for that chain, then there's a gap of 3 missing residues, and then there's polypeptide model which we've assigned to sequence, all in the same chain. Not sure if the missing table is something that turned up after the deposition date (June 2015), or if it's related to the missing residues between the UNK segment and the defined amino acids. Phil Jeffrey Princeton ________________________________ From: CCP4 bulletin board <[email protected]<mailto:[email protected]>> on behalf of Tristan Croll <[email protected]<mailto:[email protected]>> Sent: Friday, February 12, 2021 1:03 PM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: [ccp4bb] Bug in mmCIF handling of UNK residues? Hi all, If I open (as far as I can tell) the mmCIF for any structure in the wwPDB that contains both defined amino acids and UNK in the same chain, then the UNK section is treated as covalently bonded to the flanking sequence. This appears to be a bug in the mmCIF generation itself, not in the viewing software (ChimeraX, in this case): if I look in 7kzn as a random example, I see: loop_ _pdbx_validate_polymer_linkage.id<http://pdbx_validate_polymer_linkage.id> _pdbx_validate_polymer_linkage.PDB_model_num _pdbx_validate_polymer_linkage.auth_atom_id_1 _pdbx_validate_polymer_linkage.auth_asym_id_1 _pdbx_validate_polymer_linkage.auth_comp_id_1 _pdbx_validate_polymer_linkage.auth_seq_id_1 _pdbx_validate_polymer_linkage.PDB_ins_code_1 _pdbx_validate_polymer_linkage.label_alt_id_1 _pdbx_validate_polymer_linkage.auth_atom_id_2 _pdbx_validate_polymer_linkage.auth_asym_id_2 _pdbx_validate_polymer_linkage.auth_comp_id_2 _pdbx_validate_polymer_linkage.auth_seq_id_2 _pdbx_validate_polymer_linkage.PDB_ins_code_2 _pdbx_validate_polymer_linkage.label_alt_id_2 _pdbx_validate_polymer_linkage.dist 1 1 C X UNK 345 ? ? N X UNK 348 ? ? 10.08 2 1 C X UNK 396 ? ? N X UNK 403 ? ? 28.65 3 1 C Y UNK 281 ? ? N Y UNK 284 ? ? 6.72 4 1 C Y UNK 387 ? ? N Y UNK 394 ? ? 22.26 5 1 C Y UNK 420 ? ? N Y UNK 424 ? ? 12.82 Considering that the coords themselves generally seem fine, I guess this is happening post deposition? Best regards, Tristan ________________________________ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ________________________________ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 -- Nick Larsen Director - Lead Discovery H3 Biomedicine | an Eisai Oncology Company 300 Technology Sq #5 Cambridge, MA 02139 https://www.h3biomedicine.com/ [This e-mail message may contain privileged, confidential and/or proprietary information of H3 Biomedicine. If you believe that it has been sent to you in error, please contact the sender immediately and delete the message including any attachments, without copying, using, or distributing any of the information contained therein. This e-mail message should not be interpreted to include a digital or electronic signature that can be used to authenticate an agreement, contract or other legal document, nor to reflect an intention to be bound to any legally-binding agreement or contract.] ________________________________ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
