Good afternoon both, there is also the issue of inconsistency of presentation.
For example, Lysine, that is L-Lysine (LYS) is protonated on the side chain nitrogen (NZ), whiles as D-lysine (DLY) is not. i.e. you have NZ(HZ1, HZ2) for DLY, and NZ(HZ1, HZ2, HZ3) for LYS Miri On Wed, 2015-06-24 at 13:35 +0100, Ian Tickle wrote: > Hi Ben > > > From discussions we have had with PDBe they consider tautomers to be > different compounds (just as stereoisomers would be considered to be > different compounds), since they require different restraint > dictionaries, so each tautomer that was observed would require a > unique 3-lettter code. Of course you still have to have evidence > (e.g. from the H-bonding pattern) that what you are really seeing are > different tautomers, but that's a different question. > > > Cheers > > > -- Ian > > > > On 24 June 2015 at 12:50, Ben Bax <benjamin.d....@gsk.com> wrote: > Another major problem with the PDB is that it does not seem to > believe in the existence of different tautomers or protonation > states. > > For example the ATP analogue AMPPNP can have the nitrogen > between the beta and gamma phosphates protonated (-P-NH-P) or > unprotonated (P-N=P), and there are well documented examples > of both tautomers in the PDB (NH being a hydrogen bond donor > and N a hydrogen bond acceptor). > If you look in the CSD you can see that the protonation state > of the nitrogen changes the geometry of the P-N-P bond. > > However, as I understand it, the PDB considers all tautomeric > (and protonated) forms of AMPPNP the same. When I tried to > deposit a specific AMPPNP tautomer in 2013, they would not > accept it. The PDB also seems to believe, as I understand it, > that the overall charge on AMPPNP is zero and that the > phosphates do not carry negative charge. > > > Ben Bax > Senior Scientific Investigator > BioMolecular Sciences UK > RD Platform Technology & Science > > GSK > Medicines Research Centre, Gunnels Wood Road, Stevenage, SG1 > 2NY, UK > Email benjamin.d....@gsk.com > Mobile +44 (0) 7912 600604 > Tel +44 (0) 1438 55 1156 > > gsk.com | Twitter | YouTube | Facebook | Flickr > > > > > > > > > > > > -----Original Message----- > From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On > Behalf Of Martyn Symmons > Sent: 22 June 2015 23:39 > To: CCP4BB@JISCMAIL.AC.UK > Subject: Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand > 3-letter code (help-7071)] > > Well the problem is there is a lot more to a ligand than PDB > coordinates - little things like bond orders... In addition > people can publish ligands with atoms for which they have no > density - so zero-occupancy is allowed too. So who should get > priority - the group who publishes a ligand first, or the ones > who actually have density for all the atoms? > > These sorts of complications mean we all benefit from > peer-review of the structure - that is why we put things on > hold. And authors should have a chance to change their ligand > definition based on reviewers' > comments - just as they are allowed to improve the PDB > coordinates. So it is a worry for them that the PDB might > 'publish' the ligand aspect of their work before they have > completed the peer-review process. > > Maybe you don't believe is peer-review - in reply to which I'd > paraphrase what people say about democracy - it's pretty bad > but better than the alternatives. > > But to return to the point I made: what really is the problem > with maintaining and modifying _separate_ definitions with > authors' > _separate_ deposited coordinates (and bond orders) while > structures are on hold and being reviewed? Journals manage to > keep all those submitted papers separate in their databases. > > cheers > M. > > On Mon, Jun 22, 2015 at 3:12 AM, Edward A. Berry > <ber...@upstate.edu> wrote: > >> I can't imagine a journal doing that can you? When I > work on my > >> supplementary material in a paper I don't expect that the > journal > >> will take a bit out and publish it separately to support > the work of > >> my competitors. Not out of spite that I was beaten - but > because I > >> don't want to take the responsibility for checking their > science for them! > > > > > > I don't see the problem here. What about the dozens of > authors who > > will benefit from using your ligand in their structure > _after_ your > > structure comes out? You don't take responsibility for > checking their > > science. Every author gets a copy of his final structure to > check > > before it is released and each is responsible for his own. > > The only difference here is whether the competitor got to > use it > > first, (which might sting a bit) or only after you had > already made it > > your own with the first structure. > > > > I guess the ligand database is the responsibility of the > pdb, but they > > depend on first depositors to help set up each ligand, so it > is not > > surprising if the type model has coordinates from the first > > depositor's structure (although it would be convenient if > they were > > all moved to c.o.m. at 0,0,0). When another group publishes > a > > structure with the ligand, they will not be publishing the > first > > depositor's coordinates because the ligand will be moved to > its > > position in their structure and refined against their data, > probably > > with somewhat different restraints. > > > > If the ligand is a top secret novel drug lead that your > company is > > developing I guess it would come as a shock to find someone > else has > > already deposited it, and it might be good to hasten not > the > > publication but protecting of the compound with a patent! > > > > Although Miriam says a new 3-letter code is generated when > no match is > > found, I believe the depositor's code will be used if it is > available, > > at least one of mine was last year, so there is some use for > Nigel's > > utility if you want to stamp your new compound with a > rememberable name. > > > > eab > > > > > > On 06/21/2015 06:33 PM, Martyn Symmons wrote: > >> > >> Miri raises important points about issues in the PDB > Chemical > >> Component Dictionary - I think part of the problem is that > this is > >> published completely separately from the actual PDB - so > for example > >> I don't think we have an archive of the CCD for comparison > alongside > >> the PDB snapshots? This makes it difficult to follow the > convoluted > >> track of particular ligands through the PDB's many,many > changes to > >> small molecule definitions. > >> > >> But following discussion with other contributors offline I > want to > >> make it clear what is my understanding of the ZA3 > (2Y2I /2Y59) case: > >> > >> I am clear there was no unethical behaviour by either group > in the > >> course of their work on these structures and the > publication of them. > >> > >> The problem I am highlighting is that the PDB don't > understand > >> publishing ethics - what happened in ZA3 was that they > published a > >> little bit of one group's work to support the work of > someone who was > >> scooping them! > >> > >> I can't imagine a journal doing that can you? When I > work on my > >> supplementary material in a paper I don't expect that the > journal > >> will take a bit out and publish it separately to support > the work of > >> my competitors. Not out of spite that I was beaten - but > because I > >> don't want to take the responsibility for checking their > science for them! > >> > >> All the best > >> Martyn > >> > >> Cambridge > >> > >> On Sun, Jun 21, 2015 at 7:01 PM, Miri Hirshberg > >> <000002897e8e9f0f-dmarc-requ...@jiscmail.ac.uk> wrote: > >>> > >>> Sun., June 21st 2015 > >>> > >>> Good evening, > >>> > >>> adding several general points to the thread. > >>> > >>> (1) Fundamentally PDB unlike other chemical databases > insists that > >>> all equal structures should have the same 3-letter code > and the same > >>> atom names - obviously for amino acids and say ATP. > >>> > >>> (1.1) Needless to say there are endless examples in the > PDB of two > >>> ligands differ by let say one hydroxyl group, where > equivalent atoms > >>> in the two ligands having totally different names. > >>> > >>> (2) When a structure is deposited with a ligand, the > ligand is first > >>> compared against PDB chem_comp database (CCD) and against > the > >>> on-hold chem_comp (CCD) (naturally the latter is not > publicly > >>> available), and only if no-match can be found a new > three-letter > >>> code is generated and assigned. > >>> > >>> If not, then this is a mistake in annotation and should > not happen. > >>> > >>> (3) Exception to the above take several different > flavours. This > >>> include: > >>> > >>> (3.1) When the same ligand is described in PDB as a > 3-letters-code > >>> and as well as a combination of two different > 3-letters-code ligands. > >>> An example out of many is phosphoserine. The 3-letter-code > in PDB > >>> CCD is SEP which is used in 704 PDB entries (RCSB > counting > >>> 21-June-2015). But in the PDB entry 3uw2 the phosphoserine > 109A is > >>> described as a combination of SER and the inorganic > phosphate PO4 !!! > >>> (a side point: note the inorganic PO4 became organic upon > this > >>> linkage - a PDB chemical conundrum!!). > >>> > >>> (3.2) CCDC does not make any attempt to standardise atom > names nor > >>> to match same structures to have equal atom names - > original author > >>> atom names are kept so that amino acids may have bizarre > atom names > >>> and where required symmetry atom names are generated - > this is rare > >>> in the PDB but not unknown, and the PDB is poor at > completing > >>> atom/ligand names where symmetry is required and in fact > often is > >>> not completed in any chemical reasonable sense as this > would require changes in occupancy. > >>> > >>> The simplest case is in racemic PDB entries where the > symmetry > >>> generated structure for say L-ALA should be the D-version > DAL, but > >>> PDB as is, has not coped with it, as it would require two > sets of > >>> coordinates each at say 1/2 occupancy (usually). > >>> > >>> One of several examples in the PDB archive is pdb entry > 3e7r. The > >>> Xray structure of Racemic Plectasin. The entry consists of > one > >>> protein chain, in SPG P-1. > >>> > >>> In the manuscript > >>> http://onlinelibrary.wiley.com/doi/10.1002/pro.127/pdf > >>> > >>> Figure 3a, for example shows Crystal packing. > >>> (a) Centrosymmetric P-1 unit cell. The L-plectasin > molecule is shown > >>> in blue and the D-plectasin molecule is in gold. > >>> > >>> But if you use the PDB entry, and the symmetry operator of > P-1 to > >>> generate the two symmetry related mates in the unit cell > you will > >>> get a chain with L- naming residues > >>> GLY-PHE-GLY-CYS-ASN-GLY-PRO-........ etc representing D- > amino > >>> acids. > >>> (GLY is a special case). > >>> > >>> (3.3) There is also the problem in assigning a 3-letter > code where > >>> the submission has obviously assigned the wrong chirality. > One > >>> example is a where the sugar must be NAG but is assigned > NGA in a > >>> glycopeptide where NGA is impossible - the PDB should have > assigned > >>> NAG with a CAVEAT that the chirality is incorrect. Note, > >>> re-refinement by other software will require a > bond-breakage. > >>> NGA is used in 90 entries (RCSB counting 21-June-2015) > >>> > >>> regards Miri > >>> > >>> > >>> > >>> > >>>>> From: Yong Wang <wang_yon...@lilly.com> > >>>>> Reply-to: Yong Wang <wang_yon...@lilly.com> > >>>>> To: CCP4BB@JISCMAIL.AC.UK > >>>>> Subject: Re: [ccp4bb] FW: New ligand 3-letter code > (help-7071) > >>>>> Date: Sat, 20 Jun 2015 18:36:34 +0000 > >>>>> > >>>>> Sharing a ligand name should only be limited to having > the same > >>>>> compound, i.e. same 2D structure or connectivity. Each > deposition > >>>>> should have its own 3D coordinates. If a different > publication > >>>>> gets your ligand 3D coordinates ("2Y59 actually embodies > the > >>>>> atomic coordinates from the 2Y2I"), that looks to me an > oversight > >>>>> by PDB. It is hard to believe that PDB intended to use > the 3D > >>>>> coordinates from one entry for the other, ligand or not. > In fact, > >>>>> the restraints as described by the ligand dictionary > should also be kept separate as that reflects how the authors > refine their ligand. > >>>>> > >>>>> Yong > >>>>> > >>>>> -----Original Message----- > >>>>> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] > On Behalf > >>>>> Of Martyn Symmons > >>>>> Sent: Friday, June 19, 2015 8:39 PM > >>>>> To: CCP4BB@JISCMAIL.AC.UK > >>>>> Subject: Re: [ccp4bb] FW: New ligand 3-letter code > (help-7071) > >>>>> > >>>>> By oversimplifying the situation here the PDB does not > answer my > >>>>> related point about competing crystallographers: > >>>>> My scenario: > >>>>> > >>>>> Group A deposits structure with new drug - gets their > three-letter > >>>>> code for example ZA3 they then get to check the > coordinates and > >>>>> chemical definition of this ligand. > >>>>> > >>>>> But suppose a little after that a competing group B > deposits their > >>>>> structure with the same drug which they think is novel - > but no... > >>>>> they get assigned the now described ZA3 which has been > checked by > >>>>> the other group. > >>>>> > >>>>> Then it is a race to see who gets to publish and > release first. > >>>>> And if it is the second group B who wins then they are > publishing > >>>>> the work of their A competitors - who have done the > depositing and > >>>>> checking of the ligand description. > >>>>> > >>>>> Sounds unlikely? Well, it actually happened in 2011 > for my exact > >>>>> example ZA3 - present in 2Y2I and in 2Y59 from competing > groups. > >>>>> > >>>>> From the dates in the mmcif it was 2Y2I depositors who > set up > >>>>> and had a chance to review the description of ZA3 > ligand. Only to > >>>>> see it released a week before their crystal structure, > when their > >>>>> ZA3 appeared to accompany competing 2Y59! It is amazing > that the > >>>>> PDB did not spot this and arrange a suitable workaround. > >>>>> > >>>>> Just to check: > >>>>> mmcif for ZA3 shows it was created for 2Y2I: > >>>>> ... > >>>>> _chem_comp.pdbx_model_coordinates_db_code 2Y2I > >>>>> ... > >>>>> But it was modified for release: > >>>>> ... > >>>>> _chem_comp.pdbx_modified_date > 2011-07-22 > >>>>> ... > >>>>> corresponding to the early 2011-07-27 release date of > the > >>>>> competing > >>>>> structure: 2Y59 even though this PDB was _deposited_ > second. > >>>>> > >>>>> The ZA3 ligand definition released with 2Y59 actually > embodies the > >>>>> atomic coordinates from the 2Y2I structure: > >>>>> > >>>>> <mmcif> > >>>>> ZA3 O6 O6 O 0 1 N N N 8.279 7.165 40.963 0.311 > -1.061 -0.920 > >>>>> O6 ZA3 1 > >>>>> ZA3 C5 C5 C 0 1 N N N 9.132 8.047 40.908 0.147 > -0.205 -0.073 > >>>>> C5 ZA3 2 ... > >>>>> <PDB 2Y2I> > >>>>> HETATM 3598 O6 ZA3 A1000 8.279 7.165 40.963 > 1.00 41.25 > >>>>> O > >>>>> HETATM 3599 C5 ZA3 A1000 9.132 8.047 40.908 > 1.00 63.20 > >>>>> C ... > >>>>> > >>>>> Surely a better approach would be to allow both groups a > chance to > >>>>> work through and sign off on independent ligand > descriptions? > >>>>> > >>>>> Then whoever releases first would release both a novel > structure > >>>>> and the ligand definition _they_ deposited and checked. > Their > >>>>> priority can then be asserted and the other group > contacted to ask > >>>>> if they agree to accept this definition. This also has > the > >>>>> advantage of better confidentiality pre-publication. > >>>>> > >>>>> Another problem from any cross-linking of definitions is > that say > >>>>> group A are motivated by reviewers' reports to change > the > >>>>> definition of ZA3 pre-release. Well now the change > impinges on the > >>>>> chemical meaning of other group B's deposited structure. > For example ZA3 mmcif has a statement: > >>>>> > >>>>> ZA3 "Modify aromatic_flag" 2011-06-04 RCSB > >>>>> > >>>>> so this change was pre-release - but we cannot be sure > what > >>>>> motivated this - whether it was signed off by the 2Y2I > authors or > >>>>> the 2Y59 authors (or both?).... > >>>>> > >>>>> With the accelerating pace of drug discovery for sure > this sort of > >>>>> uncertainty is going to happen again.Unless the PDB have > changed > >>>>> their practice for ligand deposition? > >>>>> > >>>>> All the best > >>>>> Martyn > >>>>> > >>>>> Cambridge. > >>>>> > >>>>> On Fri, Jun 19, 2015 at 1:49 PM, Sheriff, Steven > >>>>> <steven.sher...@bms.com> wrote: > >>>>>> > >>>>>> All: > >>>>>> > >>>>>> > >>>>>> > >>>>>> Since the original query was cross-posted on both the > COOT > >>>>>> mailing list and the CCP4BB Rachel Green gave me > permission to > >>>>>> forward this to both. She provides links about the > mechanism of > >>>>>> assignment of 3-letter codes. In the third link below, > my > >>>>>> original suggestion to the COOT mailing list that one > could just > >>>>>> use UNK is incorrect as that is reserved for unknown > amino acids. > >>>>>> According to this document, I should have suggested UNL > for an > >>>>>> unknown ligand. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Steven > >>>>>> > >>>>>> > >>>>>> > >>>>>> From: Rachel Kramer Green > [mailto:kra...@rcsb.rutgers.edu] > >>>>>> Sent: Tuesday, June 16, 2015 10:21 AM > >>>>>> To: Sheriff, Steven > >>>>>> Cc: info > >>>>>> Subject: Re: New ligand 3-letter code (help-7071) > >>>>>> > >>>>>> > >>>>>> > >>>>>> Dear Steven, > >>>>>> > >>>>>> During annotation of ligands, all chemical components > present in > >>>>>> the structure are compared against the definitions in > the > >>>>>> Chemical Component Dictionary > (http://www.wwpdb.org/data/ccd). If > >>>>>> the ligand is not in the dictionary, a three letter > code is > >>>>>> assigned. See > >>>>>> > http://www.wwpdb.org/documentation/policy#toc_assignment. In > the > >>>>>> future, a group of three-letter codes may be set aside > to be used during refinement to flag new ligands. > >>>>>> > >>>>>> Clarification about the ligand ids assignment and in > particular > >>>>>> the usage of UNX/UNL/UNK residues can be found at > >>>>>> http://www.wwpdb.org/documentation/procedure#toc_2. > >>>>>> > >>>>>> Best wishes, > >>>>>> Rachel > >>>>>> > >>>>>> > >>>>>> > >>>>>> ________________________________ > >>>>>> > >>>>>> Rachel Kramer Green, Ph.D. > >>>>>> > >>>>>> RCSB PDB > >>>>>> > >>>>>> kra...@rcsb.rutgers.edu > >>>>>> > >>>>>> > >>>>>> > >>>>>> New! Deposit X-ray data with the wwPDB at: > >>>>>> > >>>>>> http://deposit.wwpdb.org/deposition (NMR and 3DEM > coming soon). > >>>>>> > >>>>>> > ___________________________________________________________ > >>>>>> > >>>>>> Twitter: https://twitter.com/#!/buildmodels > >>>>>> > >>>>>> Facebook: http://www.facebook.com/RCSBPDB > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 6/5/2015 7:50 AM, Sheriff, Steven wrote: > >>>>>> > >>>>>> All: > >>>>>> > >>>>>> > >>>>>> > >>>>>> Why the concern for unassigned three-letter codes? The > wwPDB > >>>>>> isn’t going to let you assign a three-letter code, it > will choose > >>>>>> its own code. > >>>>>> > >>>>>> > >>>>>> > >>>>>> At BMS (a pharmaceutical company), we do many hundreds > of > >>>>>> structures a year with ligands and we assign the same, > already > >>>>>> assigned, three-letter code for all of our ligands > (unless we > >>>>>> have two or more different ligands in a single > structure, in > >>>>>> which case we use two or more different already > assigned > >>>>>> three-letter codes). COOT can mostly handle this. > >>>>>> > >>>>>> > >>>>>> > >>>>>> However, I believe that if you want an unassigned code, > the wwPDB > >>>>>> has set aside UNK[nown] for this purpose. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Steven > >>>>>> > >>>>>> > >>>>>> > >>>>>> From: Mailing list for users of COOT Crystallographic > Software > >>>>>> [mailto:c...@jiscmail.ac.uk] On Behalf Of Eleanor > Dodson > >>>>>> Sent: Friday, June 05, 2015 6:28 AM > >>>>>> To: c...@jiscmail.ac.uk > >>>>>> Subject: Re: New ligand 3-letter code > >>>>>> > >>>>>> > >>>>>> > >>>>>> I use your method - trial & error.. > >>>>>> > >>>>>> It would be nice if at least there was a list somewhere > of > >>>>>> unassigned codes! > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 5 June 2015 at 09:16, Lau Sze Yi (SIgN) > >>>>>> <lau_sze...@immunol.a-star.edu.sg> wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> > >>>>>> > >>>>>> What is the proper way of generating 3-letter code for > a new ligand? > >>>>>> As of now, I insert my ligand in Coot using smiles > string and for > >>>>>> the 3-letter code I picked a non-existent code by trial > and error > >>>>>> (not very efficient). A cif file with corresponding > name which I > >>>>>> generated using Phenix was imported into Coot. > >>>>>> > >>>>>> > >>>>>> > >>>>>> I am sure there is a proper way of doing this. > Appreciate your > >>>>>> feedback. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Regards, > >>>>>> > >>>>>> Sze Yi > >>>>>> > >>>>>> > >>>>>> > >>>>>> ________________________________ > >>>>>> > >>>>>> This message (including any attachments) may contain > >>>>>> confidential, proprietary, privileged and/or private > information. > >>>>>> The information is intended to be for the use of the > individual > >>>>>> or entity designated above. If you are not the intended > recipient > >>>>>> of this message, please notify the sender immediately, > and delete > >>>>>> the message and any attachments. Any disclosure, > reproduction, > >>>>>> distribution or other use of this message or any > attachments by > >>>>>> an individual or entity other than the intended > recipient is prohibited. > >>>>>> > >>>>>> > >>>>>> > >>>>>> ________________________________ > >>>>>> This message (including any attachments) may contain > >>>>>> confidential, proprietary, privileged and/or private > information. > >>>>>> The information is intended to be for the use of the > individual > >>>>>> or entity designated above. If you are not the intended > recipient > >>>>>> of this message, please notify the sender immediately, > and delete > >>>>>> the message and any attachments. Any disclosure, > reproduction, > >>>>>> distribution or other use of this message or any > attachments by > >>>>>> an individual or entity other than the intended > recipient is prohibited. > >> > >> > > > > > > ________________________________ > > This e-mail was sent by GlaxoSmithKline Services Unlimited > (registered in England and Wales No. 1047315), which is a > member of the GlaxoSmithKline group of companies. The > registered address of GlaxoSmithKline Services Unlimited > is 980 Great West Road, Brentford, Middlesex TW8 9GS. > > >