Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]

Edward A. Berry Sun, 21 Jun 2015 19:13:46 -0700

  I can't imagine a journal doing that can you?  When I work on my
supplementary material in a paper I don't expect that the journal will
take a bit out and publish it separately to support the work of my
competitors. Not out of spite that I was beaten - but because I don't
want to take the responsibility for checking their science for them!


I don't see the problem here. What about the dozens of authors who
will benefit from using your ligand in their structure _after_ your
structure comes out? You don't take responsibility for checking their
science. Every author gets a copy of his final structure to check
before it is released and each is responsible for his own.
The only difference here is whether the competitor got to use it first,
(which might sting a bit) or only after you had already made it your
own with the first structure.

I guess the ligand database is the responsibility of the pdb, but
they depend on first depositors to help set up each ligand, so
it is not surprising if the type model has coordinates from the
first depositor's structure (although it would be convenient if
they were all moved to c.o.m. at 0,0,0). When another group publishes
a structure with the ligand, they will not be publishing the first
depositor's coordinates because the ligand will be moved to its position
in their structure and refined against their data, probably with
somewhat different restraints.

If the ligand is a top secret novel drug lead that your company is
developing I guess it would come as a shock to find someone else has
already deposited it, and it might be good to hasten not the
publication but protecting of the compound with a patent!

Although Miriam says a new 3-letter code is generated when no match is found,
I believe the depositor's code will be used if it is available,
at least one of mine was last year, so there is some use for Nigel's
utility if you want to stamp your new compound with a rememberable name.

eab

On 06/21/2015 06:33 PM, Martyn Symmons wrote:

Miri raises important points about issues in the PDB Chemical
Component Dictionary - I think part of the problem is that this is
published completely separately from the actual PDB - so for example I
don't think we have an archive of the CCD for comparison alongside the
PDB snapshots? This makes it difficult to follow the convoluted track
of particular ligands through the PDB's many,many changes to small
molecule definitions.

But following discussion with other contributors offline I want to
make it clear what is my understanding of the ZA3 (2Y2I /2Y59) case:

I am clear there was no unethical behaviour by either group in the
course of their work on these structures and the publication of them.

The problem I am highlighting is that the PDB don't understand
publishing ethics - what happened in ZA3 was that they published a
little bit of one group's work to support the work of someone who was
scooping them!

  I can't imagine a journal doing that can you?  When I work on my
supplementary material in a paper I don't expect that the journal will
take a bit out and publish it separately to support the work of my
competitors. Not out of spite that I was beaten - but because I don't
want to take the responsibility for checking their science for them!

All the best
   Martyn

Cambridge

On Sun, Jun 21, 2015 at 7:01 PM, Miri Hirshberg
<000002897e8e9f0f-dmarc-requ...@jiscmail.ac.uk> wrote:

Sun., June 21st 2015

Good evening,

adding several general points to the thread.

(1) Fundamentally PDB unlike other chemical databases
insists that all equal structures should have the same 3-letter
code and the same atom names - obviously for amino acids and say ATP.

  (1.1) Needless to say there are endless examples in the PDB of two
ligands differ by let say one hydroxyl group, where equivalent atoms in
the two ligands having totally different names.

(2) When a structure is deposited with a ligand, the ligand is first
compared against PDB chem_comp database (CCD) and against the on-hold
chem_comp (CCD) (naturally the latter is not publicly available),
and only if no-match can be found a new three-letter code  is generated
and assigned.

If not, then this is a mistake in annotation and should not happen.

(3) Exception to the above take several different flavours. This
include:

  (3.1) When the same ligand is described in PDB as a 3-letters-code
and as well as a combination of two different 3-letters-code ligands.
An example out of many is phosphoserine. The 3-letter-code
in PDB CCD is SEP which is used in 704 PDB entries (RCSB counting
21-June-2015). But in the PDB entry 3uw2 the phosphoserine 109A is
described as a combination of SER and the inorganic phosphate PO4 !!!
(a side point: note the inorganic PO4 became organic upon this linkage -
a PDB chemical conundrum!!).

  (3.2) CCDC does not make any attempt to standardise atom names nor to
match same structures to have equal atom names - original author atom
names are kept so that amino acids may have bizarre atom names and where
required symmetry atom names are generated - this is rare in the PDB but
not unknown, and the PDB is poor at completing atom/ligand names where
symmetry is required and in fact often is not completed in any chemical
reasonable sense as this would require changes in occupancy.

The simplest case is in racemic PDB entries where the symmetry generated
structure for say L-ALA should be the D-version DAL,
but PDB as is, has not coped with it, as it would require two sets of
coordinates each at say 1/2 occupancy (usually).

One of several examples in the PDB archive is pdb entry 3e7r. The
Xray structure of Racemic Plectasin. The entry consists of one protein
chain, in SPG P-1.

In the manuscript
http://onlinelibrary.wiley.com/doi/10.1002/pro.127/pdf

Figure 3a, for example shows Crystal packing.
(a) Centrosymmetric P-1 unit cell. The
L-plectasin molecule is shown in blue and the
D-plectasin molecule is in gold.

But if you use the PDB entry, and the symmetry operator of P-1
to generate the two symmetry related mates in the unit cell
you will get a chain with L- naming residues
GLY-PHE-GLY-CYS-ASN-GLY-PRO-........ etc
representing D- amino acids.
(GLY is a special case).

  (3.3) There is also the problem in assigning a 3-letter code where the
submission has obviously assigned the wrong chirality. One example is a
where the sugar must be NAG but is assigned NGA in a
glycopeptide where NGA is impossible - the PDB should have assigned NAG
with a CAVEAT that the chirality is incorrect. Note, re-refinement by
other software will require a bond-breakage.
NGA is used in 90 entries (RCSB counting 21-June-2015)

regards Miri

From: Yong Wang <wang_yon...@lilly.com>
Reply-to: Yong Wang <wang_yon...@lilly.com>
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)
Date: Sat, 20 Jun 2015 18:36:34 +0000

Sharing a ligand name should only be limited to having the same compound, i.e. same 2D 
structure or connectivity.  Each deposition should have its own 3D coordinates.  If a 
different publication gets your ligand 3D coordinates ("2Y59 actually embodies the 
atomic coordinates from the 2Y2I"), that looks to me an oversight by PDB.  It is 
hard to believe that PDB intended to use the 3D coordinates from one entry for the other, 
ligand or not.  In fact, the restraints as described by the ligand dictionary should also 
be kept separate as that reflects how the authors refine their ligand.

Yong

-----Original Message-----
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Martyn 
Symmons
Sent: Friday, June 19, 2015 8:39 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)

By oversimplifying the situation here the PDB does not answer my related point 
about competing crystallographers:
My scenario:

Group A deposits structure with new drug - gets their three-letter code for 
example ZA3  they then get to check the coordinates and chemical definition of 
this ligand.

But suppose a little after that a competing group B deposits their structure 
with the same drug which they think is novel - but no...
they get assigned the now described ZA3 which has been checked by the other 
group.

  Then it is a race to see who gets to publish and release first. And if it is 
the second group B who wins then they are publishing the work of their A 
competitors - who have done the depositing and checking of the ligand  
description.

  Sounds unlikely? Well, it actually happened in 2011 for my exact example ZA3 
- present in 2Y2I and in 2Y59 from competing groups.

  From the dates in the mmcif it was 2Y2I depositors who set up and had a 
chance to review the description of ZA3 ligand. Only to see it released a week 
before their crystal structure, when their ZA3 appeared to accompany competing 
2Y59! It is amazing that the PDB did not spot this and arrange a suitable 
workaround.

Just to check:
mmcif for ZA3 shows it was created for 2Y2I:
...
_chem_comp.pdbx_model_coordinates_db_code        2Y2I
...
But it was modified for release:
...
_chem_comp.pdbx_modified_date                    2011-07-22
...
corresponding to the early 2011-07-27 release date of the competing
structure: 2Y59 even though this PDB was  _deposited_ second.

The ZA3 ligand definition released with 2Y59 actually embodies the atomic 
coordinates from the 2Y2I structure:

<mmcif>
ZA3 O6   O6   O 0  1 N N N 8.279  7.165  40.963 0.311  -1.061 -0.920
O6   ZA3 1
ZA3 C5   C5   C 0  1 N N N 9.132  8.047  40.908 0.147  -0.205 -0.073
C5   ZA3 2  ...
<PDB 2Y2I>
HETATM 3598  O6  ZA3 A1000       8.279   7.165  40.963  1.00 41.25           O
HETATM 3599  C5  ZA3 A1000       9.132   8.047  40.908  1.00 63.20
       C ...

Surely a better approach would be to allow both groups a chance to work through 
and sign off on independent ligand descriptions?

Then whoever releases first would release both a novel structure and the ligand 
definition _they_ deposited and checked. Their priority can then be asserted 
and the other group contacted to ask if they agree to accept this definition. 
This also has the advantage of better confidentiality pre-publication.

Another problem from any cross-linking of definitions is that say group A are 
motivated by reviewers' reports to change the definition of ZA3 pre-release. 
Well now the change impinges on the chemical meaning of other group B's 
deposited structure. For example ZA3 mmcif has a statement:

ZA3 "Modify aromatic_flag" 2011-06-04 RCSB

so this change was pre-release - but we cannot be sure what motivated this - 
whether it was signed off by the 2Y2I authors or the 2Y59 authors (or both?)....

With the accelerating pace of drug discovery for sure this sort of uncertainty 
is going to happen again.Unless the PDB have changed their practice for ligand 
deposition?

All the best
  Martyn

Cambridge.

On Fri, Jun 19, 2015 at 1:49 PM, Sheriff, Steven <steven.sher...@bms.com> wrote:

All:



Since the original query was cross-posted on both the COOT mailing
list and the CCP4BB Rachel Green gave me permission to forward this to
both. She provides links about the mechanism of assignment of 3-letter
codes. In the third link below, my original suggestion to the COOT
mailing list that one could just use UNK is incorrect as that is reserved for 
unknown amino acids.
According to this document, I should have suggested UNL for an unknown
ligand.



Steven



From: Rachel Kramer Green [mailto:kra...@rcsb.rutgers.edu]
Sent: Tuesday, June 16, 2015 10:21 AM
To: Sheriff, Steven
Cc: info
Subject: Re: New ligand 3-letter code (help-7071)



Dear Steven,

During annotation of ligands, all chemical components present in the
structure are compared against the definitions in the Chemical
Component Dictionary (http://www.wwpdb.org/data/ccd). If the ligand is
not in the dictionary, a three letter code is assigned. See
http://www.wwpdb.org/documentation/policy#toc_assignment.  In the
future, a group of three-letter codes may be set aside to be used
during refinement to flag new ligands.

Clarification about the ligand ids assignment and in particular the
usage of UNX/UNL/UNK residues can be found at
http://www.wwpdb.org/documentation/procedure#toc_2.

Best wishes,
Rachel



________________________________

Rachel Kramer Green, Ph.D.

RCSB PDB

kra...@rcsb.rutgers.edu



New! Deposit X-ray data with the wwPDB at:

http://deposit.wwpdb.org/deposition (NMR and 3DEM coming soon).

___________________________________________________________

Twitter: https://twitter.com/#!/buildmodels

Facebook: http://www.facebook.com/RCSBPDB







On 6/5/2015 7:50 AM, Sheriff, Steven wrote:

All:



Why the concern for unassigned three-letter codes? The wwPDB isn’t
going to let you assign a three-letter code, it will choose its own code.



At BMS (a pharmaceutical company), we do many hundreds of structures a
year with ligands and we assign the same, already assigned,
three-letter code for all of our ligands (unless we have two or more
different ligands in a single structure, in which case we use two or
more different already assigned three-letter codes).  COOT can mostly handle 
this.



However, I believe that if you want an unassigned code, the wwPDB has
set aside UNK[nown] for this purpose.



Steven



From: Mailing list for users of COOT Crystallographic Software
[mailto:c...@jiscmail.ac.uk] On Behalf Of Eleanor Dodson
Sent: Friday, June 05, 2015 6:28 AM
To: c...@jiscmail.ac.uk
Subject: Re: New ligand 3-letter code



I use your method - trial & error..

It would be nice if at least there was a list somewhere of unassigned codes!



On 5 June 2015 at 09:16, Lau Sze Yi (SIgN)
<lau_sze...@immunol.a-star.edu.sg> wrote:

Hi,



What is the proper way of generating 3-letter code for a new ligand?
As of now, I insert my ligand in Coot using smiles string and for the
3-letter code I picked a non-existent code by trial and error (not
very efficient). A cif file with corresponding name which I generated
using Phenix was imported into Coot.



I am sure there is a proper way of doing this. Appreciate your feedback.



Regards,

Sze Yi



________________________________

This message (including any attachments) may contain confidential,
proprietary, privileged and/or private information. The information is
intended to be for the use of the individual or entity designated
above. If you are not the intended recipient of this message, please
notify the sender immediately, and delete the message and any
attachments. Any disclosure, reproduction, distribution or other use
of this message or any attachments by an individual or entity other than the 
intended recipient is prohibited.



________________________________
This message (including any attachments) may contain confidential,
proprietary, privileged and/or private information. The information is
intended to be for the use of the individual or entity designated
above. If you are not the intended recipient of this message, please
notify the sender immediately, and delete the message and any
attachments. Any disclosure, reproduction, distribution or other use
of this message or any attachments by an individual or entity other than the 
intended recipient is prohibited.

Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]

Reply via email to