This suggestion violates a basic principle of data base theory. A
single data item cannot encode two pieces of information. The whole
structure of CIF falls apart if this is done.
Does the new PDB convention contain a CIF record of the link that
bridges between the protein chain and the, now separated, glycan chain?
If not, I think this is the principle failing of their new scheme.
Dale Tronrud
On 12/4/2020 12:06 AM, Tristan Croll wrote:
> To go one step further: in large, heavily glycosylated multi-chain complexes the assignment of a random new chain ID to each glycan will lead to headaches for people building visualisations using existing viewers, because it loses the easy name-based association of glycan to parent protein chain. A suggestion: why not take full advantage of the mmCIF capability for multi-character chain IDs, and name them by appending characters to the parent chain ID? Using chain A as an example, perhaps the glycans could become Ag1, Ag2, etc.?
>
>> On 4 Dec 2020, at 07:48, Luca Jovine <luca.jov...@ki.se> wrote:
>>
>> CC: pdb-l
>>
>> Dear Zhijie and Robbie,
>>
>> I agree with both of you that the new carbohydrate chain assignment convention that has been recently adopted by PDB introduces confusion, not just for PDB-REDO but also - and especially - for end users.
>>
>> Could we kindly ask PDB to improve consistency by either assigning a separate chain to all covalently attached carbohydrates (regardless of whether one or more residues have been traced), or reverting to the old system (where N-/O-glycans inherited the same chain ID of the protein to which they are attached)? The current hybrid solution hardly seems optimal...
>>
>> Best regards,
>>
>> Luca
>>
>>> On 3 Dec 2020, at 20:17, Robbie Joosten <robbie_joos...@hotmail.com> wrote:
>>>
>>> Dear Zhijie,
>>>
>>> In generally I like the treatment of carbohydrates now as branched polymers. I didn't realise there was an exception. It makes sense for unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as these might change during model building or, in my case, carbohydrate rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.
>>>
>>> Cheers,
>>> Robbie
>>>
>>>> -----Original Message-----
>>>> From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of Zhijie Li
>>>> Sent: Thursday, December 3, 2020 19:52
>>>> To: CCP4BB@JISCMAIL.AC.UK
>>>> Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
>>>> PDB -- N-glycans are now separate chains if more than one residue
>>>>
>>>> Hi all,
>>>>
>>>> I was confused when I saw mysterious new glycan chains emerging during
>>>> PDB deposition and spent quite some time trying to find out what was
>>>> wrong with my coordinates. Then it occurred to me that a lot of recent
>>>> structures also had tens of N-glycan chains. Finally I realized that this
>>>> phenomenon is a consequence of this PDB policy announced here in July.
>>>>
>>>>
>>>> For future depositors who might also get puzzled, let's put it in a short
>>>> sentence: O- and N-glycans are now separate chains if it they contain more
>>>> than one residue; single residues remain with the protein chain.
>>>>
>>>>
>>>> https://eur01.safelinks.protection.outlook.com/?url=""
>>>>
>>>> "Oligosaccharide molecules are classified as a new entity type, branched,
>>>> assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF
>>>> category introduced to define the type of branching
>>>> (_pdbx_entity_branch.type) . "
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I found the differential treatment of single-residue glycans and multi-residue
>>>> glycans not only bit lack of aesthetics but also misleading. When a structure
>>>> contains both NAG-NAG... and single NAG on N-glycosylation sites, it might
>>>> be because of lack of density for building more residues, or because that
>>>> some of the glycosylation sites are now indeed single NAGs (endoH etc.)
>>>> while some others are not cleaved due to accessibility issues. Leaving NAGs
>>>> on the protein chain while assigning NAG-NAG... to a new chain, feels like
>>>> suggesting something about their true oligomeric state.
>>>>
>>>>
>>>> For example, for cryoEM structures, when one only builds a single NAG at a
>>>> site does not necessarily mean that the protein was treated by endoH. In
>>>> fact all sites are extended to at least tri-Man in most cases. Then why
>>>> keeping some sites associated with the protein chain while others kicked
>>>> out?
>>>>
>>>> Zhijie
>>>>
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of John
>>>> Berrisford <j...@ebi.ac.uk>
>>>> Sent: Thursday, July 9, 2020 4:39 AM
>>>> To: CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK>
>>>> Subject: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB
>>>>
>>>>
>>>> Dear CCP4BB
>>>>
>>>> PDB data will shortly incorporate a new data representation for
>>>> carbohydrates in PDB entries and reference data that improves the
>>>> Findability and Interoperability of these molecules in macromolecular
>>>> structures. In order to remediate and improve the representation of
>>>> carbohydrates across the archive, the wwPDB has:
>>>>
>>>> *standardized Chemical Component Dictionary nomenclature
>>>> following IUPAC-IUBMB recommendations
>>>> *provided uniform representation for oligosaccharides
>>>> *adopted Glycoscience-community commonly used linear descriptors
>>>> using community tools
>>>> *annotated glycosylation sites in PDB structures
>>>>
>>>> Starting July 29, 2020, users will be able to access the improved data via FTP
>>>> or wwPDB partner websites. Detailed information about this project is
>>>> available at the wwPDB website
>>>> <https://eur01.safelinks.protection.outlook.com/?url="" ; lists
>>>> of impacted entries and chemical components will be published on this page
>>>> after data release.
>>>>
>>>> The wwPDB has created a new ‘branched’ entity representation for
>>>> polysaccharides, describing all the individual monosaccharide components of
>>>> these in the PDB entry. As part of this process, we have standardized atom
>>>> nomenclature of >1,000 monosaccharides in the Chemical Component
>>>> Dictionary (CCD) and applied a branched entity representation to
>>>> oligosaccharides for >8000 PDB entries. To guarantee unambiguous chemical
>>>> description of oligosaccharides in the affected PDB entries, an explicit
>>>> description of covalent linkage information between their monosaccharide
>>>> units is included. In addition, wwPDB validation reports provide consistent
>>>> representation for these oligosaccharides and include 2D representations
>>>> based on the Symbol Nomenclature for Glycans (SNFG).
>>>>
>>>> To support the remediation of carbohydrate representation, software tools
>>>> providing linear descriptors were developed in collaboration with the
>>>> glycoscience community to enable easy translation of PDB data to other
>>>> representations commonly used by glycobiologists. These include Condense
>>>> IUPAC from GMML <https://eur01.safelinks.protection.outlook.com/?url="" at University
>>>> of Georgia, WURCS <https://eur01.safelinks.protection.outlook.com/?url="" from
>>>> PDB2Glycan at The Noguchi Institute, Japan, and LINUCS
>>>> <https://eur01.safelinks.protection.outlook.com/?url="" from pdb-care at
>>>> Germany.
>>>>
>>>> Furthermore, to ensure continued Findability of 118 common
>>>> oligosaccharides (e.g., sucrose, Lewis Y antigen), we have expanded the
>>>> Biologically Interesting molecule Reference Dictionary (BIRD
>>>> <https://eur01.safelinks.protection.outlook.com/?url="" ) that contains the covalent linkage
>>>> information and common synonyms for such molecules.
>>>>
>>>> wwPDB has also used this opportunity to improve the organization of
>>>> chemical synonyms in the CCD by introducing a new
>>>> _pdbx_chem_comp_synonyms data category. This will enable more
>>>> comprehensive capture of alternative names for small molecules in the PDB.
>>>> To minimize disruption to users, the legacy data item,
>>>> _chem_comp.pdbx_synonyms, will be retained for a transition period
>>>> through 2021.
>>>>
>>>> The carbohydrate remediation project is a wwPDB collaborative project that
>>>> is carried out principally by RCSB PDB <https://eur01.safelinks.protection.outlook.com/?url="" at Rutgers,
>>>> The State University of New Jersey and is funded by NIGMS grant U01
>>>> CA221216 in collaboration with Complex Carbohydrate Research Center
>>>> <https://eur01.safelinks.protection.outlook.com/?url="" at the University of Georgia.
>>>>
>>>> If you have any comments or queries regarding the changes to carbohydrate
>>>> representation, please visit the wwPDB website
>>>> <https://eur01.safelinks.protection.outlook.com/?url="" or
>>>> contact us at deposit-h...@mail.wwpdb.org <mailto:deposit-
>>>> h...@mail.wwpdb.org> .
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>>
>>>>
>>>> John
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> John Berrisford
>>>>
>>>> PDBe
>>>>
>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>
>>>> European Molecular Biology Laboratory
>>>>
>>>> Wellcome Trust Genome Campus
>>>>
>>>> Hinxton
>>>>
>>>> Cambridge CB10 1SD UK
>>>>
>>>> Tel: +44 1223 492529
>>>>
>>>>
>>>>
>>>> https://eur01.safelinks.protection.outlook.com/?url="" <https://eur01.safelinks.protection.outlook.com/?url=""
>>>>
>>>> https://eur01.safelinks.protection.outlook.com/?url=""
>>>> <https://eur01.safelinks.protection.outlook.com/?url=""
>>>>
>>>> https://eur01.safelinks.protection.outlook.com/?url="" <https://eur01.safelinks.protection.outlook.com/?url=""
>>>>
>>>>
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> To unsubscribe from the CCP4BB list, click the following link:
>>>> https://eur01.safelinks.protection.outlook.com/?url=""
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> To unsubscribe from the CCP4BB list, click the following link:
>>>> https://eur01.safelinks.protection.outlook.com/?url=""
>>>
>>>
>>> ########################################################################
>>>
>>> To unsubscribe from the CCP4BB list, click the following link:
>>> https://eur01.safelinks.protection.outlook.com/?url=""
>>>
>>> This message was issued to members of https://eur01.safelinks.protection.outlook.com/?url="" a mailing list hosted by https://eur01.safelinks.protection.outlook.com/?url="" terms & conditions are available at https://eur01.safelinks.protection.outlook.com/?url=""
>>
>>
>>
>> När du skickar e-post till Karolinska Institutet (KI) innebär detta att KI kommer att behandla dina personuppgifter. Här finns information om hur KI behandlar personuppgifter<https://ki.se/medarbetare/integritetsskyddspolicy>.
>>
>>
>> Sending email to Karolinska Institutet (KI) will result in KI processing your personal data. You can read more about KI’s processing of personal data here<https://ki.se/en/staff/data-protection-policy>.
>>
>> ########################################################################
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
>
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1