Dear Jasmine,

Thank you for contributing to this thread.

This has been asked in a different way, but can we simply assume at this point that the mmCIF/PDB records will no longer contain any or separate chain ID-like item that reflects chains including proteins and their glycans, as has been the universal practice in glycoprotein structural biology for decades before.

In other words, as one usually needs to select chain A and all its glycans in visualization software (e.g., using select chain A in PyMOL, which will no longer work), will we need distance-cutoff based selections to define new selections to work with or use scripts to interpret connectivity records?

Not a big deal; still a single-liner in PyMOL. But I'd rather not do it every time, if PyMOL or other software can be convinced to read a separate chain-ID like item from mmCIF files that allows us to select those actual chains easily using those identifiers.

Thank you,

Engin


On 12/10/20 1:47 PM, Jasmine Young wrote:
Dear Marcin,

The cif item, _pdbx_branch_scheme.pdb_asym_id, in the pdbx_branch_scheme category is a pointer to _atom_site.auth_asym_id in the atom_site category (I know this is confusing). The labels are consistently defined as the ones in _pdbx_poly_seq_scheme and _pdbx_nonpoly_scheme.

To use the wwPDB-assigned chain ID in publications, _atom_site.auth_seq_id _atom_site.auth_comp_id, and _atom_site.auth_asym_id can be used for the residue number, residue ID, and chain ID, respectively.


Regards,

Jasmine

===========================================================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank
Research Professor
Institute for Quantitative Biomedicine
Rutgers, The State University of New Jersey
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087

Email: jasm...@rcsb.rutgers.edu
Phone: (848)445-0103 ext 4920
Fax: (732)445-4320
===========================================================

On 12/9/20 4:31 PM, Marcin Wojdyr wrote:
Dear Jasmine,

thank you for this explanation. It's the best explanation of this
remediation I've read.

The use of IDs may confuse people, so I'd like to reiterate it and ask
for clarification.
Every residue in the mmCIF format has three (3) independent chain IDs
assigned to it (and three sequence numbers, and three residue names).

In your example:
J 4 NAG 1 I NAG 1 A NAG 1310 n

J - asym_id = _atom_site.label_asym_id
I - pdb_asym_id = _atom_site.auth_asym_id (?!)
A - auth_asym_id = n/a

(correct me if I got it wrong, but I see that _atom_site.auth_asym_id
corresponds to _pdbx_branch_scheme.pdb_asym_id and not to auth_asym_id
as one could expect).

How to call these chain IDs? When I write software documentation, I
need to refer to chain IDs (and sequence numbers), but I can't find
proper words to clearly tell which ID I'm referring to. I was using
hard to read names such as auth_asym_id, but now I see that even this
is ambiguous.

BTW, when you write that wwPDB encourages depositors to use the
wwPDB-assigned chain ID in publications, which of the two
wwPDB-assigned chain IDs do you mean?

Thank you,
Marcin

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/

--
Engin Özkan, Ph.D.
Assistant Professor
Dept of Biochemistry and Molecular Biology
University of Chicago
Phone: (773) 834-5498
http://voices.uchicago.edu/ozkanlab

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to