Thanks Paul,

BTW, I call it the tilde-hetnam extension of the PDB format:
https://gemmi.readthedocs.io/en/stable/mol.html#tilde-hetnam
"The tilde-hetnam extension addresses this issue: a long CCD code is
substituted with a 3-character alias that starts with a tilde (~); the
original code is stored in columns 72-79 of the HETNAM record."
But I don't think any other project decided to implement it. Although
programs that use gemmi library will handle such PDB files
automatically, though.

Marcin

On Thu, Nov 20, 2025 at 3:26 PM Paul Bond
<[email protected]> wrote:
>
> Hi Harry,
>
> If you need to produce PDB files from CIF files with 5-letter ligand codes 
> you can use gemmi convert, e.g.:
>
> gemmi convert gemmi convert 8XFM.cif 8XFM.pdb --shorten-tlc
>
> The original has a ligand called A1LU6. In the PDB file it gets a HET and 
> HETNAM entry:
>
> HET    ~U6  A 401      25     real CCD code: A1LU6
> HET    EDO  A 402       4
> HET     ZN  A 403       1
> HETNAM     ~U6                                                         A1LU6
>
> Then in the atom lines it is referred to by the new ID:
>
> HETATM 2209 CL   ~U6 A 401      38.210  38.396  16.715  1.00 77.22          CL
> HETATM 2210  C17 ~U6 A 401      38.401  39.551  15.377  1.00 54.65           C
> HETATM 2211  C16 ~U6 A 401      37.332  40.370  15.083  1.00 52.11           C
> HETATM 2212  C15 ~U6 A 401      37.438  41.271  14.038  1.00 51.08           C
> HETATM 2213  C14 ~U6 A 401      38.616  41.345  13.308  1.00 48.95           C
>
> If you convert it back to CIF with gemmi:
>
> gemmi convert 8XFM.pdb new.cif
>
> Then the resulting mmCIF file just uses the original A1LU6 naming.
>
> Cheers,
> Paul
>
> On Thu, 20 Nov 2025 at 12:54, Harry Powell 
> <[email protected]> wrote:
>>
>> Hi folks
>>
>> The real answer (before I ask the question) is, of course, to use mmCIF. But 
>> quite a few programs out there don’t read mmCIF files properly (if at all), 
>> so I’ll ask the question anyway.
>>
>> The venerable PDB format allows 3 columns (18-20) for the residue or ligand 
>> name, and this was fine as long as there were only chemical components 
>> allowed with 3 characters - but now there are loads that have 5 characters 
>> (e.g. A1B20 - see  
>> https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/A1B20).
>>
>> If I wanted to write a PDB format file for the ligand that had the 
>> 5-character code, what would be the “best” way to do it (of course, there 
>> are different definitions of “best”…)?
>>
>> What dirty trick would cause the least damage (at least, to common programs)?
>>
>> Allowing the ligand name to spread over columns 16-20 *might* work, but that 
>> would mean encroaching on the atom name column, and might cause confusion 
>> between things like calcium (“CA “) and carbons labelled with an “A” (“ CA”).
>>
>> Thoughts?
>>
>> Harry
>> ########################################################################
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
>> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
>> https://www.jiscmail.ac.uk/policyandsecurity/
>
>
> ________________________________
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to