First off, many thanks to those who responded on- and off-BB.

This kind of approach seems to be the best way forward. I’ll probably avoid 
using “~" in the names, because I suspect that many older programs will barf on 
them, but using HETNAM & HET should make my  life relatively easy (and is 
documented at https://www.wwpdb.org/documentation/file-format). 

Since I’m not working from mmCIF files I won’t use Gemmi (or is that “gemmi”?) 
but this does give me what should be a robust way forward.

best wishes

Harry

> On 20 Nov 2025, at 14:51, Marcin Wojdyr 
> <[email protected]> wrote:
> 
> Thanks Paul,
> 
> BTW, I call it the tilde-hetnam extension of the PDB format:
> https://gemmi.readthedocs.io/en/stable/mol.html#tilde-hetnam
> "The tilde-hetnam extension addresses this issue: a long CCD code is
> substituted with a 3-character alias that starts with a tilde (~); the
> original code is stored in columns 72-79 of the HETNAM record."
> But I don't think any other project decided to implement it. Although
> programs that use gemmi library will handle such PDB files
> automatically, though.
> 
> Marcin
> 
> On Thu, Nov 20, 2025 at 3:26 PM Paul Bond
> <[email protected]> wrote:
>> 
>> Hi Harry,
>> 
>> If you need to produce PDB files from CIF files with 5-letter ligand codes 
>> you can use gemmi convert, e.g.:
>> 
>> gemmi convert gemmi convert 8XFM.cif 8XFM.pdb --shorten-tlc
>> 
>> The original has a ligand called A1LU6. In the PDB file it gets a HET and 
>> HETNAM entry:
>> 
>> HET    ~U6  A 401      25     real CCD code: A1LU6
>> HET    EDO  A 402       4
>> HET     ZN  A 403       1
>> HETNAM     ~U6                                                         A1LU6
>> 
>> Then in the atom lines it is referred to by the new ID:
>> 
>> HETATM 2209 CL   ~U6 A 401      38.210  38.396  16.715  1.00 77.22          
>> CL
>> HETATM 2210  C17 ~U6 A 401      38.401  39.551  15.377  1.00 54.65           
>> C
>> HETATM 2211  C16 ~U6 A 401      37.332  40.370  15.083  1.00 52.11           
>> C
>> HETATM 2212  C15 ~U6 A 401      37.438  41.271  14.038  1.00 51.08           
>> C
>> HETATM 2213  C14 ~U6 A 401      38.616  41.345  13.308  1.00 48.95           
>> C
>> 
>> If you convert it back to CIF with gemmi:
>> 
>> gemmi convert 8XFM.pdb new.cif
>> 
>> Then the resulting mmCIF file just uses the original A1LU6 naming.
>> 
>> Cheers,
>> Paul
>> 
>> On Thu, 20 Nov 2025 at 12:54, Harry Powell 
>> <[email protected]> wrote:
>>> 
>>> Hi folks
>>> 
>>> The real answer (before I ask the question) is, of course, to use mmCIF. 
>>> But quite a few programs out there don’t read mmCIF files properly (if at 
>>> all), so I’ll ask the question anyway.
>>> 
>>> The venerable PDB format allows 3 columns (18-20) for the residue or ligand 
>>> name, and this was fine as long as there were only chemical components 
>>> allowed with 3 characters - but now there are loads that have 5 characters 
>>> (e.g. A1B20 - see  
>>> https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/A1B20).
>>> 
>>> If I wanted to write a PDB format file for the ligand that had the 
>>> 5-character code, what would be the “best” way to do it (of course, there 
>>> are different definitions of “best”…)?
>>> 
>>> What dirty trick would cause the least damage (at least, to common 
>>> programs)?
>>> 
>>> Allowing the ligand name to spread over columns 16-20 *might* work, but 
>>> that would mean encroaching on the atom name column, and might cause 
>>> confusion between things like calcium (“CA “) and carbons labelled with an 
>>> “A” (“ CA”).
>>> 
>>> Thoughts?
>>> 
>>> Harry
>>> ########################################################################
>>> 
>>> To unsubscribe from the CCP4BB list, click the following link:
>>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>> 
>>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
>>> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
>>> https://www.jiscmail.ac.uk/policyandsecurity/
>> 
>> 
>> ________________________________
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
> 
> ########################################################################
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
> 
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
> https://www.jiscmail.ac.uk/policyandsecurity/

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to