Thanks Paul, BTW, I call it the tilde-hetnam extension of the PDB format: https://gemmi.readthedocs.io/en/stable/mol.html#tilde-hetnam "The tilde-hetnam extension addresses this issue: a long CCD code is substituted with a 3-character alias that starts with a tilde (~); the original code is stored in columns 72-79 of the HETNAM record." But I don't think any other project decided to implement it. Although programs that use gemmi library will handle such PDB files automatically, though.
Marcin On Thu, Nov 20, 2025 at 3:26 PM Paul Bond <[email protected]> wrote: > > Hi Harry, > > If you need to produce PDB files from CIF files with 5-letter ligand codes > you can use gemmi convert, e.g.: > > gemmi convert gemmi convert 8XFM.cif 8XFM.pdb --shorten-tlc > > The original has a ligand called A1LU6. In the PDB file it gets a HET and > HETNAM entry: > > HET ~U6 A 401 25 real CCD code: A1LU6 > HET EDO A 402 4 > HET ZN A 403 1 > HETNAM ~U6 A1LU6 > > Then in the atom lines it is referred to by the new ID: > > HETATM 2209 CL ~U6 A 401 38.210 38.396 16.715 1.00 77.22 CL > HETATM 2210 C17 ~U6 A 401 38.401 39.551 15.377 1.00 54.65 C > HETATM 2211 C16 ~U6 A 401 37.332 40.370 15.083 1.00 52.11 C > HETATM 2212 C15 ~U6 A 401 37.438 41.271 14.038 1.00 51.08 C > HETATM 2213 C14 ~U6 A 401 38.616 41.345 13.308 1.00 48.95 C > > If you convert it back to CIF with gemmi: > > gemmi convert 8XFM.pdb new.cif > > Then the resulting mmCIF file just uses the original A1LU6 naming. > > Cheers, > Paul > > On Thu, 20 Nov 2025 at 12:54, Harry Powell > <[email protected]> wrote: >> >> Hi folks >> >> The real answer (before I ask the question) is, of course, to use mmCIF. But >> quite a few programs out there don’t read mmCIF files properly (if at all), >> so I’ll ask the question anyway. >> >> The venerable PDB format allows 3 columns (18-20) for the residue or ligand >> name, and this was fine as long as there were only chemical components >> allowed with 3 characters - but now there are loads that have 5 characters >> (e.g. A1B20 - see >> https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/A1B20). >> >> If I wanted to write a PDB format file for the ligand that had the >> 5-character code, what would be the “best” way to do it (of course, there >> are different definitions of “best”…)? >> >> What dirty trick would cause the least damage (at least, to common programs)? >> >> Allowing the ligand name to spread over columns 16-20 *might* work, but that >> would mean encroaching on the atom name column, and might cause confusion >> between things like calcium (“CA “) and carbons labelled with an “A” (“ CA”). >> >> Thoughts? >> >> Harry >> ######################################################################## >> >> To unsubscribe from the CCP4BB list, click the following link: >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 >> >> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing >> list hosted by www.jiscmail.ac.uk, terms & conditions are available at >> https://www.jiscmail.ac.uk/policyandsecurity/ > > > ________________________________ > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
