On 3/21/13 4:43 PM, Mark Abraham wrote:
On Thu, Mar 21, 2013 at 4:30 PM, Anna MARABOTTI <amarabo...@unisa.it> wrote:Dear Mark, thank you for your message. I'm happy to be on the right track; unfortunately the end point seems to be very far away... I tried to obtain that CFY hydrogens and protein hydrogens are all matching the aminoacids.rtp entry, in order to avoid dealing with aminoacids.hdb. This is what I did: - starting from the pdb file of the protein, I removed CFY entry (prot_noCFY.pdb) - I used pdb2gmx to add H to the protein only: pdb2gmx -f prot_noCFY.pdb -o prot_noCFY_H.pdb -p topol.top - I inserted CFY_H.pdb (obtained with Pymol in a previous passage in which I added H with Pymol to the protein, including CFY) into prot_noCFY_H.pdb, obtaining prot_CFY_H.pdb. In this way, H atoms bound to "regular" residues have been added using Amber99SB, therefore they are compatible with this ff, and atoms of CFY (previously added with Pymol) have the same naming convention in aminoacids.rtp (that I edited using atom types, charges etc. calculated with Antechamber on this molecule coming from Pymol). Obviously, the atom numbering is not sequential: the last atom of V63 (the last "regular" residue before CFY) is numbered 938, the first atom of H68 (the first "regular" residue after CFY) is numbered 939, and the atoms of CFY66 are numbered from 1 to 70. Moreover, since the sequence of atoms in aminoacids.rtp is not the same as in the coordinates of CFY (I adapted the sequence of atoms following the format of other residues in aminoacids.rtp), the numbering of CFY in the prot_CFY_H.pdb is not ordered (1-2-3-....-69-70) but disordered (19-54-20-55...49-50-24-25).Seems fine. pdb2gmx is mostly about atom/residue naming. grompp is mostly about atom/residue/moleculetype ordering. - At this stage, I usedpdb2gmx again to create the topol.top file with all coordinates correct: pdb2gmx -f prot_CFY_H.pdb -o prot_complete.gro -p topol.top (selecting amber99sb forcefield and tip3p for water, as recommended option) This is the message error from pdb2gmx: Read 'FLUORESCENT PROTEIN', 3346 atoms Analyzing pdb file Splitting PDB chains based on TER records or changing chain id. There are 1 chains and 0 blocks of water and 218 residues with 3346 atoms chain #res #atoms 1 'A' 213 3346I'd be concerned about the difference in residue count here, but 4.5.4 is so old I've no idea whose fault this is.All occupancies are one Opening force field file ./amber99sb.ff/atomtypes.atp Atomtype 1 Reading residue database... (amber99sb) Opening force field file ./amber99sb.ff/aminoacids.rtp Residue 94 Sorting it all out... Opening force field file ./amber99sb.ff/dna.rtp Residue 110 Sorting it all out... Opening force field file ./amber99sb.ff/rna.rtp Residue 126 Sorting it all out... Opening force field file ./amber99sb.ff/aminoacids.hdb Opening force field file ./amber99sb.ff/dna.hdb Opening force field file ./amber99sb.ff/rna.hdb Opening force field file ./amber99sb.ff/aminoacids.n.tdb Opening force field file ./amber99sb.ff/aminoacids.c.tdb Processing chain 1 'A' (3346 atoms, 213 residues) There are 327 donors and 319 acceptors There are 539 hydrogen bonds Will use HISE for residue 22 Will use HISD for residue 38 Will use HISE for residue 62 Will use HISE for residue 68 Will use HISD for residue 109 Will use HISE for residue 119 Will use HISE for residue 172 Will use HISH for residue 193 Will use HISH for residue 197 Will use HISE for residue 217 Identified residue SER3 as a starting terminus. Identified residue SER218 as a ending terminus. 8 out of 8 lines of specbond.dat converted successfully Special Atom Distance matrix: MET9 MET11 MET15 HIS22 HIS38 MET41 MET47 SD110 SD149 SD232 NE2317 NE2549 SD596 SD700 MET11 SD149 0.807 MET15 SD232 2.279 1.627 HIS22 NE2317 3.707 2.983 1.466 HIS38 NE2549 1.401 0.928 2.127 3.254 MET41 SD596 1.458 0.665 1.144 2.384 1.001 MET47 SD700 3.059 2.324 0.995 0.801 2.656 1.761 MET53 SD777 2.786 1.999 0.990 1.171 2.160 1.373 0.603 HIS62 NE2917 2.340 1.733 0.833 1.797 1.988 1.236 1.583 HIS68 NE21002 0.884 0.597 1.466 2.916 1.356 0.885 2.347 HIS109 NE21638 2.061 1.886 1.380 2.614 2.661 1.862 2.279 HIS119 NE21803 1.459 0.967 0.923 2.372 1.617 0.812 1.870 MET135 SD2041 3.480 2.751 1.316 0.606 2.919 2.121 0.993 MET162 SD2439 2.521 1.976 1.656 2.412 1.855 1.543 2.264 HIS172 NE22588 3.632 2.949 1.894 1.657 2.872 2.338 1.945 CYS174 SG2623 2.968 2.372 1.452 1.861 2.428 1.848 1.924 MET189 SD2891 2.167 2.379 2.736 4.000 2.754 2.569 3.722 HIS193 NE22942 2.003 2.001 2.490 3.686 2.049 2.075 3.396 HIS197 NE23011 2.012 1.634 1.830 2.896 1.554 1.426 2.614 HIS217 NE23329 2.545 2.376 2.831 3.805 2.039 2.305 3.575 MET53 HIS62 HIS68 HIS109 HIS119 MET135 MET162 SD777 NE2917 NE21002 NE21638 NE21803 SD2041 SD2439 HIS62 NE2917 1.363 HIS68 NE21002 2.107 1.482 HIS109 NE21638 2.365 1.568 1.372 HIS119 NE21803 1.688 0.976 0.584 1.078 MET135 SD2041 1.057 1.365 2.661 2.490 2.119 MET162 SD2439 1.878 0.871 1.805 2.246 1.520 1.861 HIS172 NE22588 1.721 1.401 2.829 2.860 2.359 1.067 1.342 CYS174 SG2623 1.694 0.725 2.140 2.152 1.681 1.297 0.745 MET189 SD2891 3.547 2.310 1.858 1.893 1.980 3.627 2.290 HIS193 NE22942 3.076 1.890 1.639 2.197 1.760 3.221 1.547 HIS197 NE23011 2.229 1.149 1.407 2.078 1.323 2.401 0.676 HIS217 NE23329 3.146 2.112 2.205 2.935 2.272 3.263 1.402 HIS172 CYS174 MET189 HIS193 HIS197 NE22588 SG2623 SD2891 NE22942 NE23011 CYS174 SG2623 0.826 MET189 SD2891 3.417 2.599 HIS193 NE22942 2.831 2.079 1.020 HIS197 NE23011 2.011 1.324 1.766 0.939 HIS217 NE23329 2.629 2.068 1.936 0.946 1.003 Opening force field file ./amber99sb.ff/aminoacids.arn Opening force field file ./amber99sb.ff/dna.arn Opening force field file ./amber99sb.ff/rna.arn Checking for duplicate atoms.... Now there are 3345 atoms. Deleted 1 duplicates.That also looks suspicious.Now there are 213 residues with 3345 atoms Making bonds... Warning: Long Bond (988-989 = 0.453624 nm)That seems like it might be a peptide bond bridging a "gap" where pdb2gmx was unable to recognize the intervening content as a peptide residue.WARNING: atom O1 is missing in residue CFY 66 in the pdb file ------------------------------------------------------- Program pdb2gmx_d, VERSION 4.5.4 Source code file: pdb2top.c, line: 1463 Fatal error: There were 1 missing atoms in molecule Protein_chain_A, if you want to use this incomplete topology anyhow, use the option -missing For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors The strange thing is that I checked for this error, but atom O1 in residue CFY66 is present BOTH in the starting .pdb file (the one I used for pdb2gmx) AND in the aminoacids.rtp file!!!! I checked 4 or 5 times, every time erasing the old file, checking the file IMMEDIATELY BEFORE submitting it to pdb2gmx. All atoms present in aminoacids.rtp for CFY residue are also present in the .pdb file and vice versa, and I am sure I did not make the stupid error of naming the atom 01 (zero-one) instead of O1 (o-one). I suspect that this atom is the one which is deleted because recognized as duplicated, but I'm not sure about it and I don't know how to check it. I am sure there are no duplicated atoms in CFY. I feel like this is a "fake" error message (i.e.: there is an error in my files, but it is not the one that is reported in the message: probably a problem occur around this atom, but it is not exactly ON this atom). However, I am not able to find errors.Hmm that seems weird. Justin's theory sounds plausible, but I haven't seen someone stumble on that before. Also plausible is that pdb2gmx thinks your CFY is a disconnected part of the chain and needs terminating (which might happen with an oxygen named O1?).
I stumbled across it when working with the GFP chromophore a while back :) http://redmine.gromacs.org/issues/567Still technically an open "bug," though I agree that it's really expected behavior, provided one knows how pdb2gmx works, which involves lots of steps, of course.
-Justin
It's possible there's buggy behaviour here that has been fixed in the two years since that code was released. There certainly has been an upgrade of the "is this really a new chain" machinery. Unless you have a strong scientific reason to keep 4.5.4, I'd switch to 4.6.1 (or 4.5.6 if you really have to keep 4.5). If Justin's fix doesn't work, and you have problems with a more recent version, then we can look closer.BTW the "long bond" of the other warning message is not involving residue CFY.Yeah, but my bet is those atoms are the C-terminus and N-terminus of the fragments that should form peptide bonds to CFY. MarkAny help is welcome Thank you so much. Anna Il 21.03.2013 12:00 gmx-users-requ...@gromacs.org ha scritto:Dear gmx-users, it'sabout two weeks that I'm trying to solve this problem, and I can't, so I'm asking your help. I want to do some MD simulations on a protein of the family of green fluorescent protein. This protein, as you know, has a chromophore (CFY) derived from four residues of the protein (F64-C65-Y66-G67) and covalently bound to the rest of the protein chain. How to parametrize this object, since it is not recognized by pdb2gmx? I looked at the gmx-users list and the suggestion was to create a new entry in the .rtp file of the selected forcefield.Indeed, thiskind of problem is most easily solved by making a new"residue" thatcontains the whole chromophore, such that it links to itsneighbourswith normal peptide links.------------------------------ Message: 5Date: Thu, 21 Mar 2013 11:46:12 +0100 From: Mark Abraham <mark.j.abra...@gmail.com [2]> Subject: Re: [gmx-users] help with chromophore of a GFP To: Discussion list for GROMACS users <gmx-users@gromacs.org [3]> Message-ID: <camnumasicymgivb_x5sy1yb44th8vknioqvhzdqq-tam9tn...@mail.gmail.com [4]> Content-Type: text/plain; charset=ISO-8859-1 On Wed, Mar 20, 2013 at 6:01 PM, Anna MARABOTTI <amarabo...@unisa.it [5]> wrote:Idecided to use Amber99SB since it seemed the better for my scope, then I start trying to parameterize it. This is what I did: * I used Pymol to add H to my pdb file, since I want to use an all H forcefield and since Antechamber (see below) does not work without H * I extracted the segment V63-CFY-H68 from my .pdb file. I did this since, when I extracted CFY only, I had problems with the terminals * Following the Antechamber tutorial, I used Antechamber (using the traditional Amber force field, not GAFF) to calculate charges and to assign atom types to this segment. * I used these calculated parameters in order to add the CFY residue to aminoacids.rtp in amber99sb.ff directory. * I tried to modify also aminoacids.hdb, but since it seemed too complicated to me, I decided to keep it unchanged, and to give pdb2gmx the protein with H already present * No need to add new atom/bond types to ffbonded.itp and ffnonbonded.itp: they seem all present. Since CFY is bound to the rest of protein with common peptide bonds, I did not change specbond.dat either. * I added CFY in residuetypes.dat with the specification "Protein" In my opinion, all was ready to go, instead... When I launched pdb2gmx to my protein with H added by PyMol, I got immediately an error: Fatal error: Atom H01 in residue SER 3 was not found in rtp entry NSER with 13 atoms while sorting atoms. For a hydrogen, this can be a different protonation state, or it might have had a different number in the PDB file and was rebuilt (it might for instance have been H3, and we only expected H1 & H2). Note that hydrogens might have been added to the entry for the N-terminus. Remove this hydrogen or choose a different protonation state to solve it. Option -ignh will ignore all hydrogens in the input. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors [1][1]From this error Iunderstand that: * the code for Hin PyMol is different from the code for H in Amber (read from aminoacids.rtp); in order to correct this error, I should add -ignh in order to ignore H in input.pdb2gmx has to be able to make sense ofthe atom naming. There are lots ofdifferent conventions for how toname atoms, particularly hydrogen atoms.pdb2gmx can't possibly encodethe logic to convert all of thoseconventions. So the path of leastresistance can be to ignore hydrogens andregenerate them according tothe generation rules.However, you can just rename them in theinput file so that pdb2gmxunderstands your meaning. The NSER entry inthe .rtp file shows you thenames pdb2gmx expects. If you edit thenames of those hydrogen atoms(probably H01, H02, H03) in your inputcoordinate file accordingly (to H1,H2, H3), things will be fine. Besure you don't break the required columnformatting of the coordinatefile!*Links: ------ [1] http://www.gromacs.org/Documentation/Errors [2] mailto:mark.j.abra...@gmail.com [3] mailto:gmx-users@gromacs.org [4] mailto:camnumasicymgivb_x5sy1yb44th8vknioqvhzdqq-tam9tn...@mail.gmail.com [5] mailto:amarabo...@unisa.it -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
-- ======================================== Justin A. Lemkul, Ph.D. Research Scientist Department of Biochemistry Virginia Tech Blacksburg, VA jalemkul[at]vt.edu | (540) 231-9080 http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin ======================================== -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!* Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists