Re: [gmx-users] help with chromophore of a GFP

Justin Lemkul Thu, 21 Mar 2013 13:53:38 -0700


On 3/21/13 4:43 PM, Mark Abraham wrote:

On Thu, Mar 21, 2013 at 4:30 PM, Anna MARABOTTI <amarabo...@unisa.it> wrote:



Dear Mark,

thank you for your message. I'm happy to be on the
right track; unfortunately the end point seems to be very far away...


I tried to obtain that CFY hydrogens and protein hydrogens are all
matching the aminoacids.rtp entry, in order to avoid dealing with
aminoacids.hdb. This is what I did:

- starting from the pdb file of
the protein, I removed CFY entry (prot_noCFY.pdb)

- I used pdb2gmx to
add H to the protein only: pdb2gmx -f prot_noCFY.pdb -o prot_noCFY_H.pdb
-p topol.top

- I inserted CFY_H.pdb (obtained with Pymol in a previous
passage in which I added H with Pymol to the protein, including CFY)
into prot_noCFY_H.pdb, obtaining prot_CFY_H.pdb.

In this way, H atoms
bound to "regular" residues have been added using Amber99SB, therefore
they are compatible with this ff, and atoms of CFY (previously added
with Pymol) have the same naming convention in aminoacids.rtp (that I
edited using atom types, charges etc. calculated with Antechamber on
this molecule coming from Pymol). Obviously, the atom numbering is not
sequential: the last atom of V63 (the last "regular" residue before CFY)
is numbered 938, the first atom of H68 (the first "regular" residue
after CFY) is numbered 939, and the atoms of CFY66 are numbered from 1
to 70. Moreover, since the sequence of atoms in aminoacids.rtp is not
the same as in the coordinates of CFY (I adapted the sequence of atoms
following the format of other residues in aminoacids.rtp), the numbering
of CFY in the prot_CFY_H.pdb is not ordered (1-2-3-....-69-70) but
disordered (19-54-20-55...49-50-24-25).


Seems fine. pdb2gmx is mostly about atom/residue naming. grompp is mostly
about atom/residue/moleculetype ordering.

- At this stage, I used

pdb2gmx again to create the topol.top file with all coordinates correct:


pdb2gmx -f prot_CFY_H.pdb -o prot_complete.gro -p topol.top


(selecting amber99sb forcefield and tip3p for water, as recommended
option)

This is the message error from pdb2gmx:

Read 'FLUORESCENT
PROTEIN', 3346 atoms
Analyzing pdb file
Splitting PDB chains based on
TER records or changing chain id.
There are 1 chains and 0 blocks of
water and 218 residues with 3346 atoms

  chain #res #atoms
  1 'A' 213
3346


I'd be concerned about the difference in residue count here, but 4.5.4 is
so old I've no idea whose fault this is.

All occupancies are one
Opening force field file
./amber99sb.ff/atomtypes.atp
Atomtype 1
Reading residue database...
(amber99sb)
Opening force field file
./amber99sb.ff/aminoacids.rtp
Residue 94
Sorting it all out...
Opening
force field file ./amber99sb.ff/dna.rtp
Residue 110
Sorting it all
out...
Opening force field file ./amber99sb.ff/rna.rtp
Residue
126
Sorting it all out...
Opening force field file
./amber99sb.ff/aminoacids.hdb
Opening force field file
./amber99sb.ff/dna.hdb
Opening force field file
./amber99sb.ff/rna.hdb
Opening force field file
./amber99sb.ff/aminoacids.n.tdb
Opening force field file
./amber99sb.ff/aminoacids.c.tdb

Processing chain 1 'A' (3346 atoms, 213
residues)
There are 327 donors and 319 acceptors
There are 539 hydrogen
bonds
Will use HISE for residue 22
Will use HISD for residue 38
Will use
HISE for residue 62
Will use HISE for residue 68
Will use HISD for
residue 109
Will use HISE for residue 119
Will use HISE for residue
172
Will use HISH for residue 193
Will use HISH for residue 197
Will use
HISE for residue 217
Identified residue SER3 as a starting
terminus.
Identified residue SER218 as a ending terminus.
8 out of 8
lines of specbond.dat converted successfully
Special Atom Distance
matrix:
  MET9 MET11 MET15 HIS22 HIS38 MET41 MET47
  SD110 SD149 SD232
NE2317 NE2549 SD596 SD700
  MET11 SD149 0.807
  MET15 SD232 2.279 1.627

HIS22 NE2317 3.707 2.983 1.466
  HIS38 NE2549 1.401 0.928 2.127 3.254

MET41 SD596 1.458 0.665 1.144 2.384 1.001
  MET47 SD700 3.059 2.324 0.995
0.801 2.656 1.761
  MET53 SD777 2.786 1.999 0.990 1.171 2.160 1.373
0.603
  HIS62 NE2917 2.340 1.733 0.833 1.797 1.988 1.236 1.583
  HIS68
NE21002 0.884 0.597 1.466 2.916 1.356 0.885 2.347
  HIS109 NE21638 2.061
1.886 1.380 2.614 2.661 1.862 2.279
  HIS119 NE21803 1.459 0.967 0.923
2.372 1.617 0.812 1.870
  MET135 SD2041 3.480 2.751 1.316 0.606 2.919
2.121 0.993
  MET162 SD2439 2.521 1.976 1.656 2.412 1.855 1.543 2.264

HIS172 NE22588 3.632 2.949 1.894 1.657 2.872 2.338 1.945
  CYS174 SG2623
2.968 2.372 1.452 1.861 2.428 1.848 1.924
  MET189 SD2891 2.167 2.379
2.736 4.000 2.754 2.569 3.722
  HIS193 NE22942 2.003 2.001 2.490 3.686
2.049 2.075 3.396
  HIS197 NE23011 2.012 1.634 1.830 2.896 1.554 1.426
2.614
  HIS217 NE23329 2.545 2.376 2.831 3.805 2.039 2.305 3.575
  MET53
HIS62 HIS68 HIS109 HIS119 MET135 MET162
  SD777 NE2917 NE21002 NE21638
NE21803 SD2041 SD2439
  HIS62 NE2917 1.363
  HIS68 NE21002 2.107 1.482

HIS109 NE21638 2.365 1.568 1.372
  HIS119 NE21803 1.688 0.976 0.584
1.078
  MET135 SD2041 1.057 1.365 2.661 2.490 2.119
  MET162 SD2439 1.878
0.871 1.805 2.246 1.520 1.861
  HIS172 NE22588 1.721 1.401 2.829 2.860
2.359 1.067 1.342
  CYS174 SG2623 1.694 0.725 2.140 2.152 1.681 1.297
0.745
  MET189 SD2891 3.547 2.310 1.858 1.893 1.980 3.627 2.290
  HIS193
NE22942 3.076 1.890 1.639 2.197 1.760 3.221 1.547
  HIS197 NE23011 2.229
1.149 1.407 2.078 1.323 2.401 0.676
  HIS217 NE23329 3.146 2.112 2.205
2.935 2.272 3.263 1.402
  HIS172 CYS174 MET189 HIS193 HIS197
  NE22588
SG2623 SD2891 NE22942 NE23011
  CYS174 SG2623 0.826
  MET189 SD2891 3.417
2.599
  HIS193 NE22942 2.831 2.079 1.020
  HIS197 NE23011 2.011 1.324
1.766 0.939
  HIS217 NE23329 2.629 2.068 1.936 0.946 1.003
Opening force
field file ./amber99sb.ff/aminoacids.arn
Opening force field file
./amber99sb.ff/dna.arn
Opening force field file
./amber99sb.ff/rna.arn
Checking for duplicate atoms....
Now there are
3345 atoms. Deleted 1 duplicates.


That also looks suspicious.

Now there are 213 residues with 3345
atoms
Making bonds...
Warning: Long Bond (988-989 = 0.453624
nm)


That seems like it might be a peptide bond bridging a "gap" where pdb2gmx
was unable to recognize the intervening content as a peptide residue.


WARNING: atom O1 is missing in residue CFY 66 in the pdb
file

-------------------------------------------------------
Program
pdb2gmx_d, VERSION 4.5.4
Source code file: pdb2top.c, line: 1463

Fatal
error:
There were 1 missing atoms in molecule Protein_chain_A, if you
want to use this incomplete topology anyhow, use the option -missing
For
more information and tips for troubleshooting, please check the
GROMACS
website at http://www.gromacs.org/Documentation/Errors

The
strange thing is that I checked for this error, but atom O1 in residue
CFY66 is present BOTH in the starting .pdb file (the one I used for
pdb2gmx) AND in the aminoacids.rtp file!!!! I checked 4 or 5 times,
every time erasing the old file, checking the file IMMEDIATELY BEFORE
submitting it to pdb2gmx. All atoms present in aminoacids.rtp for CFY
residue are also present in the .pdb file and vice versa, and I am sure
I did not make the stupid error of naming the atom 01 (zero-one) instead
of O1 (o-one).

I suspect that this atom is the one which is deleted
because recognized as duplicated, but I'm not sure about it and I don't
know how to check it. I am sure there are no duplicated atoms in CFY.


I feel like this is a "fake" error message (i.e.: there is an error in
my files, but it is not the one that is reported in the message:
probably a problem occur around this atom, but it is not exactly ON this
atom). However, I am not able to find errors.


Hmm that seems weird. Justin's theory sounds plausible, but I haven't seen
someone stumble on that before. Also plausible is that pdb2gmx thinks your
CFY is a disconnected part of the chain and needs terminating (which might
happen with an oxygen named O1?).


I stumbled across it when working with the GFP chromophore a while back :)

http://redmine.gromacs.org/issues/567

Still technically an open "bug," though I agree that it's really expectedbehavior, provided one knows how pdb2gmx works, which involves lots of steps, ofcourse.


-Justin

It's possible there's buggy behaviour here that has been fixed in the two
years since that code was released. There certainly has been an upgrade of
the "is this really a new chain" machinery. Unless you have a strong
scientific reason to keep 4.5.4, I'd switch to 4.6.1 (or 4.5.6 if you
really have to keep 4.5). If Justin's fix doesn't work, and you have
problems with a more recent version, then we can look closer.

BTW the "long bond" of
the other warning message is not involving residue CFY.


Yeah, but my bet is those atoms are the C-terminus and N-terminus of the
fragments that should form peptide bonds to CFY.

Mark

Any help is
welcome

Thank you so much.

Anna

Il 21.03.2013 12:00
gmx-users-requ...@gromacs.org ha scritto:

Dear gmx-users, it's

about two weeks that I'm trying to solve this problem, and I can't, so
I'm asking your help. I want to do some MD simulations on a protein of
the family of green fluorescent protein. This protein, as you know, has
a chromophore (CFY) derived from four residues of the protein
(F64-C65-Y66-G67) and covalently bound to the rest of the protein chain.
How to parametrize this object, since it is not recognized by pdb2gmx? I
looked at the gmx-users list and the suggestion was to create a new
entry in the .rtp file of the selected forcefield.


Indeed, this

kind of problem is most easily solved by making a new

"residue" that

contains the whole chromophore, such that it links to its

neighbours

with normal peptide links.

------------------------------ Message: 5

Date: Thu, 21 Mar 2013 11:46:12 +0100 From: Mark Abraham
<mark.j.abra...@gmail.com [2]> Subject: Re: [gmx-users] help with
chromophore of a GFP To: Discussion list for GROMACS users
<gmx-users@gromacs.org [3]> Message-ID:
<camnumasicymgivb_x5sy1yb44th8vknioqvhzdqq-tam9tn...@mail.gmail.com [4]>
Content-Type: text/plain; charset=ISO-8859-1 On Wed, Mar 20, 2013 at
6:01 PM, Anna MARABOTTI <amarabo...@unisa.it [5]> wrote:

decided to use Amber99SB since it seemed the better for my scope, then I
start trying to parameterize it. This is what I did: * I used Pymol to
add H to my pdb file, since I want to use an all H forcefield and since
Antechamber (see below) does not work without H * I extracted the
segment V63-CFY-H68 from my .pdb file. I did this since, when I
extracted CFY only, I had problems with the terminals * Following the
Antechamber tutorial, I used Antechamber (using the traditional Amber
force field, not GAFF) to calculate charges and to assign atom types to
this segment. * I used these calculated parameters in order to add the
CFY residue to aminoacids.rtp in amber99sb.ff directory. * I tried to
modify also aminoacids.hdb, but since it seemed too complicated to me, I
decided to keep it unchanged, and to give pdb2gmx the protein with H
already present * No need to add new atom/bond types to ffbonded.itp and
ffnonbonded.itp: they seem all present. Since CFY is bound to the rest
of protein with common peptide bonds, I did not change specbond.dat
either. * I added CFY in residuetypes.dat with the specification
"Protein" In my opinion, all was ready to go, instead... When I launched
pdb2gmx to my protein with H added by PyMol, I got immediately an error:
Fatal error: Atom H01 in residue SER 3 was not found in rtp entry NSER
with 13 atoms while sorting atoms. For a hydrogen, this can be a
different protonation state, or it might have had a different number in
the PDB file and was rebuilt (it might for instance have been H3, and we
only expected H1 & H2). Note that hydrogens might have been added to the
entry for the N-terminus. Remove this hydrogen or choose a different
protonation state to solve it. Option -ignh will ignore all hydrogens in
the input. For more information and tips for troubleshooting, please
check the GROMACS website at http://www.gromacs.org/Documentation/Errors
[1][1]

 From this error I

understand that: * the code for H

in PyMol is different from the code for H in Amber (read from
aminoacids.rtp); in order to correct this error, I should add -ignh in
order to ignore H in input.


pdb2gmx has to be able to make sense of

the atom naming. There are lots of

different conventions for how to

name atoms, particularly hydrogen atoms.

pdb2gmx can't possibly encode

the logic to convert all of those

conventions. So the path of least

resistance can be to ignore hydrogens and

regenerate them according to

the generation rules.


However, you can just rename them in the

input file so that pdb2gmx

understands your meaning. The NSER entry in

the .rtp file shows you the

names pdb2gmx expects. If you edit the

names of those hydrogen atoms

(probably H01, H02, H03) in your input

coordinate file accordingly (to H1,

H2, H3), things will be fine. Be

sure you don't break the required column

formatting of the coordinate

file!




Links:
------
[1]
http://www.gromacs.org/Documentation/Errors
[2]
mailto:mark.j.abra...@gmail.com
[3] mailto:gmx-users@gromacs.org
[4]
mailto:camnumasicymgivb_x5sy1yb44th8vknioqvhzdqq-tam9tn...@mail.gmail.com
[5]
mailto:amarabo...@unisa.it
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


--
========================================

Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!

* Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] help with chromophore of a GFP

Reply via email to