Dear all! 1) I'd like also to know more about algorithm of the reference structure choosing.
Commonly I'm using g_covar -f md.trr -s md.tpr fur PCA of the md trajectory ( here md.tpr is the protein topology and md.trr is the protein only trajectory) and g_covar -f ensemble.pdb -s ref.pdb for PCA of the X-ray data set where ensemble is all of my pdb structures in NMR-like format and reference is the random structure ( I know that such assumpption is wrong by definition but I dont realy know how I could calculate average structure for my 'pdb trajectory') Sometimes that produce 'broken geometry' of my protein when I try to fit calculated trajectory into NEW reference ( or into the old reference - md.tpr) by means of g_anaign to produce filtered.xtc g_anaeig -v eigenvec.trr -f md.trr -s md.tpr -filt filtered.xtc During visual analysis I noticed that my protein looks like 'compresed'. I dont know why this occures because I have that problem not in the same case so I suppose that the problem in the initial ref.structure choosing in g_covar or in the fitting in the g_anaeig. 2) Sometime I want to fit my md trajectory (known as the md_1) into the eigenvectors calculated from the X-ray data set ( of from another MD trajectory) for the same protein (known as the md_2). So I'd like to examine wich conformation of the md_1 correspond to the which positions in the conformational space of the second trajectory md_2. What reference structure must be chosing for such pca? 3)I'm looking in the possible tutorial which explains step by step how I could perform PCA in dihedral space for the averaged-size protein ( 800 backbone atoms). As I understood I must routinelly defined each dihedtal angle in the ndx file to provide this as the input in g_covar. Has someone some script for automatisation of such process? Thanks for suggestions, James 2013/2/11 <bapti...@itqb.unl.pt>: > Hi Vivek, > > There are two distinct steps involved: (1) the fit of your trajectory to a > reference structure, which corresponds to choose a conformation space; (2) > the use of the PCA method, which corresponds to find in that space a new > basis set whose ordered axes sequentially maximize dispersion (hopefully > capturing the distribution main features with only a few of the new > coordinates). The two steps just happen to be done by the same program. The > structure chosen for fitting is related to step 1, while the average > structure used to compute the covariance matrix is related to step 2 -- as > already pointed by Tjerk, the two structures are generally not the same. > > The aim of the fit is to get rid of the global translation and rotation of > your protein in the simulation box, trying to place all the sampled > structures in a single 3D space that reflects "only" the conformational > differences. But this is necessarily approximate, because the > superimposition of any pair of structures after the global fit will be > always worse than you would get by making a pairwise fit of the two. Thus, > you want to get a final dispersion around the reference as small as > possible. So, of the two average structures that you tried, you should > choose the one computed from the last 30 ns (it's not surprising that it > gives a smaller dispersion, because it refers to the segment you are > analyzing). Still, using an average structure as a reference is a somewhat > illusory solution, because that average must itself be obtained after > fitting the trajectory to some reference... In a study of a small flexible > peptide (where the choice of reference may have drastic effects), we found > that a good reference seems to be the "central structure" of your sample, > defined as the one that, when taken as a reference, leads to the lowest > overall dispersion (http://dx.doi.org/10.1021/jp902991u). The article > discusses the issues pointed above, so you may want to give it a look. > > You can also avoid the need of a reference by choosing a different > conformation space for PCA, a popular alternative being the phi and psi > dihedrals (look in the manual). Note that this dihedral space is a bit > different from the more usual one discussed above, each reflecting a > different kind of conformational proximity (this is also discussed in the > article). It's up to you to decide which one better suits your problem. > > Hope this helps. > Cheers, > Antonio > > On Sat, 9 Feb 2013, Tsjerk Wassenaar wrote: > >> Hi, >> >> The commands would certainly help, including the commands for getting the >> reference structure. Do note that the reference is the reference for >> fitting, which is 'external', i.e. provided by the user. This is not the >> same as the structure used to calculate the deviations, which is the >> average structure of the frames selected. >> >> Cheers, >> >> Tsjerk >> >> On Sat, Feb 9, 2013 at 7:06 PM, bipin singh <bipinel...@gmail.com> wrote: >> >>> Hi vivek, >>> >>> I have few questions related to your query: >>> >>> During covariance matrix calculation, g_covar by default takes average >>> structure of the trajectory as a reference structure then why you are >>> giving it average structure of your trajectory (0-100ns) manually. >>> Moreover without looking at your commands which you have used, it would >>> be >>> difficult for anyone that why are you getting these surprising results. >>> On Thu, Feb 7, 2013 at 1:26 PM, vivek modi <modi.vivek2...@gmail.com> >>> wrote: >>> >>>> Hello, >>>> >>>> I have troubled you with a similar question before also, but I guess I >>> >>> need >>>> >>>> some more clarification. My question is about the reference structure in >>>> PCA analysis. >>>> I have 100ns long protein simulation which I want to analyze using PCA. >>> >>> The >>>> >>>> RMSD shows fluctuations upto initial 25-30ns and then becomes very >>> >>> stable. >>>> >>>> I have performed PCA on the last 30ns window of the simulation where I >>>> assume the simulation has converged (I also did on other time windows as >>>> well). >>>> >>>> The question is this: >>>> I did the analysis on the last 30ns window in two ways by taking two >>>> different reference structures. >>>> >>>> a. I take the average structure of the trajectory (0-100ns) as >>>> the reference and then do the fitting and calculate covariance matrix >>>> for >>>> last 30ns. This is done because I suspect that the average structure >>>> over >>>> full trajectory will reflect all the changes occurring in the protein. >>>> It >>>> also gives me low cosines (<0.1). The PCs show movement occurring in >>>> certain regions of the protein. >>>> >>>> b. I take the average structure from the same window (last 30ns) then do >>>> the fitting and calculate covariance matrix for the same. This is done >>> >>> with >>>> >>>> an assumption that the reference structure must reflect the >>>> equilibriated/stable part of the trajectory unlike the previous case. >>>> Surprisingly it gives me high cosines (>0.5). Unlike the previous case, >>>> this method shows very small movement in the protein (very low RMSF). >>>> >>>> Both of these methods give me different RMSF for the PCs although they >>> >>> are >>>> >>>> done on the same part of the trajectory but the reference structure is >>>> influencing the output. >>>> >>>> Which protocol among the two is appropriate ? And how can we explain >>> >>> high >>>> >>>> cosines in second case where the reference structure is the average of >>> >>> the >>>> >>>> same time window (there must not be large deviation) while I get low >>> >>> cosine >>>> >>>> for the first case where deviations are calculated from the full >>> >>> trajectory >>>> >>>> average (large deviation) ? >>>> >>>> Any help is appreciated. >>>> >>>> Thanks, >>>> >>>> -Vivek Modi >>>> Graduate Student >>>> IITK. >>>> -- >>>> gmx-users mailing list gmx-users@gromacs.org >>>> http://lists.gromacs.org/mailman/listinfo/gmx-users >>>> * Please search the archive at >>>> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! >>>> * Please don't post (un)subscribe requests to the list. Use the >>>> www interface or send it to gmx-users-requ...@gromacs.org. >>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >>>> >>> >>> >>> >>> -- >>> *----------------------- >>> Thanks and Regards, >>> Bipin Singh* >>> -- >>> gmx-users mailing list gmx-users@gromacs.org >>> http://lists.gromacs.org/mailman/listinfo/gmx-users >>> * Please search the archive at >>> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! >>> * Please don't post (un)subscribe requests to the list. Use the >>> www interface or send it to gmx-users-requ...@gromacs.org. >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >>> >> >> >> >> -- >> Tsjerk A. Wassenaar, Ph.D. >> >> post-doctoral researcher >> Biocomputing Group >> Department of Biological Sciences >> 2500 University Drive NW >> Calgary, AB T2N 1N4 >> Canada >> -- >> gmx-users mailing list gmx-users@gromacs.org >> http://lists.gromacs.org/mailman/listinfo/gmx-users >> * Please search the archive at >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! >> * Please don't post (un)subscribe requests to the list. Use the >> www interface or send it to gmx-users-requ...@gromacs.org. >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> > -- > Antonio M. Baptista > Instituto de Tecnologia Quimica e Biologica, Universidade Nova de Lisboa > Av. da Republica - EAN, 2780-157 Oeiras, Portugal > phone: +351-214469619 email: bapti...@itqb.unl.pt > fax: +351-214411277 WWW: http://www.itqb.unl.pt/~baptista > -------------------------------------------------------------------------- > > -- > gmx-users mailing list gmx-users@gromacs.org > http://lists.gromacs.org/mailman/listinfo/gmx-users > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > * Please don't post (un)subscribe requests to the list. Use the www > interface or send it to gmx-users-requ...@gromacs.org. > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists