hmm, perplexing. How about we try something simple. Instead of doing real molecules that may be proprietary, how about constructing a simple input that has 10 copies of CCN(CC)CC and running that. Then you can safely send the output. It would also help if you could also just run the function on one machine (not using the PP stuff) to see if you can reproduce the problem there.
-greg On Wed, Aug 31, 2016 at 6:24 AM, Bennion, Brian <[email protected]> wrote: > Hello Greg, > > The source that I am use is shown below. Also, I need to clarify that all > this code is wrapped around the ParallelPython job control code. It allows > me to send each reaction to a separate cpu on my large clusters. > > I have been able to use your steps in your email to check my rdkit install > from the python interpreter. > Next I manually input my compound as a smiles string and performed your > set of commands and things work as expected. > However, when wrapped within the PP code the updatepropertycache has no > effect. My only thought is that I have not properly passed the molecule > between python modules (not sure if that makes any sense). > > This is the log output for one cycle of the code. The smiles string has > been clipped to not reveal proprietary data. The important thing here is > that the formal charge is correctly assigned but that the implicit hyrdogen > atoms are not updated. > > LOG > Tertiary nitrogen found in oxime: ((5, 6, 7, 8),) > This is the symbol and charge for the tertiary nitrogen before: N 0 > C(=O)N([H])C([H])([H])C([H])([H])C1(C([H])([H])N(C([H])([H])[H])C([H])([H]) > > This is the symbol and charge for the tertiary nitrogen after: N 1 > test3-10: SANITIZE_NONE C14H27N3O3 C14H27N3O3+ C14H27N3O3+ 3 > > > def tertNitrogenProt(molecule,molName1,w_sdf,w_smi): > patt=rdkit.Chem.MolFromSmarts('[#6]-[#7]([#6])-[#6]') > matches=molecule.GetSubstructMatches(patt) > tertNHnum=0 > if matches: > print "Tertiary nitrogen found in: ", matches > for i in matches: > moleculeStrings=rdkit.Chem.MolToSmiles(molecule, > isomericSmiles=True) > atomSymbol9=molecule.GetAtomWithIdx(i[1]).GetSymbol() > formalCharge9=molecule.GetAtomWithIdx(i[1]).GetFormalCharge() > print "This is the symbol and charge for the tertiary nitrogen > before: ",atomSymbol9,formalCharge9,moleculeStrings > #set the formal charge on the protonated tertiary nitrogen to zero > test7=rdkit.Chem.AllChem.CalcMolFormula(molecule) > molecule.GetAtomWithIdx(i[1]).SetFormalCharge(1) > atomSymbol9=molecule.GetAtomWithIdx(i[1]).GetSymbol() > formalCharge9=molecule.GetAtomWithIdx(i[1]).GetFormalCharge() > test8=rdkit.Chem.AllChem.CalcMolFormula(molecule) > print "This is the symbol and charge for the tertiary nitrogen > after: ",atomSymbol9,formalCharge9 > #update property cache and check for nonsense > molecule.UpdatePropertyCache() > moleculeH=rdkit.Chem.AddHs(molecule) > test3=rdkit.Chem.SanitizeMol(moleculeH) > test9=rdkit.Chem.AllChem.CalcMolFormula(moleculeH) > test10=moleculeH.GetAtomWithIdx(i[1]).GetDegree() > print "test3-10: ",test3,test7,test8,test9,test10 > #start generating 3 coordinates and optimize the conformation > rdkit.Chem.AllChem.EmbedMolecule(moleculeH) > rdkit.Chem.AllChem.UFFOptimizeMolecule(moleculeH,1500) > molName6=molName1+'NH+_'+str(tertNHnum)+'_XOH' > #find molecular formal charge > moleculeCharge=rdkit.Chem.GetFormalCharge(moleculeH) > moleculeH.SetProp('i_user_TOTAL_CHARGE',repr(moleculeCharge)) > moleculeH.SetProp('_Name',molName6) > w_sdf.write(moleculeH) > w_smi.write(moleculeH) > molName3=molName1+'NH+_'+str(tertNHnum)+'_XO' > totalMolecules=oximeSubStructSearch(moleculeH,molName3,w_sdf,w_ > smi) > tertNHnum += 1 > else: > print "No tertiary nitrogen matches" > return(molecule,tertNHnum) > return (moleculeH,tertNHnum) > ############################################################ > ########################################## > > > ------------------------------ > *From:* Greg Landrum [[email protected]] > *Sent:* Monday, August 29, 2016 10:41 PM > *To:* Bennion, Brian > *Cc:* [email protected] > *Subject:* Re: [Rdkit-discuss] protonating proper tertiary amines > > Hi Brian, > > On Tue, Aug 30, 2016 at 6:41 AM, Bennion, Brian <[email protected] > <http://UrlBlockedError.aspx>> wrote: > >> >> I have seemed to hit a wall with what seems like a simple task. >> >> First, I have ~9800 compounds that have a primary amine for a reaction >> that I am completing in rdkit. >> About 250 of those compounds have a tertiary alkylamine that is most >> likely protonated at pH 7.4. >> >> The dataset is a set of smiles strings for which the tertiary amine is >> not protonated. I thought this would be easy enough to fix, just use a >> smarts substructure search, set the formal charge on any hits to one and >> then AddHs, sanitize, embed, and then minimize. >> >> Well, what I get is [N+] with all the other carbons with explicit atoms >> in the resulting smiles files, and if output to sdf I get a positively >> charged diradical positioned at the tertiary nitrogen. >> > > Yes, what's happening here is that AddHs() is using the implicit valence > on the N atoms to determine how many Hs to add. Since the implicit valence > is not recomputed when you set the formal charge, you end up with the wrong > number of Hs attached to the N. A call to UpdatePropertyCache() will fix > this: > > n [16]: m = Chem.MolFromSmiles('CN') > > In [17]: AllChem.CalcMolFormula(m) > Out[17]: 'CH5N' > > In [18]: m.GetAtomWithIdx(1).SetFormalCharge(1) > > In [19]: AllChem.CalcMolFormula(m) > Out[19]: 'CH5N+' > > In [20]: m.UpdatePropertyCache() > > In [21]: AllChem.CalcMolFormula(m) > Out[21]: 'CH6N+' > > In [22]: mh = Chem.AddHs(m) > > In [24]: mh.GetAtomWithIdx(1).GetDegree() > Out[24]: 4 > > Thank you for such a great tool >> > > You're welcome! Thanks for saying thanks. :-) > > Hope this helps, > -greg > >
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

