Hi, Thanks for this, the clue that I needed was that there's a method:
" matches = mol.GetSubstructMatches(pat) " This should work fine for what I need. Cheers, Steve. -----Original Message----- From: Andrew Dalke [mailto:[email protected]] Sent: 07 September 2016 12:10 To: Stephen O'hagan <[email protected]> Cc: [email protected] Subject: Re: [Rdkit-discuss] MolWt of substructure hit? On Sep 7, 2016, at 11:53 AM, Stephen O'hagan wrote: > How would I find the molecular weight (fraction) of that substructure within > a compounds expressed as a SMILES string, e.g.: I don't know if a built-in function which does this. It's possible to write one. Here's a function which will compute the molecular weight given the molecule and the atom indices for the fragment. def get_fragment_molwt(mol, atom_indices): assert len(atom_indices) == len(set(atom_indices)) # quick duplicate check molwt = 0.0 for atom_index in atom_indices: atom = mol.GetAtomWithIdx(atom_index) molwt += atom.GetMass() return molt If you want to include the hydrogen mass, then use this variant: from rdkit import Chem _H_mass = Chem.Atom(1).GetMass() def get_fragment_molwt(mol, atom_indices): assert len(atom_indices) == len(set(atom_indices)) # quick duplicate check molwt = 0.0 for atom_index in atom_indices: atom = mol.GetAtomWithIdx(atom_index) molwt += atom.GetMass() + atom.GetTotalNumHs() * _H_mass return molt Here's an example of how to use the function: #====== from rdkit import Chem def get_fragment_molwt(): ... as above ... smiles = "CC(=O)O[C@H]1CC[C@@]2(C)C(=CCC3C4CC=C(c5cccnc5)[C@@]4(C)CCC32)C1" smarts = "[#6](:,-[#6]:,-[#6](-[#6]):,-[#6]-[#6](:[#6]:[#7]):[#6]:[#6]):,-[#6]:,-[#6]" mol = Chem.MolFromSmiles(smiles) assert mol is not None, smiles pat = Chem.MolFromSmarts(smarts) assert pat is not None, smarts matches = mol.GetSubstructMatches(pat) molwt = MolWt(mol) for match_no, match in enumerate(matches, 1): fragment_molwt = get_fragment_molwt(mol, match) print("#{}: {:.2%}".format(match_no, fragment_molwt/molwt)) #====== If I don't include the hydrogens in the fragment weight calculation then I get: #1: 37.32% #2: 37.32% #3: 37.32% ... If I include the hydrogens, then I get: #1: 40.15% #2: 39.64% #3: 40.15% ... Cheers, Andrew [email protected] ------------------------------------------------------------------------------ _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

