On 24/04/2013 16:01, Pascal Muller wrote:
Hi,

I have a little pybel script for generating Murcko scaffolds, which
works well with 2.3.1, but fails on a few molecules ("Segmentation
fault") with 2.3.2.

Depending on code modifications, the fault raises at different points,
but I have no clue on why the bug is happening.

Below is my code shortened as much as I could.

Regards,
Pascal



#!/usr/bin/env python
# encoding: utf-8
import sys
import os, pybel
import openbabel as ob

def GetMurckoFramework(mol):
     haveAtomWithOneNeighbor,atomsWithOneNeighborList =
checkIfIsOnlyOneNeighbor(mol)
     while (haveAtomWithOneNeighbor):
         for atomToDel in atomsWithOneNeighborList:
             mol.OBMol.DeleteAtom(atomToDel) # delete atom which are
not part of a cycle or a linker (terminal atoms)
         haveAtomWithOneNeighbor,atomsWithOneNeighborList =
checkIfIsOnlyOneNeighbor(mol)
     return mol

def checkIfIsOnlyOneNeighbor(mol):
     isAnAtomWithOneNeighbor = 0
     atomWithOneNeighborList = []
     for atom in ob.OBMolAtomIter(mol.OBMol):
         if not atom.IsInRing():
             neighborsCount = 0
             for neighbor in ob.OBAtomAtomIter(atom):
                 neighborsCount = neighborsCount + 1
             if (neighborsCount <= 1): # It is a terminal atom (only
one neighbor) double bond
                 atomWithOneNeighborList.append(atom) # will be deleted
                 isAnAtomWithOneNeighbor = 1
     return isAnAtomWithOneNeighbor, atomWithOneNeighborList


def main():
     if len(sys.argv) < 2:
         print "No input file provided: Murcko.py filetosprocess.smi"
         sys.exit(1)

     mol1 = pybel.readstring("smi", "C1(=Cc2cn(nc2)CC(=O)N)C(=O)NC(=O)S1")
     mol2 = pybel.readstring("smi", "C1(=CC2=C(N=C(C2)C(=O)N)C)C(=O)NC(=O)S1")
     li = [mol1, mol2]

     for mol in li:
         mol = GetMurckoFramework(mol)
         print mol

if __name__ == '__main__':
     main()

The problem seems to be in the stereo perception code called during smiles output. It possibly arises because the atom deletion somehow messes up the molecule and the stereo perception routines don't cope. I've tried deleting the stereochemical data in the scaffold to little effect, but most effective was to suppress stereo perception in smilesformat output. The output option -xi (C++ pConv->AddOption("i");) nearly does this but needs line 3608 in smilesformat.cpp to be
bool is_chiral = isomeric && AtomIsChiral(atom);

But this is only a partial solution and the stereo perception needs to be more robust. The bigger problem is how to handle OBMol objects sometimes as patterns rather than as molecules.

Remember also that atom indices can be invalidated when another atom is deleted, although that doesn't seem to be an issue in your code.

Attached is an op based on your method used from obabel (or the gui) for debugging.

Chris

/**********************************************************************
murcko.cpp - A OBOp for generation Murcko scaffolds

Copyright (C) 2013 by Chris Morley

This file is part of the Open Babel project.
For more information, see <http://openbabel.org/>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation version 2 of the License.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
***********************************************************************/
#include <openbabel/babelconfig.h>
#include <openbabel/op.h>
#include <openbabel/obiter.h>
#include <openbabel/obconversion.h>
#include <openbabel/mol.h>

namespace OpenBabel
{

class OpMurcko : public OBOp
{
public:
  OpMurcko(const char* ID) : OBOp(ID, false){};
  const char* Description()
  { return "Generate Murcko scaffold\n"
           "e.g. obabel mols.xxx -O scaffolds.yyy --murcko"; }

  virtual bool WorksWith(OBBase* pOb)const{ return dynamic_cast<OBMol*>(pOb)!=NULL; }
  virtual bool Do(OBBase* pOb, const char* OptionText=NULL, OpMap* pOptions=NULL, OBConversion* pConv=NULL);
};

/////////////////////////////////////////////////////////////////
OpMurcko theOpMurcko("murcko"); //Global instance

/////////////////////////////////////////////////////////////////
bool OpMurcko::Do(OBBase* pOb, const char* OptionText, OpMap* pOptions, OBConversion* pConv)
{
  OBMol* pmol = dynamic_cast<OBMol*>(pOb);
  if(!pmol)
    return false;
  pmol->DeleteHydrogens();
  pmol->DeleteData("OBTetrahedralStereo");
  pmol->DeleteData("OBCisTransStereo");
  bool done=false;
  while(!done)
  {
    done=true;
    for(OBMolAtomIter a(pmol);a;++a)
    {
      if(a->GetValence()==1
         && (!a->HasDoubleBond() || OBAtomAtomIter(&*a)->GetValence()<3))
      {
        pmol->DeleteAtom(&*a);
        done=false;
        break;
      }
    }
  }
  pmol->DeleteData("StereoData");
  pConv->AddOption("i"); //for smiles output to supress stereo detection
  return pmol->NumAtoms()>2;
}

}//namespace
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to