Hi Eric,

glad I could help.

The trick assumes that the indices of the atoms in the residue are
contiguous.

I'm just thinking whether or not it'd be easy to get around this issue somehow.
Why do you subtract the starting Idx instead of just using the content of your
variable indices? This variable contains a mapping between the old Idx (the
actual entry) and the new Idx (index of the entry). So you would "just" have to
invert this mapping, which might be stored in the variable inverted.  Now, you
could just use

copied.AddBond(inverted[begin_atom_idx], inverted[end_atom_idx] ...further 
things...)

et voilĂ , the indices no longer have to be contiguous.

The inversion could be performed in the following way (not tested but shoudl
work):

//N is the number of entries in the original geometry (will be the length of 
"inverted")
void invert_mapping( int N, std::vector<int>* indices, std::vector<int>* 
inverted ) {
    inverted->reserve(N+1);
    //initialize vector to 0 (since all Idx in OpenBabel start at 1 if I'm not 
mistaken)
    for (int i=0; i<=N; ++i){
        inverted->push_back(0);
    }
    //this function assumes that the atoms will be added to the new molecule in
    //the same order as they appear in indices
    int count = 1; //this will go through all the new Idx
    for (std::vector<int>::iterator it=indices->begin(); it!=indices->end(); 
++it, ++count){
        inverted->at(*it) = count;
    }
    //this function will leave a useless element at index 0 in inverted
}

Be aware that an index shift by 1 might be necessary here and there but I don't
believe it is.

Cheers,
Torsten

On Wed, 23 Sep 2015, Yun Ding wrote:

Hi Torsten,

You are correct about copying the bonds; they should be *manually **deep
copied.*

For the atom types, since only a residue of the molecule is copied in my
case, the *begin_atom_idx *and* end_atom_idx *should start from 1, 2 ... in
the new molecule.
For each atom index from the source molecule, I subtract them by an offset,
which is the 'smallest atom index in the residue minus 1'.

This black magic seems to work! I print the residue into mol2 format, and
the atom types are exactly the same as what they are as in a whole complex.
Since mol2 format uses the bonds to determine the SYBYL atom type, if the
atom type maintains, the bonds should also have been maintained.

The trick assumes that the indices of the atoms in the residue are
contiguous.

Below is the code:
- - -
unsigned int FindSmallestIndex(OBMol &mol,
                               char selected_chain,
                               string selected_resn,
                               unsigned int selected_res_num)
{
  OBAtom *atom;
  OBResidue *res;
  vector<unsigned int> indices;

  FOR_ATOMS_OF_MOL(atom, mol) {

    res = atom->GetResidue(true);
    string resn = res->GetName();
    unsigned int res_num = res->GetNum();
    char chain = res->GetChain();

    if ((resn == selected_resn)
        && (res_num == selected_res_num)
        && (chain == selected_chain)) {
      indices.push_back(atom->GetIdx());
    }
  }
  assert(indices.size() > 0);
  return *min_element(indices.begin(), indices.end());
}

OBMol CopySelected(OBMol &mol,
                   char selected_chain,
                   string selected_resn,
                   unsigned int selected_res_num)
{

  unsigned int starting_idx = FindSmallestIndex(mol, selected_chain,
selected_resn, selected_res_num) - 1;

  OBMol copied;
  OBAtom * atom;
  OBResidue *res;

  FOR_ATOMS_OF_MOL(atom, mol) {
    res = atom->GetResidue(true);

    string resn = res->GetName();
    unsigned int res_num = res->GetNum();
    char chain = res->GetChain();

    if ((resn == selected_resn)
        && (res_num == selected_res_num)
        && (chain == selected_chain)) {

      copied.InsertAtom(*atom);

      copied.BeginModify();
      for (auto bond_itr = atom->BeginBonds(); bond_itr !=
atom->EndBonds(); bond_itr++) {

        unsigned int begin_atom_idx = (*bond_itr)->GetBeginAtomIdx();
        unsigned int end_atom_idx = (*bond_itr)->GetEndAtomIdx();
        unsigned int bo = (*bond_itr)->GetBondOrder();
        unsigned int flag = (*bond_itr)->GetFlags();

        // *black magic here*
        copied.AddBond(begin_atom_idx - starting_idx, end_atom_idx -
starting_idx, bo, flag, -1);
      }
      copied.EndModify();
    }
  }

  return copied;
}

- - -

 Thanks!

Eric

On Wed, Sep 23, 2015 at 7:51 AM, Torsten Sachse <torsten.sac...@uni-jena.de>
wrote:

Hi Eric,

some time ago, I wrote a method to append one molecule to another. You
might be
able to adapt that method by creating an empty molecule and adding the
atoms and
bonds you want to it. At the beginning, I had the same problems with bonds
and
atom types not being correct but it's working now, at least printing out
the
internal file format shows perfect agreement.

I did not manage to copy the bonds directly as you try right now. This is
probably because the bond directly points to the atoms (i.e., the data
structure) it connects, but since you're creating new atoms (new instances
of
the type OBAtom) you'd be re-creating an already existing bond instead of
creating a new bond between those newly created atoms which messes up the
bond-order and the likes since the atoms the bond points to are not part
of your
new molecule.

Here's the source after changing some bits to better fit your use case (not
tested directly but should work, please excuse typos):

void OBMol::AppendMolecule(const OBMol &source)
{
    OBMol &src = (OBMol &)source;
    vector<OBAtom*>::iterator i;
    vector<OBBond*>::iterator j;
    OBAtom *atom, *a;
    OBBond *bond;
    unsigned int at_count=1;
    unsigned int begin_atom_idx, end_atom_idx, order, flags;

    _vatom.reserve(src.NumAtoms());
    _atomIds.reserve(src.NumAtoms());
    _vbond.reserve(src.NumBonds());
    _bondIds.reserve(src.NumBonds());

    for (atom = src.BeginAtom(i);atom;atom = src.NextAtom(i))
    {
        AddAtom(*atom,true);
        a=GetAtom(at_count);
        a->Duplicate(atom);
        ++at_count;
    }

    for (bond = src.BeginBond(j);bond;bond = src.NextBond(j))
    {
        begin_atom_idx=bond->GetBeginAtomIdx();
        end_atom_idx=bond->GetEndAtomIdx();
        order=bond->GetBondOrder();
        flags=bond->GetFlags();
        AddBond(begin_atom_idx, end_atom_idx, order, flags, -1);
    }
}

Again, you'd be using this by creating an empty molecule and adding all the
atoms and bond in the residue to your empty molecule by calling the above
method
(or some variant of it). The important bit probably is the lower for-loop
that
copies the bonds over by using each atom's Idx, the proper bond-order and
flags.
You probably have to adjust the each atom's Idx properly since the
original Idx
might not be the same as in your new molecule.

Hope this helps,
Torsten

PS: Sorry if it's the wrong name, I was a bit confused by your saying
"Eric" in
combination with your email address.

Dear all,

I want to create a new OBMol object for the ligand within a protein-ligand
complex.
The atoms can be copied successfully, but I have some trouble with the
bonds.

Since the residue name (ATP for example), residue number and chain id of
the ligand are known, I iterate all the atoms of the complex and insert
the
selected ones into a new OBMol.

Codes:
- - -
void CopySelected(OBMol& new_mol, OBMol select_from, char selected_chain,
string selected_resn, unsigned int selected_res_num)
{
  OBAtom *atom;
  OBResidue *res;

  FOR_ATOMS_OF_MOL(atom, select_from) {
    res = atom->GetResidue(true);

    string resn = res->GetName();
    int res_num = res->GetNum();
    char chain = res->GetChain();

    if ((resn == selected_resn)
        && (res_num == selected_res_num)
        && (chain == selected_chain)) {

      bool is_success = new_mol.InsertAtom((*atom));

      for (auto bond_itr = atom->BeginBonds();
           bond_itr != atom->EndBonds();
           bond_itr++) {
        new_mol.AddBond(*(*bond_itr));
      }
    }
  }
}
- - -

However, when the new molecule is written into sdf or mol2 format, some of
the bonds are missing; the atom types are incorrect in mol2 file.

Is there any established method to create a new molecule from part of the
existing one while maintaining *all *the information? How can I improve my
function here?

 Thanks!


Eric




--

Department of Physics and Astronomy
Louisiana State University
Baton Rouge, LA 70803-4001
ydi...@tigers.lsu.edu
------------------------------------------------------------------------------
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to