Sv, 2012-05-13 20:49 -0700, inviso rakstīja: > As a disclaimer, I'm an amazingly poor excuse for a chemist. In this > case, I'm simply a programming resource in a research project so > please forgive any amateur mistakes or abuses of terminology. The > project is being done in Python using pybel. > > As part of our process, we will be doing a scf calculation in NWChem > of both a metal ligand complex and the host portion (the ligand in > this case). I'm getting a file with the guest metal and another file > with a series of metal ligand complexes, but I am not provided the > host portion directly. The goal is to extract the metal portion from > the complex and then generate NWChem input files. > > For example, I might have RhCl3 as the metal and C9H21Cl3N3Rh as the > complex. I'm reading those two out of .xyz files so I have coordinate > data as well. I would like to end up with C9H21N3. Assuming that the > ligand might also contain Cl, it's probably not sufficient to just > randomly remove a couple Cl and a Rh. My somewhat naive approach was > to find atoms that shared both atomic number and coordinates. Is that > an accurate approach? > > When doing that, I discovered that the coordinates of the data read in > from file would change when I made a copy of the molecule. I need to > hold onto the complex as well as extract/calculate the host so I > copied the molecule first by doing something like copy_of_complex = > pybel.Molecule(complex_molecule). I then iterated over the guest > atoms and removed the matching atoms in the copy of the complex. > Because the coordinates of the copy did not match what I read in from > the file, it couldn't find the matching atoms. Done against the > original metal ligand complex read from file, it found matching atoms. > In the example code/output below, note that the coordinates of the > copy are different for the first atom. > > Is this occurring because openbabel is storing the coordinates in > floating point numbers instead of "decimal" style numbers? If I > remember correctly from looking at the API earlier, they are stored as > doubles. Any way around this? Or is this unexpected behavior? > > Also, is my entire approach flawed? Should I be trying to split the > complex prior to loading it into openbabel? Should I be reading the > complex in twice as it seems to be consistent on read? Is there a > better method available in openbabel to accomplish this? > > Finally, one other oddity I've noticed. Even when removing the atoms > from the original complex, the formula attribute of the pybel molecule > does not update even tthough the atoms are indeed gone. I'm removing > them using complex_molecule.OBMol.DeleteAtom(atom_from_complex_to_remove, > True). In the output below, note that the formula does not change, > but the number of atoms show that 4 have been removed (as expected). > > Thanks in advance for any insights you can provide, > > Greg Fortune > > -------------- > Code > -------------- > import pybel > > def molecule_diff(minuend, subtrahend): > for sub_atom in subtrahend.atoms: > for min_atom in minuend.atoms: > if(sub_atom.atomicnum == min_atom.atomicnum and > sub_atom.coords == min_atom.coords): > t = minuend.OBMol.DeleteAtom(min_atom.OBAtom, True) > > def print_atom(atom, indent='\t'): > print(indent + '%s: %s'% (pybel.ob.etab.GetSymbol(atom.atomicnum), > str(atom.coords))) > > guest = [mol for mol in pybel.readfile('xyz', 'guest.xyz')][0] > comp = [mol for mol in pybel.readfile('xyz', 'amine.xyz')][0] > host = pybel.Molecule(comp) > > print('guest: ' + guest.formula) > print_atom(guest.atoms[0], '') > print('complex: ' + comp.formula) > print_atom(comp.atoms[0], '') > print('host: ' + host.formula) > print_atom(host.atoms[0], '') > > > molecule_diff(comp, guest) > > print(comp.formula) #This should give back the formula without RhCl3, > but it gives the full formula of the original complex > print(len(comp.atoms)) #The original total is 37 atoms so 33 makes > sense assuming we removed the 4 atoms from the guest. > > > -------------- > Output > -------------- > guest: Cl3Rh > Rh: (0.06029, -2.22815, 0.03786) > complex: C9H21Cl3N3Rh > Rh: (0.06029, -2.22815, 0.03786) > host: C9H21Cl3N3Rh > Rh: (0.0603, -2.2281, 0.0379) > > C9H21Cl3N3Rh > 33
You could try to align the molecules and then get the coordinates which are close enough. Or if the bond perception works correctly and there are only those Cl bonded to Rh which have to go away, then you could iterate trough the bonded neighbors of Rh to find the atoms to remove (Hint, OBAtomAtomIter). Anyway, final structure will have to be checked to make sure that everything went as expected. If you have to remove just four atoms then it would also not be unthinkable to just provide atom indexes to the script and remove those atoms. And if it is just one structure you have to edit, then the easiest way probably is to open it in Avogadro, delete the atoms you want to go away and save the new structure. Added bonus for this is that you directly see which atoms you are removing. Reinis ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss