HI Jennifer,
The sample code in the documentation is there to generate a distance matrix
(which is what one needs for clustering). So it calculates the similarities
with this line:
sims = DataStructs.BulkTanimotoSimilarity(fps[i],fps[:i])
and then converts them to distances by subtracting from 1:
dists.extend([1-x for x in sims])
If you want the similarity matrix, you'd just do:
dists.extend(sims)
Here's a little demo to show that BulkTanimotoSimilarity is actually
returning similarities:
In [3]: fp1 = Chem.RDKFingerprint(Chem.MolFromSmiles('CCc1ccccc1'))
In [4]: fp2 = Chem.RDKFingerprint(Chem.MolFromSmiles('CCCc1ccccc1'))
In [5]: DataStructs.TanimotoSimilarity(fp1,fp2)
Out[5]: 0.7590361445783133
In [6]: DataStructs.BulkTanimotoSimilarity(fp1,(fp1,fp2))
Out[6]: [1.0, 0.7590361445783133]
You can see that the values are what you'd expect.
I hope this helps,
-greg
On Wed, Jun 6, 2018 at 12:26 PM Jennifer Hemmerich <
[email protected]> wrote:
> I was trying to calculate a Similarity Matrix with Morgan Fingerprints and
> TanimotoSimilarity.
>
> If I use DataStructs.TanimotoSimilarity(fp,fp) I get a Simlarity of 1,
> which I would expect. If I do the same with the
> DataStructs.BulkTanimotoSimilarity(fps[i], fps[i]) i get a Similarity of 0,
> which actually is not a Similarity but a distance. I took this from the
> cookbook (http://www.rdkit.org/docs/Cookbook.html) which states at the
> clustering example:
>
> # first generate the distance matrix:
> dists = []
> nfps = len(fps)
> for i in range(1,nfps):
> sims = DataStructs.BulkTanimotoSimilarity(fps[i],fps[:i])
> dists.extend([1-x for x in sims])
>
> Did I misunderstand something or is the dists list in the example
> actually a list of Similarities, and the BulkTanimotoSimilarity actually
> calculates the Tanimoto distance?
>
> It would be great to get some clarification.
>
> Thank you in advance,
>
> Jennifer
>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss