HI Jennifer,

The sample code in the documentation is there to generate a distance matrix
(which is what one needs for clustering). So it calculates the similarities
with this line:

        sims = DataStructs.BulkTanimotoSimilarity(fps[i],fps[:i])

and then converts them to distances by subtracting from 1:

        dists.extend([1-x for x in sims])


If you want the similarity matrix, you'd just do:
dists.extend(sims)
Here's a little demo to show that BulkTanimotoSimilarity is actually
returning similarities:

In [3]: fp1 = Chem.RDKFingerprint(Chem.MolFromSmiles('CCc1ccccc1'))

In [4]: fp2 = Chem.RDKFingerprint(Chem.MolFromSmiles('CCCc1ccccc1'))

In [5]: DataStructs.TanimotoSimilarity(fp1,fp2)
Out[5]: 0.7590361445783133

In [6]: DataStructs.BulkTanimotoSimilarity(fp1,(fp1,fp2))
Out[6]: [1.0, 0.7590361445783133]

You can see that the values are what you'd expect.

I hope this helps,
-greg

On Wed, Jun 6, 2018 at 12:26 PM Jennifer Hemmerich <
[email protected]> wrote:

> I was trying to calculate a Similarity Matrix with Morgan Fingerprints and
> TanimotoSimilarity.
>
> If I use DataStructs.TanimotoSimilarity(fp,fp) I get a Simlarity of 1,
> which I would expect. If I do the same with the
> DataStructs.BulkTanimotoSimilarity(fps[i], fps[i]) i get a Similarity of 0,
> which actually is not a Similarity but a distance. I took this from the
> cookbook (http://www.rdkit.org/docs/Cookbook.html) which states at the
> clustering example:
>
>  # first generate the distance matrix:
>     dists = []
>     nfps = len(fps)
>     for i in range(1,nfps):
>         sims = DataStructs.BulkTanimotoSimilarity(fps[i],fps[:i])
>         dists.extend([1-x for x in sims])
>
>  Did I misunderstand something or is the dists list in the example
> actually a list of Similarities, and the BulkTanimotoSimilarity actually
> calculates the Tanimoto distance?
>
> It would be great to get some clarification.
>
> Thank you in advance,
>
> Jennifer
>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to