A search algorithm that, say, proposes a prefix or a suffix to a SMILES string 
would need to have a way to autocomplete candidates before it could use these 
descriptors to guide an optimization because the parsing step is non-trivial, 
never mind the sanitization step (mentioned on that web page).
I will deflect blame on Jon for changing the topic from music to chemicals, but 
presumably with enough debate their aesthetic preferences in music could be 
formalized in theory or some rule-based way, as is manifest here.

From: Friam <friam-boun...@redfish.com> On Behalf Of Roger Critchlow
Sent: Tuesday, October 12, 2021 2:10 PM
To: The Friday Morning Applied Complexity Coffee Group <friam@redfish.com>
Subject: Re: [FRIAM] Schwill Rock?

Hmm, when I was in the drug discovery canal, the "descriptors" that you could 
calculate from a SMILES string were legion.

Here's the list for RDKIT, 
https://www.rdkit.org/docs/GettingStartedInPython.html#list-of-available-descriptors.
  There are one bunch that depend entirely on the formula and molecular 
structure.  Then there's a whole other bunch you can compute if you generate 3d 
structures for the molecules, possibly multiplied by the number of low energy 
structures the molecule can adopt.

What kind of plausibility were you looking for?  Does the SMILES string specify 
a real molecule?  That's hard.  There are syntax errors in SMILES, failures to 
close rings, valency errors, charge errors.  But there are lots of 
syntactically valid SMILES that won't match any known molecule, either because 
they're impossible or as yet to be determined.  The pharmas all have their own 
lists of molecules of interest, but those are proprietary.  Looks like there 
are various online databases, none that I'm familiar with.  If the SMILES 
parses, you can try generating a 2d depiction and a 3d structure.  Those will 
throw exceptions if things get too weird.

-- rec --

On Tue, Oct 12, 2021 at 3:22 PM Marcus Daniels 
<mar...@snoutfarm.com<mailto:mar...@snoutfarm.com>> wrote:
I was playing with RDKIT the other day, and it wasn’t obvious how to get a 
scalar quantity of plausibility of a molecule.   It seems a SMILES string is 
right or wrong, and then maybe there are some warnings that can be trapped.   
However, the benefits for search or fair sampling are different than the needs 
of correctness checks, which is deeper property.   That isn’t quite a fit to 
the music example where aesthetic considerations are subjective.

From: Friam <friam-boun...@redfish.com<mailto:friam-boun...@redfish.com>> On 
Behalf Of Jon Zingale
Sent: Tuesday, October 12, 2021 12:11 PM
To: friam@redfish.com<mailto:friam@redfish.com>
Subject: Re: [FRIAM] Schwill Rock?

"I mean from the perspective of aesthetics. Understanding why Pandora is 
messing it up means sampling the deep wells."

Yes, but not more than one has to. This is why I am advocating for methods like 
a weighted ensemble. The working analogy for me comes from drug discovery. It 
doesn't make a lot of sense to probe the same old sites and the same old 
conformations.

.-- .- -. - / .- -.-. - .. --- -. ..--.. / -.-. --- -. .--- ..- --. .- - .
FRIAM Applied Complexity Group listserv
Zoom Fridays 9:30a-12p Mtn UTC-6  
bit.ly/virtualfriam<http://bit.ly/virtualfriam>
un/subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:
 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
 1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
.-- .- -. - / .- -.-. - .. --- -. ..--.. / -.-. --- -. .--- ..- --. .- - .
FRIAM Applied Complexity Group listserv
Zoom Fridays 9:30a-12p Mtn UTC-6  bit.ly/virtualfriam
un/subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:
 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
 1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Reply via email to