On 8/9/24 01:10, Zara Parst wrote: ...
One idea that came to my mind was to encode these into other searchable sequences while storing but this creates problems of collusion and massive mapping. Also just small changes in smile codes completely changes the compound which loses the meaning of search. Have you tried something like this before? Any ideas are welcome.
What are you trying to achieve? SMILES strings are not that great to begin with, and if you want your searches chemically meaningful, I doubt you can do that with Solr -- unless you write your own Lucene searcher.
Dima