Actually, I have a small app called MassiveMark , where people insert different text, Markdown(MathML, LaTex, Chemistry formula etc), codeblock, images and normal text. Later on we figured out this is mainly used by students and professors to create lecture notes, exam papers etc. I guess they are mainly converting text from ChatGPT and downloading it as docx. However few users requested if we can allow them to store it and later they can fetch it. We were planning to also let them search in the document. I have no clue how we are going to search in Organic chemistry, Compound are mainly manipulated from smile code, which is manipulated during visual or export to docx.
In case if you want to look at app here is it the playground, this is under experimental https://www.assignmenthelp.net/massivemark On Fri, Aug 9, 2024 at 7:16 PM Dmitri Maziuk <dmitri.maz...@gmail.com> wrote: > On 8/9/24 01:10, Zara Parst wrote: > ... > > One idea that came to my mind was to encode these into other searchable > > sequences while storing but this creates problems of collusion and > massive > > mapping. Also just small changes in smile codes completely changes the > > compound which loses the meaning of search. Have you tried something > > like this before? Any ideas are welcome. > > What are you trying to achieve? SMILES strings are not that great to > begin with, and if you want your searches chemically meaningful, I doubt > you can do that with Solr -- unless you write your own Lucene searcher. > > Dima > >