Ok Thanks Jeff. I'm glad I wasn't missing something important. There already is a blacklist text mechanism which suppresses identification of specific text by clinical domain. Looking at the code it collects entries like cTakesSemanticCode,texta,textb,textc NE_TYPE_ID_DRUG, jasmine, coriander, bleach There's a case sensitive list and a case insensitive one.
So I will try that. in one of my examples, I'll say that 'bed' is not a disorder, while 'BED' could be one. On Tue, Aug 4, 2020 at 2:12 PM Jeffrey Miller <jeff...@gmail.com> wrote: > Hi Peter, > > To your question about sno_rx_16ab I suspect that the CUI is new since > 2016, or if it existed in UMLS back then, it was not associated with a term > in snomed or rxnorm at that time. > > To those solutions, if you are able to use the trunk I know Sean said there > was a suppression text feature, otherwise in the past I have removed the > lines from the .script file > > I definitely think the acronym case sensitive feature would be great. > > Jeff > > On Tue, Aug 4, 2020 at 3:28 PM Peter Abramowitsch <pabramowit...@gmail.com > > > wrote: > > > Hi Jeff et al > > > > To take up the thread from a few days ago where a simple english word > such > > as bed, soft, shop also maps into a legitimate but rarely used acronym > and > > shows up in the same POS as a potentially interesting entity, what is > the > > mechanism you would use to disambiguate? > > > > This problem only started since I constructed a SNO+RX+HGNC dictionary > > from the 2020A UMLS dump. Adding more TUIS where a more conventional > > word-sense of the target word occurs, does not fix this problem. > > > > For instance, why does the sno_rx dictionary not contain this disease > which > > aliases to "bed" ? > > > > ucsf_dict_v1 $ grep 3159311 *.script > > *INSERT INTO CUI_TERMS VALUES(3159311,0,1,'bed','bed')* > > INSERT INTO CUI_TERMS VALUES(3159311,5,8,'myopia , high , with > > nonprogressive cone dysfunction','nonprogressive') > > INSERT INTO CUI_TERMS VALUES(3159311,0,3,'bornholm eye > disease','bornholm') > > INSERT INTO CUI_TERMS VALUES(3159311,5,6,'x-linked cone dysfunction > > syndrome with myopia','myopia') > > INSERT INTO TUI VALUES(3159311,47) > > *INSERT INTO PREFTERM VALUES(3159311,'BORNHOLM EYE DISEASE')* > > INSERT INTO SNOMEDCT_US VALUES(3159311,718718009) > > > > > > sno_rx_16ab $ grep 3159311 *.script > > nada > > > > Solutions good or evil? > > > > - Strip the relevant lines out of ths dict.script file? > > - Blacklist the text? > > - Add to my stopCUI list (a little feature I added)? > > - Some other configuration I don't know about? > > For instance, is there a CUI:ACRONYM table? > > I'm tempted to create one. This would require the matching term to be > > present in upper case. > > > > Peter > > >