Where in the source code is this feature implemented? On Tue, Aug 4, 2020 at 7:30 PM Peter Abramowitsch <pabramowit...@gmail.com> wrote:
> Blacklist format > Actually I got it inverted, its: > > semantic_code1, semantic_code2,...|text1 > semantic_code1, semantic_code2,...|text2 > > Peter > > On Tue, Aug 4, 2020 at 4:16 PM Peter Abramowitsch <pabramowit...@gmail.com > > > wrote: > > > Ok Thanks Jeff. I'm glad I wasn't missing something important. > > > > There already is a blacklist text mechanism which suppresses > > identification of specific text by clinical domain. > > Looking at the code it collects entries like > > cTakesSemanticCode,texta,textb,textc > > NE_TYPE_ID_DRUG, jasmine, coriander, bleach > > There's a case sensitive list and a case insensitive one. > > > > So I will try that. > > in one of my examples, I'll say that 'bed' is not a disorder, while > 'BED' > > could be one. > > > > > > > > On Tue, Aug 4, 2020 at 2:12 PM Jeffrey Miller <jeff...@gmail.com> wrote: > > > >> Hi Peter, > >> > >> To your question about sno_rx_16ab I suspect that the CUI is new since > >> 2016, or if it existed in UMLS back then, it was not associated with a > >> term > >> in snomed or rxnorm at that time. > >> > >> To those solutions, if you are able to use the trunk I know Sean said > >> there > >> was a suppression text feature, otherwise in the past I have removed the > >> lines from the .script file > >> > >> I definitely think the acronym case sensitive feature would be great. > >> > >> Jeff > >> > >> On Tue, Aug 4, 2020 at 3:28 PM Peter Abramowitsch < > >> pabramowit...@gmail.com> > >> wrote: > >> > >> > Hi Jeff et al > >> > > >> > To take up the thread from a few days ago where a simple english word > >> such > >> > as bed, soft, shop also maps into a legitimate but rarely used acronym > >> and > >> > shows up in the same POS as a potentially interesting entity, what is > >> the > >> > mechanism you would use to disambiguate? > >> > > >> > This problem only started since I constructed a SNO+RX+HGNC > dictionary > >> > from the 2020A UMLS dump. Adding more TUIS where a more conventional > >> > word-sense of the target word occurs, does not fix this problem. > >> > > >> > For instance, why does the sno_rx dictionary not contain this disease > >> which > >> > aliases to "bed" ? > >> > > >> > ucsf_dict_v1 $ grep 3159311 *.script > >> > *INSERT INTO CUI_TERMS VALUES(3159311,0,1,'bed','bed')* > >> > INSERT INTO CUI_TERMS VALUES(3159311,5,8,'myopia , high , with > >> > nonprogressive cone dysfunction','nonprogressive') > >> > INSERT INTO CUI_TERMS VALUES(3159311,0,3,'bornholm eye > >> disease','bornholm') > >> > INSERT INTO CUI_TERMS VALUES(3159311,5,6,'x-linked cone dysfunction > >> > syndrome with myopia','myopia') > >> > INSERT INTO TUI VALUES(3159311,47) > >> > *INSERT INTO PREFTERM VALUES(3159311,'BORNHOLM EYE DISEASE')* > >> > INSERT INTO SNOMEDCT_US VALUES(3159311,718718009) > >> > > >> > > >> > sno_rx_16ab $ grep 3159311 *.script > >> > nada > >> > > >> > Solutions good or evil? > >> > > >> > - Strip the relevant lines out of ths dict.script file? > >> > - Blacklist the text? > >> > - Add to my stopCUI list (a little feature I added)? > >> > - Some other configuration I don't know about? > >> > For instance, is there a CUI:ACRONYM table? > >> > I'm tempted to create one. This would require the matching term to > >> be > >> > present in upper case. > >> > > >> > Peter > >> > > >> > > >