>From codebase 4.0.1 org.apache.ctakes.dictionary.lookup2.consumer.DefaultTermConsumer line 98, but you'll see it referenced everywhere in the file.
Oddly enough it is not abstracted out so it could be used in the PrecisionTermConsumer. I'm just testing it. Peter Peter On Tue, Aug 4, 2020 at 4:41 PM Jeffrey Miller <jeff...@gmail.com> wrote: > Where in the source code is this feature implemented? > > On Tue, Aug 4, 2020 at 7:30 PM Peter Abramowitsch <pabramowit...@gmail.com > > > wrote: > > > Blacklist format > > Actually I got it inverted, its: > > > > semantic_code1, semantic_code2,...|text1 > > semantic_code1, semantic_code2,...|text2 > > > > Peter > > > > On Tue, Aug 4, 2020 at 4:16 PM Peter Abramowitsch < > pabramowit...@gmail.com > > > > > wrote: > > > > > Ok Thanks Jeff. I'm glad I wasn't missing something important. > > > > > > There already is a blacklist text mechanism which suppresses > > > identification of specific text by clinical domain. > > > Looking at the code it collects entries like > > > cTakesSemanticCode,texta,textb,textc > > > NE_TYPE_ID_DRUG, jasmine, coriander, bleach > > > There's a case sensitive list and a case insensitive one. > > > > > > So I will try that. > > > in one of my examples, I'll say that 'bed' is not a disorder, while > > 'BED' > > > could be one. > > > > > > > > > > > > On Tue, Aug 4, 2020 at 2:12 PM Jeffrey Miller <jeff...@gmail.com> > wrote: > > > > > >> Hi Peter, > > >> > > >> To your question about sno_rx_16ab I suspect that the CUI is new since > > >> 2016, or if it existed in UMLS back then, it was not associated with a > > >> term > > >> in snomed or rxnorm at that time. > > >> > > >> To those solutions, if you are able to use the trunk I know Sean said > > >> there > > >> was a suppression text feature, otherwise in the past I have removed > the > > >> lines from the .script file > > >> > > >> I definitely think the acronym case sensitive feature would be great. > > >> > > >> Jeff > > >> > > >> On Tue, Aug 4, 2020 at 3:28 PM Peter Abramowitsch < > > >> pabramowit...@gmail.com> > > >> wrote: > > >> > > >> > Hi Jeff et al > > >> > > > >> > To take up the thread from a few days ago where a simple english > word > > >> such > > >> > as bed, soft, shop also maps into a legitimate but rarely used > acronym > > >> and > > >> > shows up in the same POS as a potentially interesting entity, what > is > > >> the > > >> > mechanism you would use to disambiguate? > > >> > > > >> > This problem only started since I constructed a SNO+RX+HGNC > > dictionary > > >> > from the 2020A UMLS dump. Adding more TUIS where a more > conventional > > >> > word-sense of the target word occurs, does not fix this problem. > > >> > > > >> > For instance, why does the sno_rx dictionary not contain this > disease > > >> which > > >> > aliases to "bed" ? > > >> > > > >> > ucsf_dict_v1 $ grep 3159311 *.script > > >> > *INSERT INTO CUI_TERMS VALUES(3159311,0,1,'bed','bed')* > > >> > INSERT INTO CUI_TERMS VALUES(3159311,5,8,'myopia , high , with > > >> > nonprogressive cone dysfunction','nonprogressive') > > >> > INSERT INTO CUI_TERMS VALUES(3159311,0,3,'bornholm eye > > >> disease','bornholm') > > >> > INSERT INTO CUI_TERMS VALUES(3159311,5,6,'x-linked cone dysfunction > > >> > syndrome with myopia','myopia') > > >> > INSERT INTO TUI VALUES(3159311,47) > > >> > *INSERT INTO PREFTERM VALUES(3159311,'BORNHOLM EYE DISEASE')* > > >> > INSERT INTO SNOMEDCT_US VALUES(3159311,718718009) > > >> > > > >> > > > >> > sno_rx_16ab $ grep 3159311 *.script > > >> > nada > > >> > > > >> > Solutions good or evil? > > >> > > > >> > - Strip the relevant lines out of ths dict.script file? > > >> > - Blacklist the text? > > >> > - Add to my stopCUI list (a little feature I added)? > > >> > - Some other configuration I don't know about? > > >> > For instance, is there a CUI:ACRONYM table? > > >> > I'm tempted to create one. This would require the matching term > to > > >> be > > >> > present in upper case. > > >> > > > >> > Peter > > >> > > > >> > > > > > >