Blacklist format
Actually I got it inverted, its:

semantic_code1, semantic_code2,...|text1
semantic_code1, semantic_code2,...|text2

Peter

On Tue, Aug 4, 2020 at 4:16 PM Peter Abramowitsch <pabramowit...@gmail.com>
wrote:

> Ok Thanks Jeff.  I'm glad I wasn't missing something important.
>
> There already is a blacklist text mechanism which suppresses
> identification of specific text by clinical domain.
> Looking at the code it collects entries like
> cTakesSemanticCode,texta,textb,textc
> NE_TYPE_ID_DRUG, jasmine, coriander, bleach
> There's a case sensitive list and a case insensitive one.
>
> So I will try that.
> in one of my examples, I'll say that  'bed' is not a disorder, while 'BED'
> could be one.
>
>
>
> On Tue, Aug 4, 2020 at 2:12 PM Jeffrey Miller <jeff...@gmail.com> wrote:
>
>> Hi Peter,
>>
>> To your question about sno_rx_16ab I suspect that the CUI is new since
>> 2016, or if it existed in UMLS back then, it was not associated with a
>> term
>> in snomed or rxnorm at that time.
>>
>> To those solutions, if you are able to use the trunk I know Sean said
>> there
>> was a suppression text feature, otherwise in the past I have removed the
>> lines from the .script file
>>
>> I definitely think the acronym case sensitive feature would be great.
>>
>> Jeff
>>
>> On Tue, Aug 4, 2020 at 3:28 PM Peter Abramowitsch <
>> pabramowit...@gmail.com>
>> wrote:
>>
>> > Hi Jeff et al
>> >
>> > To take up the thread from a few days ago where a simple english word
>> such
>> > as bed, soft, shop also maps into a legitimate but rarely used acronym
>> and
>> > shows up in the same POS as a potentially interesting entity,  what is
>> the
>> > mechanism you would use to disambiguate?
>> >
>> > This problem only started since I  constructed a SNO+RX+HGNC dictionary
>> > from the 2020A UMLS dump.   Adding more TUIS where a more conventional
>> > word-sense of the target word occurs, does not fix this problem.
>> >
>> > For instance, why does the sno_rx dictionary not contain this disease
>> which
>> > aliases to  "bed" ?
>> >
>> > ucsf_dict_v1 $ grep 3159311 *.script
>> > *INSERT INTO CUI_TERMS VALUES(3159311,0,1,'bed','bed')*
>> > INSERT INTO CUI_TERMS VALUES(3159311,5,8,'myopia , high , with
>> > nonprogressive cone dysfunction','nonprogressive')
>> > INSERT INTO CUI_TERMS VALUES(3159311,0,3,'bornholm eye
>> disease','bornholm')
>> > INSERT INTO CUI_TERMS VALUES(3159311,5,6,'x-linked cone dysfunction
>> > syndrome with myopia','myopia')
>> > INSERT INTO TUI VALUES(3159311,47)
>> > *INSERT INTO PREFTERM VALUES(3159311,'BORNHOLM EYE DISEASE')*
>> > INSERT INTO SNOMEDCT_US VALUES(3159311,718718009)
>> >
>> >
>> > sno_rx_16ab $ grep 3159311 *.script
>> > nada
>> >
>> > Solutions good or evil?
>> >
>> >    - Strip the relevant lines out of ths dict.script file?
>> >    - Blacklist the text?
>> >    - Add to my stopCUI list (a little feature I added)?
>> >    - Some other configuration I don't  know about?
>> >    For instance, is there a CUI:ACRONYM table?
>> >    I'm tempted to create one.  This would require the matching term to
>> be
>> >    present in upper case.
>> >
>> > Peter
>> >
>>
>

Reply via email to