Hi Peter,

To my knowledge, there isn't any drastic difference in the behavior of the
dictionary gui creator and the way the sno_rx dictionary was created. I
originally thought there was, but I realized the difference was that I had
not installed all of UMLS to my machine (just the vocabularies I was
interested in) and I was missing synonyms. The first thing I would check,
are you able to find a matching entry in the .script file for your ctakes
dictionary when you do this:

grep -i ,\'short\', *.script

That would confirm whether or not you have a term in your dictionary made
up only of 'short' and whether it mapped to the CUI equal to "SHORT
STATURE, ONYCHODYSPLASIA, FACIAL DYSMORPHISM, AND HYPOTRICHOSIS SYNDROME".
If it's not in there, something else is going on. You could do the same for
'bed'.

If not, another thing I might check is that I noticed you are using
the OverlapJCasTermAnnotator in your prior e-mail. I don't have much
experience with it, and I don't think it should cause this behavior, but I
wonder if that could be making the difference (as compared
to DefaultJCasTermAnnotator).

Jeff

On Sat, Aug 1, 2020 at 5:27 PM Peter Abramowitsch <pabramowit...@gmail.com>
wrote:

>
> Hi All
>
> Having created a new dictionary from the 2020AA UMLS and added Genes and
> Receptors to the dictionary-creator's default selections, I have a curious
> problem where cTakes now assigns the most bizarre acronyms to ordinary
> words used in POS contexts where it shouldn't  find <XXX>Mentions.
>
> Here are two examples:
>
> 1.   soft (in "soft tissue...")
> becomes   "SHORT STATURE, ONYCHODYSPLASIA, FACIAL DYSMORPHISM, AND
> HYPOTRICHOSIS SYNDROME",
>
> 2.   bed in ("The wound bed was...")
> becomes  "BORNHOLM EYE DISEASE"
>
> I have not changed the TermConsumer type in the descriptor XML.
>
> Are the DictionaryCreator's defaults, the equivalent to the default sno_rx
> that's delivered with the app?
>
> Attached is the vocab subsets list I used
>
>
> Peter
>
>
>

Reply via email to