Thanks Sean. Glad to know there wasn't any special behavior with prefterms that I hadn't known about all these years
Peter On Tue, Aug 23, 2022 at 4:31 PM Finan, Sean <sean.fi...@childrens.harvard.edu.invalid> wrote: > Hi Peter, > > the "blood, urine"... in the example did work when I originally tested, > but the default settings (window size, etc.) may have been changed since > then. > > Everything in preftext is simple string literal. It is likely that > certain things will not appear in raw text. The UMLS has some interesting > synonym sources. > > Sean > > ________________________________ > From: Peter Abramowitsch <pabramowit...@gmail.com> > Sent: Tuesday, August 23, 2022 6:00 PM > To: dev@ctakes.apache.org <dev@ctakes.apache.org> > Subject: Two Questions about OverlapJcasTermAnnotator [EXTERNAL] > > * External Email - Caution * > > > Hi Sean (or whoever has some historical knowledge) > > I'm trying to improve the term annotators for speed and have noticed that > the overlap term annotator does not seem to pass even the most rudimentary > use cases suggested in the code comments: > > // things like "blood, urine, sputum cultures" should pick up "blood > culture" and "urine culture" > > I'm happy to fix this, but my question is whether anyone can attest to > whether it ever has worked, or what use cases you have to indicate that it > does today. > > The other question is about the conventions in the term dictionary. When a > PREFTERM has symbols embedded in its text - like so: > > *'electrocardiogram ; 24 hour'* > or so > *'us . doppler . cw'* > or so > *'angioscopies , microscopic'* > > Do the symbols have any implied meaning or behavior somewhere in the > pipeline, or are they literally part of the text? (which is usually an > impossibility in real notes) >