Two Questions about OverlapJcasTermAnnotator
Hi Sean (or whoever has some historical knowledge) I'm trying to improve the term annotators for speed and have noticed that the overlap term annotator does not seem to pass even the most rudimentary use cases suggested in the code comments: // things like "blood, urine, sputum cultures" should pick up "blood culture" and "urine culture" I'm happy to fix this, but my question is whether anyone can attest to whether it ever has worked, or what use cases you have to indicate that it does today. The other question is about the conventions in the term dictionary. When a PREFTERM has symbols embedded in its text - like so: *'electrocardiogram ; 24 hour'* or so *'us . doppler . cw'* or so *'angioscopies , microscopic'* Do the symbols have any implied meaning or behavior somewhere in the pipeline, or are they literally part of the text? (which is usually an impossibility in real notes)
Re: Two Questions about OverlapJcasTermAnnotator [EXTERNAL]
Hi Peter, the "blood, urine"... in the example did work when I originally tested, but the default settings (window size, etc.) may have been changed since then. Everything in preftext is simple string literal. It is likely that certain things will not appear in raw text. The UMLS has some interesting synonym sources. Sean From: Peter Abramowitsch Sent: Tuesday, August 23, 2022 6:00 PM To: dev@ctakes.apache.org Subject: Two Questions about OverlapJcasTermAnnotator [EXTERNAL] * External Email - Caution * Hi Sean (or whoever has some historical knowledge) I'm trying to improve the term annotators for speed and have noticed that the overlap term annotator does not seem to pass even the most rudimentary use cases suggested in the code comments: // things like "blood, urine, sputum cultures" should pick up "blood culture" and "urine culture" I'm happy to fix this, but my question is whether anyone can attest to whether it ever has worked, or what use cases you have to indicate that it does today. The other question is about the conventions in the term dictionary. When a PREFTERM has symbols embedded in its text - like so: *'electrocardiogram ; 24 hour'* or so *'us . doppler . cw'* or so *'angioscopies , microscopic'* Do the symbols have any implied meaning or behavior somewhere in the pipeline, or are they literally part of the text? (which is usually an impossibility in real notes)
Re: Two Questions about OverlapJcasTermAnnotator [EXTERNAL]
Thanks Sean. Glad to know there wasn't any special behavior with prefterms that I hadn't known about all these years Peter On Tue, Aug 23, 2022 at 4:31 PM Finan, Sean wrote: > Hi Peter, > > the "blood, urine"... in the example did work when I originally tested, > but the default settings (window size, etc.) may have been changed since > then. > > Everything in preftext is simple string literal. It is likely that > certain things will not appear in raw text. The UMLS has some interesting > synonym sources. > > Sean > > > From: Peter Abramowitsch > Sent: Tuesday, August 23, 2022 6:00 PM > To: dev@ctakes.apache.org > Subject: Two Questions about OverlapJcasTermAnnotator [EXTERNAL] > > * External Email - Caution * > > > Hi Sean (or whoever has some historical knowledge) > > I'm trying to improve the term annotators for speed and have noticed that > the overlap term annotator does not seem to pass even the most rudimentary > use cases suggested in the code comments: > > // things like "blood, urine, sputum cultures" should pick up "blood > culture" and "urine culture" > > I'm happy to fix this, but my question is whether anyone can attest to > whether it ever has worked, or what use cases you have to indicate that it > does today. > > The other question is about the conventions in the term dictionary. When a > PREFTERM has symbols embedded in its text - like so: > > *'electrocardiogram ; 24 hour'* > or so > *'us . doppler . cw'* > or so > *'angioscopies , microscopic'* > > Do the symbols have any implied meaning or behavior somewhere in the > pipeline, or are they literally part of the text? (which is usually an > impossibility in real notes) >