Hi Sreejith,

Without seeing an example of text I can't say whether my next words will help 
you or not.

If you are using trunk then you should have access to two 'new' annotation 
engines in ctakes-core.
ListAnnotator        - Annotates formatted List Sections by detecting them 
using Regular Expressions provided in an input File.
ListEntryNegator  - Checks List Entries for negation, which may be exhibited 
differently from unstructured negation.

ListAnnotator can use any list of regular expressions in a file.  The default 
file is in ctakes-core-res, called DefaultListRegex.bsv
The format for each line in the regex list is 
NAME||LIST_REGEX||ENTRY_SEPARATOR_REGEX   where
NAME     - name of list type.  Can be anything.
LIST_REGEX   - some regular expression for which a block of text will match a 
list in its entirety.
ENTRY_SEPARATOR_REGEX   - some regular expression for which text within the 
entire list will match a single list entry.
For instance, the List 
Smoker Status: N
Drinking Status: Y
Pregnant: N/A
A -simple- line in the regex file could be
Colonized 
List||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n){2,}||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n)
Notice that each item is separated by two bar characters "||".

The file of regular expressions can be changed using the LIST_TYPES_PATH 
parameter.

ListEntryNegator will iterate through each ListEntry in the cas and use a 
regular expression to determine whether or not items in the list should be 
negated.
Right now that regex is hard-coded in the class.  There should probably be a 
mechanism to overwrite it.  ": N" is not in there.   Also, only 
Disease/Disorders and Sign/Symptom mentions in the ListEntry are negated.   You 
would need to add SmokingStatusAnnotation as a negatable.

I don't know if any of this is helpful, but I thought that I would throw it out 
there.

Sean
________________________________________
From: Sreejith Pk <sreji...@gmail.com>
Sent: Friday, July 24, 2020 4:09 AM
To: dev@ctakes.apache.org
Subject: Re: Clarification regarding NegationFSM [EXTERNAL]

* External Email - Caution *


Hi Peter, Thanks a lot for the reply.

Let me elaborate more on the changes I have done so far. I have added
KuRuleBasedClassifierAnnotator to the pipeline inorder to fetch Smoking
related keywords from the document. I have
modified KuRuleBasedClassifierAnnotator in such a way that it will iterate
through the identified tokens and if the token matches any smoking related
word which are configured inside a keyword.txt file. The identified tokens
will be then set to SmokerNamedEntityAnnotation and thus can be read from
the output XMI.
Here in my scenario, the sentence I am passing to cTAKES is "Smoking
status: N". As Smoking is configured inside keywords.txt, it will be coming
as the output node in SmokerNamedEntityAnnotation. Its polarity only I am
parsing in my parser logic. Here polarity of SmokerNamedEntityAnnotation
- "Smoking" token is coming as 1 instead of expected -1
(NB: I have removed ":" from the NamedEntityContextAnalizer.java - boundary
words set)

Thanks and Regards,
Sreejith


On Thu, Jul 23, 2020 at 11:20 PM Peter Abramowitsch <pabramowit...@gmail.com>
wrote:

> Check and see if the identified annotation you get for "Smoking status: N"
> without your change is actually "Non Smoker" with polarity 1.
> Nonsmoker is a separate concept, from a Smoker with polarity -1.  Instead
> of looking at range text, check the canonical text for the concept you
> have.
> Having said that, there are many issues with negation in all of the
> negation annotators.  Some are too eager, others are too cautious.
>
> Peter
>
> On Thu, Jul 23, 2020 at 10:17 AM Sreejith Pk <sreji...@gmail.com> wrote:
>
> > Hi Team,
> >
> > We are using cTAKES 4.0.0 as the NLP engine in our application. I have
> > added ContextAnnotator to the pipeline to achieve correct Polarity to the
> > tokens.
> > After analysing the ContextAnnotator code, I understand that negation
> > determining condition is written in NegationFSM class.
> > In my requirement, I have a sentence "Smoking status: N"  and I want to
> set
> > polarity -1 to the token "Smoking" because of the occurrence of "N". To
> > achieve the same, I have tried adding "N" to the existing HashSet
> > in NegationFSM constructor like iv_negVerbsSet.add("N"); But it seems,
> > polarity of the word token "Smoking" is still  coming as 1.
> > With the same configuration set if I pass "Smoking status: denies", I am
> > getting the polarity of token "Smoking" as -1. Kindly help.
> >
> > Thanks & Regards
> > Sreejith
> >
>

Reply via email to