Thanks Sean. I didn't know about that annotator. On Fri, Jul 24, 2020, 3:51 AM Finan, Sean <sean.fi...@childrens.harvard.edu> wrote:
> Hi Sreejith, > > Without seeing an example of text I can't say whether my next words will > help you or not. > > If you are using trunk then you should have access to two 'new' annotation > engines in ctakes-core. > ListAnnotator - Annotates formatted List Sections by detecting them > using Regular Expressions provided in an input File. > ListEntryNegator - Checks List Entries for negation, which may be > exhibited differently from unstructured negation. > > ListAnnotator can use any list of regular expressions in a file. The > default file is in ctakes-core-res, called DefaultListRegex.bsv > The format for each line in the regex list is > NAME||LIST_REGEX||ENTRY_SEPARATOR_REGEX where > NAME - name of list type. Can be anything. > LIST_REGEX - some regular expression for which a block of text will > match a list in its entirety. > ENTRY_SEPARATOR_REGEX - some regular expression for which text within > the entire list will match a single list entry. > For instance, the List > Smoker Status: N > Drinking Status: Y > Pregnant: N/A > A -simple- line in the regex file could be > Colonized > List||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n){2,}||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n) > Notice that each item is separated by two bar characters "||". > > The file of regular expressions can be changed using the LIST_TYPES_PATH > parameter. > > ListEntryNegator will iterate through each ListEntry in the cas and use a > regular expression to determine whether or not items in the list should be > negated. > Right now that regex is hard-coded in the class. There should probably be > a mechanism to overwrite it. ": N" is not in there. Also, only > Disease/Disorders and Sign/Symptom mentions in the ListEntry are negated. > You would need to add SmokingStatusAnnotation as a negatable. > > I don't know if any of this is helpful, but I thought that I would throw > it out there. > > Sean > ________________________________________ > From: Sreejith Pk <sreji...@gmail.com> > Sent: Friday, July 24, 2020 4:09 AM > To: dev@ctakes.apache.org > Subject: Re: Clarification regarding NegationFSM [EXTERNAL] > > * External Email - Caution * > > > Hi Peter, Thanks a lot for the reply. > > Let me elaborate more on the changes I have done so far. I have added > KuRuleBasedClassifierAnnotator to the pipeline inorder to fetch Smoking > related keywords from the document. I have > modified KuRuleBasedClassifierAnnotator in such a way that it will iterate > through the identified tokens and if the token matches any smoking related > word which are configured inside a keyword.txt file. The identified tokens > will be then set to SmokerNamedEntityAnnotation and thus can be read from > the output XMI. > Here in my scenario, the sentence I am passing to cTAKES is "Smoking > status: N". As Smoking is configured inside keywords.txt, it will be coming > as the output node in SmokerNamedEntityAnnotation. Its polarity only I am > parsing in my parser logic. Here polarity of SmokerNamedEntityAnnotation > - "Smoking" token is coming as 1 instead of expected -1 > (NB: I have removed ":" from the NamedEntityContextAnalizer.java - boundary > words set) > > Thanks and Regards, > Sreejith > > > On Thu, Jul 23, 2020 at 11:20 PM Peter Abramowitsch < > pabramowit...@gmail.com> > wrote: > > > Check and see if the identified annotation you get for "Smoking status: > N" > > without your change is actually "Non Smoker" with polarity 1. > > Nonsmoker is a separate concept, from a Smoker with polarity -1. Instead > > of looking at range text, check the canonical text for the concept you > > have. > > Having said that, there are many issues with negation in all of the > > negation annotators. Some are too eager, others are too cautious. > > > > Peter > > > > On Thu, Jul 23, 2020 at 10:17 AM Sreejith Pk <sreji...@gmail.com> wrote: > > > > > Hi Team, > > > > > > We are using cTAKES 4.0.0 as the NLP engine in our application. I have > > > added ContextAnnotator to the pipeline to achieve correct Polarity to > the > > > tokens. > > > After analysing the ContextAnnotator code, I understand that negation > > > determining condition is written in NegationFSM class. > > > In my requirement, I have a sentence "Smoking status: N" and I want to > > set > > > polarity -1 to the token "Smoking" because of the occurrence of "N". To > > > achieve the same, I have tried adding "N" to the existing HashSet > > > in NegationFSM constructor like iv_negVerbsSet.add("N"); But it seems, > > > polarity of the word token "Smoking" is still coming as 1. > > > With the same configuration set if I pass "Smoking status: denies", I > am > > > getting the polarity of token "Smoking" as -1. Kindly help. > > > > > > Thanks & Regards > > > Sreejith > > > > > >