Hi Sreejith, Without seeing an example of text I can't say whether my next words will help you or not.
If you are using trunk then you should have access to two 'new' annotation engines in ctakes-core. ListAnnotator - Annotates formatted List Sections by detecting them using Regular Expressions provided in an input File. ListEntryNegator - Checks List Entries for negation, which may be exhibited differently from unstructured negation. ListAnnotator can use any list of regular expressions in a file. The default file is in ctakes-core-res, called DefaultListRegex.bsv The format for each line in the regex list is NAME||LIST_REGEX||ENTRY_SEPARATOR_REGEX where NAME - name of list type. Can be anything. LIST_REGEX - some regular expression for which a block of text will match a list in its entirety. ENTRY_SEPARATOR_REGEX - some regular expression for which text within the entire list will match a single list entry. For instance, the List Smoker Status: N Drinking Status: Y Pregnant: N/A A -simple- line in the regex file could be Colonized List||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n){2,}||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n) Notice that each item is separated by two bar characters "||". The file of regular expressions can be changed using the LIST_TYPES_PATH parameter. ListEntryNegator will iterate through each ListEntry in the cas and use a regular expression to determine whether or not items in the list should be negated. Right now that regex is hard-coded in the class. There should probably be a mechanism to overwrite it. ": N" is not in there. Also, only Disease/Disorders and Sign/Symptom mentions in the ListEntry are negated. You would need to add SmokingStatusAnnotation as a negatable. I don't know if any of this is helpful, but I thought that I would throw it out there. Sean ________________________________________ From: Sreejith Pk <sreji...@gmail.com> Sent: Friday, July 24, 2020 4:09 AM To: dev@ctakes.apache.org Subject: Re: Clarification regarding NegationFSM [EXTERNAL] * External Email - Caution * Hi Peter, Thanks a lot for the reply. Let me elaborate more on the changes I have done so far. I have added KuRuleBasedClassifierAnnotator to the pipeline inorder to fetch Smoking related keywords from the document. I have modified KuRuleBasedClassifierAnnotator in such a way that it will iterate through the identified tokens and if the token matches any smoking related word which are configured inside a keyword.txt file. The identified tokens will be then set to SmokerNamedEntityAnnotation and thus can be read from the output XMI. Here in my scenario, the sentence I am passing to cTAKES is "Smoking status: N". As Smoking is configured inside keywords.txt, it will be coming as the output node in SmokerNamedEntityAnnotation. Its polarity only I am parsing in my parser logic. Here polarity of SmokerNamedEntityAnnotation - "Smoking" token is coming as 1 instead of expected -1 (NB: I have removed ":" from the NamedEntityContextAnalizer.java - boundary words set) Thanks and Regards, Sreejith On Thu, Jul 23, 2020 at 11:20 PM Peter Abramowitsch <pabramowit...@gmail.com> wrote: > Check and see if the identified annotation you get for "Smoking status: N" > without your change is actually "Non Smoker" with polarity 1. > Nonsmoker is a separate concept, from a Smoker with polarity -1. Instead > of looking at range text, check the canonical text for the concept you > have. > Having said that, there are many issues with negation in all of the > negation annotators. Some are too eager, others are too cautious. > > Peter > > On Thu, Jul 23, 2020 at 10:17 AM Sreejith Pk <sreji...@gmail.com> wrote: > > > Hi Team, > > > > We are using cTAKES 4.0.0 as the NLP engine in our application. I have > > added ContextAnnotator to the pipeline to achieve correct Polarity to the > > tokens. > > After analysing the ContextAnnotator code, I understand that negation > > determining condition is written in NegationFSM class. > > In my requirement, I have a sentence "Smoking status: N" and I want to > set > > polarity -1 to the token "Smoking" because of the occurrence of "N". To > > achieve the same, I have tried adding "N" to the existing HashSet > > in NegationFSM constructor like iv_negVerbsSet.add("N"); But it seems, > > polarity of the word token "Smoking" is still coming as 1. > > With the same configuration set if I pass "Smoking status: denies", I am > > getting the polarity of token "Smoking" as -1. Kindly help. > > > > Thanks & Regards > > Sreejith > > >