Sean, When I use cTAKES I'd like to be able to refer to the version number for reproducibility. If I run just the latest trunk (to get access to a new feature), it is not easily referenced. How is it decided to make a new cTAKES release? Do you think there will be any future releases or would it be better to begin referring to cTAKES by svn commit rather than version?
Also, unrelatedly, I am not sure when this happened, but the github mirror for cTAKES (https://github.com/apache/ctakes) doesn't seem to be updating. It doesn't have dockhand (as an example). Thanks, Jeff On Wed, Jul 29, 2020 at 1:31 PM Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Tomasz, > > As far as I know there aren't any upcoming releases planned. > > Sean > ________________________________________ > From: Tomasz Oliwa <ol...@uchicago.edu> > Sent: Wednesday, July 29, 2020 1:17 PM > To: dev@ctakes.apache.org > Subject: Re: Clarification regarding NegationFSM [EXTERNAL] [EXTERNAL] > [EXTERNAL] > > * External Email - Caution * > > > Sean, > > Since you mention a new release, is there any expected time for a new > stable cTAKES release? An up-to-date stable release for the user > installation would be appreciated I think. > > Regards, > Tomasz > > ________________________________________ > From: Finan, Sean <sean.fi...@childrens.harvard.edu> > Sent: Friday, July 24, 2020 10:45 AM > To: dev@ctakes.apache.org > Subject: Re: Clarification regarding NegationFSM [EXTERNAL] [EXTERNAL] > > I don't think that anybody does. It is not in the release, not > documented, not necessarily ready for widespread use, etc. Everything > associated with types List and ListEntry is new. > > Hopefully when ctakes 4.0.1 ( should be 5.0 at this point ) is released > these types will be much more usable. > > Sean > ________________________________________ > From: Peter Abramowitsch <pabramowit...@gmail.com> > Sent: Friday, July 24, 2020 10:50 AM > To: dev@ctakes.apache.org > Subject: Re: Clarification regarding NegationFSM [EXTERNAL] [EXTERNAL] > > * External Email - Caution * > > > Thanks Sean. I didn't know about that annotator. > > On Fri, Jul 24, 2020, 3:51 AM Finan, Sean < > sean.fi...@childrens.harvard.edu> > wrote: > > > Hi Sreejith, > > > > Without seeing an example of text I can't say whether my next words will > > help you or not. > > > > If you are using trunk then you should have access to two 'new' > annotation > > engines in ctakes-core. > > ListAnnotator - Annotates formatted List Sections by detecting > them > > using Regular Expressions provided in an input File. > > ListEntryNegator - Checks List Entries for negation, which may be > > exhibited differently from unstructured negation. > > > > ListAnnotator can use any list of regular expressions in a file. The > > default file is in ctakes-core-res, called DefaultListRegex.bsv > > The format for each line in the regex list is > > NAME||LIST_REGEX||ENTRY_SEPARATOR_REGEX where > > NAME - name of list type. Can be anything. > > LIST_REGEX - some regular expression for which a block of text will > > match a list in its entirety. > > ENTRY_SEPARATOR_REGEX - some regular expression for which text within > > the entire list will match a single list entry. > > For instance, the List > > Smoker Status: N > > Drinking Status: Y > > Pregnant: N/A > > A -simple- line in the regex file could be > > Colonized > > > List||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n){2,}||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n) > > Notice that each item is separated by two bar characters "||". > > > > The file of regular expressions can be changed using the LIST_TYPES_PATH > > parameter. > > > > ListEntryNegator will iterate through each ListEntry in the cas and use a > > regular expression to determine whether or not items in the list should > be > > negated. > > Right now that regex is hard-coded in the class. There should probably > be > > a mechanism to overwrite it. ": N" is not in there. Also, only > > Disease/Disorders and Sign/Symptom mentions in the ListEntry are negated. > > You would need to add SmokingStatusAnnotation as a negatable. > > > > I don't know if any of this is helpful, but I thought that I would throw > > it out there. > > > > Sean > > ________________________________________ > > From: Sreejith Pk <sreji...@gmail.com> > > Sent: Friday, July 24, 2020 4:09 AM > > To: dev@ctakes.apache.org > > Subject: Re: Clarification regarding NegationFSM [EXTERNAL] > > > > * External Email - Caution * > > > > > > Hi Peter, Thanks a lot for the reply. > > > > Let me elaborate more on the changes I have done so far. I have added > > KuRuleBasedClassifierAnnotator to the pipeline inorder to fetch Smoking > > related keywords from the document. I have > > modified KuRuleBasedClassifierAnnotator in such a way that it will > iterate > > through the identified tokens and if the token matches any smoking > related > > word which are configured inside a keyword.txt file. The identified > tokens > > will be then set to SmokerNamedEntityAnnotation and thus can be read from > > the output XMI. > > Here in my scenario, the sentence I am passing to cTAKES is "Smoking > > status: N". As Smoking is configured inside keywords.txt, it will be > coming > > as the output node in SmokerNamedEntityAnnotation. Its polarity only I am > > parsing in my parser logic. Here polarity of SmokerNamedEntityAnnotation > > - "Smoking" token is coming as 1 instead of expected -1 > > (NB: I have removed ":" from the NamedEntityContextAnalizer.java - > boundary > > words set) > > > > Thanks and Regards, > > Sreejith > > > > > > On Thu, Jul 23, 2020 at 11:20 PM Peter Abramowitsch < > > pabramowit...@gmail.com> > > wrote: > > > > > Check and see if the identified annotation you get for "Smoking status: > > N" > > > without your change is actually "Non Smoker" with polarity 1. > > > Nonsmoker is a separate concept, from a Smoker with polarity -1. > Instead > > > of looking at range text, check the canonical text for the concept you > > > have. > > > Having said that, there are many issues with negation in all of the > > > negation annotators. Some are too eager, others are too cautious. > > > > > > Peter > > > > > > On Thu, Jul 23, 2020 at 10:17 AM Sreejith Pk <sreji...@gmail.com> > wrote: > > > > > > > Hi Team, > > > > > > > > We are using cTAKES 4.0.0 as the NLP engine in our application. I > have > > > > added ContextAnnotator to the pipeline to achieve correct Polarity to > > the > > > > tokens. > > > > After analysing the ContextAnnotator code, I understand that negation > > > > determining condition is written in NegationFSM class. > > > > In my requirement, I have a sentence "Smoking status: N" and I want > to > > > set > > > > polarity -1 to the token "Smoking" because of the occurrence of "N". > To > > > > achieve the same, I have tried adding "N" to the existing HashSet > > > > in NegationFSM constructor like iv_negVerbsSet.add("N"); But it > seems, > > > > polarity of the word token "Smoking" is still coming as 1. > > > > With the same configuration set if I pass "Smoking status: denies", I > > am > > > > getting the polarity of token "Smoking" as -1. Kindly help. > > > > > > > > Thanks & Regards > > > > Sreejith > > > > > > > > > >