Hi Everybody, I wanted to clarify something in connection with this thread of emails:
The relation extractor is a machine learning system. Thus, it is going to be pretty hard to reverse engineer it to determine why a specific mistake has been madeā¦ However, if there is a certain pattern, say, of location_of that it cannot seem to handle, it should be fixable by adding more training examples that annotate this pattern. Dima On Mar 24, 2014, at 15:21, Masanz, James J. <masanz.ja...@mayo.edu> wrote: > For the text "aneurysm in the middle cerebral artery" > 3.1 creates 3 location of relations > aneurysm : middle cerebral artery > aneurysm : cerebral artery > aneurysm : artery > > trunk also creates 3 location of relations > aneurysm : middle cerebral artery > aneurysm : cerebral artery > aneurysm : artery > > bodyLocation is not set for the IdentifiedAnnotation for "aneurysm" in either > case. > > I've created CTAKES-290 > > That jira issue will not address the fact the relation extractor does not > create annotations for location of relations in "Rash on arm and leg". > I'll leave that determination to someone more familiar with the relation > extractor > > -- James > > -----Original Message----- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Monday, March 24, 2014 12:52 PM > To: dev@ctakes.apache.org > Subject: RE: getSeverity etc. for relation extractor > > Hi James, I don't have an exact phrase to use. We used the location_of with > a brain aneurysm project, but the corpus is elsewhere now. However, it would > tag things such as [aneurysm] : [middle cerebral artery] and [aneurysm] : > [cerebral artery] - which is different from arm/leg, but an example of 2 > locations for one entity. > ________________________________________ > From: Masanz, James J. [masanz.ja...@mayo.edu] > Sent: Monday, March 24, 2014 11:05 AM > To: 'dev@ctakes.apache.org' > Subject: RE: getSeverity etc. for relation extractor > > I ran 3.1 against "pain in arm and leg" and I get just one location_of > relation. > And again no location_of relations for "rash on arm and leg" > > Sean, what was the exact phrase you used with the incubator version? (or was > that a while ago and lost) > > -----Original Message----- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Friday, March 21, 2014 3:59 PM > To: dev@ctakes.apache.org > Subject: RE: getSeverity etc. for relation extractor > > Hi James, > > It is starting to resemble a row of falling dominoes ... > > I ran with an incubator version of the "location of" extractor and it did > seem to find multiple locations for a single d/d. Functionality may have > changed since then. > > Thanks for all of your attention to this topic. > > Sean > > -----Original Message----- > From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] > Sent: Friday, March 21, 2014 4:34 PM > To: 'dev@ctakes.apache.org' > Subject: RE: getSeverity etc. for relation extractor > > Running from trunk, I don't get any relations for "Rash on arm and leg" :( > > If I change the text to "pain in arm and leg" I get one > LocationOfTextRelation annotation with arg1=SignSymptomMention (pain) and > arg2=AnatomicalSiteMention (arm) > > Does the relation extractor support creating a 2nd relation involving pain - > the one between pain and leg (is this just an unfortunate choice of example) > or does the relation extractor need enhancement before it would create > mutiple location_of for a single SignSymptomMention or DiseaseDisorderMention > > BTW, I will have to debug the setting of bodyLocation in the code because > even for "pain in arm", when running from trunk, the LocationOfTextRelation > annotation is being created, but the bodyLocation within the > SignSymptomMention is not being set because the code in > TemplateFillerAnnotator expects arg1 and arg2 to be swapped from what they > currently are. I'll take a look at what it was in cTAKES 3.1 and find out if > this is a bug in TemplateFillerAnnotator or something else. > > -- James > > -----Original Message----- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Friday, March 21, 2014 12:30 PM > To: dev@ctakes.apache.org > Subject: RE: getSeverity etc. for relation extractor > >> until we have a definite, well-defined need (from a user). > > "Rash on arm and leg" > >> I don't follow what you mean by your item B) below > > [Rash].getLocationRelation() > [Rash : Arm] > [Rash].getLocation() > [Arm] > > > > -----Original Message----- > From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] > Sent: Friday, March 21, 2014 12:58 PM > To: 'dev@ctakes.apache.org' > Subject: RE: getSeverity etc. for relation extractor > > Yes, if there is more than one severity or location relation for a given > identified annotation, currently the template filler does just take the last > severity and or last location. > > I suggest not changing the type system to allow a list (FSArray), or at least > holding off until we have a definite, well-defined need (from a user). > > I think instead, ideally, we would make the template filler smarter at > picking which severity / which location when there is more than one for the > given identified annotation. Therefore I'd rather not make it a list now, > when in the long run I think it should be a single value. And in the meantime > if someone has a need, they can look through the relations. > > Pei, I don't follow what you mean by your item B) below > > -- James > > -----Original Message----- > From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] > Sent: Thursday, March 20, 2014 2:03 PM > To: dev@ctakes.apache.org > Subject: RE: getSeverity etc. for relation extractor > > Awesome! > Thanks James... > > On Sean's point about many-to-one relationships. I think the current type > system only supports 1 degree_of and severity_of for each > IdentifiedAnnotation? > Does the TemplateFiller component currently just take the last one in the > list currently? > Should we modify the type system to support this in the future- something > like the below? > A) Support many-to-one > B) Separate out getting the relations and getting the actual identified > annotations. > > One suggestion would be: > IdentifiedAnnotation.getBodyLocations(): FSArray<IdentifiedAnnotation> > IdentifiedAnnotation.getBodyLocationRelations(): > FSArray<LocationOfTextRelation> > IdentifiedAnnotation.getSeverity(): FSArray<Modifier> > IdentifiedAnnotation.getSeverityRelations(): FSArray<DegreeOfTextRelation> > > What do others think? > --Pei > >> -----Original Message----- >> From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] >> Sent: Thursday, March 20, 2014 2:50 PM >> To: 'dev@ctakes.apache.org' >> Subject: RE: getSeverity etc. for relation extractor >> >> I saw the jira was assigned to me and had a few minutes so I >> implemented a fix and committed. >> It was more than just the one line. >> The name of the index in which the binary text relations has changed >> (now separate indexes instead of one for all binary text relations) so >> I had to change which index was searched. >> >> -----Original Message----- >> From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] >> Sent: Thursday, March 20, 2014 9:28 AM >> To: dev@ctakes.apache.org >> Subject: RE: getSeverity etc. for relation extractor >> >> Thanks for confirm James. It seem like a bug... >> Chase, >> if you confirm if adding ddm.setSeverity(degreeOfTextRelation); works >> for you, I can commit the changes in trunk. >> >> Which also brings up some interesting points: >> 1) Should we populate IdentifiedAnnotation.severity() and >> bodylocationof() Directly in RelationExtractorAnnotator instead of the >> template filler? >> It would seem more intuitive and faster than iterating through the >> relations afterwards again. >> 2)Chase brought up a good point, should we add some of the commonly >> used components to the defaultpipeline? (DrugNER, RelationExtractor, >> TemplateFiller)? Seems easier to get onboard I think. >> >> --Pei >> >> >>> -----Original Message----- >>> From: Chen, Pei >>> Sent: Wednesday, March 19, 2014 5:58 PM >>> To: dev@ctakes.apache.org >>> Subject: RE: getSeverity etc. for relation extractor >>> >>> Chase, >>> I am not sure why or the reasoning behind this, but it might explain >>> why Severity is null for your DiseaseDisorderMention example: >>> Line 319 in TemplateFillerAnnotator.java: >>> >>> If I'm reading this logic correctly, it will only populate severity for >>> SignSymptomMention.... Can't think of why not to populate it if it exists >>> in >>> the BinaryTextRelations- >>> have you tried adding: ddm.setSeverity(degreeOfTextRelation); >>> instead of logging the error ??? >>> >>> if (eventMention instanceof >>> DiseaseDisorderMention) { >>> DiseaseDisorderMention ddm = >>> (DiseaseDisorderMention) eventMention; >>> logger.error("Need to implement attr >> for " + relation + " for >>> DiseaseDisorderMention"); >>> } else if (eventMention instanceof >>> SignSymptomMention) { >>> SignSymptomMention ssm = >>> (SignSymptomMention) eventMention; >>> >>> ssm.setSeverity(degreeOfTextRelation); >>> >>> Would you mind opening a Jira attach a patch/test if it works for you? >>> -Pei >>> >>>> -----Original Message----- >>>> From: Chase Master [mailto:chasemast...@gmail.com] >>>> Sent: Wednesday, March 19, 2014 4:09 PM >>>> To: dev@ctakes.apache.org >>>> Subject: Re: getSeverity etc. for relation extractor >>>> >>>> Thanks, >>>> I tried using the AggregateTemplateFiller.xml from the >>>> template-filler module, and I specified the relation extractor >>>> pipeline that I was using before from the relation-extractor >>>> project (there is also a different one in the template-filler >>>> project called "RelationExtractorAggregateWithoutOrangeBook"). >>>> However, I don't see a difference, the severity is still null. >>>> >>>> Just wondering - is there some reason that the TemplateFiller is >>>> not included by default? It seems confusing that there are >>>> getters for properties that aren't set in general ...even when one >>>> runs the default clinical pipeline instead of the >>>> RelationExtractorAggregate, these getters are there, but there are no >>>> relations. >>>> >>>> >>>> Thanks >>>> Chase >>>> >>>> >>>> On Wed, Mar 19, 2014 at 1:04 PM, Chen, Pei >>>> <pei.c...@childrens.harvard.edu>wrote: >>>> >>>>> If I remember correctly, I think those attributes were set in >>>>> IdentifiedAnnotation via: >>>>> ctakes-template-filler/desc/analysis_engine/TemplateFillerAnnotator. >>>>> xm >>>>> l >>>>> One can look at the logic in: >>>>> org.apache.ctakes.template.filler.ae.TemplateFillerAnnotator [1] >>>>> >>>>> Have you tried added that to the pipeline? >>>>> >>>>> [1] >>>>> http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-template-fil >>>>> le >>>>> r/ >>>>> sr >>>>> c/main/java/org/apache/ctakes/template/filler/ae/TemplateFillerA >>>>> nn >>>>> ot >>>>> at >>>>> or.java >>>>> >>>>> --Pei >>>>> >>>>>> -----Original Message----- >>>>>> From: Chase Master [mailto:chasemast...@gmail.com] >>>>>> Sent: Wednesday, March 19, 2014 1:56 PM >>>>>> To: dev@ctakes.apache.org >>>>>> Subject: getSeverity etc. for relation extractor >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to output the relations associated with >>>>> DiseaseDisorderMentions >>>>>> and other types. But I want to start by iterating over >>>>>> DiseaseDisorderMention, not BinaryTextRelations since I want >>>>>> to be sure >>>>> to >>>>>> find them all, even if they have no associated relation. >>>>>> >>>>>> I always get null when using any of the getters like >>>>>> "getSeverity()". I >>>>> am >>>>>> using the example text "He had a slight fracture in the >>>>>> proximal right >>>>> fibula". >>>>>> When I iterate over BinaryTextRelations, I see the following >>>>>> valid >>>>> values: >>>>>> BinaryTextRelation slightFracture = iterator.next(); >>>>>> slightFracture.getArg1().getArgument().getCoveredText() is >> "fracture" >>>>>> slightFracture.getArg2().getArgument().getCoveredText() is "slight". >>>>>> However, for the "fracture" DiseaseDisorderMention, >>>>>> getSeverity() is >>>>> null. >>>>>> If it wasn't, I would then grab >>>>>> disease.getSeverity().getArg1().getArgument().getCoveredText() >>>>>> , >>>>>> or for Arg2. >>>>>> >>>>>> Thanks, >>>>>> Chase >>>>>