What benefit would it have to store a string with some separation character (which may mean that the separation character in the elements may need to be escaped), over using a feature of type FSArray<Token> pointing to the original segments?
Not sure if that is what Karthik meant when referring to fetching the matched atom. -- Richard On 02.10.2013, at 01:46, Karthik Sarma <ksa...@ksarma.com> wrote: > Hmm, couldn't you just fetch the matched atom and use that? Should be the > same information (without, I suppose, the original ordering and split). > > -- > Karthik Sarma > UCLA Medical Scientist Training Program Class of 20?? > Member, UCLA Medical Imaging & Informatics Lab > Member, CA Delegation to the House of Delegates of the American Medical > Association > ksa...@ksarma.com > gchat: ksa...@gmail.com > linkedin: www.linkedin.com/in/ksarma > > > On Tue, Oct 1, 2013 at 12:37 PM, Masanz, James J. > <masanz.ja...@mayo.edu>wrote: > >> Yes, this would help address that multiple permutations example. The new >> getOriginalText method would return something like "Acute|Disease". Right >> now I'm thinking of just using vertical bar as delimiter, to start with at >> least, but think it should be configurable. >> >> -----Original Message----- >> From: dev-return-2067-Masanz.James=mayo....@ctakes.apache.org [mailto: >> dev-return-2067-Masanz.James=mayo....@ctakes.apache.org] On Behalf Of >> Chen, Pei >> Sent: Tuesday, October 01, 2013 9:38 AM >> To: dev@ctakes.apache.org >> Subject: CTAKES-248- include original covered text of NEs which can't be >> recovered post if NE is from a disjoint span >> >> This sounds pretty cool. >> James, will this address the multiple permutations lookup example: >> "Acute alcoholic liver disease." There is a cui: C0001314: Acute Disease, >> but if you getCoveredText(), on the UMLSConcept, you would actually get the >> same "Acute alcoholic liver disease" instead of "Acute Disease". >> So, there is a new field called getOriginalText() that matched the hit? >> >>> -----Original Message----- >>> From: james-mas...@apache.org [mailto:james-mas...@apache.org] >>> Sent: Monday, September 30, 2013 5:49 PM >>> To: comm...@ctakes.apache.org >>> Subject: svn commit: r1527792 - /ctakes/trunk/ctakes-type- >>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst >>> em.xml >>> >>> Author: james-masanz >>> Date: Mon Sep 30 21:48:01 2013 >>> New Revision: 1527792 >>> >>> URL: http://svn.apache.org/r1527792 >>> Log: >>> CTAKES-248 - for named entities, since the annotation just has the >> begin and >>> end offset, it is requested to have a way to get the original covered >> text >>> (especially for disjoint spans) so it is possible to know which words in >> the >>> covered text were actually used in the matching to the dictionary entry >>> >>> Modified: >>> ctakes/trunk/ctakes-type- >>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst >>> em.xml >>> >>> Modified: ctakes/trunk/ctakes-type- >>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst >>> em.xml >>> URL: http://svn.apache.org/viewvc/ctakes/trunk/ctakes-type- >>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst >>> em.xml?rev=1527792&r1=1527791&r2=1527792&view=diff >>> ========================================================== >>> ==================== >>> Binary files - no diff available.