> We also had issues in cTAKES 2.5. > Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his > changes.
Thanks for verifying, Sean ________________________________________ From: Kim Ebert [kim.eb...@perfectsearchcorp.com] Sent: Friday, September 05, 2014 12:28 PM To: dev@ctakes.apache.org; Finan, Sean Subject: Re: Permutations Hi Pei and Sean, Sean, any thoughts about this would be helpful. We also had issues in cTAKES 2.5. Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his changes. === modified file 'src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java' --- src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java 2012-11-28 01:56:50 +0000 +++ src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java 2013-02-06 16:39:37 +0000 @@ -294,14 +294,16 @@ Iterator mdhIterator = mdhSet.iterator(); while (mdhIterator.hasNext()) { - MetaDataHit mdh = (MetaDataHit) mdhIterator.next(); + MetaDataHit mdh = (MetaDataHit) mdhIterator.next(); + + List permutationSorted = (List) ((ArrayList)permutation).clone(); // figure out start and end offsets - Collections.sort(permutation); + Collections.sort(permutationSorted); int startOffset; - if (permutation.size() > 0) + if (permutationSorted.size() > 0) { - int firstIdx = ((Integer) permutation.get(0)).intValue(); + int firstIdx = ((Integer) permutationSorted.get(0)).intValue(); if (firstIdx <= firstTokenIndex.intValue()) { firstIdx--; @@ -322,9 +324,9 @@ } int endOffset; - if (permutation.size() > 0) + if (permutationSorted.size() > 0) { - int lastIdx = ((Integer) permutation.get(permutation.size() - 1)).intValue(); + int lastIdx = ((Integer) permutationSorted.get(permutationSorted.size() - 1)).intValue(); if (lastIdx <= firstTokenIndex.intValue()) { lastIdx--; Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 09/05/2014 10:17 AM, Pei Chen wrote: > Hi Kim, > Thanks for pointing that out. > https://issues.apache.org/jira/browse/CTAKES-310 has been opened for > this. > If you commit the changes, we can see if we can include in the 3.2.1 > patch release. > I was looking at the changelist for this file, and it may look like > some of these optimizations may have been intentional by Sean so he > may have some more insight in this bit of the logic. > > On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert > <kim.eb...@perfectsearchcorp.com> wrote: >> Hi All, >> >> I was reviewing the use of permutations, and I noticed that we sorted >> the permutation list before creating the string to do the concept lookup >> with. It also appears that we were sorting the object that was stored in >> the parent list. >> >> I've made a few changes, and now it appears I can discover some >> additional concepts based upon the permutations. >> >> Let me know what you think of the following changes. >> >> Thanks, >> >> Kim >> >> === modified file >> 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java' >> --- >> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java >> 2014-07-31 22:00:48 +0000 >> +++ >> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java >> 2014-09-04 18:39:59 +0000 >> @@ -210,11 +210,12 @@ >> final List<List<Integer>> permutationList = iv_permCacheMap.get( >> permutationIndex ); >> for ( List<Integer> permutations : permutationList ) { >> // Moved sort and offset calculation from inner (per >> MetaDataHit) iteration 2-21-2013 spf >> - Collections.sort( permutations ); >> + List<Integer> permutationsSorted = (List) >> ((ArrayList)permutations).clone(); >> + Collections.sort( permutationsSorted ); >> int startOffset = firstWordStartOffset; >> int endOffset = firstWordEndOffset; >> - if ( !permutations.isEmpty() ) { >> - int firstIdx = permutations.get( 0 ); >> + if ( !permutationsSorted.isEmpty() ) { >> + int firstIdx = permutationsSorted.get( 0 ); >> if ( firstIdx <= firstTokenIndex ) { >> firstIdx--; >> } >> @@ -222,7 +223,7 @@ >> if ( firstToken.getStartOffset() < firstWordStartOffset ) { >> startOffset = firstToken.getStartOffset(); >> } >> - int lastIdx = permutations.get( permutations.size() - 1 ); >> + int lastIdx = permutationsSorted.get( >> permutationsSorted.size() - 1 ); >> if ( lastIdx <= firstTokenIndex ) { >> lastIdx--; >> } >> >> >> -- >> Kim Ebert >> 1.801.669.7342 >> Perfect Search Corp >> http://www.perfectsearchcorp.com/ >>