Re: Permutations

2014-09-05 Thread Pei Chen
Hi Kim,
Thanks for pointing that out.
https://issues.apache.org/jira/browse/CTAKES-310 has been opened for
this.
If you commit the changes, we can see if we can include in the 3.2.1
patch release.
I was looking at the changelist for this file, and it may look like
some of these optimizations may have been intentional by Sean so he
may have some more insight in this bit of the logic.

On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert
 wrote:
> Hi All,
>
> I was reviewing the use of permutations, and I noticed that we sorted
> the permutation list before creating the string to do the concept lookup
> with. It also appears that we were sorting the object that was stored in
> the parent list.
>
> I've made a few changes, and now it appears I can discover some
> additional concepts based upon the permutations.
>
> Let me know what you think of the following changes.
>
> Thanks,
>
> Kim
>
> === modified file
> 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java'
> ---
> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
> 2014-07-31 22:00:48 +
> +++
> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
> 2014-09-04 18:39:59 +
> @@ -210,11 +210,12 @@
>final List> permutationList = iv_permCacheMap.get(
> permutationIndex );
>for ( List permutations : permutationList ) {
>   // Moved sort and offset calculation from inner (per
> MetaDataHit) iteration 2-21-2013 spf
> - Collections.sort( permutations );
> + List permutationsSorted = (List)
> ((ArrayList)permutations).clone();
> + Collections.sort( permutationsSorted );
>   int startOffset = firstWordStartOffset;
>   int endOffset = firstWordEndOffset;
> - if ( !permutations.isEmpty() ) {
> -int firstIdx = permutations.get( 0 );
> + if ( !permutationsSorted.isEmpty() ) {
> +int firstIdx = permutationsSorted.get( 0 );
>  if ( firstIdx <= firstTokenIndex ) {
> firstIdx--;
>  }
> @@ -222,7 +223,7 @@
>  if ( firstToken.getStartOffset() < firstWordStartOffset ) {
> startOffset = firstToken.getStartOffset();
>  }
> -int lastIdx = permutations.get( permutations.size() - 1 );
> +int lastIdx = permutationsSorted.get(
> permutationsSorted.size() - 1 );
>  if ( lastIdx <= firstTokenIndex ) {
> lastIdx--;
>  }
>
>
> --
> Kim Ebert
> 1.801.669.7342
> Perfect Search Corp
> http://www.perfectsearchcorp.com/
>


Re: Permutations

2014-09-05 Thread Kim Ebert
Hi Pei and Sean,

Sean, any thoughts about this would be helpful.

We also had issues in cTAKES 2.5.

Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his
changes.

=== modified file
'src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java'
--- src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java  
2012-11-28 01:56:50 +
+++ src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java  
2013-02-06 16:39:37 +
@@ -294,14 +294,16 @@
 Iterator mdhIterator = mdhSet.iterator();
 while (mdhIterator.hasNext())
 {
-MetaDataHit mdh = (MetaDataHit)
mdhIterator.next();
+MetaDataHit mdh = (MetaDataHit)
mdhIterator.next();
+   
+List permutationSorted = (List)
((ArrayList)permutation).clone();
 // figure out start and end offsets
-Collections.sort(permutation);
+Collections.sort(permutationSorted);
 
 int startOffset;
-if (permutation.size() > 0)
+if (permutationSorted.size() > 0)
 {
-int firstIdx = ((Integer)
permutation.get(0)).intValue();
+int firstIdx = ((Integer)
permutationSorted.get(0)).intValue();
 if (firstIdx <= firstTokenIndex.intValue())
 {
 firstIdx--;
@@ -322,9 +324,9 @@
 }
 
 int endOffset;
-if (permutation.size() > 0)
+if (permutationSorted.size() > 0)
 {
-int lastIdx = ((Integer)
permutation.get(permutation.size() - 1)).intValue();
+int lastIdx = ((Integer)
permutationSorted.get(permutationSorted.size() - 1)).intValue();
 if (lastIdx <= firstTokenIndex.intValue())
 {
 lastIdx--;


Kim Ebert
1.801.669.7342
Perfect Search Corp
http://www.perfectsearchcorp.com/

On 09/05/2014 10:17 AM, Pei Chen wrote:
> Hi Kim,
> Thanks for pointing that out.
> https://issues.apache.org/jira/browse/CTAKES-310 has been opened for
> this.
> If you commit the changes, we can see if we can include in the 3.2.1
> patch release.
> I was looking at the changelist for this file, and it may look like
> some of these optimizations may have been intentional by Sean so he
> may have some more insight in this bit of the logic.
>
> On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert
>  wrote:
>> Hi All,
>>
>> I was reviewing the use of permutations, and I noticed that we sorted
>> the permutation list before creating the string to do the concept lookup
>> with. It also appears that we were sorting the object that was stored in
>> the parent list.
>>
>> I've made a few changes, and now it appears I can discover some
>> additional concepts based upon the permutations.
>>
>> Let me know what you think of the following changes.
>>
>> Thanks,
>>
>> Kim
>>
>> === modified file
>> 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java'
>> ---
>> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
>> 2014-07-31 22:00:48 +
>> +++
>> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
>> 2014-09-04 18:39:59 +
>> @@ -210,11 +210,12 @@
>>final List> permutationList = iv_permCacheMap.get(
>> permutationIndex );
>>for ( List permutations : permutationList ) {
>>   // Moved sort and offset calculation from inner (per
>> MetaDataHit) iteration 2-21-2013 spf
>> - Collections.sort( permutations );
>> + List permutationsSorted = (List)
>> ((ArrayList)permutations).clone();
>> + Collections.sort( permutationsSorted );
>>   int startOffset = firstWordStartOffset;
>>   int endOffset = firstWordEndOffset;
>> - if ( !permutations.isEmpty() ) {
>> -int firstIdx = permutations.get( 0 );
>> + if ( !permutationsSorted.isEmpty() ) {
>> +int firstIdx = permutationsSorted.get( 0 );
>>  if ( firstIdx <= firstTokenIndex ) {
>> firstIdx--;
>>  }
>> @@ -222,7 +223,7 @@
>>  if ( firstToken.getStartOffset() < firstWordStartOffset ) {
>> startOffset = firstToken.getStartOffset();
>>  }
>> -int lastIdx = permutations.get( permutations.size() - 1 );
>> +int lastIdx = permutationsSorted.get(
>> permutation

RE: Permutations

2014-09-05 Thread Finan, Sean
Hi Kim, Pei,

I don't think that I changed anything to which Kim is referring, just a couple 
of other things that happen to be in the same segment.  From the attached it 
looks like Kim's change is to copy a list and sort the copy, while mine were 
moving the sort from an inner to outer loop.  At any rate, whatever I did I did 
a year and a half ago and I'm concentrating on the new lookup these days.

Sean

From: Pei Chen [chen...@apache.org]
Sent: Friday, September 05, 2014 12:17 PM
To: dev@ctakes.apache.org
Subject: Re: Permutations

Hi Kim,
Thanks for pointing that out.
https://issues.apache.org/jira/browse/CTAKES-310 has been opened for
this.
If you commit the changes, we can see if we can include in the 3.2.1
patch release.
I was looking at the changelist for this file, and it may look like
some of these optimizations may have been intentional by Sean so he
may have some more insight in this bit of the logic.

On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert
 wrote:
> Hi All,
>
> I was reviewing the use of permutations, and I noticed that we sorted
> the permutation list before creating the string to do the concept lookup
> with. It also appears that we were sorting the object that was stored in
> the parent list.
>
> I've made a few changes, and now it appears I can discover some
> additional concepts based upon the permutations.
>
> Let me know what you think of the following changes.
>
> Thanks,
>
> Kim
>
> === modified file
> 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java'
> ---
> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
> 2014-07-31 22:00:48 +
> +++
> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
> 2014-09-04 18:39:59 +
> @@ -210,11 +210,12 @@
>final List> permutationList = iv_permCacheMap.get(
> permutationIndex );
>for ( List permutations : permutationList ) {
>   // Moved sort and offset calculation from inner (per
> MetaDataHit) iteration 2-21-2013 spf
> - Collections.sort( permutations );
> + List permutationsSorted = (List)
> ((ArrayList)permutations).clone();
> + Collections.sort( permutationsSorted );
>   int startOffset = firstWordStartOffset;
>   int endOffset = firstWordEndOffset;
> - if ( !permutations.isEmpty() ) {
> -int firstIdx = permutations.get( 0 );
> + if ( !permutationsSorted.isEmpty() ) {
> +int firstIdx = permutationsSorted.get( 0 );
>  if ( firstIdx <= firstTokenIndex ) {
> firstIdx--;
>  }
> @@ -222,7 +223,7 @@
>  if ( firstToken.getStartOffset() < firstWordStartOffset ) {
> startOffset = firstToken.getStartOffset();
>  }
> -int lastIdx = permutations.get( permutations.size() - 1 );
> +int lastIdx = permutationsSorted.get(
> permutationsSorted.size() - 1 );
>  if ( lastIdx <= firstTokenIndex ) {
> lastIdx--;
>  }
>
>
> --
> Kim Ebert
> 1.801.669.7342
> Perfect Search Corp
> http://www.perfectsearchcorp.com/
>


RE: Permutations

2014-09-05 Thread Finan, Sean
> We also had issues in cTAKES 2.5.
> Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his 
> changes.

Thanks for verifying,

Sean

From: Kim Ebert [kim.eb...@perfectsearchcorp.com]
Sent: Friday, September 05, 2014 12:28 PM
To: dev@ctakes.apache.org; Finan, Sean
Subject: Re: Permutations

Hi Pei and Sean,

Sean, any thoughts about this would be helpful.

We also had issues in cTAKES 2.5.

Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his
changes.

=== modified file
'src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java'
--- src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java
2012-11-28 01:56:50 +
+++ src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java
2013-02-06 16:39:37 +
@@ -294,14 +294,16 @@
 Iterator mdhIterator = mdhSet.iterator();
 while (mdhIterator.hasNext())
 {
-MetaDataHit mdh = (MetaDataHit)
mdhIterator.next();
+MetaDataHit mdh = (MetaDataHit)
mdhIterator.next();
+
+List permutationSorted = (List)
((ArrayList)permutation).clone();
 // figure out start and end offsets
-Collections.sort(permutation);
+Collections.sort(permutationSorted);

 int startOffset;
-if (permutation.size() > 0)
+if (permutationSorted.size() > 0)
 {
-int firstIdx = ((Integer)
permutation.get(0)).intValue();
+int firstIdx = ((Integer)
permutationSorted.get(0)).intValue();
 if (firstIdx <= firstTokenIndex.intValue())
 {
 firstIdx--;
@@ -322,9 +324,9 @@
 }

 int endOffset;
-if (permutation.size() > 0)
+if (permutationSorted.size() > 0)
 {
-int lastIdx = ((Integer)
permutation.get(permutation.size() - 1)).intValue();
+int lastIdx = ((Integer)
permutationSorted.get(permutationSorted.size() - 1)).intValue();
 if (lastIdx <= firstTokenIndex.intValue())
 {
 lastIdx--;


Kim Ebert
1.801.669.7342
Perfect Search Corp
http://www.perfectsearchcorp.com/

On 09/05/2014 10:17 AM, Pei Chen wrote:
> Hi Kim,
> Thanks for pointing that out.
> https://issues.apache.org/jira/browse/CTAKES-310 has been opened for
> this.
> If you commit the changes, we can see if we can include in the 3.2.1
> patch release.
> I was looking at the changelist for this file, and it may look like
> some of these optimizations may have been intentional by Sean so he
> may have some more insight in this bit of the logic.
>
> On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert
>  wrote:
>> Hi All,
>>
>> I was reviewing the use of permutations, and I noticed that we sorted
>> the permutation list before creating the string to do the concept lookup
>> with. It also appears that we were sorting the object that was stored in
>> the parent list.
>>
>> I've made a few changes, and now it appears I can discover some
>> additional concepts based upon the permutations.
>>
>> Let me know what you think of the following changes.
>>
>> Thanks,
>>
>> Kim
>>
>> === modified file
>> 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java'
>> ---
>> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
>> 2014-07-31 22:00:48 +
>> +++
>> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
>> 2014-09-04 18:39:59 +
>> @@ -210,11 +210,12 @@
>>final List> permutationList = iv_permCacheMap.get(
>> permutationIndex );
>>for ( List permutations : permutationList ) {
>>   // Moved sort and offset calculation from inner (per
>> MetaDataHit) iteration 2-21-2013 spf
>> - Collections.sort( permutations );
>> + List permutationsSorted = (List)
>> ((ArrayList)permutations).clone();
>> + Collections.sort( permutationsSorted );
>>   int startOffset = firstWordStartOffset;
>>   int endOffset = firstWordEndOffset;
>> - if ( !permutations.isEmpty() ) {
>> -int firstIdx = permutations.get( 0 );
>> + if ( !permutationsSorted.isEmpty() ) {
>> +int firstIdx = permutationsSorted.get( 0 );
>>  if ( firstIdx <= firstTokenIndex ) {
>> firstIdx--;
>>  }
>> @@