Re: Performance of the cleartk history module [EXTERNAL]

2022-01-04 Thread Finan, Sean
Hi Peter, I created a second engine that just used text matching or regular expressions given the discovered events. It also uses covering section types, formatted text and other things, but the text match might be the most impactful item. You are an accomplished developer so the email scratch

Re: Performance of the cleartk history module [EXTERNAL]

2022-01-04 Thread Peter Abramowitsch
Hi Sean Ok.. I was confused whether I was meant to find it in the sources. But while you're reading this, is there a brief way to describe the difference between the older:package org.apache.ctakes.assertion.medfacts.cleartk; and org.apache.ctakes.assertion.medfacts.cleartk.windowed Peter O

Re: Performance of the cleartk history module [EXTERNAL]

2022-01-04 Thread Finan, Sean
Great question. The package name "windowed" isn't helpfully self-descriptive. It contains yet another bit of code that I wrote as quickly as possible to help somebody in real-time with a problem. * There is only a 'procedural' difference between the two. The models and methods are the same.

Re: Performance of the cleartk history module [EXTERNAL]

2022-01-04 Thread Peter Abramowitsch
Thank you for the fulsome and humorous response. Yes, I understand perfectly. We definitely think along the same lines. One of the drawbacks of static and simple to understand utility functions like JCasUtil's is that one can just slap things together without getting to grips with the wastage o

Re: Performance of the cleartk history module [EXTERNAL]

2022-01-04 Thread Miller, Timothy
Peter, That sounds really useful! Were you able to benchmark it for runtime on a reasonably sized sample of your notes? Just curious because I wouldn't have expected regex to be that much of a bottleneck. Tim On Tue, 2022-01-04 at 17:36 -0800, Peter Abramowitsch wrote: * External Email - Cauti

Re: Performance of the cleartk history module [EXTERNAL]

2022-01-04 Thread Peter Abramowitsch
Hi Tim, The performance boost was the frosting on the cake: I had to make changes (at least for our team) because Negex was not working correctly in sentences with multiple identified annotations only some of which were meant to be negated. Negex became over-eager - applying negation when it shou