Hi Steve, Thanks for the link to the Javadoc! At a glance it looks like indexCovering(..) would be better than a ContainmentIndex for the assertion purposes since a reverse map isn't required. It is excellent that uimafit has that class - I can probably make use of it in the future.
Dima, you now have a couple of options! I don't think that anybody is parsing a bio 101 textbook so the indexCovering(..) maps memory shouldn't be a problem. Cheers, Sean -----Original Message----- From: Steven Bethard [mailto:steven.beth...@gmail.com] Sent: Thursday, June 22, 2017 12:32 PM To: dev@ctakes.apache.org Cc: Miller, Timothy Subject: Re: negation/uncertainty: pipeline runs very slowly [EXTERNAL] On Thu, Jun 22, 2017 at 9:04 AM Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > There is some suboptimal looping in AssertioncCleartkAnalysisEngine. > For > instance: > Collection<IdentifiedAnnotation> entities = > JCasUtil.select(identifiedAnnotationView, IdentifiedAnnotation.class); > for (IdentifiedAnnotation identifiedAnnotation : entities) { > ... > List<Sentence> sents = new > ArrayList<>(JCasUtil.selectCovering(jCas, Sentence.class, > entityOrEventMention.getBegin(), entityOrEventMention.getEnd())); This should definitely never be done. According to the UimaFIT documentation for selectCovering: Note: this is REALLY SLOW! You don't want to use this. Instead, consider using indexCovering(JCas, Class, Class) or a ContainmentIndex. https://urldefense.proofpoint.com/v2/url?u=https-3A__uima.apache.org_d_uimafit-2Dcurrent_api_org_apache_uima_fit_util_JCasUtil.html-23selectCovering-2Djava.lang.Class-2Dorg.apache.uima.cas.text.AnnotationFS-2D&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WZDzMoLjtvG7WlioYEMjz0VJhC1J1UASEY8uvm75D7k&s=n24YXYqPGamSmyLvGF7nTnoE7CIf_RL3TA44DQbqcYQ&e= Steve