Any preference for separate factory classes: class SentenceDetectorAnnotatorFactory:
static AnalysisEngineDescription getSentenceDetectorAnnotator() VS static methods added to primitive annotators: class SentenceDetector (existing) static AnalysisEngineDescription getSentenceDetectorAnnotator() ? The former can clutter up the class space while the latter extends the length of classes, especially if there are multiple versions (getUMLSDictionaryAnnotator(), getICD9DictionaryAnnotator(), getMeshDictionaryAnnotator(), etc.) Tim On 04/16/2014 04:48 AM, Richard Eckart de Castilho wrote: > It would be nice if uimaFIT provided a Maven plugin to automatically > generate descriptors for aggregates. Maybe if we come up with a > convention for factories, e.g. a "class with static methods that do > not take any parameters and that return descriptors", or "methods > that bear a specific Java annotation, e.g. @AutoGenerateDescriptor)" > it should be possible to implement such a Maven plugin. > > Cheers, > > -- Richard > > On 16.04.2014, at 05:21, Steven Bethard <steven.beth...@gmail.com> wrote: > >> +1. And note that once you have a descriptor, you can generate the >> XML, so we should arrange to replace the current XML descriptors with >> ones generated automatically from the uimaFIT code. That should reduce >> some synchronization problems when the Java code was changed but the >> XML descriptor was not. >> >> Steve >> >> On Tue, Apr 15, 2014 at 8:52 AM, Miller, Timothy >> <timothy.mil...@childrens.harvard.edu> wrote: >>> The discussion in the other thread with Abraham Tom gave me an idea I >>> wanted to float to the list. We have been using some UIMAFit pipeline >>> builders in the temporal project that maybe could be moved into >>> clinical-pipeline. For example, look to this file: >>> >>> http://svn.apache.org/viewvc/ctakes/trunk/ctakes-temporal/src/main/java/org/apache/ctakes/temporal/pipelines/TemporalExtractionPipeline_ImplBase.java?view=markup >>> >>> with the static methods getPreprocessorAggregateBuilder() and >>> getLightweightPreprocessorAggregateBuilder() [no umls]. >>> >>> So my idea would be to create a class in clinical-pipeline >>> (CTakesPipelines) with static methods for some standard pipelines (to >>> return AnalysisEngineDescriptions instead of AggregateBuilders?): >>> >>> getStandardUMLSPipeline() -- builds pipeline currently in >>> AggregatePlaintextUMLSProcessor.xml >>> getFullPipeline() -- same as above but with SRL, constituency parsing, >>> etc., every component in ctakes >>> >>> We could then potentially merge our entry points -- I think Abraham's >>> experience points out that this is currently confusing, as well as >>> probably not implemented optimally. For example, either >>> ClinicalPipelineWithUmls or BagOfCUIsGenerator would use that static >>> method to run a uimafit-style pipeline. Maybe we can slowly deprecate >>> our xml descriptors too unless people feel strongly about keeping those >>> around. >>> >>> Another benefit is that the cTAKES API is then trivial -- if you import >>> ctakes into your pom file getting a UIMA pipeline is one UimaFit call: >>> >>> builder.add(CTAKESPipelines.getStandardUMLSPipeline()); >>> >>> >>> I think this would actually be pretty easy to implement, but hoping to >>> get some feedback on whether this is a good direction. >>> >>> Tim > -- Tim Miller Instructor Boston Children's Hospital and Harvard Medical School timothy.mil...@childrens.harvard.edu 617-919-1223