On 10/01/2013 02:38 AM, Pei Chen wrote:
Richard, I, and few others had an interesting bar conversation...
In the spirit of interoperability, What if we had a baseline common type
system that could be reused across UIMA compatible NLP systems?
Imagine for a moment that OpenNLP, ClearTK, ClearNLP, DKPro, cTAKES etc. if
we could come up with a common baseline type system could be be reused? It
may sound like a dream, but it could be doable-- if we could factor out and
find the common ground? Perhaps we could start with the syntactical
features... and then extend it for more specific domain use cases?
The OpenNLP UIMA AEs don't depend on a specific type system, I believe
they can just
be configured to work with the cTAKEs type system out of the box.
There is a jira for doing that with descriptors for the POSTagger,
Sentence Detector and Chunker.
https://issues.apache.org/jira/browse/CTAKES-98
The Sentence Detector might need a bit more work to work with the new
line modification you need for
cTAKEs, but I guess that can easily be done for the next OpenNLP release
if there is a need.
Jörn