Hi, and thanks for taking the time to read all the emails on this.
Here's some answers to your questions, below.
Otis Gospodnetic wrote:
Having finally read all the emails related to this proposal, I'm very much for this
"puppy" entering ASF and eventually getting it going with Lucene and friends.
A few questions.
1. What you are proposing for ASF is the UIMA 2.0 code that currently lives on
SF, correct?
Yes, that is correct.
2. What about the SDK, and could you tell me/us what's in the SDK that is not
in the SF code? (I'm confused, because your proposal includes references to
tools for development and design of UIMA components, but doesn't that typically
live in an SDK?)
The only other thing in the SDK that is not coming to Apache is a
version of a semantic search engine (and some associated components)
that can index both keywords, and also labeled spans containing the
keywords; this is because Apache already has Lucene, and that engine is
a good candidate for extension in this manner. The SDK includes tooling
and examples; those are coming. In addition, we're bringing the
framework test cases.
3. I'm a bit puzzled why something that sounds like a framework/pipeline for
hooking up components with pre-defined input/output adapters ends up with with
a 400 page user guide/book. Perhaps I should present this as a question. How
come? Or is that user guide for the SDK only?
There are several reasons for this. One reason is that the book's first
part is actually a general introduction to the rationale behind the
framework, followed by a tutorial (chapters 4-7). Our target audience
were mainly Researchers who worked down in the depths of analytic
algorithms, and who didn't necessarily spend much time keeping up to
date with newer technologies for building software applications. So we
found ourselves giving tutorials, and decided it would be good to
include those in the big book.
Besides the framework, we have some tooling (both Eclipse IDE based, and
stand alone); there are chapters on these tools and how to use them.
The architecture includes the idea of specifying lots of meta-data about
the components, in XML, and our early users had a lot of trouble getting
the XML right. So we built an Eclipse editor for editing the XML which
does a whole bunch of consistency checking, and presents a visual model
to the user describing the component meta-data in a friendlier way than
just XML. The chapter describing this tool is one of the larger ones.
Finally, when you get into the details, you'll find there's more to this
than it first appears :-).
Does that help explain the manual length?
-Marshall Schor
Otis
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]