Re: Questions about XML Parser for Java

keshlam Wed, 01 Aug 2007 08:36:30 -0700

>Would you be so kind as to provide me a rough estimate of the man hours
that expended in developing the XML Parser


Probably not possible, but it's a significant number of man-years.

Xerces started off as an early prototype of IBM's XML4J parser, which went
through several complete redesigns and reimplementations, API changes,
changes in the validation scheme... Heck, the DOM implementation alone is
probably multiple man-years during that stage, since the first DOM
implementation was discarded in favor of one I wrote, which then underwent
a lot of further evolution. Work on that was done across multiple IBM
groups from Tokyo to California to New York to Toronto to wherever. I
really doubt anyone was attempting to track total time investment.

And of course once Xerces hit Apache, and we started getting contributions
from the open source community, any pretense of time tracking would have
gone right out the window.

Could a parser be written in less time? Sure; a lot of the time was spent
in helping the standards to evolve, and a lot was spent in performance
tuning, and Xerces supports things that your particular application may not
need (the downside of being a generally useful tool is that one has to
invest in being general.) And the requirements for an XML parser are better
understood these days. But writing a parser that you'll be happy using is
still not a trivial exercise; the devil really is in the details.


>We have noted that saving an XML file as an Excel file gets you an Excel
file that seems to have been parsed in some
> manner. [...] I wonder if you would be willing to comment on the
differences between what XML4J would provide
>and what Excel provides for some particular XML file.

I'm sorry, but that question really doesn't make a lot of sense. It's like
asking what the difference is between a motor and a washing machine.

Excel is a particular application. It supports a particular XML-based
markup language as one of its file export/import syntaxes, and therefore
must contain at least a limited XML serializer and parser. (May not be
fully general, since they know a priori exactly what kind of XML they
intend to generate and process.))

XML4J/Xerces is a general-purpose XML parser for invocation from
applications. It converts between XML syntax and the standard APIs for
working with XML (DOM, SAX, etc.), as well as performing validation against
DTDs and/or schemas that describe the particular XML-based markup language
you are working with.. Xerces can be used as a building block for any
application which needs to read or write data represented in XML.



______________________________________
"... Three things see no end: A loop with exit code done wrong,
A semaphore untested, And the change that comes along. ..."
  -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish
(http://www.ovff.org/pegasus/songs/threes-rev-11.html)

Re: Questions about XML Parser for Java

Reply via email to