DJ Ortley wrote: > Looking through the source code, it seems to me (which are key words > that indicate this is only an opinion, one which may not be worth > much) that using a library such as Xerces or some sort of XML DOM like > library would be of benefit. > > I was wondering if any thought had been given to that previously?
This is the approach that JSword uses. We actually use JAXP which is an interface layer over a plug-in implementation of XML. So in some cases we use Crimson and in others we use Xerces. It all depends upon what is bundled with the user's JDK. SAX is a better model for most processing than DOM, as most processing does not need an object representation of That said, I think that there are significant advantages and also disadvantages to using it. To me the most significant advantages are that it is a full implementation of an XML parser and we don't need to maintain it. Disadvantages: It is a full implementation of the XML parser. Sword doesn't need a full implementation of the parser. Our documents have a well defined vocabulary (i.e. the DTD specs) and we only need a parser sufficient to parse that vocabulary. Parsing serves two purposes: search/indexing, i.e. stripping out only the text from the "verse" and display, i.e. converting the module raw source into some kind of presentation source. The former benefits from being very fast. Sword's "stripping" routines are built for speed. It would be a huge performance loss to use a true XML parser. For the most part, parsing for converting to a display representation can be slower because it will likely be fast enough. The other thing is that the Sword library has taken a least common denominator approach to its requirements. It is targeted to small handhelds (phones, pdas and the like) and to computers of all ages, colors and creeds. Introducing a fairly large library would need to be optional (like curl, icu4c and lucene) and it would still leave the need for the current custom parsing. Earlier I submitted a patch to make the parser more accurate and it was rejected as a performance hit and too big/risky of a change. And these were the reasons that I was given. _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page