Package: wnpp Severity: wishlist Package name: tagsoup Version: 1.0rc3 Upstream Author: John Cowan <[EMAIL PROTECTED]> URL: http://mercury.ccil.org/~cowan/XML/tagsoup/ License: GPL or AFL Description: SAX-compliant HTML parser for Java
This is the home page of TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML. TagSoup is free and Open Source software, licensed under the Academic Free License, a cleaned-up and patent-safe BSD-style license which allows proprietary re-use. It's also licensed under the GNU GPL, since unfortunately the GPL and the AFL are incompatible. You can choose to license TagSoup from me under either the GPL or the AFL.