I would recommend using the open source project HTMLParser ( http://htmlparser.sourceforge.net/). It provides an excellent API for parsing html files and extracting the relevant text. -drj
On 8/29/06, James liu <[EMAIL PROTECTED]> wrote:
i wanna index html,,,but it have image,flash,javascript, and i wanna make index quick,, but i don't know how to get textmode content,,, anyone can help me?