Re: Lightweight lib/way to strip html from text

2012-09-11 Thread Denis Labaye
Hi, This thread on the Enlive mailling list may be of some interest to you: [enlive] How to select all user visible text from webpage? https://groups.google.com/forum/#!msg/enlive-clj/rrY08JdI4Tc/FmDuNjc6w_oJ Denis On Thu, Sep 6, 2012 at 7:41 PM, jamieorc wrote: > Hey all, I'm looking for a

Re: Lightweight lib/way to strip html from text

2012-09-10 Thread Timo Mihaljov
On 06.09.2012 20:41, jamieorc wrote: > Hey all, I'm looking for a lightweight way to strip html from a long > String of text and leave just the text. I've come across JSoup, but at > over 300kb for the lib, not quite lightweight. > > Suggestions? I've found Jericho HTML Parser to be fast, robust

Re: Lightweight lib/way to strip html from text

2012-09-06 Thread Richard Lyman
On Thu, Sep 6, 2012 at 11:41 AM, jamieorc wrote: > Hey all, I'm looking for a lightweight way to strip html from a long String > of text and leave just the text. I've come across JSoup, but at over 300kb > for the lib, not quite lightweight. > > Suggestions? > > Cheers, > Jamie > When you say 'ht

Re: Lightweight lib/way to strip html from text

2012-09-06 Thread Michael Klishin
2012/9/6 jamieorc > Hey all, I'm looking for a lightweight way to strip html from a long > String of text and leave just the text. I've come across JSoup, but at over > 300kb for the lib, not quite lightweight. > > Suggestions? > JSoup is good way to do it. If you need to identify the "main" par