subject:"Re\: Parsing HTML in clojure"

Re: Parsing HTML in clojure

2011-06-06 Thread Mukul

Hi, I have worked on a similar project before and have found the following link useful http://blog.prashanthellina.com/2009/07/27/extracting-relevant-text-from-html-pages/ Best regards ~ Mukul Joshi Director & CEO, SpotOn Software Pvt. Ltd. _SpotOn : One stop spot for your mobile development

Re: Parsing HTML in clojure

2011-06-06 Thread Rasmus Svensson

2011/6/6 Base : > hi all, > > I am working on an app that will parse web pages to do some NLP and > statistics. I am able to parse the HTML using several different tool > ( enlive, HTML parser, etc). However I would like to discard all the > rest of the junk in the web page that is not pertinent

Re: Parsing HTML in clojure

2011-06-06 Thread Base

Hi All - Thanks for your help! I found this last night and it looks pretty promising. It is apparently part of Apache Tika (which I have never heard of until now) that has a lot of interesting functionality! https://boilerpipe-web.appspot.com/ Thanks! On Jun 5, 11:14 pm, Bruce Williams wrot

Re: Parsing HTML in clojure

2011-06-06 Thread Bruce Williams

I looked at HtmlCleaner and it pretty cleans up the 'syntax' of the html but does nothing with the 'semantics' - ads,etc Bruce Williams Concepts, like individuals, have their histories and are just as incapable of withstanding the ravages of time as are individuals. But in and through all this

Re: Parsing HTML in clojure

2011-06-05 Thread Myriam Abramson

Me too, starting in October. I still need to get up to speed with Clojure however. On Sun, Jun 5, 2011 at 11:04 PM, Andreas Kostler < andreas.koestler.le...@gmail.com> wrote: > There's a Java library called HtmlCleaner. You might wanna give that a > shot. > Btw, I'm working on quite a similar pro

Re: Parsing HTML in clojure

2011-06-05 Thread Andreas Kostler

There's a Java library called HtmlCleaner. You might wanna give that a shot. Btw, I'm working on quite a similar project so if you like email me and we can maybe join forces. Andreas On 06/06/2011, at 11:01 AM, Base wrote: > hi all, > > I am working on an app that will parse web pages to do so

Re: Parsing HTML in clojure

Re: Parsing HTML in clojure

Re: Parsing HTML in clojure

Re: Parsing HTML in clojure

Re: Parsing HTML in clojure

Re: Parsing HTML in clojure

6 matches

Site Navigation

Mail list logo

Footer information