Andy, Il giorno 10 gennaio 2012 22:46, Andy Wingo <wi...@pobox.com> ha scritto:
> Hi Catonano, > > On Fri 30 Dec 2011 23:58, Catonano <caton...@gmail.com> writes: > > > I´m a beginner, I never wrote a single line of LISP or Scheme in my life > > and I´m here for asking for directions and suggestions. > > Welcome! :-) > thank you so much for your reply. I had been eagerly waiting for a signal from the list and I had missed it ! I´m sorry. The gmail learning mechanism hasn´t still learned enough about my interest in this issue, so it didn´t promptly reported about your reply. I had to dig inside the folders structure I had layed out in order to discover it. As for me I haven´t learned enough about the gmail learning mechaninsm woes. I guess we´re both learning, now. Well, I was attempting a joke ;-) > > my boldness is such that I´d ask you to write for me an example > > skeleton code. > > > Hey, it's fair, I think; that is a new part of Guile, and there is not a > lot of example code. > > Thanks, Andy, I´m grateful for this. Actually I managed to set up geiser, load a file and get me delivered to a prompt in which that file is loaded. Cool ;-) But there are still some thing I didn´t know that your post made clear. > Generally, we figure out how to solve problems at the REPL, so fire up > your Guile: > > $ guile > ... > scheme@(guile-user)> > > (Here I'm assuming you have guile 2.0.3.) > > Use the web modules. Let's assume we're grabbing http://www.gnu.org/, > for simplicity: > > > (use-modules (web client) (web uri)) > > (http-get (string->uri "http://www.gnu.org/software/guile/")) > [here the text of the web page gets printed out] > Ok, I had managed to arrive so far (thanks to the help received in the guile cannel in irc) > > Actually there are two return values: the response object, corresponding > to the headers, and the body. If you scroll your terminal up, you'll > see that they get labels like $1 and $2. > I didn´t know they were 2 values, thanks > > Now you need to parse the HTML. The best way to do this is with the > pragmatic HTML parser, htmlprag. It's part of guile-lib. So download > and install guile-lib (it's at http://www.non-gnu.org/guile-lib/), and > then, assuming the html is in $2: > I had seen those $i things but I hadn´t understood that stuff was "inside" them and that I could use them, so I was using a lot of (define this that). And this is probably why I missed the two values returned by http-get. Thanks ! > > (use-modules (htmlprag)) > > (define the-web-page (html->sxml $2)) > And I didn´t know about htmlprag, thanks > > That parses the web page to s-expressions. You can print the result > nicely: > > > ,pretty-print the-web-page > thanks, I didn´t know this, either > > Now you need to get something out of the web page. The hackiest way to > do it is just to match against the entire page. Maybe someone else can > come up with an example, but I'm short on time, so I'll proceed to The > Right Thing -- the problem is that whitespace is significant, and maybe > all you want is the contents of "the <title> in the <head> in the > <html>." > > So in XML you'd use XPATH. In SXML you'd use SXPATH. It's hard to use > right now; we really need to steal > http://www.neilvandyke.org/webscraperhelper/ from Neil van Dyke. But > you can see from his docs that the thing would be > > > (use-modules (sxml xpath)) > > (define matcher (sxpath '(// html head title))) > > (matcher the-web-page) > $3 = ((title "GNU Guile (About Guile)")) > > I was going to attempt something along this line (sxml-match (xml->sxml page) [(div (@ (id "real_player") (rel ,url))) (str but I´m going to explore your lines too. I still wasn´t there, I had stumbled in something I thought it was a bug, but I also had something else to do (this is a pet project) so this had to wait. But I´ll surely let you know Thanks again for your help Bye Cato