In case no one offers a better library, enclosed is a small one that I recently created for a web-scraping task.
Start a simulation of a browser with `make-connection`, use `goto!` to follow a link to a relative URL (following redirects), and use `back!` to go back. The `goto!` function returns two values: the headers as a string and the page content as bytes. Beware: My application accessed a single site, so this library doesn't attempt to do the right thing with cookies across sites. At Wed, 08 Jan 2014 03:48:44 -0800, Duncan Bayne wrote: > Hi All, > > I'm trying to re-write some Common Lisp web-scraping code in Racket. > > In Common Lisp, I'm POSTing a login request, and storing the cookie-jar > for subsequent GETs: > > (defun login (username password) > "Logs in to www.example.com. Returns a cookie-jar containing > authentication details." > (let ((cookie-jar (make-instance 'drakma:cookie-jar))) > (drakma:http-request "http://www.example.com/login" > :method :post > :parameters `(("username" . ,username) ("password" . > ,password)) > :cookie-jar cookie-jar) > cookie-jar)) > > ; snip > > (defun get-page (page-num cookie-jar) > "Downloads a potentially invalid HTML page containing data to scrape. > Returns a string containing the HTML." > (let ((url (concatenate 'string "http://www.example.com/data/" > (write-to-string page-num)))) > (let ((body (drakma:http-request url :cookie-jar cookie-jar))) > (if (search "No data found." body) > nil > body)))) > > However, I can't find an equivalent in Racket. The latest HTTP > library[1] makes no mention of cookies at all, and AFAICT the cookie > library[2] seems more about correctly serializing and deserializing > them. > > Can anyone suggest a way of re-writing the above CL in Racket without > having to implement a bunch of header-parsing stuff? > > TIA for any help ... > > [1] > https://github.com/plt/racket/blob/master/racket/collects/net/http-client.rkt > [2] http://docs.racket-lang.org/net/cookie.html > > -- > Duncan Bayne > ph: +61 420817082 | web: http://duncan-bayne.github.com/ | skype: > duncan_bayne > > I usually check my mail every 24 - 48 hours. If there's something > urgent going on, please send me an SMS or call me. > ____________________ > Racket Users list: > http://lists.racket-lang.org/users
connection.rkt
Description: Binary data
____________________ Racket Users list: http://lists.racket-lang.org/users