hi guys! I try to write a module for http download tools. When I finished it, I realized some of the procedures could be the part of module/web/client.scm. So I format a patch. The function of these procedures listed below: http-get-uri-head ==> get the response struct under http method 'HEAD', this head is useful to get information of the remote file. http-client-get-block-from-uri ==> get response body whose size is 'block' which can be specified by user. If you don't specify the 'block', it will be the length of the target file. http-client-get-ready-to-continue ==> returns (values pos fd) If the download target has a part in local space, pos will point to it's broken pointer. fd is the file port of the local target. The rest procedures maybe useful. To avoid download a file incorrectly, I think a checksum or md5 is useful. So I just used the ETag which contains in HTTP protocol. Each target file would generate a "filename.etag" contains the last time ETag. When continue to download file, one may use these procedures to checkout the ETag. http-client-get-check-string ==> get the ETag string from "filename.etag" http-client-checkout-etag ==> checkout if the ETag of HEAD and ETag of "filename.etag" are equal. http-client-remove-check-file ==> when you get a different ETag, you may delete the "filename.etag" http-client-etag-stamp ==> when you first download a file, this procedure could generate "filename.etag" and after you finished your downloading, you need to delete "filename.etag"
Here's an simple 'continue-to-download' example to show how to use these procedures: ----------------------------------code begin--------------------------------------- (define* (http-client-retrive-file-continue uri #:key (path (uri-path uri)) (try 5)) (let ([head (http-get-uri-head uri)]) (call-with-values (lambda () (http-client-get-ready-to-continue uri #:path path #:head head)) (lambda (pos port) (catch #t (lambda () (if (zero? pos) (begin (display "download from beginning") (http-client-etag-stamp uri #:path path)) (format #t "continue from position ~a~%" pos)) (let lp ([data (http-client-get-block-from-uri uri #:start pos #:head head #:block 4096)] [pos pos]) (if data (let* ([dl (bytevector-length data)] [new-pos (+ pos dl)] ) (put-bytevector port data) (force-output port) (format #t "~a-~a~%" pos new-pos) (lp (http-client-get-block-from-uri uri #:start new-pos #:head head #:block 4096) new-pos)) (format #t "~a has already been done!~%" path)))) (lambda e (case (car e) ((system-error) (let ([E (system-error-errno e)]) (if (or (= E ECONNABORTED) (= E ECONNREFUSED) (= E ECONNRESET)) (begin (format #t "~a, try again!~%left ~a times to try~%" (car (cadddr e)) try) (close port) (http-client-retrive-file-continue uri #:path path #:try (1- try)))))) (else (display "some error occured!\n")(newline) (format #t "~a : ~a~%" (car e) (cdr e))))) ))))) ---------------------code end----------------------------- And you may try this: (http-client-retrive-file-continue (string->uri " http://mirrors.kernel.org/gnu/gcc/gcc-4.6.2/gcc-4.6.2.tar.bz2") #:path "mmr.tar.bz2" #:try 10) #:path could be used to specify the local target. If you ignore it, it would be the original file name. #:try is times to try. When try decrease to 0 but downloading has unfinished, it'll quit anyway. Besides, one may use these procedures to build his/her own threads based downloading tools, say, split the remote file into blocks and use 10 threads to down them separably. Any comments?
From ab03251a6c9a476753c5498ba3a75009c37db272 Mon Sep 17 00:00:00 2001 From: NalaGinrut <nalagin...@gmail.com> Date: Thu, 23 Feb 2012 17:46:56 +0800 Subject: [PATCH] add some useful procedures for http client --- module/web/client.scm | 72 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 71 insertions(+), 1 deletions(-) diff --git a/module/web/client.scm b/module/web/client.scm index b035668..fc96284 100644 --- a/module/web/client.scm +++ b/module/web/client.scm @@ -37,9 +37,18 @@ #:use-module (ice-9 rdelim) #:use-module (web request) #:use-module (web response) + #:use-module (web http) #:use-module (web uri) + #:autoload (rnrs io ports) (get-string-all) #:export (open-socket-for-uri - http-get)) + http-get + http-get-uri-head + http-client-get-block-from-uri + http-client-get-check-string + http-client-checkout-etag + http-client-remove-check-file + http-client-etag-stamp + http-client-get-ready-to-continue)) (define (open-socket-for-uri uri) (let* ((ai (car (getaddrinfo (uri-host uri) @@ -114,3 +123,64 @@ (if decode-body? (decode-response-body res body) body))))) + +(define* (http-get-uri-head uri #:key (sock (open-socket-for-uri uri))) + (let ((rq (build-request uri #:method 'HEAD))) + (write-request rq sock) + (force-output sock) + (let ((head (read-response sock))) + (close sock) + head))) + +(define* (http-client-get-block-from-uri uri #:key (block #f) (start 0) + (head (http-get-uri-head uri))) + (let* ((s (open-socket-for-uri uri)) + (end (if block (+ start block) (response-content-length head))) + (range-str (format #f "bytes=~a-~a" start end)) + (range (parse-header 'range range-str)) + (rq (build-request uri #:headers `((range ,@range))))) + (write-request rq s) + (force-output s) + (read-response-body (read-response s)))) + +(define (http-client-get-check-string path) + (let ((target (string-append path ".etag"))) + (if (file-exists? target) + (call-with-input-file target + (lambda (port) + (get-string-all port))) + ""))) + +(define* (http-client-checkout-etag uri #:key (path (uri-path uri)) + (head (http-get-uri-head uri))) + (let* ((etag (car (response-etag head))) + (chk-str (http-client-get-check-string path)) + ) + (string=? etag chk-str))) ;; checkout ETag + +(define* (http-client-remove-check-file path #:key (ext ".etag")) + (let ((chk-file (string-append path ext))) + (and (file-exists? chk-file) (delete-file chk-file)))) + +(define* (http-client-etag-stamp uri #:key (head (http-get-uri-head uri)) + (path (uri-path uri)) + (ext ".etag")) + (let ((chk-file (string-append path ext)) + (etag (car (response-etag head)))) + (and (file-exists? chk-file) (delete-file chk-file)) + (call-with-output-file chk-file + (lambda (port) + (format port "~a" etag) + (close port))))) + +(define* (http-client-get-ready-to-continue uri #:key (path (uri-path uri)) + (head (http-get-uri-head uri))) + (if (http-client-checkout-etag uri #:path path #:head head) ;; checkout ETag + (let* ((fp (open-file path "a")) + (pos (stat:size (stat path)))) + (seek fp pos SEEK_SET) + (http-client-remove-check-file path) + (values pos fp)) + (let* ((fp (open-file path "w")) + (pos 0)) + (values pos fp)))) -- 1.7.0.4