Sorry I took so long to respond, but I've been assigned to a project in a
place far, far
away.
The readn function is admittedly primitive code, but it is also low-level
code that
will work w/ protocols other than HTTP. The content-length header is an
HTTP-specific
header, and if this code looked for such a value, it wouldn't work w/ other
protocols. So,
issues like Keep-alive, HTTP protocol versions, and headers are meaningless
as far as
it's concerned.
But I won't defend it any further. I was never particularly happy w/ it (in
fact, I stole it
from someone else in my younger days.) I'm disavowing it now, as it does
fail under
certain low-probability (but still fairly likely) conditions, such as
message lengths that are
exactly divisible by 4096, or whatever the specified buffer size is.
Here I'll present much better low-level code for non-blocking reads from a
socket. This is
CPP, but only because I wanted to use the C++ string object to dynamically
size the
read results (it also throws exceptions). Set the time-out value for the
read by modifying the
timeval struct.
Please let me know if this works any better for you.
John
// BEGIN SAMPLE CODE
// -----------------------------
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <bits/sockaddr.h>
#include <netdb.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/types.h>
#include <string>
#include <iostream>
int writeToSocket(int sock, const char *request) {
int bytesWritten;
bytesWritten = write (sock, request, strlen(request));
return bytesWritten;
}
string read_socket(int sock) {
char res[1024];
struct timeval tv;
tv.tv_sec = 10; // time out in 10 seconds
tv.tv_usec = 0;
fd_set readfds;
FD_ZERO(&readfds);
FD_SET(sock, &readfds);
string s = "";
int r = 0;
int i = 0;
while ((i = select(sock + 1, &readfds, NULL, NULL, &tv)) > 0) {
if ((i = FD_ISSET(sock, &readfds)) < 1) break;
memset(res, 0, 1024);
r = read(sock, res, 1023);
if (r < 1) break;
s.append(res);
}
return s;
}
int openSocket(const char *host, int port) {
long ipAddress;
struct hostent* hostInfo;
struct sockaddr_in sockInfo;
int sock;
memset(&sockInfo, 0, sizeof(sockInfo));
sockInfo.sin_family = AF_INET;
sockInfo.sin_port = htons(port);
ipAddress = inet_addr(host);
if (ipAddress < 0) {
hostInfo = gethostbyname(host);
ipAddress = *(long *)*hostInfo->h_addr_list;
}
sockInfo.sin_addr.s_addr = ipAddress;
// Open the socket
if ((sock = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
int i = -1;
throw i;
}
// And connect
if (connect (sock, (const struct sockaddr *)&sockInfo, sizeof(sockInfo))
== -1) {
int i = -2;
throw i;
}
return sock;
}
string talk(int sock, string in) {
int n = writeToSocket(sock, in.c_str());
string res = read_socket(sock);
return res;
}
// ------------------------------------
// END SAMPLE CODE
At 03:37 PM 4/28/00 -0700, Michael Wojcik wrote:
>> -----Original Message-----
>> From: Brian Snyder [mailto:[EMAIL PROTECTED]]
>> Sent: Friday, April 28, 2000 1:21 PM
>>
>> This is a snippet of code from the darkspell gadgets.
>
>[A typical Stevens-style "readn" function.]
>
>> (Replace read with SSL_read and you have a reader for an SSL enabled
>> connection).
>>
>> My question is why does this print out some code that is less then 4096
>> bytes...ever?
>> It seems that this would not return and print the 'buf' until its read
>> 'readSize' (4096) worth of characters.
>>
>> However, short of user interaction to kill the server, how does this work,
>> so that it will receive an arbitrary reply, without hanging forever until
>> waiting on 4096 bytes? IE: Where is it reading the 'content-length'
>field
>> so it really knows how much data to get?
>
>I haven't looked at the darkspell code, but I'll comment on the behavior of
>the Unix read(2) system call with sockets, which is what we seem to have
>here. You may know all of this already, of course, but it may be helpful
>for others.
>
>When read is called for N bytes on a TCP socket:
>
>- If there is any application data available for reading, read() copies that
>data into the caller's buffer and sets the return value to the number of
>bytes copied.
>
>- If there is no application data available and a TCP FIN has been received
>from the peer for this conversation, read() returns 0, its EOF indicator. A
>FIN (usually) means the remote application has closed the conversation
>cleanly.
>
>- If there is no data available and any of various error conditions occur,
>read() returns -1 and sets errno appropriately.
>
>- If there is neither application data nor a FIN nor an error, read() will
>block until one of those things is received, or it's interrupted by a
>signal. (The latter behavior is somewhat system-dependent.)
>
>The purpose of readn() is to loop calling read() until it reports EOF or an
>error, or it receives a certain amount of data. In effect, readn() is like
>read() except for a more stringent condition for returning data. So yes,
>it'll block until it gets its 4096 bytes - *unless* it gets EOF or an error
>from read().
>
>Perhaps the gadget was written for HTTP/1.0, and so doesn't take into
>account persistent connections? If the connection isn't persistent, the
>server will close it after it sends the response. readn() will loop calling
>read() until it's gotten all of the data; then it will get a 0 on the next
>call to read() and return.
>
>If the connection is persistent, and the server sends less than 4096 bytes
>and doesn't close the connection, readn() will indeed block forever (unless
>an error kicks it out).
>
>readn() is meant to be used when the caller knows the peer is supposed to be
>sending that much data. Calling it with a fixed buffer size when you could
>be receiving variable-length data is simply wrong. If the caller wants to
>receive data until the remote application closes its end, readn() will work,
>but it hides what the caller really wants to do. In any case, a more robust
>application would probably employ some kind of timeout processing using
>poll() or its equivalent (or multithreading, though that design has certain
>drawbacks) to get out of the read() if data isn't received in a certain
>timeframe - regardless of whether it knows how much data to receive.
>
>And, of course, a good HTTP/1.1 application should be paying attention to
>the Content-length header if present, or the Transfer Encoding, or
>whatever's applicable to that particular flow. (Content-length isn't
>present if the "chunked" Transfer Encoding is being used. See RFC 2616.)
>
>Michael Wojcik [EMAIL PROTECTED]
>MERANT
>Department of English, Miami University
>______________________________________________________________________
>OpenSSL Project http://www.openssl.org
>User Support Mailing List [EMAIL PROTECTED]
>Automated List Manager [EMAIL PROTECTED]
>
>
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List [EMAIL PROTECTED]
Automated List Manager [EMAIL PROTECTED]