Re: need some help with tcp/ip programming

Amos Shapira Mon, 14 May 2007 19:50:22 -0700

On 15/05/07, guy keren <[EMAIL PROTECTED]> wrote:

Amos Shapira wrote:
> On 15/05/07, *guy keren* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>
> wrote:
>
>      > I think you are tinkering with semantics and so miss the real
>     issue (do
>      > you work as a consultant? :).
>
>     did you write that to rafi or to me? i'm not dealing with semantics
- i
>     am dealing with a real problem, that stable applications have to
deal
>     with - when the network breaks, and you never get the close from the
>     other side.
>
>
> I wrote this to you, Guy. Rafi maybe used "disconnect" when he basically
> ment that the TCP connection went down from the other side while you
> seemed to hang on "disconnect" being defined as "cable eaten by an
> aligator" :).

lets leave this subject. i brought it up, because many programmers new
to socket programming are surprised by the fact that a network
disconnection does not cause the socket to close, and that the
connection may stay there for hours.

> As long as Rafi feels happy about the replies that's not relevant any
> more, IMHO.
>
>      > Alas - I think that I've just read not long
>      > ago that there is a bug in Linux' select in implementing just
>     that and
>      > it might miss the close from the other side sometimes
>
>     what you are describing here sounds astonishing - that such a basic
>     feature of the sockets implementation is broken? i find this hard to
>     believe, without clear evidence.
>
>
> Here is something about what I read before, it's the other way around,
> and possibly only relevant to UDP but I'm not sure - if a packet arrives
> with bad CRC, it's possible that the FD will be marked as "ready to
> read" by select but then the packet will be discarded (because of the
> CRC error) and when the process reads the socket it won't get anything.
> That would make the process get a "0 read right after select" which does
> NOT indicate a close from the other side.
>
> http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html
>
> I don't know what would be a select(2)-based work-around, if required at
> all.

first, it does not return a '0 read'. this situation could have two
different effects, depending on the blocking-mode of the socket.

if the socket is in blocking mode (the default mode) - select() might
state there's data to be read, but recvmsg (or read) will block.

if the socket is in non-blocking mode - select() might state there's
data to be read, but recvmsg (of read) will return with -1, and errno
set to EAGAIN.

in neither case will read return 0. the only time that read is allowed
to return 0, is when it encounters an EOF. for a socket, this happens
ONLY if the other side closed the sending-side of the connection.



Is there an on-line reference (or a manual page) to support this?

From what I remember about select, the definition of it returning a "ready

to read" bit set is "the next read won't block", which will be true for
non-blocking sockets any time and therefore they weren't encouraged together
with select.

ofcourse, whenever i did select-based socket programming, i always set

the sockets to non-blocking mode. this requires some careful
programming, to avoid busy-waits, but it's the only way to gurantee
fully non-blocking behaviour. and people should also note that the
socket should be set to non-blocking mode before calling connect, and be
ready to handle the peculear way that the connect call works for
non-blocking sockets.



Also there is the issue of signals. If you want robust programs then you'll
have to use pselect.

doing socket programming without referencing stevens' latest TCP/IP book

is foolish.



Sorry for being foolish, I learned TCP/IP from RFC's and socket programming
from BSD4.2 sources in `86, Steven's book wasn't available then. :^)
I since then read the early editions of his books (circa early 90's, I
remember reading a volume while the later ones where still "in the making"),
but it's been a while since I had to write a complete C socket program with
select in earnest, and I accept that some interfaces may have changed over
the years.

These days, with pthreads being a mainstream, I'd consider using multiple
threads. select() is nice when you absolutely *must* use a single thread
(which was the case back when pthreads wasn't invented yet, or later when
the various UNIX versions had their own idea on thread API's) but if you
have so many connections that multiple threads will become a problem then a
single thread having to cycle through all these connections one by one will
also slow things down. Not to mention the signal problem and just generally
the fact that one connection taking too much time to handle will slow the
handling of other connections.
A possible go-between might be to select/poll on multiple FD's then handing
the work to threads from a thread pool, but such a job would be justifiable
only for a large number of connections, IMHO.

If you insist on using a single thread then select seems to be the underdog
today - poll is just as portable (AFAIKT), and Boost ASIO (and I'd expect
ACE) allows making portable code which uses the superior API's such as
epoll/kqueue/"dev/poll".

--Amos

Re: need some help with tcp/ip programming

Reply via email to