RE: "select" returns error "Bad file descriptor" when called with a copy of "svc_fdset" (defined at "rpc.h") as it's readfds argument

Dave Korn Mon, 28 Jun 2004 07:16:21 -0700

> -----Original Message-----
> From: cygwin-owner On Behalf Of Dave Korn
> Sent: 28 June 2004 14:13


> > -----Original Message-----
> > From: cygwin-owner On Behalf Of Dave Korn
> > Sent: 28 June 2004 13:24
> 
> > > -----Original Message-----
> > > From: cygwin-owner On Behalf Of Lev Pliner
> > > Sent: 28 June 2004 12:38
> > 
> > > I once again ask you to help me to solve my problem. I 
> > > attached an easy
> > > example that works under Linux and FreeBSD.
> > 
> > 
> >   Your code doesn't even compile.  There's no such include 
> > file as rpc/rpc.h
> > on cygwin.
> 
>   Not by default, anyway.  Downloading and installing the 
> sunrpc package might help me try out your testcase......

Right.  Done it and I think I've found the problem.  Basically, it's these
lines:

---------->SNIP!<----------
  int size = getdtablesize ( );
        ......
  if ( argc > 1 ) while ( 1 )
  {
    readfds = svc_fdset;

    switch ( select ( size, &readfds, NULL, NULL, NULL ) )
---------->SNIP!<----------

  An fd_set is a variably-sized array.  Each entry is an unsigned 32-bit
integer which contains a single bit to represent (up to) 32 file handles.
So the first 32-bit integer contains a single bit for each of the file
handles 0-31; the second 32-bit integer contains a bit for each of file
handles 32-63; and so on.

  In order to save space, however, the system doesn't require you to
allocate any more 32-bit ints in your fd_set array than the minimum you need
for the highest file descriptor you want to handle.  So if you know your app
is never going to have more than 32 open files you could get away with using
a single 32-bit int; up to 64, two of them; or maybe if you new you'd never
have more than 256 files you could make do with only 4 32-bit words in each
fd_set.

The struct defined in sys/types.h as an fd_set is a default:

---------->SNIP!<----------
/*
 * Select uses bit masks of file descriptors in longs.
 * These macros manipulate such bit fields (the filesystem macros use
chars).
 * FD_SETSIZE may be defined by the user, but the default here
 * should be >= NOFILE (param.h).
 */
#  ifndef       FD_SETSIZE
#       define  FD_SETSIZE      64
#  endif

typedef long    fd_mask;
#  define       NFDBITS (sizeof (fd_mask) * NBBY)       /* bits per mask */
#  ifndef       howmany
#       define  howmany(x,y)    (((x)+((y)-1))/(y))
#  endif

/* We use a macro for fd_set so that including Sockets.h afterwards
   can work.  */
typedef struct _types_fd_set {
        fd_mask fds_bits[howmany(FD_SETSIZE, NFDBITS)];
} _types_fd_set;

#define fd_set _types_fd_set
---------->SNIP!<----------

  In other words, the default fd_set has only two 32-bit integers making it
up; it can represent file descriptors in the range 0-64.  Other sizes of
fd_set are possible, this is just a default for the sake of making the
header files work.

  Now, given that an fd_set can be any size, and you only pass a pointer to
the first word of that fd_set, how does select (or any other function) know
how many 32-bit words there actually are at the address you passed?  Well,
that's what the size argument to select is for.  Normally you'd make your
fd_sets big enough to handle the maximum number of open files you'll ever
have in your application, and you pass that number to select as well.  For
instance if you're dealing with up to 64 files, you'd need two 32-bit ints
per fd_set; you'd pass 64 as the 'size' argument to select, and then select
knows that each fd_set has two ints.  If you passed 256 as the 'size'
argument to select, it'd know that your fd_set had four 32-bit words in it.

  Now, the default fd_set size as we've seen above is 64, or two words.  So
when we come to the line

    readfds = svc_fdset;

you're copying two 32-bit words from svc_fdset to readfds.  But the value of
size returned from getdtablesize is 256!  So when you pass that value as
size along to select, along with the pointer to readfds, select thinks that
readfds has four 32-bit words; it reads all of them to find out which file
descriptors you want to select on.  But you only copied _two_ 32-bit words
from svc_fdset!  So the second pair of 32-bit words (which select expects to
contain bitmasks for fds 64-127) are just random stuff from the stack, some
of which will be 1 bits, and so select thinks you've specified a whole bunch
of other file handles you want to do a select on, and when it finds out
they're not valid file handles it complains.

  The answer is that you can't copy an unknown fd_set from a library or
other application by simply assigning one variable to another, as you've
written above.  You're dealing with a variably-sized array; you have to know
how big it is, allocate an equivalent one, and copy it.  Something like this
should work: (note that 'howmany' and 'NFDBITS' are from sys/types.h above)

// find out how many fd's there is space for in svc_fdset
int size = getdtablesize ( );
// calculate how many bytes that is the same as.
size_t fd_set_size = howmany (size, NFDBITS);
// allocate an fd_set of the same size
fd_set *readfds = (fd_set *) malloc (fd_set_size);
// copy the entire contents across
  memcpy (readfds, &svc_fdset, fd_set_size);


  Another thing that makes your example work is just to say:

    readfds = svc_fdset;

  that since we know that line only copies the default fd_set size from the
definition above, we know that it will have copied a maximum of FD_SETSIZE
bits-worth of integers.  So we could just limit the value of size that we
pass to select:

    switch ( select ( FD_SETSIZE, &readfds, NULL, NULL, NULL ) )

because at root, you were originally copying a big array into a smaller one
(truncating the end), but then passing select the size of the larger one.
This way you're at least giving select the correct size of the array you're
actually passing to select, rather than the size of the array from which you
originally copied the data but aren't passing to select.  However that has
the problem that if your service fd was ever > FD_SETSIZE (64) then the
truncation of the array would leave readfds with no bits set at all!  So
allocating readfds to be the correct size based on the known size of
svc_fdset (that you get from getdtablesize) is the better solution.  

  A third solution would be to "#define FD_SETSIZE 256" in your code before
#including sys/types.h, which means that the default fd_set struct it
defines would match the ones being used by the rpc library.  But that's not
a robust solution; the rpc library might decide to use a different
descriptor table size one day.


Summary:

  The problem is that you pass select an fd_set of one size, but tell it
you've passed it one of a different size.  It doesn't help any that in your
application the size of fd_set is defined at compile time, but you don't
know what size the library uses until runtime.  The three possibilities are:

A) #define FD_SETSIZE 256 
 - has the disadvantage of being a compile time constant; what if one day
getdtablesize returns a larger value, perhaps because a newer version of the
library uses more fds?

B) size = FD_SETSIZE;
 - has the advantage of telling select the true size of the data you're
passing it, but doesn't deal with the problems caused if the data gets
truncated.

C) Get size from getdtablesize and then allocate your fd_sets according to
that size.
 - Should work correctly for all default system fd_set sizes and for all rpc
library dtable sizes.

  


    cheers, 
      DaveK
-- 
Can't think of a witty .sigline today....


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

RE: "select" returns error "Bad file descriptor" when called with a copy of "svc_fdset" (defined at "rpc.h") as it's readfds argument

Reply via email to