Re: SCTP using questions (API etc.)

Randall Stewart Wed, 16 Apr 2008 07:27:29 -0700

Vadim:

Sorry I have not chimed in earlier.. I tend to
"not look" often at some of my boxes :-)


Glad Michael helped out here .. thanks Michael



Vadim Goncharov wrote:

Hi Michael Tuexen!
On Thu, 6 Mar 2008 09:34:13 +0100; Michael Tuexen wrote about 'Re: SCTP using 
questions (API etc.)':
"substreams". SCTP can do it for me, it's wonderful, but in practice
there
are some questions.

How long can be one particular SCTP message? Can I rely on the fact
that it
can be unbounded, e.g. I want to emulate a stream with transfer of
6 Gig-sized file?
Protocol wise there is no limitation of the message size. API wise,forthis size of a message you need to use the explicit EOR mode to beable
to pass this large message using multiple sequential send() calls.
And how should I determine from my/remote stack an optimal size formessageparts when entire message is guaranteed to not fit into buffers/windows of
both peers?
If the sendbuffer is too small for the message to fit, the send call
will return -1 and errno being set to EMSGSIZE. Or you do it in theapplicationby inspecting the sendbuffer size. You do not have to deal with therecv buffer
of the peer.
So this means I need no subscription to unsent messages and simply can try
to resend message in several steps without EOR, after getting EMSGSIZE ?


So, if you have put your socket int EEOR mode, then
you could send multiple sends down the socket (to the same stream)
until you get back a EWOULDBLOCK. You would only get a
EMSGSIZE if the value you are sending is larger than the entire
size of the send socket buffer..

So lets say the send buffer for the socket is 100k

You could do
while(ok) {
   send(1k[index]);
   if(ret == -1) && error == EWOULDBLOCK)
       hit full buffer (100k is inqueue)
       go do wait or other thing
       resume send(1k[index]
   else
      index++;

Either all of the buffer or none of the buffer will be sent.

Can a message be of zero-length data (only headers) ?
Empty user messages, i.e. a DATA chunk without payload is notallowed.
An empty SCTP message, i.e. only the common header without any chunks
is allowed and processed by FreeBSD when received, but ever send(well,
I do not know a way to force the FreeBSD implementation to send it).
OK, understood. So I should include at least 1 byte of my ownheaders intodata and do receive into *iov with at least to parts to ensure goodalign
for non-header part?
What header are you talking about? An application header or any SCTPheader?
You will never receive any SCTP header as part of a user message via
a recv() call. SCTP will give you as much of a message that fits into
the buffer you provide or it has, if the partial delivery API has beeninvoked.
My applicaion-protocol header, of course. Does this mean also that I should
always enable partial delivery on receiving? Or what will happen if received
msg is too big and don't fir into my buffers?


Well, you have no control over this per.se. You can get partial delivery
events.. there is only one. the partial delivery was aborted.. you probably
need this if you are going to do EEOR mode.

Basically the kernel will start a partial delivery when 1/2 of the recvbuffer

is in use. Note there is a socket option to control this value, so you can
change it if you like...

What is the relation between SCTP streams in both directions? Can
streams
be opened and closed on-demand, like SSH port forwarding (yet again

multiplexing example) or they are preallocated at connection setupall

together? What is the minimum number of streams application can rely
upon
(or it just one stream 0 in general case) ?

If you restrict to protocols being in RFC status, there is no way of

modifying the number of streams during the lifetime of anassociation.

The number of streams in each direction is negotiated during the
association setup. The streams in bother directions are completely
independent. There is always at least one stream in each direction,
which
is stream 0.
However, there is an extension (currently specified in an Internet
Draft)
which allows the addition of streams during the lifetime of an
association.
The ID is at least partially supported by the FreeBSD implementation.
https://datatracker.ietf.org/drafts/draft-stewart-sctpstrrst/

OK. Are there recommended defaults for various stacks about how many

streams they are creating by default / what maximum of themapplication

can ever request?

The maximum number to request is 2^16 - 1. It is controllable by the
applications via socket options. Defaults in OSes are in the order of
10, 16, 32...


Can I be sure that every OS can give me maximum number of streams if I
request it?


The ceiling for the number of streams is actually a defined contnsatn
in the BSD stack (and in most).

For bsd its defined in
sctp_constants.h

and is

#define MAX_SCTP_STREAMS 2048

I probably should make this a configurable item ... hmm..

Each stream outbound costs about 16 bytes.
Each stream inbound costs about 16 bytes...

Thus my desire to limit to some extent resources used.. I think
most kernels do this as well.

You of course can twiddle the define, and I think for 8.x I will see about
making it an option.

How can I put request to kernel for a connect, for example, and then
sleep
until connect will complete or event in some another descriptor will
occur?

If you use the 1-to-1 style API, it should be similar to using TCP.
Put the socket in non-blocking mode, enable notifications,

call connect() and wait until the fd becomes readable. You shouldget an

indication that that association has been established or could not
start.

Yes, that's possible, as I see after reading draft-ietf-tsvwg-sctpsocket.

But several more questions arise. What notifications do I really need
on 1-to-1 non-blocking socket API mode? What use is 'context' in this

1-to-1 context and why after a failed send I must receive entirefailed

sent message (which can be very long) instead of just an error code?

The context is something you provide in the send call and is given
back to you. So you can use it to find some state/buffer/whatever again.


It was unclear from draft whether context is one per SCTP association or per
send call. And what the hell are all that unsent messages, why I must
retrieve entire unsent message - can I fire-and-forget a 2M msg and receive
only context of it instead of all 2 megs? And on which condition such event
can ever occur - with TCP it's simple, I either do write() a number of bytes
successfully or receive an error from write() - be that EAGAIN for just
blocking of peer's recv() or connection termination error. What concept is
under unsent msgs?



The idea is that you can see the message that did not get sent. And
you can know if it was every sent .. i.e. put on the wire but
unack'd or never put on the wire.

We don't currently have a way to not get the entire message up (sorry
no one ever asked for that)...

The context is kept per message if I remember right.. Its copied from
the sinfo_context field and then carried with the queued data
until its acked and freed.

I believe you can set a default context as well..

In usual FSM I can use kqueue()/kevent() with arbitrary void* to my
data, also telling me how many bytes I can read from or write to
the socket (RCVLOWAT etc.), as well as indicating error/EOF conditions
so I don't need to do additional syscalls. Is this working with SCTP?

Haven't tried it... Sounds like it would make sense to make sure that
it works.


Oh, can you please check it?.. Would be good to support all features
described in kqueue(2).


I rather doubt this works, since we don't use socket buffers.. pe.se.

I will have to go take a look at it and will proabably need to add
that to my TODO list.

Michael just finished getting it to work INET only.. (no v6).. good work
Michael :-D

If I can't write to TCP socket (due to window shortage from peer),
I leave data in my own application buffers, but SCTP tells something

about unsent messages delivered later, looks somewhat weird, do Ireally

need this? Also, all that msg*/cmsg* staff is too complex, and I see
there are simplier sctp_send()/sctp_sendx() interfaces, will they be
enough and really simplier for me?..

sctp_sendx() purpose is to use the multiple addresses provided during

the implicit setup of the association. So I think you are not lookingfor

Ok.

this. sctp_send() can be used to provide the stream id, payload protocol
identifier and to on with using the CMSG stuff. So you might be looking
for this function.


With CMSG? May be you wanted to say 'without' ?..


Yep,

The sctp_xxx send calls are true function calls so they do not
have the intense overhead of the app encoding ancillary data
and the kernel un-encoding it.. much better :-)

How can I put each client to it's fd and then do a kqueue()/kevent()
on a
set of those fd's (and other items) ? It is very handy to have this
architecture as kevent() allows to store an arbitrary void* in it's
structure which I can later use to quickly dispatch events.

And, of course, all this usual C10K-problem-solving-TCP-server
tricks I want
with basic SCTP SEQPACKET benefits: multiple streams and message
record
separation (I don't need other SCTP features currently). Where can I
find
answers to these questions, like it was with W.R.Stevens books for
TCP ?..

Have you looked at the third edition of 'Unix Network Programming'?
Randall Stewart wrote a couple of sections covering SCTP...

Unfortunately, I have only 2nd edition currently available here,though

heard about 3rd, yes. May be several months later...

It is really worth buying if you are interested in SCTP socketprogramming...


I know, but in my province it is currently unavailable for some months...
you know, Siberia, bears walking on the streets :) but it is not critical
for actual SCTP programming (TCP version will be debugged first), but I need
to take architectural decisions now.

Also, are there some examples of real-world SCTP applications with source
code available? May be something is getting to integrate into our base
system?..

I could probably find some of my test code and send it to you..
I have a pretty intesensive test app that we use sctp_test_app that
does about every socket option etc.. its not pretty..(it grew organically)..
but it does cover lots of stuff..

R



--
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369 <or> 803-317-4952 (cell)
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: SCTP using questions (API etc.)

Reply via email to