RE: renegotiating problem - connection hanging?

David Schwartz Wed, 21 Jun 2006 15:37:33 -0700

> David you are bringing completely unrelated issues into the situation.


        No, you are failing to understand my argument.

> David Schwartz wrote:
> ...SNIP...
> >     One other point, I didn't mention threads to argue that if
> another thread
> > steals your data, the operation will clearly block. I mentioned
> it to show
> > that it's impossible for 'select' to guarantee even that the
> next operation
> > will block without breaking valid code. (Because that would
> require kernel
> > omniscience to divine the intent of the programmer.)
> >
> >     Consider:
>
> Yes we are all aware of that and its just another unrelated side track.

        No, it's not.

> To disarm this point too it is possible to know for sure that no other
> process or threads have access to your file descriptor.  Duh!  Also do
> you take threaded programming so lightly that you think you can nick and
> borrow file descriptors in your application willy nilly.  Duh!

        That has nothing to do with anything. The point is not what the 
application
can know or do, the point is what the implemenation of 'read' or 'write' can
know or do.

        One thread detecting that a socket is writable and then asking another
thread to do a write is hardly borrowing file descriptors willy nilly.

> >     Should that write block or not? If you really think
> 'select' could ever
> > guarantee that a future operation will not block, then the kernel should
> > remember the 'select' hit and return immediately from that
> 'write'. However,
> > the implementation has no way to know that the call to write
> from thread B
> > had anything to do with the call to 'select' from thread A.
> Perhaps the code
> > is unrelated and thread B needs normal blocking behavior. (Think of the
> > bizarre race conditions this would cause.)

> Err you do when you are a competent multi-threaded programmer, this
> stuff is all basic schooling.  You can not use the same SSL context from
> two threads at the same time anyway, the high level API calls are not
> thread safe.

        What does that have to do with my point?!

        My point is that 'select' cannot guarantee that a future operation will 
not
block because there is no way to tell whether a given operation is supposed
to be a "future operation" that umst not block or a normal operation that
should block because the socket is blocking.

> And YES select can guarantee the next operation would not block in the
> circumstances we are talking about.  Otherwise those applications would
> be broken by design and they are not.

        No, it cannot. I *SHOWED* *WHY*. Because there is no way that either
'select' or the subsequent operation can pair themselves (without kernel
omniscience).

        The kernel sees a 'select', then it sees an operation on a blocking 
socket.
The kernel has no way to know that you are thinking of that operation as
subsequent to the select and I demonstrated cases where they can be
incorrectly paired. So the kernel has no way to assure that that next
operation will not block without breaking normal blocking semantics.

> The situation is the _NORMAL_ single process, single thread has created
> a file descriptor associated with a network socket which is set in
> blocking mode.  Nothing else on the host has access to that file
> descriptor because we created it since execve()/fork() was last called.
>
> There is no point complicating matters by side tracking issues concerning:
>
> * What is another thread does something with the fd
> * What is another process has access to the same fd (dup across
> fork()/exec())
>
> All of these issues are non-starters and unrelated to the problem being
> discussed, we are all aware of those issues.

        The same problem occurs with one thread. Consider the following code,
assume blocking sockets:

1) do some stuff
2) do a huge write, don't check for short writes since our socket is
blocking

        Now you come along and say "the kernel can ensure that a select hit 
ensures
a subsequent operation will not block". I say, what happens if I do the
following:

1) do some stuff
1.5) do a 'select' to log which sockets are readable and writable for
statistical purposes
2) do a huge write, don't check for short writes since our socket is
blocking

        Your proposal, having 'select' ensure the subsequent 'write' does not 
block
will *break* my code. And my reply to you would be, "what should I do?
request blocking sockets again with an 'I REALLY MEAN IT THIS TIME' flag?"

> The problem at hand is that ideally we want the two parallel blocking
> modes of the SSL layer to be direct equivalents to the host machines two
> blocking modes at the socket layer.  This is allows transparency which
> means you application doesn't need any design change.

        I agree. 'SSL_read' should block until application data is avialable, 
just
as a TCP 'read' does.

> Of course there is a well defined "subsequent" since the poll/select
> event system in the kernel and the file descriptor io buffers which
> drive those triggers have appropriate locking in place to make it well
> defined behaviour.

        They do not, see my example above. What if 'select' gives me both a 
'read'
hit and a 'write' hit. I do a 'read'. Then later there's a 'write'. Is that
'write' subsequent? What if another 'select' just for 'read' intervenes? Is
the 'write' still subsequent?

        This is madness. The fact that you asked for blocking semantics simply
*has* to override the fact that the operation might be 'subsequent' for some
imprecise definition of subsequent.

> If you believe what you say is true please point at the kernel
> implementation that works the way you say it does.  Linux does not work
> this way, it works the way Mikhail and I have explained.

        I'm not talking about any particular implementation, I'm talking about 
the
relevent standards and what guarantees you actually have.

> No its not that the next read is non-blocking.  Its that the next read()
> has data to read or EOF or error condition to report.  Because of that
> the next invocation of a related system call will behave not blocking.
> The select indicates that event is ready waiting and pending inside the
> kernel for the application to pull from the socket.

        What if the error went away by the time I call 'read'? You *must* mean 
that
the kernel is *obligated* to keep the error until a subsequent 'read'
occurs. There is nothing else you can possibly mean. Your interpretation
requires 'select' to make the next operation non-blocking, which requires
some way to determine when that next operations occurs.

        Consider a 'select' read hit on a UDP socket. Your interpretation
*requires* the kernel not to discard that packet or a subsequent 'read' can
block. When precisely can the kernel go back to its normal discard process?
Your interpreation absolutely does require the kernel to unambiguously pair
'select' operations to "subsequent" operations.

> As Mikhail pointed out in another email, you have not explained what
> scenario can exist where that pending event disappears ?  If another
> process or thread issues a read()/recvfrom()/recvmsg()/recv() on that
> file descriptor after the select returns its a given that it may clear
> the pending event.  Again we all know this but its not related to the
> problem being discuessed.

        That I cannot think of a way it can break does not mean it's guaranteed 
to
work. The guarantees you have are the ones in POSIX.



> As I said before you start off with a working app that uses poll/select
> event loop for timeouts but sometimes goes off and wants to do bulk
> blocking io.  Whatever the io paragim is for the platform the
> application runs within that.  So when you convert it over to SSL you
> want to keep your IO layer driven from your poll/select loop.  If the
> SSL layer only did one syscall per high-level call the design of your
> app stays the same and the SSL layer gets invoked when there is
> work to do.
>
> This is a standard way of writing an app.  If you take OpenSSL out of
> the app the app works.

        The app does not work. Consider a case where the kernel detects an error
and gives a read hit on select. By the time the app calls 'read' the error
has cleared, so the read blocks forever. Please tell me where the standards
for 'select' and 'read guarantee that this will not happen.

> Again one last call to please prove your point by code or
> implementation.  Lets see your system where a select event can vanish
> when no application layer call has been made relating to it.

        See the paragraph above. A 'select' event can vanish due to timeouts or
network packets being received. (Please show me where the standard
guarantees it cannot, you cannot do it.)

        You are the one arguing from folklore. That you have never seen it 
happen
means it cannot happen.

> But if you take OpenSSL out of the application, the application works
> fine.

        It works fine until it doesn't. Suppose someone added OpenSSL by a 
preload
or other intercept. Now it doesn't work. You blame that on OpenSSL. I blame
it on the application.

> Not one solid technical reason has been given to explain why it
> has to be like.

        The solid technical reason is this simple:

        1) A blocking read should block until application data is available.

        2) Nothing guarantees that an operation after a select will not block.

> I am agreed with Mikhail that it makes the application
> programers view of OpenSSL mode complicated than it needs to be.

        I totally disagree. It makes it *precisely* the same as TCP. You make
blocking calls if you want to block, non-blocking ones if you don't.
Simplicity itself.

> I am
> lucky in that all my development with OpenSSL is non-blocking all the
> time but I fully understand the other IO programming model.  You don't
> appear to, you have incorrect beliefs on how things work and give far
> too much weight to unrelated issues that have no impact in the real
> world.  Maybe you are in academia :).

        No, I haver to fix people's code when it breaks. It breaks because they
ASSUME that because it has always worked even though it was never guaranteed
to work, it will continue to work. You are specifically endorsing that
broken assumption.

> >     Okay, I'll shut up now. This is just one of my pet peeves
> because it's a
> > bug I have to frequently track down and fix and I'm getting
> tired of people
> > evangelizing *for* the bug and encouraging people to make it.

> That paragraph went over my head.  What bug ?  Which bug ?

        The bug of assuming that a 'select' assures that a subsequent operation
will not block and then blocking because something unexpected happened
between the 'select' and the operation.

        The instant case is just such an example. The unexpected thing was that 
the
data cleared because it turned out not to be application data.

        DS



______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

RE: renegotiating problem - connection hanging?

Reply via email to