On Wed, Feb 17, 2016 at 9:12 PM, Roy Marples <r...@marples.name> wrote: > On 17/02/2016 09:02, Ryota Ozaki wrote: >>> So what events would you choose to skip, if not the scheme that Roy >>> described? >> >> (I think I confused you, sorry...) >> >> I rather want to not skip anything as much as possible >> (except for repeating same events (e.g., up/up/up) because >> keeping them all changes the original behavior). >> >> I intend to skip/eliminate events only if there are too many >> events happen in a short period (i.e., need queuing) to protect >> the system from overloading. In that case (it's a very rare case >> I think), we just drop an earliest event first. > > How much is too many and what is a short period?
We can choose a number that applications unlikely to handle the events (10 or so). A short period means a period between the first interrupt for a link state event happens and a softint for link state changes starts running. > Once you start skipping/eliminating events, how is your solution any > better? How do you measure some lossage vs some lossage? Mine doesn't drop events if there are only a few events while yours drops one event even if there are just two events. I suppose that a few or several events can happen in "a short period" easier than a dozen of events (or more) and the latter implies some hardware troubles (or VMM defects?) and needs a special care to protect the system, for example we give up delivering all events. For the former, we shouldn't skip/eliminate events. > > Also, we can't just drop the earliest event first - we have to ensure > that each state is left in the queue. > Consider starting in UP: > DOWN/UNKNOWN/UP/UNKNOWN/UP/UNKNOWN/UP > > We cannot just discard the fact it went down because important events > attached to DOWN won't trigger. We can preserve DOWN specially if we need. > > Lastly, have we considered the system could be overloaded due to so many > link state change events? A longer queue or more complicated would only > make this worse. Of course, I care and so accept dropping events, but do you really think just two events cause overload? > > From an earlier post of yours: >> Even if a UP state is transient, it's an event that may provide us a >> hint of network conditions for diagnostic. We may be able to get it >> from the console output, but it's not so convenient; we need to >> track events via two different facilities. > > If you're skipping/eliminating events as well then you would also need a > second facility to record this. Other than scribbling on the console, > what did you have in mind? Could this be used elsewhere in the system > where equvialent network assertations are recorded? I don't plan to provide another facility to notify events (even if we provide something, nobody wants to use it, I think). Yes, it's a limitation that we cannot always provide full events, no objection on that. But we can still tell that something bad is happening by sending a bunch of events at once. ozaki-r