Re: [9fans] notes and traps

2008-08-30 Thread Steve Simon
purely as an aside

this detailed conversation on how notes are handled in the
kernel is very interesting, I was trying to understand this myself
recently and gave up; I will now try again.

Thanks,

-Steve



Re: [9fans] notes and traps

2008-08-30 Thread erik quanstrom
> I'm not claiming anyone to be wrong. I want to solve
> this problem.

i'm not sure there is a problem with notes.  but i really can't
answer that question.  someone with a better understanding
of what notes are supposed to be could answer this.

> I also noted that i think this is the correct behavior in
> the previous mail. My patch is on the other side inpostnote(),
> where the note is put in the up->note[] array, not where
> it decides to kill the proc.

i'm not convinced that you've explained why this change is
correct or why it could solve the problem. after all, the
NNOTED array is only 5 entries long.  what if one gets 5
user notes before the system note?

do you kill the process (isn't this how it works now?),
make notes unreliable or block the sender?  none of these
options seems like an improvement to me.

> >  sleeping in a note handler seems
> > like it's pushing the limits to me. 
> 
> It doesnt matter if we sleep() here. You could do a
> write() or any other syscall instead. Or do the looping 
> and wait for the timer interrupt.

i know.  but sleep gives other processes a large window
to deliver more notes.  what is the external note, by the
way?  a keyboard interrupt?

> Being curious, why is sleep() in a note handler a bad
> idea?

as i see it (please chime in if you disagree), a process in
a note handler is in a funny state.  the kernel treats this
state as funny as it keeps up->notified around
to note (ha!) this fact.

as a general rule, i would think that anything prolonging
this funny state would tend to cause problems.

> Dont tell me... But i cant think of any other way to catch
> linux syscalls for now.

without changing your mechanism, you could move the
work out of the note handler and let the note handler
just set state.

for example, byron's rc ^c handler just set a global
interrupted variable.  there were only three or four
places where this variable was checked.  these locations
were carefully vetted for longjmp safety. and lonjmp'ed
back to the command loop (if interactive).  it doesn't
sound like it would work well, but it worked perfectly.

about the same time byron was working on rc, i wrote
my own shell.  i wasn't convinced that byron's technique
would work or would be more reliable than doing things
from the signal handler.  ... experience says that i was
wrong.

there's also a special dodge of a note continuation handler
for ape available.  this hints that the note interface might
be pushed too far by ape.

- erik



Re: [9fans] notes and traps

2008-08-30 Thread cinap_lenrek
> > I'm not claiming anyone to be wrong. I want to solve
> > this problem.
> 
> i'm not sure there is a problem with notes.  but i really can't
> answer that question.  someone with a better understanding
> of what notes are supposed to be could answer this.
>
> > I also noted that i think this is the correct behavior in
> > the previous mail. My patch is on the other side inpostnote(),
> > where the note is put in the up->note[] array, not where
> > it decides to kill the proc.
> 
> i'm not convinced that you've explained why this change is
> correct or why it could solve the problem. after all, the
> NNOTED array is only 5 entries long.  what if one gets 5
> user notes before the system note?

assuming now that up->note[] is filled up and up->nnote = 5.
the trap happens... we are in sys/src/9/pc/trap.c trpa():
we detect the fault and do a postnote(up, 1, "sys: trap: bla", NDebug)
postnote() will see that p->notified is 0 becasue we are not in the
note handler yet, so it will skip the:

if(flag != NUser && (p->notify == 0 || p->notified))
p->nnote = 0;

then it checks if here is any room to store the note, which is not
the case. then we set p->notepending and exit.

So now, our trap note got lost! We will *not* get killed then... and happily
process the pending 5 user notes.

With my suggested patch, it would have discarded all the notes
and *have* the NDebug note posted. Notify will pick it up and kill
us if here is no handler installed or if it happend while inside the
note handler.

> do you kill the process (isn't this how it works now?),

no, it would lose the note with the current implementation.

> make notes unreliable or block the sender?

yes, trowing usernotes away when something important as an
internal note happend to make sure it gets put in the note array
so we have a chance to process it:

if(n->flag!=NUser && (up->notified || up->notify==0)){
if(n->flag == NDebug)
pprint("suicide: %s\n", n->msg);
qunlock(&up->debug);
pexit(n->msg, n->flag!=NDebug);
}

This was already in the code (see postnote()).  The only thing that i
claim to change is that it doesnt matter if we are notified or not
when posting the note.  We check for it in notify() ayway so the
process *will* be terminated if it causes another trap while inside
its note handler. 

> none of these
> options seems like an improvement to me.

losing the trap note and *not* handling it at all looks like
a bug to me.

> > >  sleeping in a note handler seems
> > > like it's pushing the limits to me. 
> > 
> > It doesnt matter if we sleep() here. You could do a
> > write() or any other syscall instead. Or do the looping 
> > and wait for the timer interrupt.
> 
> i know.  but sleep gives other processes a large window
> to deliver more notes.  what is the external note, by the
> way?  a keyboard interrupt?

External notes are the ones posted to /proc/pid/note[pg].
Like if you hit [DEL], rio writes "interrupt" the the programs
note file running in the window.

> > Being curious, why is sleep() in a note handler a bad
> > idea?
> 
> as i see it (please chime in if you disagree), a process in
> a note handler is in a funny state.  the kernel treats this
> state as funny as it keeps up->notified around
> to note (ha!) this fact.

As i see it, up->notified just means we have put the process
into its note handler, and any further notes that are in the
up->note[] array (that are NUser) get not handled until the
user process calls noted().

Thats exactly how it is documented in notify(2):

Until the program calls noted, any further externally-generated notes
(e.g., hangup or alarm) *[1] will be held off, and any further notes
generated by erroneous behavior by the program (such as divide by
zero) *[2] will kill the program.

[1] that are, NUser notes
[2] NDebug for traps

> as a general rule, i would think that anything prolonging
> this funny state would tend to cause problems.

> > Dont tell me... But i cant think of any other way to catch
> > linux syscalls for now.
> 
> without changing your mechanism, you could move the
> work out of the note handler and let the note handler
> just set state.

Not in this case. I need both notes to interrupt the linux
machine code (which i cant change) and to catch the
syscalls.

With the current handling, the only place where it would be save to
send (external) notes to a process from another is where i'm not in
linux user code (which can trigger a trap when its doing a syscall).
I dont generate traps in the linuxemu code.  (Which would kill me as
documneted in the manpage)

> for example, byron's rc ^c handler just set a global
> interrupted variable.  there were only three or four
> places where this variable was checked.  these locations
> were carefully vetted for longjmp safety. and lonjmp'ed
> back to the command loop (if interactive).  it doesn't
> sound like it would work well, but it worked

Re: [9fans] notes and traps

2008-08-30 Thread cinap_lenrek
> i'm not convinced that you've explained why this change is
> correct or why it could solve the problem. after all, the
> NNOTED array is only 5 entries long.  what if one gets 5
> user notes before the system note?
> do you kill the process (isn't this how it works now?),
> make notes unreliable or block the sender? none of these
> options seems like an improvement to me.

Hm, i think here would be a better way to handle it without
losing user notes while make sure internal system notes
get posted (and handled) in any case.

What if we change postnote to:

int
postnote(Proc *p, int dolock, char *n, int flag)
{
int x, s, ret;
Rendez *r;
Proc *d, **l;

if(dolock)
qlock(&p->debug);

if(flag != NUser){
x = 0;
if(p->nnote < NNOTE){
if(p->nnote)
memmove(&p->note[0], &p->note[1], p->nnote * 
sizeof(p->note[0]));
p->nnote++;
}
} else {
x = -1;
if(p->nnote+1 < NNOTE)
x = p->nnote++;
}

ret = 0;
if(x >= 0) {
strcpy(p->note[x].msg, n);
p->note[x].flag = flag;
ret = 1;
}
p->notepending = 1;
if(dolock)
qunlock(&p->debug);
...

In that case, if a external note is posted, we make sure here is
always room for one further internal note in the array. (see the
p->nnote+1 < NNOTE)

On arrival of a internal note, we move the rest down the array
and put our note in the first entry.

So the following conditions hold true:

- internal notes get *always* posted (so the process gets terminated
  if its not handled or while in the note handler)
- we do not drop external notes on internal note arrival and the sender
  can detect if posting the note failed.
- internal notes get not queued after external notes so they will be
  handled by the next call to notify() and not tick notify() to think
  that while processing a previous user note, that this caused an
  internal note and kill the process before it has a chance to see the
  that note.

Anything wrong here?

--
cinap