Dan Nelson wrote:
In the last episode (Dec 03), Support (Rudy) said:
Below is part of the cron...  Seems like any random cronjob can get
clogged up... load varies from 0.2 to 1.0 on this dual-core box.  I
rebooted the box -- cron's continue to slowly pile up.

One of the cronjobs that is 'stuck' is this one: /root/bin/raid-status.sh
which can be found here:
 http://www.monkeybrains.net/~rudy/example/raid_status.html

Forgot to mention, I am running:
  6.2-STABLE FreeBSD 6.2-STABLE #3: Thu May 31 01:18:15 PDT 2007

OH, ps shows this:
58383  ??  D      0:00.00 cron: running job (cron)
58384  ??  IVs    0:00.00 cron: running job (cron)

In general, when troubleshhoting, "ps axlw" is a more useful command.
It adds among other columns, the MWCHAN one, which details exactly why
a process is stuck in the D state.
Anyway, cron does a fork and then a vfork creating a child and a
grandchild process.  I'm sort of surprised at the amount of code
between vfork and exec in the grandchild in
/src/usr.sbin/cron/cron/do_command.c .  Since process 3 is actually
using process 2's address space one must be extremely careful not to
modify static variables or change other global state that would affect
the parent once it resumes execution, and all the logging,
environment-setting, and user-context calls are certain to mess with
the parent's state, especially with nss modules in the mix.  I'd
personally recompile cron with all vforks replaced with fork and see
what happens.

It couldn't hurt to update to a newer kernel version along the RELENG_6
branch as a test, I guess.  Note that your uname will change to
6.3-PRERELEASE, but apart from causing lsof to complain, you should be
okay.

/var/log/cron has this entry:
Dec  3 20:16:00 pita /usr/sbin/cron[58384]: (root) CMD  
(/root/bin/raid-status.sh CRON)

BUT there is no 'raid-status.sh' stuck in the "ps axw". Seems like the vfork set off the cronjob, it ran, but then cron didn't 'stop' executing. Any debuggin tips?

Can you tell if raid-status.sh ever ran?  i.e. is process 2
stuck at the start of vfork or at the end.

I added this line to the top of my cronjob:
 logger -t DEBUG "$0: $$"
and cron seems stuck BEFORE the script is ever run. Whether it sticks or not appears random, as plenty of log lines are showing up with the output of the logger command in my /var/log/messages.

# tail /var/log/messages
Dec 13 11:16:00 pita DEBUG: /root/bin/raid-status.sh: 64414
Dec 13 12:00:00 pita DEBUG: /root/bin/raid-status.sh: 80115
Dec 13 12:00:00 pita DEBUG: /root/bin/raid-status.sh: 80119
Dec 13 12:11:00 pita DEBUG: /root/bin/raid-status.sh: 84283

Here is the ps output:
# ps axlw
  UID   PID  PPID CPU PRI NI   VSZ   RSS MWCHAN STAT  TT       TIME COMMAND
    0 85939 82253   0   8  0  2148  1560 ppwait D     ??    0:00.00 cron: 
running job (cron)
    0 85940 85939   0   4  0  2148  1560 sbwait IVs   ??    0:00.00 cron: 
running job (cron)
# grep 85940 /var/log/cron
Dec 13 12:16:00 pita /usr/sbin/cron[85940]: (root) CMD 
(/root/bin/raid-status.sh CRON)

- Rudy
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to