So I ran truss on the master smbd process...
It appears that something fired a SIGTERM arrow at it and then tried to
cancel it with a SIGCONT arrow or something?
Received signal #18, SIGCLD, in pollsys() [caught]
siginfo: SIGCLD CLD_KILLED pid=22085 status=0x0006
pollsys(0x08047840, 7, 0x080479C8, 0x00000000) Err#4 EINTR
lwp_sigmask(SIG_SETMASK, 0x00030080, 0x00000000, 0x00000000, 0x00000000)
= 0xFFBFFEFF [0xFFFFFFFF]
write(7, "\0", 1) = 1
setcontext(0x08047320)
waitid(P_ALL, 0, 0x08047900, WEXITED|WTRAPPED|WNOHANG) = 0
getuid() = 0 [0]
fstat64(8, 0x080476D0) = 0
write(8, " [ 2 0 1 0 / 1 0 / 2 6 ".., 53) = 53
getuid() = 0 [0]
write(8, " s m b d / s e r v e".., 50) = 50
getuid() = 0 [0]
write(8, " [ 2 0 1 0 / 1 0 / 2 6 ".., 53) = 53
getuid() = 0 [0]
write(8, " S c h e d u l e d ".., 68) = 68
waitid(P_ALL, 0, 0x08047900, WEXITED|WTRAPPED|WNOHANG) = 0
pollsys(0x08047840, 7, 0x080479C8, 0x00000000) = 1
read(6, "\0", 16) = 1
Received signal #15, SIGTERM, in pollsys() [caught]
siginfo: SIGTERM pid=22086 uid=0
pollsys(0x08047840, 7, 0x080479C8, 0x00000000) Err#4 EINTR
lwp_sigmask(SIG_SETMASK, 0x00014080, 0x00000000, 0x00000000, 0x00000000)
= 0xFFBFFEFF [0xFFFFFFFF]
Received signal #25, SIGCONT [default]
siginfo: SIGCONT pid=22086 uid=0
write(7, "\0", 1) = 1
setcontext(0x08047320)
getuid() = 0 [0]
write(8, " [ 2 0 1 0 / 1 0 / 2 6 ".., 54) = 54
getuid() = 0 [0]
write(8, " s e t t i n g s e".., 49) = 49
getuid() = 0 [0]
getgid() = 0 [0]
setgroups(0, 0x00000000) = 0
setregid(-1, 0) = 0
getgid() = 0 [0]
setreuid(-1, 0) = 0
getuid() = 0 [0]
getuid() = 0 [0]
write(8, " [ 2 0 1 0 / 1 0 / 2 6 ".., 56) = 56
getuid() = 0 [0]
write(8, " Y i e l d i n g c".., 26) = 26
fcntl(12, F_SETLKW64, 0x08047690) = 0
fcntl(12, F_SETLK64, 0x080477A0) = 0
fcntl(12, F_SETLK64, 0x080477A0) = 0
fcntl(12, F_SETLKW64, 0x080476E0) = 0
fcntl(12, F_SETLKW64, 0x08047720) = 0
fcntl(12, F_SETLKW64, 0x080477D0) = 0
munmap(0xFE540000, 53248) = 0
close(13) = 0
munmap(0xFE490000, 40200) = 0
close(14) = 0
sigaction(SIGUSR1, 0x080477D0, 0x00000000) = 0
munmap(0xFE6F8000, 86016) = 0
close(5) = 0
close(6) = 0
close(7) = 0
sigaction(SIGCLD, 0x08047870, 0x00000000) = 0
sigaction(SIGHUP, 0x08047870, 0x00000000) = 0
sigaction(SIGTERM, 0x08047870, 0x00000000) = 0
close(31) = 0
close(30) = 0
close(29) = 0
close(28) = 0
getuid() = 0 [0]
write(8, " [ 2 0 1 0 / 1 0 / 2 6 ".., 53) = 53
getuid() = 0 [0]
write(8, " S e r v e r e x i".., 35) = 35
unlink("/var/samba/locks/smbd.pid") = 0
fcntl(17, F_SETLKW64, 0x08047880) = 0
fcntl(17, F_SETLKW64, 0x080478A0) = 0
fstat64(17, 0x08047850) = 0
fcntl(18, F_SETLKW64, 0x08047880) = 0
fcntl(18, F_SETLKW64, 0x080478A0) = 0
fstat64(18, 0x08047850) = 0
time() = 1288063842
fcntl(17, F_SETLKW64, 0x08047810) = 0
fcntl(17, F_SETLKW64, 0x08047850) = 0
fdsync(17, FSYNC) = 0
memcntl(0xFE3EF000, 4132, MC_SYNC, MS_SYNC, 0, 0) = 0
fdsync(17, FSYNC) = 0
memcntl(0xFE3EF000, 24, MC_SYNC, MS_SYNC, 0, 0) = 0
fdsync(17, FSYNC) = 0
memcntl(0xFE37C000, 733184, MC_SYNC, MS_SYNC, 0, 0) = 0
utimensat(AT_FDCWD, "/var/samba/locks/gencache.tdb", 0x00000000, 0) = 0
fdsync(17, FSYNC) = 0
memcntl(0xFE3EF000, 24, MC_SYNC, MS_SYNC, 0, 0) = 0
fcntl(17, F_SETLKW64, 0x08047840) = 0
fcntl(17, F_SETLKW64, 0x08047840) = 0
fcntl(17, F_SETLKW64, 0x08047820) = 0
fcntl(18, F_SETLKW64, 0x08047810) = 0
fcntl(18, F_SETLKW64, 0x08047850) = 0
fdsync(18, FSYNC) = 0
memcntl(0xFE370000, 4132, MC_SYNC, MS_SYNC, 0, 0) = 0
fdsync(18, FSYNC) = 0
memcntl(0xFE370000, 24, MC_SYNC, MS_SYNC, 0, 0) = 0
fdsync(18, FSYNC) = 0
memcntl(0xFE35C000, 126976, MC_SYNC, MS_SYNC, 0, 0) = 0
utimensat(AT_FDCWD, "/var/samba/locks/gencache_notrans.tdb", 0x00000000,
0) = 0
fdsync(18, FSYNC) = 0
memcntl(0xFE370000, 24, MC_SYNC, MS_SYNC, 0, 0) = 0
fcntl(18, F_SETLKW64, 0x08047840) = 0
fcntl(18, F_SETLKW64, 0x08047840) = 0
fcntl(18, F_SETLKW64, 0x08047820) = 0
time() = 1288063842
fcntl(18, F_SETLKW64, 0x08047800) = 0
fcntl(18, F_SETLKW64, 0x08047840) = 0
kill(0, SIGTERM) = 0
Received signal #15, SIGTERM [default]
siginfo: SIGTERM pid=21338 uid=0
On Tuesday, October 26, 2010 10:02 AM, Christopher Chan wrote:
Hi all,
I have a samba installation that is part of a AD domain and uses opends
for its ldap backend both for winbind and samba.
I keep getting problems with smbd. No problems with winbindd however. On
a separate note, why is nmbd not started? Anyway, back to smbd.
The logs have entries such as this:
[2010/10/26 09:41:05.325972, 3] smbd/server.c:259()
smbd/server.c:258 Unclean shutdown of pid 20181
and this:
[2010/10/26 09:41:05.326256, 1] smbd/server.c:267()
Scheduled cleanup of brl and lock database after unclean shutdown
and stuff like this:
[2010/10/26 09:41:25.327120, 2] lib/messages_local.c:289()
message to process 19991 failed - No such process
[2010/10/26 09:41:25.327219, 2] lib/messages_local.c:379()
pid 19991 doesn't exist - deleting messages record
[2010/10/26 09:41:25.327318, 2] lib/messages.c:127()
pid 19991 doesn't exist - deleting connections -1 []
core dumps can be found for some of these pids
Finally, all smbd processes disappear and I find logs such as these in
the SMF logs:
[ Oct 26 09:13:10 Leaving maintenance because clear requested. ]
[ Oct 26 09:13:10 Enabled. ]
[ Oct 26 09:13:10 Executing start method ("/usr/sbin/smbd -D"). ]
[ Oct 26 09:13:10 Method "start" exited with status 0. ]
[ Oct 26 09:32:19 Stopping because process dumped core. ]
[ Oct 26 09:32:19 Executing stop method ("/usr/bin/kill `cat
/var/samba/locks/smbd.pid`"). ]
[ Oct 26 09:32:19 Method "stop" exited with status 0. ]
[ Oct 26 09:32:20 Executing start method ("/usr/sbin/smbd -D"). ]
[ Oct 26 09:32:20 Method "start" exited with status 0. ]
[ Oct 26 09:32:27 Stopping because process dumped core. ]
[ Oct 26 09:32:27 Executing stop method ("/usr/bin/kill `cat
/var/samba/locks/smbd.pid`"). ]
[ Oct 26 09:32:27 Method "stop" exited with status 0. ]
[ Oct 26 09:32:33 Executing start method ("/usr/sbin/smbd -D"). ]
[ Oct 26 09:32:33 Method "start" exited with status 0. ]
[ Oct 26 09:33:10 Stopping because process dumped core. ]
[ Oct 26 09:33:10 Executing stop method ("/usr/bin/kill `cat
/var/samba/locks/smbd.pid`"). ]
[ Oct 26 09:33:10 Method "stop" exited with status 0. ]
[ Oct 26 09:33:17 Executing start method ("/usr/sbin/smbd -D"). ]
[ Oct 26 09:33:17 Method "start" exited with status 0. ]
[ Oct 26 09:33:24 Stopping because process dumped core. ]
[ Oct 26 09:33:24 Executing stop method ("/usr/bin/kill `cat
/var/samba/locks/smbd.pid`"). ]
[ Oct 26 09:33:24 Method "stop" exited with status 0. ]
[ Oct 26 09:33:31 Restarting too quickly, changing state to maintenance. ]
Having to go and run svcadm clear samba is not fun for a production
system...
Backtrace of sample core dump:
pstack smbd.19991.bradsuper1
core 'smbd.19991.bradsuper1' of 19991: /usr/sbin/smbd -D
feb72297 _lwp_kill (1, 6, 8046ee8, feb19f5e) + 7
feb19f6a raise (6, 0, 8046f38, feaf19fa) + 22
feaf1a1a abort (86f02b9, 8992a14, 8046f78) + f2
083a6059 dump_core (13, 8a59dd0, 0) + 211
083b8e9a smb_panic (8719318, 8992a14, 8046f98, 819b07c) + 12e
0819b08f set_unix_security_ctx (c399, c357, 13, 8a59dd0) + 47
0819b15d set_sec_ctx (c399, c357, 13, 8a59dd0, 8a599d0, 64) + b5
0818b69e change_to_user (8a594f8, 64, 8a53780, 871fc28) + 2ee
081b1728 make_connection_snum (8a353b0, 15, 8a437e8, 8a5cb78, 1,
80471f0) + a8c
081b22ef make_connection (8a353b0, 8a8b17d, 8a5cb78, 1, 8a8b210, 64) + 4cf
08167867 reply_tcon_and_X (8a8b050, 8992a14, 80475f8, 81adbd8) + 23f
081adcba switch_message (75, 8a8b050, 5a, 0) + 3ca
081ade86 construct_reply (0, 5a, 0, 0, 0, 0) + de
081ae0ad process_smb (8a353b0, 8a8af70, 5a, 0, 0, 0) + 135
081aeedc smbd_server_connection_read_handler (8a353b0, 871dc40, 0,
81aeef4) + 9c
081aef2d smbd_server_connection_handler (8a35338, 8a37e30, 1, 8a353b0) + 45
083c923c run_events (8a35338, 1, 8047770, 80477f0) + 1e8
081ad56f smbd_server_connection_loop_once (8a353b0, 8992a14, 8047948,
81af85e) + 10f
081af86c smbd_process (0, 0, 40, 10, 2b060002, 1366a8c0) + 6c4
086dd67d smbd_accept_connection (8a35338, 8a85bc8, 1, 8a85b78) + 209
083c923c run_events (8a35338, 1, 8047af0, 8047b70) + 1e8
083c943d s3_event_loop_once (8a35338, 898033c, 8821ffc, 83c9e9a) + 111
083c9f05 _tevent_loop_once (8a35338, 898033c, 8047c78, 86ddf16) + 79
086ddf22 smbd_parent_loop (8a856c0) + 82
086df0ce main (2, 8047e54, 8047e60, 8133baf) + bea
08133c0d _start (2, 8047ef8, 8047f07, 0, 8047f0a, 8047f1e) + 7d
Please let me know what I can do to help identify the problem and get
this fixed. BTW, this also keeps appearing everytime samba is restarted:
rlimit_max: rlimit_max (256) below minimum Windows limit (16384)
_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss
_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss