[OpenAFS-devel] fileserver crash on Solaris 2.6 with 1.2.7

Martin MOKREJŠ mmokrejs@natur.cuni.cz
Fri, 13 Dec 2002 03:58:08 +0100 (CET)


After "bos restart " on one of those 2 Solaris machines, I've attached
truss to see what's happening with bosserver.

319:    poll(0x0009B080, 1, 60000)                      = 1
319:    recvmsg(3, 0x000CA298, 0)                       = 32
319:    time()                                          = 1039745780
319:    time()                                          = 1039745780
319:    sendmsg(3, 0x000CA030, 0)                       = 44
319:    time()                                          = 1039745780
319:    time()                                          = 1039745780
319:    time()                                          = 1039745780
319:    poll(0x000C82B0, 1, 0)                          = 1
319:    recvmsg(3, 0x000CA298, 0)                       = 140
319:    stat("/usr/afs/etc/CellServDB", 0x000C9CC4)     = 0
319:    time()                                          = 1039745780
319:    time()                                          = 1039745780
319:    time()                                          = 1039745780
319:    time()                                          = 1039745780
319:    poll(0x000C82B0, 1, 0)                          = 0
319:    time()                                          = 1039745780
319:    time()                                          = 1039745780
319:    access("/usr/afs/local/NoAuth", 0)              Err#2 ENOENT
319:    time()                                          = 1039745780
319:    open("/usr/afs/etc/UserList", O_RDONLY)         = 5
319:    read(5, " m m o k r e j s . a d m".., 4096)     = 24
319:    close(5)                                        = 0
319:    kill(4908, SIGQUIT)                             = 0
319:    kill(4909, SIGTERM)                             = 0
319:        Received signal #18, SIGCLD [caught]
319:          siginfo: SIGCLD CLD_KILLED pid=4909 status=0x000F
319:    setcontext(0xEFFFF318)
319:    kill(4910, SIGTERM)                             = 0
319:        Received signal #18, SIGCLD [caught]
319:          siginfo: SIGCLD CLD_KILLED pid=4910 status=0x000F
319:    setcontext(0xEFFFF388)
319:    kill(4911, SIGTERM)                             = 0
319:        Received signal #18, SIGCLD [caught]
319:          siginfo: SIGCLD CLD_KILLED pid=4911 status=0x000F
319:    setcontext(0xEFFFF388)
319:    waitid(P_ALL, 0, 0x000A90F8, WEXITED|WTRAPPED|WNOHANG) = 0
319:    time()                                          = 1039745780
319:    open("/usr/afs/logs/BosLog", O_WRONLY|O_APPEND|O_CREAT, 0666) = 5
319:    llseek(5, 0, SEEK_END)                          = 29960
319:    fstat64(5, 0x000A82A8)                          = 0
319:    ioctl(5, TCGETA, 0x000A8234)                    Err#25 ENOTTY
319:    write(5, " F r i   D e c   1 3   0".., 55)      = 55
319:    close(5)                                        = 0
319:    stat("/usr/afs/logs/core", 0x000A8FF0)          Err#2 ENOENT
319:    waitid(P_ALL, 0, 0x000A90F8, WEXITED|WTRAPPED|WNOHANG) = 0
319:    time()                                          = 1039745780
319:    open("/usr/afs/logs/BosLog", O_WRONLY|O_APPEND|O_CREAT, 0666) = 5
319:    llseek(5, 0, SEEK_END)                          = 30015
319:    fstat64(5, 0x000A82A8)                          = 0
319:    ioctl(5, TCGETA, 0x000A8234)                    Err#25 ENOTTY
319:    write(5, " F r i   D e c   1 3   0".., 55)      = 55
319:    close(5)                                        = 0
319:    stat("/usr/afs/logs/core", 0x000A8FF0)          Err#2 ENOENT
319:    waitid(P_ALL, 0, 0x000A90F8, WEXITED|WTRAPPED|WNOHANG) = 0
319:    time()                                          = 1039745780
319:    open("/usr/afs/logs/BosLog", O_WRONLY|O_APPEND|O_CREAT, 0666) = 5
319:    llseek(5, 0, SEEK_END)                          = 30070
319:    fstat64(5, 0x000A82A8)                          = 0
319:    ioctl(5, TCGETA, 0x000A8234)                    Err#25 ENOTTY
319:    write(5, " F r i   D e c   1 3   0".., 53)      = 53
319:    close(5)                                        = 0
319:    stat("/usr/afs/logs/core", 0x000A8FF0)          Err#2 ENOENT
319:    waitid(P_ALL, 0, 0x000A90F8, WEXITED|WTRAPPED|WNOHANG) = 0
319:        Received signal #18, SIGCLD [caught]
319:          siginfo: SIGCLD CLD_EXITED pid=4908 status=0x0000
319:    setcontext(0x0009CD70)
319:    waitid(P_ALL, 0, 0x000A90F8, WEXITED|WTRAPPED|WNOHANG) = 0
319:    time()                                          = 1039745780
319:    open("/usr/afs/logs/BosLog", O_WRONLY|O_APPEND|O_CREAT, 0666) = 5
319:    llseek(5, 0, SEEK_END)                          = 30123
319:    fstat64(5, 0x000A82A8)                          = 0
319:    ioctl(5, TCGETA, 0x000A8234)                    Err#25 ENOTTY
319:    write(5, " F r i   D e c   1 3   0".., 53)      = 53
319:    close(5)                                        = 0
319:    unlink("/usr/afs/local/SALVAGE.fs")             = 0
319:    waitid(P_ALL, 0, 0x000A90F8, WEXITED|WTRAPPED|WNOHANG) Err#10 ECHILD
319:    fork()                                          = 4938
4938:   fork()          (returning as child ...)        = 319
4938:   close(3)                                        = 0
4938:   close(4)                                        = 0
4938:   close(5)                                        Err#9 EBADF
4938:   close(6)                                        Err#9 EBADF
4938:   close(7)                                        Err#9 EBADF
4938:   close(8)                                        Err#9 EBADF
4938:   close(9)                                        Err#9 EBADF
4938:   close(10)                                       Err#9 EBADF
4938:   close(11)                                       Err#9 EBADF

Is the child process trying to close file descriptors already
by closed by parent?

4938:   close(12)                                       Err#9 EBADF
4938:   close(13)                                       Err#9 EBADF
4938:   close(14)                                       Err#9 EBADF
4938:   close(15)                                       Err#9 EBADF
4938:   close(16)                                       Err#9 EBADF
4938:   close(17)                                       Err#9 EBADF
4938:   close(18)                                       Err#9 EBADF
4938:   close(19)                                       Err#9 EBADF
4938:   close(20)                                       Err#9 EBADF
4938:   close(21)                                       Err#9 EBADF
4938:   close(22)                                       Err#9 EBADF
4938:   close(23)                                       Err#9 EBADF
4938:   close(24)                                       Err#9 EBADF
4938:   close(25)                                       Err#9 EBADF
4938:   close(26)                                       Err#9 EBADF
4938:   close(27)                                       Err#9 EBADF
4938:   close(28)                                       Err#9 EBADF
4938:   close(29)                                       Err#9 EBADF
4938:   close(30)                                       Err#9 EBADF
4938:   close(31)                                       Err#9 EBADF
4938:   close(32)                                       Err#9 EBADF
4938:   close(33)                                       Err#9 EBADF
4938:   close(34)                                       Err#9 EBADF
4938:   close(35)                                       Err#9 EBADF
4938:   close(36)                                       Err#9 EBADF
4938:   close(37)                                       Err#9 EBADF
4938:   close(38)                                       Err#9 EBADF
4938:   close(39)                                       Err#9 EBADF
4938:   close(40)                                       Err#9 EBADF
4938:   close(41)                                       Err#9 EBADF
4938:   close(42)                                       Err#9 EBADF
4938:   close(43)                                       Err#9 EBADF
4938:   close(44)                                       Err#9 EBADF
4938:   close(45)                                       Err#9 EBADF
4938:   close(46)                                       Err#9 EBADF
4938:   close(47)                                       Err#9 EBADF
4938:   close(48)                                       Err#9 EBADF
4938:   close(49)                                       Err#9 EBADF
4938:   close(50)                                       Err#9 EBADF
4938:   close(51)                                       Err#9 EBADF
4938:   close(52)                                       Err#9 EBADF
4938:   close(53)                                       Err#9 EBADF
4938:   close(54)                                       Err#9 EBADF
4938:   close(55)                                       Err#9 EBADF
4938:   close(56)                                       Err#9 EBADF
4938:   close(57)                                       Err#9 EBADF
4938:   close(58)                                       Err#9 EBADF
4938:   close(59)                                       Err#9 EBADF
4938:   close(60)                                       Err#9 EBADF
4938:   close(61)                                       Err#9 EBADF
4938:   close(62)                                       Err#9 EBADF
4938:   close(63)                                       Err#9 EBADF
319:    open("/usr/afs/local/SALVAGE.fs", O_RDWR|O_CREAT|O_TRUNC, 0666) = 5
319:    close(5)                                        = 0
319:    fork()                                          = 4940
4940:   fork()          (returning as child ...)        = 319
4940:   close(3)                                        = 0
4940:   close(4)                                        = 0
4940:   close(5)                                        Err#9 EBADF
4940:   close(6)                                        Err#9 EBADF
4940:   close(7)                                        Err#9 EBADF
4940:   close(8)                                        Err#9 EBADF
4940:   close(9)                                        Err#9 EBADF
4940:   close(10)                                       Err#9 EBADF
4940:   close(11)                                       Err#9 EBADF
4940:   close(12)                                       Err#9 EBADF
4940:   close(13)                                       Err#9 EBADF
4940:   close(14)                                       Err#9 EBADF
4940:   close(15)                                       Err#9 EBADF
4940:   close(16)                                       Err#9 EBADF
4938:   execve("/usr/afs/bin/fileserver", 0xEFFFF328, 0xEFFFFF14)  argc = 1
4938:   resolvepath("/usr/lib/ld.so.1", "/usr/lib/ld.so.1", 1023) = 16
4938:   open("/var/ld/ld.config", O_RDONLY)             Err#2 ENOENT
4938:   open("/usr/lib/libpthread.so.1", O_RDONLY)      = 3
4938:   fstat(3, 0xEFFFF5D4)                            = 0
4938:   mmap(0x00000000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF7B0000
4938:   mmap(0x00000000, 81920, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF790000
4938:   mmap(0xEF7A3000, 1404, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 12288) = 0xEF7A3000
4938:   munmap(0xEF794000, 61440)                       = 0
4938:   open("/dev/zero", O_RDONLY)                     = 4
4938:   mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0xEF780000
4938:   memcntl(0xEF790000, 12160, MC_ADVISE, 0x0003, 0, 0) = 0
4938:   close(3)                                        = 0
4938:   open("/usr/lib/libsocket.so.1", O_RDONLY)       = 3
4938:   fstat(3, 0xEFFFF5D4)                            = 0
4938:   mmap(0xEF7B0000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xEF7B0000
4938:   mmap(0x00000000, 102400, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF760000
4938:   mmap(0xEF777000, 4089, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 28672) = 0xEF777000
4938:   mmap(0xEF778000, 388, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 4, 0) = 0xEF778000
4938:   munmap(0xEF768000, 61440)                       = 0
4938:   memcntl(0xEF760000, 12072, MC_ADVISE, 0x0003, 0, 0) = 0
4938:   close(3)                                        = 0
4938:   open("/usr/lib/libresolv.so.2", O_RDONLY)       = 3
4938:   fstat(3, 0xEFFFF5D4)                            = 0
4938:   mmap(0xEF7B0000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xEF7B0000
4938:   mmap(0x00000000, 139264, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF730000
4938:   mmap(0xEF74D000, 5979, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 53248) = 0xEF74D000
4940:   close(17)                                       Err#9 EBADF
4940:   close(18)                                       Err#9 EBADF
4940:   close(19)                                       Err#9 EBADF
4940:   close(20)                                       Err#9 EBADF
4940:   close(21)                                       Err#9 EBADF
4940:   close(22)                                       Err#9 EBADF
4940:   close(23)                                       Err#9 EBADF
4940:   close(24)                                       Err#9 EBADF
4940:   close(25)                                       Err#9 EBADF
4940:   close(26)                                       Err#9 EBADF
4940:   close(27)                                       Err#9 EBADF
4940:   close(28)                                       Err#9 EBADF
4940:   close(29)                                       Err#9 EBADF
4940:   close(30)                                       Err#9 EBADF
4940:   close(31)                                       Err#9 EBADF
4940:   close(32)                                       Err#9 EBADF
4940:   close(33)                                       Err#9 EBADF
4940:   close(34)                                       Err#9 EBADF
4940:   close(35)                                       Err#9 EBADF
4940:   close(36)                                       Err#9 EBADF
4940:   close(37)                                       Err#9 EBADF
4938:   mmap(0xEF74F000, 10684, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 4, 0) = 0xEF74F000
4938:   munmap(0xEF73E000, 61440)                       = 0
4938:   memcntl(0xEF730000, 11548, MC_ADVISE, 0x0003, 0, 0) = 0
4938:   close(3)                                        = 0
4938:   open("/usr/lib/libnsl.so.1", O_RDONLY)          = 3
4938:   fstat(3, 0xEFFFF5D4)                            = 0
4938:   mmap(0xEF7B0000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xEF7B0000
4938:   mmap(0x00000000, 581632, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF680000
4938:   mmap(0xEF700000, 33204, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 458752) = 0xEF700000
4938:   mmap(0xEF709000, 20368, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 4, 0) = 0xEF709000
4938:   munmap(0xEF6F1000, 61440)                       = 0
4938:   memcntl(0xEF680000, 70240, MC_ADVISE, 0x0003, 0, 0) = 0
4938:   close(3)                                        = 0
4938:   open("/usr/lib/libintl.so.1", O_RDONLY)         = 3
4938:   fstat(3, 0xEFFFF5D4)                            = 0
319:    fork()                                          = 4942
4942:   fork()          (returning as child ...)        = 319
4942:   close(3)                                        = 0
4942:   close(4)                                        = 0
4942:   close(5)                                        Err#9 EBADF
4942:   close(6)                                        Err#9 EBADF
4942:   close(7)                                        Err#9 EBADF
4942:   close(8)                                        Err#9 EBADF
4942:   close(9)                                        Err#9 EBADF
4942:   close(10)                                       Err#9 EBADF
4942:   close(11)                                       Err#9 EBADF
4942:   close(12)                                       Err#9 EBADF
4942:   close(13)                                       Err#9 EBADF
4942:   close(14)                                       Err#9 EBADF
4942:   close(15)                                       Err#9 EBADF
4942:   close(16)                                       Err#9 EBADF
4938:   mmap(0xEF7B0000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xEF7B0000
4938:   close(3)                                        = 0
4938:   open("/usr/lib/libdl.so.1", O_RDONLY)           = 3
4938:   fstat(3, 0xEFFFF5D4)                            = 0
4938:   mmap(0x00000000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF720000
4938:   mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0xEF670000
4938:   close(3)                                        = 0
4938:   open("/usr/lib/libc.so.1", O_RDONLY)            = 3
4938:   fstat(3, 0xEFFFF5D4)                            = 0
4938:   mmap(0x00000000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF660000
4938:   mmap(0x00000000, 712704, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xEF580000
4938:   mmap(0xEF625000, 26300, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 610304) = 0xEF625000
4938:   mmap(0xEF62C000, 4304, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 4, 0) = 0xEF62C000
4940:   close(38)                                       Err#9 EBADF
4940:   close(39)                                       Err#9 EBADF
4940:   close(40)                                       Err#9 EBADF
4940:   close(41)                                       Err#9 EBADF
4940:   close(42)                                       Err#9 EBADF
4940:   close(43)                                       Err#9 EBADF
4940:   close(44)                                       Err#9 EBADF
4940:   close(45)                                       Err#9 EBADF
4940:   close(46)                                       Err#9 EBADF
4940:   close(47)                                       Err#9 EBADF
4940:   close(48)                                       Err#9 EBADF
4940:   close(49)                                       Err#9 EBADF
4940:   close(50)                                       Err#9 EBADF
4940:   close(51)                                       Err#9 EBADF
4940:   close(52)                                       Err#9 EBADF
4940:   close(53)                                       Err#9 EBADF
4940:   close(54)                                       Err#9 EBADF
4940:   close(55)                                       Err#9 EBADF
4940:   close(56)                                       Err#9 EBADF
4940:   close(57)                                       Err#9 EBADF
4940:   close(58)                                       Err#9 EBADF
4940:   close(59)                                       Err#9 EBADF
4940:   close(60)                                       Err#9 EBADF
4940:   close(61)                                       Err#9 EBADF
4940:   close(62)                                       Err#9 EBADF
4940:   close(63)                                       Err#9 EBADF
4942:   close(17)                                       Err#9 EBADF
4942:   close(18)                                       Err#9 EBADF
4942:   close(19)                                       Err#9 EBADF
4942:   close(20)                                       Err#9 EBADF
4942:   close(21)                                       Err#9 EBADF
4942:   close(22)                                       Err#9 EBADF
4942:   close(23)                                       Err#9 EBADF
4942:   close(24)                                       Err#9 EBADF
4942:   close(25)                                       Err#9 EBADF
4942:   close(26)                                       Err#9 EBADF
[...]

-- 
Martin Mokrejs <mmokrejs@natur.cuni.cz>, <m.mokrejs@gsf.de>
PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany
tel.: +49-89-3187 3683 , fax: +49-89-3187 3585