[OpenAFS] Fileserver process hung on startup
John Morris
openafs@butchwax.com
29 Mar 2004 01:44:22 -0600
Hi! See what y'all can do with this.
Openafs 2.9.11, custom smp kernel 2.4.23.
Three fileserver cell, one fileserver, kug, suddenly stops serving
files; clients see 'connection timed out'.
AFS server processes seem to be running normally as reported by bos
status.
# bos status kug -long -local
Instance ptserver, (type is simple) currently running normally.
Process last started at Sun Mar 28 16:51:58 2004 (2 proc starts)
Last exit at Sun Mar 28 16:51:55 2004
Command 1 is '/usr/afs/bin/ptserver'
Instance vlserver, (type is simple) currently running normally.
Process last started at Sun Mar 28 16:51:58 2004 (2 proc starts)
Last exit at Sun Mar 28 16:51:55 2004
Command 1 is '/usr/afs/bin/vlserver'
Instance fs, (type is fs) currently running normally.
Auxiliary status is: file server running.
Process last started at Mon Mar 29 01:29:36 2004 (11 proc starts)
Last exit at Mon Mar 29 01:29:36 2004
Last error exit at Mon Mar 29 01:29:36 2004, by vol, by exiting
with code 1
Command 1 is '/usr/afs/bin/fileserver'
Command 2 is '/usr/afs/bin/volserver'
Command 3 is '/usr/afs/bin/salvager'
#
Port 2040 not being listened on:
# netstat -tl | grep 2040
#
Get these errors from 2040 not being open:
FSYNC_clientInit temporary failure (will retry): Connection refused
Any fs commands on kug's filesystems hang for a long time before timing
out.
strace on fileserver process finds process in seemingly hung state, ie.
no system calls until process is killed.
Haven't noticed anything else funny about /vicepa; salvages complete
with no errors.
Volume DB is frozen as long as fileserver process is running; once
fileserver is killed, voldb comes back online.
Lsof shows kug's fileserver process compared with another normally
running fileserver's process has similar files open, except
localhost:2040, and of course /vicepa files.
restarts and reboots don't help.
That's all I can think of. Any ideas? Thanks for any suggestions! My
home directory is on this fileserver, so help will be appreciated extra!
John