1.2.9 unstable ? (was [OpenAFS] inaccessibble volume -
please help)
Rudolph T Maceyko
rtm@cert.org
Thu, 22 May 2003 13:34:02 -0400
--On Wednesday, May 21, 2003 23:56:08 -0400 Derrick J Brashear
<shadow@dementia.org> wrote:
>> FTR, your "umount.diff" patch has fixed the problems we were seeing
>> with shutdown hanging. We saw it mostly under Red Hat 7.3, but have
>> also applied the patch to Red Hat 9 boxes. (We aren't running any 8
>> boxes.)
>
> Perchance do you know if "umount /afs" succeeded or failed?
> If it failed, afsd -shutdown would return EACCES and never trigger the
> code path which I believe is suspect.
I'm going back over console logs now to see whether there have been any
shutdown/umount problems w/o hangs.
/afs busy + stock 1.2.9 = hung system at shutdown:
Stopping AFS services.....
umount: /afs: device is busy
libafs-2.4.18-27.7.x-i686: Device or resource busy
.
.
.
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
Starting killall: [ OK ]
Sending all processes the TERM signal...
Sending all processes the KILL smd: recovery thread got woken up ...
ignal... md: recovery thread finished ...
Syncing hardware clock to system time afs: Lost contact with file
server a.b.c.14 in cell cert.org (all multi-homed ip addresses down for
the server)
afs: Lost contact with file server a.b.c.14 in cell cert.org (all
multi-homed ip addresses down for the server)
Turning off swap:
Turning off quotas:
Unmounting file systems: umount2: Device or resource busy
umount: AFS: not found
umount: /afs: Illegal seek
afs: Lost contact with file server a.b.c.15 in cell cert.org (all
multi-homed ip addresses down for the server)
afs: Lost contact with file server a.b.c.15 in cell cert.org (all
multi-homed ip addresses down for the server)
afs: Lost contact with volume location server a.b.c.11 in cell
cert.org
afs: Lost contact with volume location server a.b.c.11 in cell
cert.org
afs: Lost contact with volume location server a.b.c.13 in cell
cert.org
afs: Lost contact with volume location server a.b.c.13 in cell
cert.org
afs: Lost contact with volume location server a.b.c.12 in cell
cert.org
afs: Lost contact with volume location server a.b.c.12 in cell
cert.org
Unmounting file systems (retry): WARM shutting down of: CB... afs...
BkG... CTrunc... AFSDB... RxEvent... RxListener...
(system hung at this point)
/afs busy + 1.2.9 patched with umount.diff = system not hung at
shutdown:
Stopping AFS services.....
umount: /afs: device is busy
libafs-2.4.20-13.7-i686: Device or resource busy
.
.
.
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
Starting killall: [ OK ]
Sending all processes the TERM signal...
Sending all processes the KILL smd: recovery thread got woken up ...
ignal...
Syncing hardware clock to system time
Turning off swap:
Turning off quotas:
Unmounting file systems: afs_cacheDp 1 at stop
Please stand by while rebooting the system...
flushing ide devices: hdc
Restarting system.
Normal shutdown:
Stopping AFS services.....
WARM shutting down of: CB... afs... BkG... CTrunc... AFSDB...
RxEvent... RxListener...
afs_cacheDp 1 at stop
.
.
.
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
Starting killall: [ OK ]
Sending all processes the TERM signal...
Sending all processes the KILL smd: recovery thread got woken up ...
ignal...
Syncing hardware clock to system time
Turning off swap:
Turning off quotas:
Unmounting file systems:
Please stand by while rebooting the system...
flushing ide devices: hdc
Restarting system.
Rudy