1.2.9 unstable ? (was [OpenAFS] inaccessibble volume - please help)

Derrick J Brashear shadow@dementia.org
Fri, 6 Jun 2003 15:35:16 -0400 (EDT)


On Thu, 8 May 2003, Derrick J Brashear wrote:

> > I'll let you know...  I'm building a box to do some testing right now.
> > We backed out 1.2.9 for Red Hat 7.3 boxes once this problem was
> > discovered.  Because we autoreboot our Red Hat boxes, it frequently
> > happens that folks leave processes running that are accessing AFS.  So
> > the shutdown problem shows itself pretty well here...  (Older versions
> > of AFS/OpenAFS used to have similar problems.)
>
> If you can reproduce it, you might try this:
> /afs/andrew.cmu.edu/usr16/shadow/umount.diff

While it seems to fix it for the wrong reason, I believe I found the
correct fix, and it's been applied as
STABLE12-linux-rx-listener-flush-signals-20030605

I can tell you that the problem was that if e.g. /etc/init.d/halt ran
before afsd was shut down, a kill -15 and a kill -9 of the listener pid
happened. It was these that tripped up the shutdown.

If anyone wants to play with this a bit, apply the aforementioned delta to
1.2.9, arrange for a file in /afs to be busy, and try shutting down.

The 1.2.10 release candidates shouldn't have the problem for other reasons
but I prefer to know we fixed the underlying bug, and it appears form my
testing that this does so.