[OpenAFS-devel] fssync changes to deal with volserver timeouts

Derrick J Brashear shadow@dementia.org
Wed, 4 Dec 2002 12:29:58 -0500 (EST)


This was beaten to death a while back with no resolution at the time:

volserver would give a volume back to the fileserver, which would
promptly break volume callbacks. however, if there was a delay in doing
so, the blocking in the volserver waiting for a reply could cause other
things it was doing (including talking to clients) to lose. 

We tried changing the fileserver to ack the giveback and then do the work,
with the problem that now the volserver would try another fssync call and
the fssync thread would be busy breaking callbacks and the effect was
similar

I'm led to believe IBM dealt with this another way (possibly by changing
the fssync listener to a "hot thread" and acking before going into break
callback land) but the potential for using up all the threads just
breaking fssync callbacks was disturbing and so I pursued another route.

There's a patch on the head, not directly portable to 1.2.x, but I have a
version for 1.2.x also, more on that shortly. 

It adds a thread to the fileserver just for breaking callbacks for fssync
callers. The way this works is as follows:
-fileserver fssync handler calls BreakVolumeCallbacksLater
--file entries which need to be broken are marked FE_LATER
--callbacks to be broken are marked CB_DELAYED
--hosts with callbacks we just marked CB_DELAYED are marked HFE_LATER
--fssync lwp gets wakeup from fssync handler
--fileserver acks volserver

-fileserver fssync thread wakes up from this or every 5 minutes, in case a
 wakeup is missed. 
--the fssync thread calls BreakLaterCallBacks until it finds that no
 callbacks needed to be broken
--BreakLaterCallBacks finds file entries set FE_LATER, unchains them, and
 breaks all callbacks they represent. it works just like BreakVolumeCallbacks
 which means that if the host is VENUSDOWN we end up forcing them to 
 InitCallBackState later, and toss the callbacks. if callbacks were
 available to break it returns 1, suggesting the caller call it again

if a caller with a callback to be broken calls in before we break it,
-CallPreamble notices HFE_LATER and calls BreakDelayedCallbacks.

edge cases:
-HFE_LATER is not unset, so you one time call BreakDelayedCallbacks 
 unnecessarily for each host you had a "Later" callback to break for.
 the overhead on this is negligible.
-if a new caller gets a callback on a file where we set the file entry
 FE_LATER and haven't dealt yet, it will have its callback broken
 also, despite no changes to the file. overhead: one FetchStatus per
 client this happens to. but it's not very likely to happen anyway.

http://www.dementia.org/~shadow/fssync.diff 
a.k.a. /afs/andrew.cmu.edu/usr16/shadow/www/fssync.diff
is applicable to 1.2.x if anyone wishes to try it there.

if you do try it, please report your findings.