[OpenAFS] Server crash
Rob Banz
rob@nofocus.org
Fri, 7 Dec 2007 14:09:20 -0500
>
>
> Look at the FileLog and see what failed to attach.
>
> This is one reason I dislike that optimization.
>
For the most part, its been a win for me. With a decent filesystem on
the back-end, I haven't had volume attachment problems running with a
fast-restart fileserver. I'd say if I had seen an issue where I did
have a multitude of volumes that needed salvaging, its not too hard to
either write a little script to troll your FileLog and run salvager on
the appropriate volumes -- or stop the fileserver and salvage the
whole partition.
In the environment I was responsible for, the only time I was having
to implement drastic measures (kill -9'ing the fileserver) was in the
instance of those dreaded clogged RX calls due to (usually) connection
table lockups -- and I never had a problem with using the fast-restart
fileserver, and it brought us back into service in a few minutes
rather than the hour+ that a salvage would cause. Even in the couple
instances where we did have storage go offline, at least since we used
ZFS, everything would come up fine in the fast-restart environment...
I think your success or failure with it is very dependent on the
behavior of your backing filesystem and how it orders transactions...
-rob