[OpenAFS] Re: proper way to bring down a file server?
Wed, 23 Feb 2011 13:20:17 -0800
>> This was fixed with "fs flushmount". Is this supposed to be necessary
>> after you "vos move" a volume?
> No. At what point did this problem occur? While volumes were being
> moved, after they had been moved, after you turned off the server, ... ?
The "Connection timed out" error occurred after I had turned off the old server.
But we didn't notice the problem until today, even though the volumes moved
and the old server was shutdown yesterday afternoon.
> Clients cache the location of volumes for about 2 hours. This is usually
> fine, since if they get the location wrong the fileserver will tell the
> client, and the client will look up the fresh location. But if you move
> a volume off of a server and then immediately turn it off, a client may
> be slow to find the new location, since it tries to contact the old,
> downed, fileserver first. So you may benefit from leaving the old,
> empty, fileserver online for a few hours, and then turning it off.
Thanks for the explanation; I'll leave "old" servers online for a few hours in
the future. Not sure if that was the problem this time though, since it had
been closer to 14 hours since the volume was moved when we noticed the problem.
> However, the client should recover from this. I would expect a 'fs
> checkv' may help resolve things more quickly, but a flushm may also
> help, as you've found.
The client may have eventually automatically recovered... but I'll run both "fs
flushm" and "fs checkv" next time.
>> Am I supposed to remove athens from the VLDB with "vos changeaddr
>> -oldaddr<athens IP> -remove"? I will build a new "athens" server, but
>> am waiting for new hardware to arrive, so it may be a few weeks.
> You don't need to do that, but it shouldn't hurt anything. When you
> bring the new "athens" server online, it will tell the VLDB that it is
> the new "athens" server, and will replace the existing entry.
Okay good to know. I'll leave it in there.