[OpenAFS] Lost RW Volume Recovery?
Hartmut Reuter
reuter@rzg.mpg.de
Wed, 20 Feb 2008 09:47:15 +0100
Robert Sturrock wrote:
> Hi all.
>
> I'm not sure how, but we have lost the RW volume for our cell.user (a
> structural volume under which live user home areas). After a bit of
> searching, I found this thread that describes a possible recovery
> method involving dump/restoring from an RO and then salvaging:
>
> http://www.openafs.org/pipermail/openafs-info/2002-December/007228.html
>
> I tried this method and it _seemed_ to work, but I'm still having
> problems accessing the volume after remounting it. A quick rundown on
> what I did:
>
>
> $ vos dump cell.user.readonly > cell.user.dump
>
> $ vos restore hermes2 a cell.user -verbose < cell.user.dump
> Restoring volume cell.user Id 536870918 on server hermes2.its.unimelb.edu.au partition /vicepa .. done
> Updating the existing VLDB entry
> ------- Old entry -------
>
> cell.user
> ROnly: 536870919
> number of sites -> 2
> server hermes1.its.unimelb.edu.au partition /vicepa RO Site
> server telos.its.unimelb.edu.au partition /vicepa RO Site
> ------- New entry -------
>
> cell.user
> RWrite: 536870918 ROnly: 536870919
> number of sites -> 3
> server hermes1.its.unimelb.edu.au partition /vicepa RO Site
> server telos.its.unimelb.edu.au partition /vicepa RO Site
> server hermes2.its.unimelb.edu.au partition /vicepa RW Site
> Restored volume cell.user on hermes2 /vicepa
>
> $ bos salvage hermes2 a cell.user
> Starting salvage.
> bos: salvage completed
>
> $ bos salvage hermes2 a cell.user
> Starting salvage.
> bos: salvage completed
>
> .. but now the problem is as follows:
>
> $ fs mkmount /afs/.athena.unimelb.edu.au/user cell.user
>
> [ so far, so good .. but .. ]
>
> $ ls -ld user
> ls: user: Connection timed out
>
> $ ls -l
> total 14
> drwxrwxrwx 5 root root 2048 Nov 14 15:36 arch
> drwxrwxrwx 5 root root 2048 Feb 19 09:33 devlp
> drwxrwxrwx 2 root root 2048 Jan 23 14:44 group
> drwxrwxrwx 3 root root 2048 Oct 25 12:15 project
> drwxrwxrwx 4 root root 2048 Oct 15 11:13 pub
> drwxrwxrwx 2 root root 2048 Feb 20 12:38 tmp
> ?--------- ? ? ? ? ? user
> drwxrwxrwx 2 root root 2048 Oct 4 21:06 www
>
> Any pointers as to where I go from here?
>
> The only thing I can think of is that there may be some caching going on
> which in some way is still looking for the old RW volume.
>
> One alternative might be to "vos convertROtoRW", but I suspect that would
> leave me with the same problem to solve.
>
> Regards,
>
> Robert.
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
You need a "fs checkvol" on the client because the disappearing of the
old volume didn't the callbacks needed to provoke a new vldb lookup on
the clients. The same problem you have after a "vos convertROtoRW ...".
Hartmut
--
-----------------------------------------------------------------
Hartmut Reuter e-mail reuter@rzg.mpg.de
phone +49-89-3299-1328
RZG (Rechenzentrum Garching) fax +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------