[OpenAFS] $HOME Directory suddenly contains only orphaned entries

Friedrich Delgado Friedrichs 6delgado@informatik.uni-hamburg.de
Sat, 25 May 2002 13:52:49 +0200


Hiho!

Excuse me if my explanations are too verbose, but i am pretty new to afs and i cannot
really judge what is relevant for you.

I've just started entrusting my $HOME to openafs 1.2.3 on debian woody on my box at home,
called abrasax.

I usually keep the machine and some programs running while i sleep, but i occasionally
forget to refresh my tokens before leaving the machine. I intended to set the lifetime
for my tickets higher, but doing "kinit; aklog" after unlocking xscreensaver seemed to
cause no problems.

There is one process running that does not write to /afs, only to an ext3 partition.

Occasionally procmail puts some Mail into my incoming box in the /afs $HOME, i did
this via a small wrapper script that reads my password from a file in the regular
filesystem, owned by me, with permissions 600 (i know, that's little insecure, but
i did not have the time to investigate the proper solution yet, probably running
procmail authenticated as "postman" or something. Besides, i am the only person
having a login on my machine.)

After unlocking the screen, running "kinit;aklog" i could not access my $HOME.

I have a volume user.friedel that is mounted on /afs/<cellname>/user/friedel
it countains the directories "home" (my $HOME) and "public", containing readable or
writable directories, and a mountpoint "backup" for the user.friedel.backup volume.

I do regular backups for all volumes at 5:30 with "vos backupsys". This backup
seemed to have run without problems tonight.

vos listvol abrasax reported that it was unable to attach my homevolume. Curiously, i was
still able to access the files in the "public" directory on the same volume.
After a restart of the openafs server, the volume was reported offline.

After a quick dive into the openafs administration manual, i decided to do a
bos salvage abrasax /vicepa user.friedel /afs/.taupan/salvage.user.friedel.log

In the logfile, about 80000 (!) orphaned entries were reported, taking up about
2.5 Gig of space (the size of all my data in the $HOME! Fortunately i have a backup
on a differente harddrive, which is unfortunately a few days old...)

After that i tried 
bos salvage abrasax /vicepa user.friedel /afs/.taupan/salvage.user.friedel.log -orphans attach

which brought me all my data back.

After identifying my mail directory, i read my INBOX and found out that the last mail
had actually been received at 9:30 am. So it seems the disaster must have happened well
after the backup, after 9:30 and before 10:56, when i logged in. I could not find
anything helpful in the logs. There are only some entries about the homevolume needing
salvaging that dated from well before the disaster happened.

Another thing: I tried mounting the backup volume somewhere else, and it seems to be empty, so
i'm wondering if it's possible to restore any data from the backup volume.

My /vicepa partition is ext3, i keep my log and data partitions on raw partitions of the
same disk.

My most important issue here is that i'd like to find out how this happened and how to
prevent it from ever happening again.

And of course i'd like to have my old directory structure back.
-- 
		Friedrich Delgado Friedrichs