[OpenAFS] PTS sadness

Brian Gallew geek+afs@cmu.edu
Mon, 25 May 2009 14:07:02 +0300


So, my PTS database is unhappy.  prdb_check complains about 120
unreferenced entries, two entry membership counts being off, and an
inconsistent group count (should be 147, header claims 69).  Further,
I'm starting to see zombie groups, where pts examine|delete|members all
claim the group doesn't exist, but pts create says that it does.

I've looked at web documentation and it's ... a little sparse.  I see
that if I have errors in my PTS database that I shouldn't perform any
PTS operations until they are fixed.  But there does not seem to be a
reference anywhere to what tools I could use to fix this.  Looking at
what is available, I'm using my best guess and wondering if someone
could comment on the correctness of the following plan:

1) use pt_util to dump user, group, and membership information this way:
for d in user group members;do pt_util -$d -name -xtra -datafile pts-$d;
done
2) shut down ptserver on all of my DB servers
3) delete the PTS database on all of my DB servers
4) On the master server, use pt_util to re-create everything
for d in user group members;do pt_util -w -datafile pts-$d; done
5) restart ptserver on the master server, wait until it elects itself as
the sync site.
6) restart ptserver on the other DB servers, waiting after each start
for quorum to settle before starting the next one.

Comments?  Questions?  General laughter and pointing?

Here's my system information:
afs1.qatar.cmu.edu (master server)
afs2.qatar.cmu.edu
afs3.qatar.cmu.edu
kdc-01.qatar.cmu.edu (don't ask)

All systems are RHEL-5.3 with heimdal-1.2.1 and openafs-1.4.8.