[OpenAFS] Client IPs not being inserted into the server's host CPS table

William Setzer William_Setzer@ncsu.edu
Tue, 21 Jul 2009 15:38:01 -0400

We use IP ACLs to serve web content out of AFS via a pool of load-
balanced servers.  We've had a problem lately where after a reboot,
the IP ACL for the rebooted web server/AFS client sometimes stops
working on a random AFS file server.  It works for all the rest of the
AFS servers, just not the one deciding to be difficult.

While tracing this down, I noticed that in all the server/client cases
where there was a problem (verified for three or four cases), the
client IP address was not in "hosts.dump" file (generated via "kill
-XCPU <fileserver process>").

I'm pretty sure this is why the IP ACL fails to work.  What I can't
figure out is how to get the client back in that table.  I thought it
was automatically added when the client contacted the server, but this
isn't happening.  I tried changing the cache uuid via "fs uuid
-generate" (a shot in the dark), wondering if perhaps there was some
internal uuid caching going on.  I tried using the "flushcps" program
to tickle the table, but that didn't help either.

Here's the particulars for a current client/server problem:

  Example Server: Solaris 8, OpenAFS v1.4.7
  Example Client: RHEL    5, OpenAFS v1.4.10

So far, all of the problems have been on v1.4.7 servers, but I don't
have a large enough of a sample size to know if it's coincidence.

Is there something I'm just missing, or is this possilby a bug?  It's
really starting to cause us problems, so I'd appreciate any hints as
to which direction to explore.  Workarounds to let us bring our web
servers back online would be most welcome as well.

Thanks in advance for your help.

William Setzer
OIT Systems and Hosted Services
NC State University