[OpenAFS] NetRestrict'd interfaces still talk to AFS (Linux)

Lin Osborne lin_osborne@ncsu.edu
Mon, 28 Jun 2004 16:00:47 -0400


Folk,

I manage a set of web servers that serve SSL-enabled,
IP-based virtual hosts whose document roots reside in AFS
and are accessed via an IP ACL. Each vhost has a unique IP
that is aliased to a single physical interface, but only
one IP (the "real" eth0 IP) is given ACL access per machine.

To date, I've been running this setup trouble free on
Red Hat 7.3 (2.4.18 kernel/1.2.7 client/2.2.5 glibc). Recent efforts
to migrate the sites to Debian "stable" with a 2.4.26/1.2.11/2.3.2
kernel/client/glibc combination have revealed an unexpected behavior
not seen on the RH machines.

Basically, about every 20 to 25 days, the Debian machine loses contact
with AFS, downing the sites and requiring a reboot to reset the client.
I find the following message in /var/log/messages

kernel: afs: Tokens for user of AFS id -1 for cell <my_cell_name> have expired

I first implemented NetInfo to restrict AFS to the IP that's in the
ACL. 'fs gc' indicated that this IP was the only one registered with the
client, but the problem reappeared. So, I implemented NetRestrict list
of IPs that shouldn't register. Again, 'fs gc' indicates only the
desired IP is regsitered. However, tcpdump indicates that the
restricted IP is still talking to AFS. Output looks like this:

15:14:17.919232 restricted-IP.afs3-callback > our-fileserver.afs3-fileserver: 
rx data fs call fetch-status fid 537307550/10054/8232 (44) (DF)
15:14:17.919656 our-fileserver.afs3-fileserver > restricted-IP.afs3-callback: 
rx data cb call whoareyou (32) (DF)
15:14:17.919783 restricted-IP.afs3-callback > our-fileserver.afs3-fileserver: 
rx data cb reply whoareyou (460) (DF)
15:14:17.920681 our-fileserver.afs3-fileserver > restricted-IP.afs3-callback: 
rx data fs reply fetch-status (148) (DF)


Again, I haven't seen this problem on Red Hat 7.3 in production. Based on
tests, RH-9 and RHEL-3 appear to only talk to AFS via the "machine" IP.
On both Debian and RH, the machine IP (i.e., the one in the AFS ACL, the
one *not* assigned to any vhost) is the lowest numbered IP.

My questions are:
1) Why is the client losing access?
2) Why does NetInfo/NetRestrict not limit AFS conversation to the allowed IP?
3) What can I do to solve 1) and 2)?

Many thanks!
Lin Osborne