[OpenAFS] Tokens discarded during large file transfer

Christopher D. Clausen cclausen@acm.org
Mon, 12 Feb 2007 21:20:10 -0600

W. Mark Smith <online+lists.afs-info@coffeefreak.net> wrote:
> Most people who get this error have the problem immediately after
> they log in. In my case, it consistently happens during large file
> transfers. The problem occurs when I am copying a large (>1GB) file
> to an AFS directory. At about the 500MB point in the transfer, I
> get a "permission denied". I have to unlog then aklog to get my AFS
> tokens again. Here is the error code that shows up in my message
> log:

Can you provide the exact syntax of the command you are using to do the 

Can you run "id" as well and include it here?

If "id" shows two high-numbered groups, the command is probably running 
inside a PAG.  Maybe try not using a PAG and run the copy?  (It might be 
hard to get a session without a PAG, but an su command should work, 
provided you've disabled any AFS related PAM.)  I suspect the issue 
isn't PAG related though...

> Feb 10 11:40:43 [hostname] kernel: afs: Tokens for user of AFS id
> [id] for cell [cell] are discarded (rxkad error=19270410)
> And the corresponding error is:
> # translate_et 19270410
> 19270410 (rxk).10 = sealed data inconsistent
> The server is compiled with large file support. When I do the same
> thing on my (slower) home network, I do not have this problem, and
> I can write files larger than 2GB.
> My configuration is RHEL4, kernel 2.6.9-42.0.8.ELsmp, and I have
> tried both openafs 1.4.1 and 1.4.2.

On the server?  Or client(s)?

What does rxdebug <server> 7000 -version return?

> Does anyone have any suggestions?

Is this "home network" test using the exact same client and server?  Or 
just ones with a similar configuration?

Is it possible that there is an actual error in your networking hardware 
that is breaking some of the packets under higher loads?

Can you try forcing the link speed on your client's ethernet adapter to 
10BASE, 100BASE, 1000BASE (if applicable) and see if the commands 
complete at the slower speed?  Or otherwise verify your duplex speed 
between server and switch and client and switch and any other network 
links in between.  (I'd say to plug your client into the server directly 
using a cross-over, or at least into the same physical switch, just to 
test, if possible.)

Do you happen to have another OS version that you can test with? 
Windows, Solaris, or even another Linux kernel version?  Have you tried 
not using an SMP kernel (just to test)?