[OpenAFS] Problem with openafs-1.3.79 on kernel 2.6.11

Dr A V Le Blanc Dr A V Le Blanc <LeBlanc@mcc.ac.uk>
Thu, 10 Mar 2005 09:25:14 +0000


I (Dr A V Le Blanc) wrote:
> After some careful testing on various smaller machines, we decided
> to move one of our web servers to Openafs 1.3.79 with kernel 2.6.11;
> with the exception of the kernel, we are using a fairly standard
> Debian sarge system, with glibc 2.3.2.

On Tue, Mar 08, 2005 at 12:50:48PM -0800, Mike Fedyk wrote:
> Why aren't you using 2.6.8-13 that is in sarge?

Not that it's relevant, or of any interest, but we need a kernel
patch that isn't in the standard kernel.

> In about three hours, and well before our normal daily peak stress,
> AFS became partially unusable.  Attempts to write produced a message
> that the quota was exhausted, which wasn't the case; I tried
> stopping AFS, but it wouldn't stop cleanly, claiming that the
> filesystem was still in use, even though no processes were running
> that could be using it.  My conclusion: 1.3.79 isn't quite up to
> production quality, regrettably.

$Mike Fedyk wrote:
> Exact error messages would be more helpful than a summary.

I realise that my summary can't help diagnose the problem (which looks
similar to one reported for a different platform for 1.3.79), but I
think people need to know that there are still problems with 1.3.79.
It apparently ran without problems for three hours on a heavily loaded
machine, after which the logs begin to show this sort of message:

Mar  7 11:05:36 xxxx kernel: afs: failed to store file (partition full)

Note that there was no full partition anywhere in the system.  Some
while later, we see messages like this:

Mar  7 11:25:51 xxxx kernel: AFS_VMA_CLOSE(1553): Skipping Already locked vcp=e62d5f38 vmap=e62d5f48
Mar  7 11:27:46 xxxx kernel: AFS_VMA_CLOSE(31284): Skipping Already locked vcp=e9299f38 vmap=e9299f48
Mar  7 11:27:57 xxxx kernel: AFS_VMA_CLOSE(31282): Skipping Already locked vcp=e9351f38 vmap=e9351f48

When I discovered the problem, I managed to get afsd to shut down, and
I was unable to restart afs because the kernel module would not load.
So I rebooted the machine.

     -- Owen
     LeBlanc@mcc.ac.uk