[OpenAFS] Memory leak with 1.4.1 modules on Linux 2.6.16
Caskey L. Dickson
caskey@technocage.com
Fri, 21 Jul 2006 23:32:24 -0700
Jose Calhariz wrote:
> I have found a situation that can lead to a big memory leak with the
> openafs modules on Linux, until all RAM is exhausted and the machine
> start to trash memory and swap, until death. It's a known problem?
>
> I have mounted the volume root.afs inside a directory,
> "fs mkmount -dir /afs/cell/dir/new_cells -vol root.afs" ,
> so I could create a mount point for the root.afs of a foreign cell.
> But I forget to destroy it in the end, "fs rmmount /afs/cell/dir/new_cells".
You have created a cycle in your filesystem tree. Unix tools assume
that the structure of the file system is a tree in the ADT sense, namely
an acyclic directed graph. (Thus the default prohibition of making hard
links to directories.)
The path /afs/cell/dir/new_cells/cell/dir/new_cells/cell... produces an
infinitely deep tree.
> When my backup system started, amanda using gnu tar, to backup my
> cell of afs, it backed up /afs/cell/dir/new_cells by mistake. The
> kernel in the machine that ran tar on the /afs/cell/dir/new_cells
> directory, started to eat all the available memory. If I reboot or
> stop openafs client, everything is OK. All the memory is reclaimed by
> the kernel to be used by the normal programs. I don't have privileges
> to backup the foreign cell, only my local cell.
>
> After doing "fs rmmount /afs/cell/dir/new_cells" everything went
> OK, and the backups run as usually, without memory leaks.
At this point you removed the cycle. There are a few tools that can
manage file system cycles, via various means, however I can't recommend
any. I believe I saw a script a few days ago that wraps the mount point
creation process and walks backward up the path verifying that the same
volume doesn't appear anywhere between it and the root. That is one option.
Another is to do such work inside of a dir that does not allow
system:backup into it, and ensure that your automated backup tools run
with correct permissions.
CLD