[OpenAFS-devel] fileserver loop

Thomas Mueller thomas.mueller@hrz.tu-chemnitz.de
Thu, 22 Aug 2002 06:51:11 +0200 (MEST)


On Wed, 21 Aug 2002, Russ Allbery wrote:

> Could you let me know some more about the symptoms of this problem?  We'v=
e
> been seeing some significant difficulties periodically in our cell since
> we upgraded to OpenAFS 1.2.6 file servers, and I'm wondering if this is
> the same thing.

As I can see from the other answers to your question, the problem we saw=20
is anonther one. And it exists for a long time, it was introduced=20
somewhere between AFS 3.4a and 3.6.

The patch

http://www.openafs.org/cgi-bin/wdelta/viced-callback-avoid-potential-loopin=
g-problem-20020201

was already an attempt to solve it.

In our case the problem looks like the fileserver was up and running just=
=20
fine, but seemed to stop responding to all requests and the fileserver=20
process consumes all cpu cycles.
If you increase the loglevel of the fileserver you will see
thousands of lines such as
=20
Wed Jul 10 17:24:59 2002 GSS: Delete longest inactive host <ipaddr>
=20
in /usr/afs/logs/FileLog.

These lines were produced by several threads of the fileserver.
Each thread tries to get space in the callback list, which has no free
entries.
The problem is that the function GetSomeSpace_r() in viced/callback.c runs
in an infinite loop if there were at least two client hosts which are=20
holding callbacks but their entry in the hostList is locked.
The enumeration function which is called within an do ... while=20
loop to traverse the hostList (enumerate_r() in
viced/host.c) starts each time at the beginning of the hostList and
therefore will provide one of those two hosts. So the end of the hostList
will never be reached.

My patch is to make sure that one thread traverses the hostList just for=20
one time and so the surrounding loop will have a chance to stop.

Thomas.
--=20
-------------------------------------------------
Thomas M=FCller, TU Chemnitz, URZ, D-09107 Chemnitz
Tel: +49 (0)371 5311755   Fax: +49 (0)371 5311629
-------------------------------------------------