[OpenAFS] cache manager locked under heavy load?

Alena Manova nymano@seznam.cz
Sat, 06 Feb 2010 20:11:05 +0100 (CET)


Hello,

we have Apache webservers (with pretty high traffic) reading the content from AFS. normally the system runs fine, but at certain point (probably related to I/O load) AFS stops responding and all system load massively rises - all of the apache processes stuck in state "sending reply". restarting apache recovers the state.

the cmdebug at the time shows messages similar to:
Lock afs_xvcache status: (writer_waitingupgrade_waiting, upgrade_locked(pid:18571 at:5), 1 read_locks(pid:16782), 954 waiters)
Lock afs_xvcache status: (writer_waitingupgrade_waiting, upgrade_locked(pid:16639 at:5), 713 waiters)

The cache manager has 1GB cache size (tried even more with no results). The afs fileservers are in that time fine and other clients can access it.

does anyone have any advice how to sort out this issue please?

thank you, Nick.

am I right that all the issue is related to