[OpenAFS] 'vos' command dos not finish, file service works ok (sort of)

Andreas Hirczy ahi@itp.tugraz.at
Wed, 23 Jul 2008 19:57:53 +0200


Hi all!

My AFS cell works ok in most scenarios, but since a reboot of one DB-server
last friday no vos command besides "vos help" finishes - e.g.  "vos exa
root.afs -localauth -verbose" hangs indefinitely and does not produce any
output. Log files are also basically empty. File access works perfectly but I
cannot create or move volumes; no backup of course.

check_bos, check_udebug on ports 7002 and 7003, check_afsspace and
check_rxdebug from <http://www.eyrie.org/~eagle/software/afs-monitor/> all
tell me everything is ok. Output from "udebug" also looks fine.

After a restart of all our servers later today (we did have sheduled power
outage for a few hours anyway) one of the files servers salvages at every
restart and check_rxdebug tells me about 66 blocked connections. There is no
sign of a hardware problem with disks (SW raid1, active and synced), the
systems load is very low.

Please provide some tips how I can find more information about our systems
behaviour.

I am running 1.4.7 from Debian unstable recompiled on Debian Stable with
custom linux kernel 2.6.25.11. As far as i can tell the only unusual setup in
this cell is, that DB servers are configured to a secondary IP address
provide by fake (arp poisioning) since I have no direct control over DNS
aliases.

Best regards,
Andreas
-- 
Andreas Hirczy <ahi@itp.tugraz.at>                   http://itp.tugraz.at/~ahi/
Graz University of Technology                        phone: +43/316/873-   8190
Institute of Theoretical and Computational Physics     fax: +43/316/873-10 8190
Petersgasse 16, A-8010 Graz                         mobile: +43/664/859 23 57