Juha Jäykkä
Thu, 25 May 2006 12:05:21 +0300

> Do you by chance have the File Alteration Monitor installed?

FAM produced more problems (2) than it solved (0), so it went out the
nearest window already months ago. Thanks for the suggestion, though.

As for the other replies I've got since I last posted...

Derrick Brashear seems to think I'm banging my head on a wall by not
upgrading to 1.4.1; well, he may be correct, but upgrades to live
environment are always something I'm not looking forward to.=20

Derrick: you said "good look" if I tried client 1.4.1 with 1.3.81
servers; why? I'll go for 1.4.1 on the clients if it works with older
servers and we'll be installing a new fileserver during the summer, so I
think we can also upgrade the servers at that time. Is the upgrade
nothing else but a sequence of shut down all the server processes,
upgrade the binaries, restart? With no fear of anything going wrong?

Also, what I'm seeing is NOT .parentlock getting stuck. It's access to a
(seemingly random) file descriptor getting stuck with stuff like this in
strace (long lines for you, once again =3D) ):

15766 read(5, "\1\0\v\0\0\0c\0", 8)     =3D 8
15766 read(5,"\200\35,\4\0\0\300\5\377\377\37\0\0\1\0\0\24\0\377\377"..., 3=
96) =3D 396
15766 write(5,"7\0\5\0\0\0\300\5D\0\0\0\10\0\0\0\377\377\377\0b\0\5\0"..., =
64) =3D 64
15766 read(5, 0xbffff0d0, 32)           =3D -1 EAGAIN (Resource temporarily=

While this does not prevent firefox from starting, it makes it take the
five minutes I mentioned.

However, I have started thinking this may not be related to AFS, after
all. While doing my latest tests with this just five minutes ago, I
copied a 350 kB file over the network from the machine in question. It
took over two minutes on 100BaseTX-FD! There was no other load (network
or cpu) on either of the machines. This file was not even on AFS space in
either machine. Thus I did this over an ssh connection (the file is some
200 kB):

~> time cat <same file> > /dev/null
real    0m0.026s
user    0m0.003s
sys     0m0.008s

Notice: this should now come from cache!
~> time cat <a short file>
real    2m12.788s
user    0m0.000s
sys     0m0.021s

Something wrong with the network, definitely. I'm sorry I thought this is
AFS related, but it now seems not to be the case. I'm not acquitting AFS
yet, though. =3D)

Thanks for all the input, also. This is by far the most useful mailing
list I've ever stumbled across.


		Juha Jäykkä, juolja@utu.fi
		home: http://www.utu.fi/~juolja/

