[OpenAFS] Problems in the last 2 days

Klaas Hagemann kerberos@northsailor.de
Wed, 29 Jan 2003 17:45:18 +0100


Hi,

Sorry for replying to myself, but i think that these crashes are caused 
by a vos move command.
At the time of the last crash i had to "moves" at the same time. They 
were started from the same client but with different afs-users and 
different volumes.

Are there any known problems regarding this point?

Klaas


Klaas Hagemann schrieb:
> Hi,
> 
> i had lots of problems with my file-servers in the past few days and 
> posted a lot of, lets say not very "usefull" messages or #bullshit#.
> 
> Sorry for that and thanks to everyone who tried to help me.
> 
> No i thing i can discribe the error a bit better:
> 
> Bevor thursday i had 2 Management-Server and 1 Fileserver. The 
> Management Server only hostet read-only replicas for the subddirectorys. 
> All the Volumes for the home-directories (20000) were at the one 
> fileserver. The system was running very stable with this configuration.
> 
> Thursday evening i moved ca.500 Volumes with User Home Directories to 
> another, new Fileserver using the vos move command.
> After that i got lots of problems with the processes on my fileserver.
> 
> First i used the pthread fileserver. There the fileserver-processes 
> simply stopped working from time to time, so that the volumes were not 
> reachable any more.
> The volserver kept working, so that "vos examine >volume<" gives a 
> successfull return. The System could not be shutdown any more and the 
> processes had to be killed by hand.
> 
> Than i switched over to the PWD fileserver. There the fileserver process 
> itself works fine, but after 5-6 hours the kernel was not able to 
> allocate any more memory. So the whole system crashed and had to be 
> rebooted over the "reset-button".
> Another error was that the volserver stopped working but the fileserver 
> were still running. So "vos examine >volume<" delievered a failure but 
> the volume still was reachable.
> 
> Then i moved the whole volumes back to the first file server and now 
> everything works stable again. I am very sure to be able to reproduce 
> this error, because nothing else happend on the network.
> 
> What may cause this problem?
> I thought of problems in the synchronization of the 2 database 
> management servers, are there any known problems when moving lots of 
> volumes?
> 
> Please let me know if you need any further information.
> I use openafs-1.2.7 on suse linux 7.3, /vicepa is on ext3 with lvm.
> 
> Thanks
> Klaas
> 
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>