[OpenAFS] Weird volserver problem
Derrick J Brashear
shadow@dementia.org
Sat, 28 Jul 2007 17:18:43 -0400 (EDT)
you probably want the volserver clone locking patch in (i'm guessing)
src/vol/clone.c since 1.4.4
On Sat, 28 Jul 2007, Brian Sebby wrote:
> We're having a strange problem that just started happening this afternoon
> on one of our fileservers that appears to be related to the volserver.
>
> We have a number of jobs that perform vos release commands, and today we
> started getting error messages from them indicating that they were timing
> out, etc. Trying to run various "vos" commands takes forever, and although
> they eventually return the information, they sit there for several minutes
> before they succeed.
>
> I'm seeing a number of messages like this in the VolserLog file:
>
> Sat Jul 28 16:02:11 2007 trans 60 on volume 1818569609 has been idle for more than 570 seconds
> Sat Jul 28 16:02:11 2007 trans 55 on volume 1818569660 has been idle for more than 600 seconds
> Sat Jul 28 16:02:11 2007 trans 55 on volume 1818569660 has timed out
> Sat Jul 28 16:02:41 2007 trans 60 on volume 1818569609 has been idle for more than 600 seconds
> Sat Jul 28 16:02:41 2007 trans 60 on volume 1818569609 has timed out
>
> and
>
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
> Sat Jul 28 15:59:41 2007 1 Volser: DumpVolume: Rx call failed during dump, error
> -01
>
> These volumes are on SAN storage, using ZFS as the backend fileserver.
> We're running the 1.4.4 namei fileserver on Solaris with the -nofsync patch.
>
> Here are the bos parameters we're using:
>
> Instance fs, (type is fs) currently running normally.
> Auxiliary status is: file server running.
> Process last started at Sat Jul 28 15:50:38 2007 (3 proc starts)
> Last exit at Sat Jul 28 15:50:38 2007
> Command 1 is '/usr/afs/bin/fileserver -nojumbo -nofsync'
> Command 2 is '/usr/afs/bin/volserver -nojumbo -nofsync'
> Command 3 is '/usr/afs/bin/salvager'
>
> Any help would be greatly appreciated.
>
>
> Brian
>
> --
> Brian Sebby (sebby@anl.gov) | Unix and Operation Services
> Phone: +1 630.252.9935 | Computing and Information Systems
> Fax: +1 630.252.4601 | Argonne National Laboratory
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>