[OpenAFS] vos dump has timeout 700 second if vlserver down

Hartmut Reuter reuter@rzg.mpg.de
Thu, 14 Aug 2008 11:58:11 +0200


Michal Svamberg wrote:
> Hi,
> I have 3 vlservers. When one of these servers is down, the 'vos dump' is
> waiting for a long time.
> The timeout is defined in the function DumpVolume() at volser/vos.c:
> rx_SetRxDeadTime(60 * 10);
> With this parameter, the timeout is exactly 700 seconds (by wireshark).
> Changing the parameter to 10*10 leads to a timeout 112 seconds.
> 
> In the attachment, I send the wireshark dump of communications of 'vos
> dump' with
> vlserver (147.228.10.17 is down).
> 
> Why other openafs commands have smaller timeout (app. 12 seconds)?

Because when the old (non-pthreaded) volserver asked the fileserver for 
a volume it hung in the read to the socket without a chance to serve 
rx-requests.

> Why 'vos dump' has such a big timeout?
> Is there any option to change it?

If you know one of the vlservers is dead take it out of the CellServDB 
on the machine where you do the vos dump.

> 
> I have big problems when one vlserver is down and I am creating a dump
> of thousands volumes.
> I use bacula for creating backups.
> 
> Thanks for responses.
> Michal Svamberg


-- 
-----------------------------------------------------------------
Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
			   	phone 		 +49-89-3299-1328
			   	fax   		 +49-89-3299-1301
RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------