[OpenAFS] Re: backup dump suddenly started failing - (failed - partially dumped, possible communication error)

Bastian dea1306@melvex.xs4all.nl
Sat, 09 Jan 2010 00:31:04 +0100


Andrew Deason schreef:
> On Fri, 08 Jan 2010 17:10:12 +0100
> Bastian <dea1306@melvex.xs4all.nl> wrote:
> 
>> Andrew Deason schreef:
>>> On Fri, 08 Jan 2010 14:21:17 +0100
>>> Bastian <dea1306@melvex.xs4all.nl> wrote:
>>>
>>>> Fri Jan  8 12:05:43 2010: Task 2: End of pass 1: Volumes remaining
>>>> = 1 Fri Jan  8 12:05:43 2010: Task 2: Starting pass 2
>>>> Fri Jan  8 12:44:07 2010: Task 2: Volume <x> failed - partially
>>>> dumped Possible communication failure
>>> Anything in VolserLog on the server that volume <x> is on?
>>>
>>> Try 'vos dump -verbose'ing the backup volume <x>; what does it say?
>>>
>> Thanks. You are right. It seems to hang. vos dump hangs after 982M of
>> the volume (which is about one fourth) has been dumped.
> 
> This looks like bug 20727, which was fixed in 1.4.8. The following
> patches fix it:
> 
> 05369aa5551ac6913a4079be7d85ef04c2b76f52
> 41122a94a24e2e5b9389f89badc8f67b37a5312f
> 
> I don't know if 1.4.7.dfsg1-6+lenny2 contains these patches; perhaps
> someone else can say if they are.
> 

Judging by the source code of rx.c, 1.4.7.dfsg1-6+lenny2 does not
contain these patches.

I'm not sure how this bug explains that the problem suddenly came up
after months of successfully 'backup dump'ing.

But I will try and see if these patches solve the problem. Am I right to
assume that they (only) need to be applied on the file server with these
volumes (not on the db servers or the tape coordinator)?