[OpenAFS] vos move failures?

Wed, 21 Oct 2009 11:20:56 -0400

On Wed, Oct 21, 2009 at 11:06 AM, Eric Chris Garrison
<ecgarris@iupui.edu> wrote:
> Derrick Brashear wrote:
>> On Wed, Oct 21, 2009 at 10:39 AM, Eric Chris Garrison
>> <ecgarris@iupui.edu> wrote:
>>> I've been having trouble with moving some of our larger volumes (>200GB=
 in
>>> size):
>>>
>>> [root@rfs3 qla2xxx]# vos move public.sudoc_01 rfs3 h rfs3 e -localauth
>>>
>>> Failed to move data for the volume 536878704
>>> =A0 Possible communication failure
>>> vos move: operation interrupted, cleanup in progress...
>>> clear transaction contexts
>>> move incomplete - attempt cleanup of target partition - no guarantee
>>> cleanup complete - user verify desired result
>>>
>>> The above took an hour or two, and I've seen it take longer to fail.
>>> Large volume moves don't always fail, just quite often.
>>>
>>> Another question is, if it's local to the same machine from one partiti=
on
>>> to another, why does 200GB take hours to "vos move"? =A0Any ideas?
>>>
>>> I'm using openafs-server-1.4.11-1.1.1 on RHEL 4.
>>
>> What's in the VolserLog?
>
> Wed Oct 21 10:05:19 2009 1 Volser: Clone: Cloning volume 536878704 to new
> volume 536890205
> Wed Oct 21 10:05:19 2009 VAttachVolume: Failed to open
> /vicepe/V0536878704.vol (errno 2)
>
> Also... on the retry that's running right now, I see a lot of these:
>
> Wed Oct 21 11:03:17 2009 trans 8 on volume 536890205 is older than 3450
> seconds
>

If  a volserver crashes and gets restarted, you'll want to look at the
previous log.

Steven