[OpenAFS] vos move & vos rel failing on large volumes
Todd DeSantis
atd@us.ibm.com
Fri, 11 Apr 2003 07:44:32 -0400
Hi -
I am glad you found at least one fileserver machine that had
a faulty setup with its duplex settings. Over the years we
have found that the auto-negotiate settings for cards/switches
was not behaving properly and that you need to physically
make the settings.
Maybe now that this one problem is cleared up, it might help
with the overall health of the network and maybe the long
running "vos move" will move along. Do you know if it sees
the problem in the same time span or same spot ? I know this
would be very difficult to catch. Maybe you could run scripts
against the source and destination volservers every 30 seconds
to see how the transaction is proceeding
touch vollog.<server>
while (1)
date >> vollog.<server>
vos status <server> >> vollog.<server>
sleep 30
end
Maybe this can show that the transaction stops sending data or
receiving data at some point while the timer still continues for
10 plus hours.
The Meltdown output for your destination volserver will not
show any real resends, it is just the src volserver that will
show that.
The rxkad error shows that the packet does not have a proper length
so it takes the error condition.
Thanks
Todd
Lee Damon
<nomad@crow.ee.wash To: Todd DeSantis/Pittsburgh/IBM@ibmus, Ray Pasetes <rayp@fnal.gov>,
ington.edu> shadow@dementia.org
cc: openafs-info@openafs.org, nikola@crow.ee.washington.edu
04/10/2003 12:52 PM Subject: Re: [OpenAFS] vos move & vos rel failing on large volumes
I've cut and pasted a bunch of different suggestions into this email and am
replying to each one in turn. I would like to thank everyone who has
replied with suggestions or timing information.
> been able to move them. You may want to issue your vos move command
> again with the -verbose option to see if there is more info on what
[root@harrier root]# time vos move -verbose -fromserver kite -frompartition
/vic epc -toserver potential -topartition /vicepf -id v.s8.cadence.2002a
Starting transaction on source volume 536872898 ... done
Cloning source volume 536872898 ... done
Ending the transaction on the source volume 536872898 ... done
Starting transaction on the cloned volume 1443364917 ... done
Deleting pre-existing volume 536872898 on destination ... done
Creating the destination volume 536872898 ... done
Dumping from clone 1443364917 on source to volume 536872898 on destination
...Failed to move data for the volume 536872898
rxk: sealed data inconsistent
vos move: operation interrupted, cleanup in progress...
clear transaction contexts
access VLDB
FATAL: VLDB access error: abort cleanup
cleanup complete - user verify desired result
real 648m58.951s
user 0m0.010s
sys 0m0.006s
>me either. i'd sort of like to see volserver strace/truss/ptrace/whatever
>output while such a slow thing is happening.
Derrick: I'll send this directly, it's *very* verbose, as one would expect
(22MB of output from running strace for about 4 minutes.)
> Meltdown.pl -s <src volserver> -p 7005 -t 30 (seconds)
>
> and I want to pay attention to the "resend" column.
This looks telling...
/tmp/Meltdown.pl -s kite -p 7005 -t 30
Server: kite, Port: 7005, Interval 30 seconds
Thu Apr 10 09:23:30 PDT 2003
hh:mm:ss wproc nobufs wpack fpack calls delta data resends
idle
09:23:30 0 0 0 464 8358 0 12387116 686973
11
09:24:00 0 0 0 464 8358 0 12387116 686973
11
09:24:30 0 0 0 464 8358 0 12387116 686973
11
09:25:00 0 0 0 464 8369 11 12387140 686974
11
09:25:30 0 0 0 432 8369 0 12389368 687103
10
09:26:00 0 0 0 432 8369 0 12392574 687283
10
09:26:30 0 0 0 432 8369 0 12395914 687468
10
09:27:00 0 0 0 432 8369 0 12398900 687639
10
09:27:30 0 0 0 432 8369 0 12401784 687806
10
09:28:00 0 0 0 432 8369 0 12405380 687998
10
09:28:30 0 0 0 432 8369 0 12408306 688166
10
09:29:00 0 0 0 432 8369 0 12411452 688340
10
And, for the fileserver we're trying to move to:
/tmp/Meltdown.pl -s potential -p 7005 -t 30
Server: potential, Port: 7005, Interval 30 seconds
Thu Apr 10 09:26:52 PDT 2003
hh:mm:ss wproc nobufs wpack fpack calls delta data resends
idle
09:26:52 0 0 0 462 2310 0 3812638 253710
10
09:27:22 0 0 0 462 2310 0 3812638 253710
10
09:27:52 0 0 0 462 2310 0 3812638 253710
10
09:28:22 0 0 0 462 2310 0 3812638 253710
10
09:28:52 0 0 0 462 2310 0 3812638 253710
10
09:29:22 0 0 0 462 2310 0 3812638 253710
10
09:29:52 0 0 0 462 2310 0 3812638 253710
10
09:30:22 0 0 0 462 2310 0 3812638 253710
10
09:30:52 0 0 0 462 2310 0 3812638 253710
10
09:31:22 0 0 0 462 2310 0 3812638 253710
10
09:31:52 0 0 0 462 2310 0 3812638 253710
10
09:32:22 0 0 0 462 2310 0 3812638 253710
10
09:32:52 0 0 0 462 2310 0 3812638 253710
10
> If this column is increasing while the release/move is in
> progress than I would want to look at the routes between
> these 2 volserver machines.
The routes look clean, but they are something I have very little control
over. I'm trying to get networking involved, we'll see. One thing I
discovered yesterday was that the new fileserver (potential) was ordered
with 100baseT cards instead of gig, but I don't think that is sufficient
to explain the problem.
The thing is, we were seeing slow moves before we brought potential in.
Our first two fileservers, kite & harrier, are connected directly to
the same gig switch, and large releases were taking hours there too. I'm
going to do the same set of tests (with a much smaller volume :) between
them and see if resends climb while that is happening. Since I
don't want to leave the move in an "indeterminate state", I'll have to wait
for that to time out and die of its own unnatural causes first.
> Is duplex settings set properly on the network card and
> switch port ? "netstat -ni" can show collisions if this is
> a problem.
netstat -ni shows no collisions or other errors on any interface on any of
our fileservers.
> Can the network handle fragmented packets ?
yes, I believe so.
> How FULL is the destination vice partition ? Does it have
> ample free inodes around ?
Darn near empty. It's a new fileserver.
> I would also look at the the volserver options
> - using -udpsize 1048576
I'm a bit lost as to how to reset BosConfig to pass this option on to
volserver
without trashing my entire cell. Any hints for a lost soul?
> Are there any other vos tranactions happening on the src or
> destination server during these times
In the duration of an 10+hour move, there are a few other vos releases that
happen.
> vos status <src volserver>
> vos status <dst volserver>
Here's a snapshot from just now:
[root@harrier root]# vos status kite
Total transactions: 1
--------------------------------------
transaction: 7376 created: Thu Apr 10 09:25:01 2003
attachFlags: offline
volumeStatus:
volume: 1443364918 partition: /vicepc procedure: Forward
packetRead: 1 lastReceiveTime: 96171503 packetSend: 41257 lastSendTime:
96171503
--------------------------------------
[root@harrier root]# vos status potential
Total transactions: 1
--------------------------------------
transaction: 1815 created: Thu Apr 10 09:25:11 2003
volumeStatus:
volume: 536872898 partition: /vicepf procedure: Restore
packetRead: 41366 lastReceiveTime: 96171459 packetSend: 1 lastSendTime:
96171459
--------------------------------------
Another interesting tidbit is in the VolserLog on kite. Volume 1443364918
is the one I'm trying to move to (on potential), 536872898 is the source
(on
kite).
Thu Apr 10 09:38:01 2003 trans 7376 on volume 1443364918 is older than 780
secon
ds
Thu Apr 10 09:38:31 2003 trans 7376 on volume 1443364918 is older than 810
secon
ds
Thu Apr 10 09:39:01 2003 trans 7376 on volume 1443364918 is older than 840
secon
ds
Thu Apr 10 09:39:31 2003 trans 7376 on volume 1443364918 is older than 870
secon
ds
Thu Apr 10 09:40:01 2003 trans 7376 on volume 1443364918 is older than 900
secon
ds
Thu Apr 10 09:40:31 2003 trans 7376 on volume 1443364918 is older than 930
secon
ds
Thu Apr 10 09:41:01 2003 trans 7376 on volume 1443364918 is older than 960
secon
ds
The previous move said that for a very long time, then started saying the
trans was idle:
Thu Apr 10 01:06:01 2003 trans 7236 on volume 1443364917 is older than
38730
sec
onds
Thu Apr 10 01:06:31 2003 trans 7236 on volume 1443364917 is older than
38760
sec
onds
Thu Apr 10 01:07:01 2003 trans 7236 on volume 1443364917 is older than
38790
sec
onds
Thu Apr 10 01:12:31 2003 trans 7236 on volume 1443364917 has been idle for
more
than 300 seconds
and finally:
Thu Apr 10 01:16:01 2003 trans 7236 on volume 1443364917 has been idle for
more
than 510 seconds
Thu Apr 10 01:16:31 2003 trans 7236 on volume 1443364917 has been idle for
more
than 540 seconds
Thu Apr 10 01:17:01 2003 trans 7236 on volume 1443364917 has been idle for
more
than 570 seconds
Thu Apr 10 01:17:31 2003 trans 7236 on volume 1443364917 has been idle for
more
than 600 seconds
Thu Apr 10 01:17:31 2003 trans 7236 on volume 1443364917 has timed out
thanks again for all the hints,
nomad
----------- - Lee "nomad" Damon - \
work: nomad@ee.washington.edu \
play: nomad@castle.org or castle!nomad \
/\
Sr. Systems Admin, UWEE SSLI Lab / \
"Celebrate Diversity" / \