[OpenAFS] OpenAFS in the ISP environment. A good idea...

Thu, 06 Mar 2003 13:36:40 -0500 (EST)

On Thu, 6 Mar 2003, Derrick J Brashear wrote:

> Also, I should note, you can do a parallel salvage on partitions. If said
> does not become I/O bound, partitioning instead of "one big chunk" can be
> a win. I don't know offhand if it becomes I/O bound.

Some time ago, I did scads of salvager benchmarks using Transarc AFS.  I
don't know if openafs changed the behaviour or not (I guess I could look
if I felt like it) but a few observations on this point:

1. Salvager had a notion of trying not to kill a single disk with I/O, so
it would ignore the -parallel option if all the partitions appeared to be
on the same disk.  If I recall correctly, it figured this out by calling
stat() on the /vice partition mountpoint and then looked at the result of
the integer division, st_dev / PartsPerDisk (which I think was defined as
8).  If each partition appeared to be on the same disk, it would just
serialize the salvages.  At the time, I was using a Sun A1000 RAID array
and salvager thought it was one big disk.  Looking at the salvager
sources, I found an "all" flag that was not documented at the time that
would allow one to override this notion.  Such as to say, if you wanted to
force the salvager to parallelize 7 partitions on the same disk you could
say "-parallel all7" and it would work for you.

2. I have about a million pages of benchmark results that I'll not bore
you with, but a quick summary is pretty much like this.  We were
benchmarking a Sun Ultra1, Netra T1-105 and E450.  We were testing the
effects of parallel salvaging across 7 partitions of a RAID device.

>From memory:

On the Ultra 1, it performed best when salvages were serialized.  The I/O
bus was just hammered by anything more than 1, and the overall salvage
time for 7 partitions increased each time we forced another salvager to
run in parallel.

The Netra performed better each time we forced another parallel salvager
process up to 4 in parallel.  At 5 and above, overall salvage times began
to increase.

The 450 was an I/O beast compared to the other 2 boxes.  Salvage times
decreased each time we added another parallel salvage process.

Not sure if you actually care about this kind of info...  If so, hope it
helps.

Dave
--
Dave McMurtrie, Systems Programmer
University of Pittsburgh
Computing Services and Systems Development,
Development Services -- UNIX and VMS Services
717P Cathedral of Learning
(412)-624-6413