[OpenAFS] OpenAFS benchmark improvements

Wed, 12 Dec 2007 17:56:40 -0500

this is great practical advice. another useful thing to look at is cron 
jobs running on the afs servers, or cron jobs that affect the afs servers.
you can find things like really inefficient creation of backup volumes 
with respect to the actual backups you run, really inefficient volume
replication (multiple jobs trying to release the same volume, etc.), or 
afs restarts in the middle of other jobs that depend on afs. in other words,
you want to make sure the cron jobs aren't fighting with each other.

i did this professionally for a year...you see a lot of improvements 
that can be made that way. ok, it's not as exciting as finding low-level 
bugs in the code,
but if it suits your personality...cleaning up can't hurt, and it might 
help.

anne

Steve Simmons wrote:
> I'm going to second a big chunk of what Jerry wrote. About five years 
> ago I inherited an AFS cell that had been through some rough time and 
> spent more than a little time cleaning. The end result was much faster 
> service. Our performance was never as bad as Jerry's, but it was still 
> nothing to write home about.
>
> I did some of the same things, didn't have to do others. Of the things 
> done, two surprised me in that they made a difference.  One was the 
> same as Jerry's - getting rid of all the bogus values returned by vos 
> listaddr.  It didn't seem to make much difference to the users, but by 
> god, anything that tried to look at all the servers got orders of 
> magnitude better.
>
> The other was to salvage every single volume in the cell, attaching 
> the orphans:
>
>    bos salvage <volume> -orphans attach
>
> Sonofagun if my salvages on reboot didn't stop peppering me with 
> complaints. We deleted all the dead files it found, thus reclaiming 
> some of the 'missing' disk space and gaving it back to the users. As a 
> side effect, now if I get a salvage message, I know it's something to 
> look at. On the other hand, note that if you restore a volume from 
> before you forced the attach, it will need a salvage.
>
> Another thing which helped (and unlike the other two, I expected this 
> to help) was to get the vldb and the on-server volumes back into sync. 
> I don't recall the precise steps I had to go through, but it took more 
> than just doing a set of vos syncvldb/vos syncserv commands on the 
> various machines involved. I do recall generating a vldb list and 
> comparing that to the output of vos listvol from all the servers, and 
> that had to be followed by some vos zap and vos remove and vos remsite 
> commands. The whole cell got a lot snappier after that.
>
> Once we were confident the cell was stable, we upgraded from Transarc 
> to OAFS. That
> made a big difference too, but it wasn't done until after the cleanup.
>
> On Dec 12, 2007, at 9:52 AM, Jerry Normandin wrote:
>>
>> So.. any of you out there that are experiencing AFS slowness, do a
>> sanity check to see what you come up with.  You might just say, WTF!
>
> Absolutely.
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info