[OpenAFS-devel] tweaking openafs configuration - chunksize, cache expiration and cache renewal

Mon, 19 Apr 2010 15:34:30 -0400

--On Saturday, April 17, 2010 03:56:49 PM -0400 Jeffrey Altman 
<jaltman@secure-endpoints.com> wrote:

> (3) The default configuration of the file server is not very good.
>     I recommend:
>
>       -L -udpsize 131071 -sendsize 131071 -rxpck 700 -p 128 -b 600
>       -nojumbo -cb 1500000
>
>     You can remove -nojumbo if you know it is safe to send large udp
>     packets and not have them be fragmented between your various sites.

Unless you have a fairly small cell in terms of numbers of volumes, files, 
and clients, even the values provided by -L are way too small.  I think 
better values are perhaps "-vc N -s N*10 -l N*3 -cb M", where N is a guess 
at how many volumes the server has, and M is somewhere between 10 and 100 
times the total number of vcache entries (N*13), but in no case more than 
about 10% of the RAM in the machine, at 72 bytes per entry.  At some point 
soon I'll post a more detailed message on fileserver tuning which explains 
how I came up with these numbers (which unfortunately is mostly guesswork). 
In any case, the values provided by -L, except the ones overridden by other 
options, are probably not the cause of the problems the OP is seeing in 
testing.

I agree that -p 128 -b 600 -cb 1500000 all seem reasonable.  With modern 
chunk sizes, IMHO a -sendsize of at least 256K (262144) is advisable, but 
your suggestion of 128K is still much better than the 16K default.  As with 
the chunk size, the send size is relevant only on the initial fetch or if 
the data is changing.

I currently recommend care in setting -udpsize.  It's not clear to me that 
larger values are actually better, and while I'm currently using 128K or 
256K on my own servers, I'm considering revisiting that decision.  The 
larger this buffer is, the more UDP packets the kernel will hold onto when 
we can't handle them.  This is fine and desirable in the event of a very 
short, very heavy traffic burst, but will make our performance worse if the 
fileserver rx stack (not the network) is congested for some reason.  In any 
case, use care to avoid a -udpsize setting larger than will be permitted by 
the kernel, as doing so will cause the fileserver to fall back to a 32K 
buffer, which is smaller than the default.  And again, the reported problem 
is likely not the result of a poor UDP buffer size.

> Changing the chunksize will not have a dramatic impact if you already
> have the data cached.  If the data isn't changing, its the status info
> that is expiring.

Or, it's not expiring at all, and the client simply needs the server to 
make access control decisions, as was already described.

>> Lastly, how do you guys think about the general plan - setting up an
>> afs-configuration that caches entire afs volumes from a designated cell
>> once a day or something... is this really practically or rather not
>> recommended due to performance issues?
>
> If you controlled the cell and could setup a file server at the remote
> location, I think you would be better off setting up a remote file
> server and storing readonly volume instances on it.  If you don't
> control the cell or don't trust the remote location to place a file
> server there, then your approach is reasonable.

I agree that if you can do it, a local fileserver replica is a good idea.
I disagree that the prefetching approach is reasonable.  The cache manager 
manages its caches of both data and metadata based on actual usage 
patterns, and a "just fetch everything" task mostly destroys that 
information.  Unless you have overprovisioned both your data cache and the 
various metadata cache parameters, or have very detailed knowledge of your 
usage patterns, a job that prefetches everything is likely to result in 
worse overall performance than just letting the CM do its job.

> I'm not comfortable that you have determined the actual cause of
> the performance delays.  You may want to monitor the traffic flows
> with wireshark and see what is actually being sent and received on
> the wire.

With this I absolutely agree.

-- Jeff