[OpenAFS] Re: bonnie++ on OpenAFS

Hartmut Reuter reuter@rzg.mpg.de
Tue, 23 Nov 2010 12:02:04 +0100

Simon Wilkinson wrote:
>> Yep, this is what's happening in the trace Achim provided, too. Every 4k
>> we write the chunk. I'm not sure how that's possible unless something is
>> closing the file a lot, or the cache is full of stuff we can't kick out.
> Actually, it's entirely possible. Here's how it all goes wrong...
> When the cache is full, every call to write results in us attempting to
> empty the cache. On Linux the page cache means that we only call write
> once for each 4k chunk. However, our attempts to empty the cache are a
> little pathetic. We just attempt to store all of the chunks of the file
> currently being written back to the fileserver. If it's a new file there
> is only one such chunk - the one that we are currently writing. As
> chunks are much larger than pages, and when a chunk is dirty we flush
> the whole thing to the server, this is why we see repeated writes of the
> same data. The process goes something like this:
> *) Write page at 0k, dirties first chunk of file.
> *) Discover cache is full, flush first chunk (0->1024k) to the file server
> *) Write page at 4k, dirties first chunk of file
> *) Cache is still full, flush first chunk to file server
> *) Write page at 8k, dirties first chunk of file
> ... and so on.
> The problem is that we don't make good decisions when we decide to flush
> the cache. However, any change to flush items which are less active will
> be a behaviour change - in particular, on a multi-user system it would
> mean that one user could break write-on-close for other users simply by
> filling the cache.

The problem here ist that afs_DoPartialWrite is called with each write. Normally 
it gets out without doing anything, but if the percentage of dirty chunks is to 
high it triggers a background store. However, this can happen multiple times 
before the background job starts executing. Therefore I introduced in AFS/OSD a 
new flag bit CStoring which is switched on when the background task is submitted 
and switched off when it's done. And during that time no new background stores 
are scheduled for this file.

> Cheers,
> Simon.
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info

Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
			   	phone 		 +49-89-3299-1328
			   	fax   		 +49-89-3299-1301
RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)