[OpenAFS] vos release

Rainer Toebbicke rtb@pclella.cern.ch
Fri, 09 Aug 2002 15:06:44 +0200


This is a multi-part message in MIME format.
--------------070102060006000900040906
Content-Type: text/plain; charset=US-ASCII; format=flowed
Content-Transfer-Encoding: 7bit

'vos release' involves telling the fileserver to break all volume callbacks. 
This is done using the fssync interface: FSYNC_askfs and thereabouts.

The volserver is single threaded. After sending the BreakVolumeCallbacks 
request to the file server it issues a read (no IOMGR_select() or the like) 
for a one-byte response. In the case of BreakVolumeCallbacks it does not even 
care about the response.

BreakVolumeCallbacks is executed in the fileserver. The fileserver issues the 
break callback RPCs to all clients in batches of 10 and waits up to 4 (?) 
seconds for everybody in the batch to respond.

For a typical root.cell or so volume at our site with a modest 1500 clients on 
each replica in a mix of on-site/off-site, the chances are high to spend 
several minutes in this thread. During all this time the BreakVolumeCallbacks 
request will hang, and the volserver as well as the only thread is in the 
read(). It won't even be able to keep the RX connection alive, so 'vos' 
command will time out after the usual 50 seconds (although somewhere I saw 
that in some version of vos the timeout is increased). In any case during this 
wait the volserver will not accept any other request, either.

Worse, I strongly suspect this mechanism to be responsible for

The above scenario might not be responsible for all the volserver hang-ups, 
but on those servers running the attached patch since a couple of months we 
haven't seen any since.

The patch simply starts the BreakVolumeCallbacks in a separate thread (useful 
only in the pthread-fileserver) and increases the batch size, so that the 
fileserver responds immediately. That way the volserver is happy and goes on 
doing the rest of the job, while the callbacks will eventually be broken when 
the fileserver is through the lot. I thought about issues like having several 
BreakVolumeCallbacks running at the same time eventually running into each 
other, but finally decided that there wasn't any danger - all structures are 
protected by locks.

I looked into doing the job in the volserver (somewhat more logical place) but 
did not come up with a simple solution.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke        http://cern.ch/~rtb         rtb@mail.cern.ch  O__
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland   > |
Phone: +41 22 767 8985       Fax: +41 22 767 7155                     ( )\( )

--------------070102060006000900040906
Content-Type: application/x-java-vm;
 name="BreakVolumeCallBacks-in-own-thread"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="BreakVolumeCallBacks-in-own-thread"

KioqIG9wZW5hZnMvc3JjL3ZpY2VkL2NhbGxiYWNrLmMJU2F0IE9jdCAxMyAwNjoyMjoxMCAy
MDAxCi0tLSBvcGVuYWZzL3NyYy92aWNlZC9jYWxsYmFjay5jCVR1ZSBNYXIgMTkgMTI6Mzk6
MjAgMjAwMgoqKioqKioqKioqKioqKioKKioqIDEyMiwxMjggKioqKgogIC8qIGNhY2hlIG1h
bmFnZXJzIHdpbGwgYWxsIHNlbmQgaW4gdGhlaXIgcmVzcG9uc2VzIHNpbXVsdGFuZW91c2x5
LCAqLwogIC8qIHRoZXJlYnkgc3dhbXBpbmcgdGhlIGZpbGUgc2VydmVyLiAgQXMgYSByZXN1
bHQsIHNvbWV0aGluZyBsaWtlICovCiAgLyogMTAgb3IgMTUgbWlnaHQgYmUgYSBiZXR0ZXIg
YmV0LiAqLwohICNkZWZpbmUgTUFYX0NCX0hPU1RTCTEwCiAgCiAgLyogbWF4IHRpbWUgdG8g
YnJlYWsgYSBjYWxsYmFjaywgb3RoZXJ3aXNlIGNsaWVudCBpcyBkZWFkIG9yIG5ldCBpcyBo
b3NlZCAqLwogICNkZWZpbmUgTUFYQ0JUIDI1ICAKLS0tIDEyMiwxMjggLS0tLQogIC8qIGNh
Y2hlIG1hbmFnZXJzIHdpbGwgYWxsIHNlbmQgaW4gdGhlaXIgcmVzcG9uc2VzIHNpbXVsdGFu
ZW91c2x5LCAqLwogIC8qIHRoZXJlYnkgc3dhbXBpbmcgdGhlIGZpbGUgc2VydmVyLiAgQXMg
YSByZXN1bHQsIHNvbWV0aGluZyBsaWtlICovCiAgLyogMTAgb3IgMTUgbWlnaHQgYmUgYSBi
ZXR0ZXIgYmV0LiAqLwohICNkZWZpbmUgTUFYX0NCX0hPU1RTCTUwCiAgCiAgLyogbWF4IHRp
bWUgdG8gYnJlYWsgYSBjYWxsYmFjaywgb3RoZXJ3aXNlIGNsaWVudCBpcyBkZWFkIG9yIG5l
dCBpcyBob3NlZCAqLwogICNkZWZpbmUgTUFYQ0JUIDI1ICAKKioqKioqKioqKioqKioqCioq
KiAxMjU5LDEyNjUgKioqKgogICAqIGEgZGVsYXllZCBjYWxsYmFjay4gIFJlc2V0cyB3aWxs
IGJlIGZvcmNlZCBpZiB0aGUgaG9zdCBpcwogICAqIGRldGVybWluZWQgdG8gYmUgZG93biBi
ZWZvcmUgdGhlIFJQQyBpcyBleGVjdXRlZC4KICAgKi8KISBCcmVha1ZvbHVtZUNhbGxCYWNr
cyh2b2x1bWUpCiAgICAgIGFmc191aW50MzIgdm9sdW1lOwogIAogIHsKLS0tIDEyNTksMTI2
NSAtLS0tCiAgICogYSBkZWxheWVkIGNhbGxiYWNrLiAgUmVzZXRzIHdpbGwgYmUgZm9yY2Vk
IGlmIHRoZSBob3N0IGlzCiAgICogZGV0ZXJtaW5lZCB0byBiZSBkb3duIGJlZm9yZSB0aGUg
UlBDIGlzIGV4ZWN1dGVkLgogICAqLwohIEJyZWFrVm9sdW1lQ2FsbEJhY2tzX3Modm9sdW1l
KQogICAgICBhZnNfdWludDMyIHZvbHVtZTsKICAKICB7CioqKioqKioqKioqKioqKgoqKiog
MTMyMiwxMzI3ICoqKioKLS0tIDEzMjIsMTM1MCAtLS0tCiAgICAgIEhfVU5MT0NLCiAgCiAg
cmV0dXJuIDA7CisgfSAvKkJyZWFrVm9sdW1lQ2FsbEJhY2tzX3MqLworIAorIAorIC8qIAlC
cmVha1ZvbHVtZUNhbGxCYWNrcyBub3cgKDIwMDIpIGhhcHBlbnMgaW4gYSB0aHJlYWQgc2lu
Y2UgaXQgY2FuIHRha2UgYSBzdWJzdGFudGlhbCBhbW91bnQKKyAJb2YgdGltZSBkdXJpbmcg
d2hpY2ggdGhlIHZvbHNlcnZlciBzaW1wbHkgd2FpdHMKKyAqLworIEJyZWFrVm9sdW1lQ2Fs
bEJhY2tzKHZvbHVtZSkKKyAgICAgYWZzX3VpbnQzMiB2b2x1bWU7CisgCisgeworICNpZmRl
ZiBBRlNfUFRIUkVBRF9FTlYKKyAJcHRocmVhZF90IGNiX3RocmVhZDsKKyAJcHRocmVhZF9h
dHRyX3QgY2JfYXR0cjsKKyAKKyAJcHRocmVhZF9hdHRyX2luaXQoJmNiX2F0dHIpOworIAlw
dGhyZWFkX2F0dHJfc2V0ZGV0YWNoc3RhdGUoJmNiX2F0dHIsIFBUSFJFQURfQ1JFQVRFX0RF
VEFDSEVEKTsKKyAKKyAJaWYgKHB0aHJlYWRfY3JlYXRlKCZjYl90aHJlYWQsICZjYl9hdHRy
LCAodm9pZCAqKUJyZWFrVm9sdW1lQ2FsbEJhY2tzX3MsICh2b2lkICopIHZvbHVtZSkpIHsK
KyAJCVZpY2VMb2coMCwgKCJCcmVha1ZvbHVtZUNhbGxCYWNrcyB0aHJlYWQgY3JlYXRlIGZh
aWxlZCBlcnJubz0lZFxuIiwgZXJybm8pKTsKKyAJfQorICNlbHNlCisgCUJyZWFrVm9sdW1l
Q2FsbEJhY2tzX3Modm9sdW1lKTsKKyAjZW5kaWYgCiAgfSAvKkJyZWFrVm9sdW1lQ2FsbEJh
Y2tzKi8KICAKICAK
--------------070102060006000900040906--