[OpenAFS-devel] deadlock in host.c (fileserver)
Rainer Toebbicke
rtb@pclella.cern.ch
Tue, 20 Jun 2006 17:06:13 +0200
This is a multi-part message in MIME format.
--------------090902000102060704030305
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Jeffrey Altman wrote:
> Rainer:
>
> Thank you for the patch. The second block of your patch is indeed a
> deadlock that must be fixed. The first block of your patch affects
> code that has been changed on the 1.4.x branch. Please examine the
> attached patch.
>
No, I'm afraid that's not sufficient: after dropping and re-acquiring
the locks it may well be that the rxcon_client has changed again.
Therefore you need the
while (... && ... && ...) / continue instead of the if (...) construct.
I came up with the patch from the 1.4.1 base and wasn't aware
of the rxr_CidOf() changes.
Attached is the patch against 1.57.2.36, but re-applied through cut &
paste and neither tested nor even compiled.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985 Fax: +41 22 767 7155
--------------090902000102060704030305
Content-Type: text/plain;
name="patch_client_deadlock"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="patch_client_deadlock"
--- /tmp/host.c.o141 2006-06-20 16:55:02.000000000 +0200
+++ /tmp/host.c 2006-06-20 17:00:30.000000000 +0200
@@ -1758,9 +1758,12 @@
* required). So, before setting the RPC's rock, we should disconnect
* the RPC from the other client structure's rock.
*/
- oldClient = (struct client *)rx_GetSpecific(tcon, rxcon_client_key);
- if (oldClient && oldClient->sid == rxr_CidOf(tcon)
- && oldClient->VenusEpoch == rxr_GetEpoch(tcon)) {
+ while (
+ (oldClient = (struct client *)rx_GetSpecific(tcon, rxcon_client_key))
+ && oldClient != client
+ && oldClient->sid == rxr_CidOf(tcon)
+ && oldClient->VenusEpoch == rxr_GetEpoch(tcon)
+ ) {
char hoststr[16];
if (!oldClient->deleted) {
/* if we didn't create it, it's not ours to put back */
@@ -1782,9 +1785,12 @@
FreeCE(client);
created = 0;
}
- ObtainWriteLock(&oldClient->lock);
oldClient->refCount++;
+ H_UNLOCK;
+ ObtainWriteLock(&oldClient->lock);
+ H_LOCK;
client = oldClient;
+ continue; /* all locks dropped, re-check */
} else {
ViceLog(0, ("FindClient: deleted client %x(%x) already had conn %x (host %s:%d), stolen by client %x(%x)\n",
oldClient, oldClient->sid, tcon,
--------------090902000102060704030305--