[OpenAFS-devel] Remediating Stale Host Info: Revving Afsint: Was: extended callbacks RPC modification

Tom Keiser tkeiser@sinenomine.net
Fri, 13 Mar 2009 16:33:02 -0400


On Fri, Mar 13, 2009 at 3:01 PM, Matt Benjamin <matt@linuxbox.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Hi Tom,
>
> I think I get this generally, but I had the following thoughts
> (some that we discussed in IM):
>
> 1. would it make sense to use an RPC union to package the
> ClientAssertion (maybe with some other info you've also suggested we may
> wish to be sending to clients 'just FYI' to support heuristics for
> callback discard (which I also want to propose))?
>

Hi Matt,

Ok.  Encapsulating in a union makes perfect sense.  The simple
heuristics that I find obvious are:

1) a means of communicating xcb "load" to cache managers
2) a means of mandating that a cm give up at least N call backs


The reason I'd like to see these extensions is the intractability of
having good global knowledge on the fileserver.  One of the major
problems with mandating on the server the fid list for xcb
notification cancellation is that low temporal access frequency to a
read-mostly fid does not tell us anything about access patterns on the
client nodes.  The CM is in a far better position to predict local
access patterns, and thus I think a combination of hinting/mandating
from the fileserver, and cm access knowledge could yield a better
global result.  Granted, even with these augmentations it is possible
to devolve into oscillations due to revocation of coherence messaging
for active objects.  But, I think with judicious use of heuristics and
some damping mechanism (hysteresis loop, perhaps?) we can do a lot
better at smoothing out the instabilities.


> 2. Biting the bullet, is it appropriate that we perhaps re-open the
> discussion of revving afsint now?
>
> I think we collectively must know most of what's going to be in it?
> - From Felix we have a proposal for AFSFetchStatus64 which
> is probably close to ready for proposal to AFS3-Standardization. =A0Your
> proposal to add client UUID to operations seems well motivated to me. Is
> there something else obvious? =A0Access control information was mentioned
> by Derrick (though I don't know if anything is changing there).
>

I'll go out on a limb.  I'd like to see AFSFid become a union.  I'm
not proposing that we add any new fid types now, but I'd like to see
us have the option of doing so at a lower cost in the future.  The
immediately obvious use case to me would be adding a cell uuid field
to fid in order to support multi-cell fileservers.

-Tom


> If we gave some thought to other widening of interfaces we certainly
> need--especially, if we could work out some future-proofing (unions
> again? =A0perhaps this is being overused, but I'm not sure)--would we hav=
e
> something that could be proposed?
>
> Matt
>
> Tom Keiser wrote:
>> Matt, et al.
>>
>> <Developer/strategists>, Steven and I had a long conversation this morni=
ng about
>> the host package. =A0We were discussing potential ways to mitigate the
>> problems caused by using ephemeral, potentially stale (host,port) tuples
>> to map onto host objects. =A0Until we can revise afsint to pass the cach=
e
>> manager's uuid as an additional IN parameter to every stateful
>> fileserver RPC, I'd like to propose a stop-gap. =A0Namely, that we pass =
a
>> cm uuid assertion as an additional IN parameter to
>> RXAFSCB_ExtendedCallBack:
>>
>> =A0proc ExtendedCallBack(
>> =A0 =A0IN HostIdentifier *Server,
>>
>> + =A0IN afsUUID *ClientAssertion,
>>
>> =A0 =A0IN AFSCBFids *Fids_Array,
>>
>> =A0 =A0IN AFSExtendedCallBackSeq *CallBacks_Array,
>>
>> =A0 =A0OUT AFSExtendedCallBackRSeq *CallBack_Result_Array
>>
>> =A0} multi =3D 65540;
>>
>>
>> Furthermore, I propose addition of a new et code which notifies the
>> fileserver that the callback RPC failed due to a cm uuid mismatch.
>>
>> These additions will allow the fileserver to quickly detect staleness in
>> its struct Interface cache. =A0I consider this to be a rather important
>> addtion, as the current implementation can lead to loss of cache
>> coherence when a (host,port) tuple drifts between hosts.
>>
>> If we can reach internal consensus, I'll volunteer to post a new xcb
>> draft to afs3-standardization.
>>
>> Thoughts?
>>
>> -Tom
>
> - --
>
> Matt Benjamin
>
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI =A048104
>
> http://linuxbox.com
>
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.7 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFJuq15JiSUUSaRdSURCBN4AJwMvB83eeSqzPadLryIT3r2rVZq0ACeN+ae
> C6e2S+2h0DdkA4UA7utNjkI=3D
> =3DaW5u
> -----END PGP SIGNATURE-----
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>



--=20
Tom Keiser
tkeiser@gmail.com