[OpenAFS] Busy call channel when communicating with server ...

Jeffrey Altman jaltman@secure-endpoints.com
Wed, 05 Sep 2012 08:27:49 -0400


This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig6597A5FE174E56A7273C70EE
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

The SMB-to-AFS gateway interface has serious architectural issues which=20
can
lead to kernel deadlocks in the SMB redirector on Vista and above which=20
cannot
be fixed in the AFS code.  It also has this problem with SMB timeouts=20
and
AFS timeouts being out of sync which can lead to application failures=20
and
data loss ("delayed write failures").  Third, it forces SMB protocol=20
semantics
on the AFS cache manager.   For all of these reasons, more than five=20
years and
$1.4M was spent on the development of the AFS redirector.

If you tell me that you are suffering from a problem that the community
has been suffering from for a decade and which the AFS redirector was=20
developed
to solve, I'm going to tell you to use the AFS redirector.

If you tell me that you are having problems with the AFS redirector, I'm
going to tell you to support the development of the AFS redirector so=20
that
the problems can be fixed.

Of your list:

1. Printing issues: were fixed in 1.7.15

2. Adobe issues: bug in Acrobat Reader -- file bug report with Adobe

3. Various hangs; deadlocks with Sophos and other AV products fixed as=20
they are reported

You are of course free to pick your poison but the SMB interface is no
longer under development  because the problems with timeouts and the
deadlocks in the SMB redirector simply cannot be fixed.

Jeffrey Altman



On Wednesday, September 05, 2012 2:28:05 AM, Michael Richter wrote:
> We tried several versions of 1.7 till 1.7.8 and had some issues
> (printing of smilie pages, Acrobat crashes, hang on shutdown/logout).
> So we went back to 1.6.1 for our XP machines. Actually we use 1.7.15
> on our Vista/7 machines only (less than 20 PCs and notebooks).
>
> So you mean we should try again?
>
> schrieb Jeffrey Altman:
>> An rx peer such as an AFS file service will return a RX_PACKET_TYPE_BU=
SY
>> response to an RPC if the RPC was issued on a call channel which the
>> file server believes is still in use.  The most frequent cause of this=

>> condition is when the client's idle dead timeout processing triggers
>> because the file service accepted an RPC request and has replied with
>> nothing but keep alive messages for the configured period of time.  It=

>> specific time depends on whether the volume the request was sent to is=

>> replicated or not.  In all cases, the idle dead timeout period will be=

>> longer than 60 seconds.
>>
>> As the event log message indicates, the client will automatically retr=
y
>> the request when this situation occurs.  However, the 1.6.1 client on
>> Windows relies upon the SMB redirector and the SMB protocol has an RPC=

>> request timeout period of a bit more than 45 seconds.  If the SMB
>> timeout period is reached without a response from the SMB server, (aka=

>> the AFS cache manager \\AFS), then the SMB client will terminate the
>> connection and it too will begin its own process of attempting retries=

>> and finally reporting errors to the application.
>>
>> If client side idle dead timeouts are being triggered, it means that t=
he
>> file service is not responding to clients in a timely fashion and an
>> effort should be expended to determine why.
>>
>> The OpenAFS 1.7.17 client does not rely on the SMB redirector.  It
>> should be used on all Windows platforms as the AFS redirector does not=

>> impose an arbitrary 45 timeout period on file operations.
>>
>> Jeffrey Altman
>>
>>
>>
>> On 9/4/2012 2:45 AM, Michael Richter wrote:
>>> Hi,
>>>
>>> since a month we have infrequent problem on our workstations. The PCs=

>>> are getting unresponsive, programs are crashing, it's not usable. Whi=
le
>>> this happens OpenAFS posts messages like this in 5s-interval:
>>>
>>> Type:    Warning
>>> Source:    AFS Client
>>> Event-ID:    4145
>>> Eventtime:    03.09.2012 16:06:24
>>> User:    n/z
>>> Computer:    UBWSTMAPC185
>>> Description:
>>> Busy call channel when communicating with server 130.149.204.70,
>>> retrying ...
>>>
>>> This does not happen on all PCs. But all PCs use the same software:
>>> Windows XP Prof SP3, OpenAFS 1.6.1
>>>
>>> I asked our AFS server administrators but it seems that they don't kn=
ow
>>> what's wrong.
>>> Does someone know what I can do to fix it?
>>>
>>> Michael
>>> _______________________________________________
>>> OpenAFS-info mailing list
>>> OpenAFS-info@openafs.org
>>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info


--------------enig6597A5FE174E56A7273C70EE
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)

iQEcBAEBAgAGBQJQR0VHAAoJENxm1CNJffh4MooIAJUy+u0rKeU/1z4IvuHwAN/V
0FVJ9jgTPdstJ99WK7af4PnTE26qRu1rKc8hTus+YzwsFCMV5lXn+oHLrBBO3/kx
eSHnWSSW8vT4X6uLohfaMD7IiF44SQslBYZSaDRFhKZMRYkeMmRt7wR9c1VKYMJa
I7N6jyQSsxJAYLAZRgdeqrkNdkwS6pbj4VCm1oJN12bt59XkFb7G6T0ZyGh4gRaa
ZP+2tOXFk7WJkTXwZbgnhSomJ/TOF0MPbRPiS8e6FpPslBPCH9J5Jts21NXAxsfq
69nkg1G+/zHJZtILL04gYKh3zcT7Eb3J0W7O1e0W80P0TsASBa9nRZvTI3U/w0g=
=9zOH
-----END PGP SIGNATURE-----

--------------enig6597A5FE174E56A7273C70EE--