[OpenAFS] Re: Connection Timed Out errors occasionally when accessing openafs drive

Adam Megacz megacz@hcoop.net
Sun, 07 Jun 2009 11:21:00 -0700


FWIW, we are still experiencing this problem as well after upgrading
to 1.4.10, although it seems to occur less often than it did before.

  - a

Ken Elkabany <Ken@Elkabany.com> writes:
> I upgraded our server and client to 1.4.10. Unfortunately, I am still
> receiving Connection Timed Out errors. They rarely occur, but when
> they do they are a severe hindrance. My use case is as follows:
>
> Three different unix user accounts (root, www-data, aux) are all
> running multiple background processes (~9 total) which access the afs
> mount. They each automatically acquire, or re-acquire tickets and
> tokens, and then proceed to read, copy, and write files. Occasionally,
> upon creating a directory using a python os command similar to "mkdir
> -p" (os.makedirs), I receive a "Connection Timed Out" error. The
> processes must then be restarted.
>
> Any other suggestions?
>
> Ken
>
> On Sun, May 10, 2009 at 7:41 PM, Derrick Brashear <shadow@gmail.com> wrote:
>> it probably matters in the server here, but both.
>>
>> Derrick
>>
>>
>> On May 10, 2009, at 10:35 PM, Ken Elkabany <Ken@Elkabany.com> wrote:
>>
>>> Is this bug fixed in the client or the server? Thanks.
>>>
>>> Ken
>>>
>>> On Sun, May 10, 2009 at 7:22 PM, Derrick Brashear <shadow@gmail.com>
>>> wrote:
>>>>
>>>> I'd venture this is a bug fixed in 1.4.10, with idle dead time
>>>> computation
>>>> in rx.
>>>>
>>>> Derrick
>>>>
>>>>
>>>> On May 10, 2009, at 9:53 PM, Ken Elkabany <Ken@Elkabany.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I have openafs 1.4.9 client and server running on two separate
>>>>> machines across a WAN. The client has scripts that access the
>>>>> /afs/our.cell/ directory. Occasionally, the script will fail to
>>>>> complete, and the logs will say that the "Connection Timed Out" on a
>>>>> "mkdir -p /afs/our.cell/x/y/z" command. The frequency of the errors
>>>>> are approximately 1 in 100, small enough to not be easily reproducible
>>>>> manually, but enough to hamper our project. The scripts run as the
>>>>> root user, and is guaranteed to have the proper ticket and token. It's
>>>>> also important to note that these scripts often run in parallel (4 at
>>>>> a time, all root, modifying our cell). When one fails, all scripts
>>>>> running concurrently will fail with the same error, and I typically
>>>>> either unlog;kdestroy or restart the openafs-client (I am unsure which
>>>>> of those solutions is necessary or sufficient). I will soon have an
>>>>> additional LAN setup, and will determine if the same error occurs. Has
>>>>> anyone dealt with this issue before?
>>>>>
>>>>> Thank you for the assistance,
>>>>>
>>>>> Ken
>>>>> _______________________________________________
>>>>> OpenAFS-info mailing list
>>>>> OpenAFS-info@openafs.org
>>>>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>>>
>>

--