[OpenAFS-devel] flock Input/output error

Derrick Brashear shadow@gmail.com
Wed, 11 Aug 2010 11:53:16 -0400


On Wed, Aug 11, 2010 at 11:14 AM, Hans-Werner Paulsen
<hans@mpa-garching.mpg.de> wrote:
> On Tue, Aug 10, 2010 at 04:59:32PM -0400, Derrick Brashear wrote:
>> What did fstrace show? Does tcpdump show aborts from the server or are
>> the errors locally generated?
>
> When running the fileserver with -auditlog the output is:
> Wed Aug 11 16:27:25 2010 [4] EVENT AFS_SRX_SetLock CODE 0 NAME ...
> ...
> there are no lines with CODE not 0.

How many setlock lines are there, compared to the number of flocks attempted?
>
> fstrace on the client for a process with an error executing flock
> shows the following lines:
> time 961.188544, pid 5152: Analyze RPC op 13 conn 0xffffffffc53572c0 code 0x0 user 0x41629d76
> time 961.188552, pid 5152: StoreAll vp 0xffffffff807a7400 len (0x0, 0x0)
> time 961.188554, pid 5152: StoreAll Done vp 0xffffffff807a7400 length 0x0 (returns 0x0)
> time 961.188554, pid 5152: Vnode Lock vp 0xffffffff807a7400 wait 0x0 excl 0x2
> time 961.188554, pid 5152: Vnode Lock vp 0xffffffff807a7400 wait 0x0 readers 0 waiters 5

Likewise the above, are you seeing as many attempts here as you expect
or is it not even getting to this point?

> time 961.188660, pid 5152: Analyze RPC op 15 conn 0xffffffffc53572c0 code 0x0 user 0x41629d76
> time 962.189581, pid 5152: Close 0xffffffff80c7eb80 flags 0x8020
>
> (I added a "sleep(1)" when the flock is in error)
>
> Shall I add some afs_Trace calls to the code?