[OpenAFS-devel] 1.3.71 on AIX 5.2 with VAC 6.0

Horst Birthelmer horst@riback.net
Thu, 14 Oct 2004 16:15:24 +0200


On Oct 14, 2004, at 2:47 PM, Harald Barth wrote:

>
>> I traced the problem a little further and I think you end up in
>> AFSVolEndtrans because of an error in getting the file.
>> If you take a look at src/usd/usd_file.c where  the
>> USD_IOCTL_GETBLKSIZE call is implemented you will see real weird stuff
>> done for AIX because AIX lacking some data in the stats.
>
> It should be 4K because of pipe.

OK, than it would have been right.

>
>> In the mean time ... can you make sure that you're able to write more
>> than 4GB in the filesystem you're doing your dumps on?
>> So that I'm not tracing a ghost. ;-)
>
> See above - I'm writing to a pipe. The receiving process takes big
> chunks of data, but may have died for some reason so that the pipe
> stalls. Can that be a problem? In the meantime I'm trying to get
> Arla's vos dump in shape and I'm finding a some strange code in
> OpenAFS' vos dump, but not related to this problem. Patch will follow.
>

The receiving process in that case is vos, isn't it??
I think your data fetching does something wrong and you end up in the 
ERROR_EXIT part.
Then your UV_VolumeDump gets stuck and you end up in the "fromtid" part 
of the error handling code.
Now your AFSVolEndTrans gets an error  (maybe for the same reason) and 
you end up with a core.

That's my theory so far...
That's why I was looking into the VolumeDump and DumpFunction code. 
Since you said you won't get the dump (I just assumed this) I thought 
it had to be the USD_IOCTL call. But the more I think about it the more 
I'm convinced it gets an error in ReceiveFile and then propagate that 
errorcode back.

They're all assumptions because I still can't test anything to track 
that error down. :-(

BTW, forget my patch I sent you. It's not related to that at all. I 
mean that code isn't correct, too, but not our problem.

Horst