[OpenAFS-devel] CopyOnWrite failure
Derrick J Brashear
shadow@dementia.org
Tue, 12 Mar 2002 23:46:37 -0500 (EST)
On Tue, 5 Mar 2002, Derrick J Brashear wrote:
> On Tue, 5 Mar 2002, Matthew N. Andrews wrote:
>
> > does anyone have any suggestions about where I might proceed with
> > respect to tracking this down, and fixing it?
>
> Due to other things which have arisen, I suspect the pthread fileserver. I
> can't prove it and I'm not sure how to narrow it down other than "try the
> lwp fileserver for whatever window necessary to prove it doesn't happen,
> then start debugging"
More thoughts. Clearly I was (somewhat) wrong. If it was merely pthreads,
we'd see it on Solaris. As far as I know, we don't. So, I'll narrow my
theory to namei fileserver and pthreads, or just the namei fileserver.
This can be narrowed in 2 ways:
-trying an lwp fileserver on linux (it gets built but not installed)
-trying a namei fileserver on solaris
As yet I haven't seen this problem on my linux machine so preferably I'd
need some help to track it, and this week I have other
(non-computing-related) problems anyway. But, thanks to a donation from
MIT to replace virtue.openafs.org, the current virtue hardware should be
available for the latter test as soon as I have time to set up the new
machine.
Some questions for those of you with this problem:
-only with linux servers?
-always with non-replicated volumes that have a .backup?
-if the above, was the backup being recreated at the time? (the VolserLog
may be helpful here, as well as the vos examine info)
-what if anything pertinent about access patterns?
-can you try a lwp fileserver for some period of time on your hardware?
-D