[OpenAFS-devel] Problems with 1.4.8pre2 on Fedora 8

Derrick Brashear shadow@gmail.com
Tue, 14 Oct 2008 11:54:59 -0400


Servers blacklisted once are not marked down. The goal here is if your
server is, for instance, still Transarc, and you are trying an
authenticated operation with krb5-style tokens, which would, say, work
with one of three replicas you have of something, but not the one
you're trying, you neither want to mark the server you have down (it's
not), nor try it again.

Is this Linux? Are you willing to apply patches live with ksplice?



On Tue, Oct 14, 2008 at 11:43 AM, Harald Barth <haba@kth.se> wrote:
>> I suppose these "Connection timed out" events must have to do with
>> new code in afs_analyze.c and rx.c. Looking in afs_analyze.c at the
>> new routine afs_BlackListOnce I do not understand the followning
>> lines:
>>
>>                  if (tvp->serverHost[i] &&
>>                      !(tvp->serverHost[i]->addr->sa_flags &
>>                        SRVR_ISDOWN)) {
>>                      areq->skipserver[i] = 1;
>>                  }
>>
>>
>> I think the "!" is wrong. Why should we skip a server which is not down and use a server which is down?
>
> I applied the patch
> http://www.openafs.org/cgi-bin/wdelta/cachemgr-blacklist-down-servers-20081010?diff=1&f=u
>
> But still no cigar :( How are your results?
>
> $ ./sob -n 30000 -o 1000 -s 1k -b 1k -w
> Writing 30000 files of size 0.001MB, blocksize 1kB
> Failed to create file testfile.12344 : Connection timed out
> [Exit 1 ]
> $ /usr/openafs/sbin/rxdebug localhost 7001 -v
> Trying 127.0.0.1 (port 7001):
> AFS version:  OpenAFS 1.4.8pre2-pdc51 built  2008-10-14
>
> :( :( What is the logic beind the BlackListOnce()?
>
> http://www.openafs.org/cgi-bin/cvsweb.cgi/openafs/src/afs/afs_analyze.c.diff?r1=1.25&r2=1.31
>
> And I think I want logging of servers marked up/down to syslog in the
> usual way before I want this near production.
>
> Been staring too long at at the diff now.