[OpenAFS-devel] Problems with 1.4.8pre2 on Fedora 8

Harald Barth haba@kth.se
Tue, 21 Oct 2008 01:11:54 +0200 (CEST)


> I looked at the patch in CVS.  Seems like we should either also
> initialize idleError and tokenError in InitReq(), or make the check
> for these values in afs_Conn depend on initd == 1.

I think you have good eyes. I argued in jabber to initialize ALL
variables in the struct in InitReq() to zero as they might or might
not be used in the future. And you showed that two of them (idleError
tokenError) are dereferenced, but I imagine Derrick arguing that it
does not matter in the codepath in question....

So I am for a)

# diff -u /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_osi_pag.c{.orig,}
--- /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_osi_pag.c.orig  2008-10-20 18:12:07.000000000 +0200
+++ /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_osi_pag.c       2008-10-21 00:36:22.000000000 +0200
@@ -437,6 +437,12 @@
       av->skipserver[i] = 0;
       i++;
     }
+    av->accessError = 0;
+    av->volumeError = 0;
+    av->networkError = 0;
+    av->permWriteError = 0;
+    av->tokenError = 0;
+    av->idleError = 0;
     av->uid = PagInCred(acred);
     if (av->uid == NOPAG) {
        /* Afs doesn't use the unix uid for anuthing except a handle

even if the values *should* not be used yet. Another strategy would be
to set them to MAXINT and write asserts.

> As it stands, the test still relies on 2 potentially uninitialized
> variables, and servercheck=1 will have an effect or not depending on
> random values.

The alternatives are 

b) Think that skipserver[i] will allways be 0 and therefore 
   "protect" ((areq->idleError > 0) || (areq->tokenError > 0))
   The following patch is only cosmetic but does show that
   reasoning by not evaluating an unset variable:

# diff -u /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_conn.c{.orig,}
--- /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_conn.c.orig     2008-06-29 05:26:04.000000000 +0200
+++ /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_conn.c  2008-10-21 00:54:11.000000000 +0200
@@ -84,8 +84,8 @@
     /* First is always lowest rank, if it's up */
     if ((tv->status[0] == not_busy) && tv->serverHost[0]
        && !(tv->serverHost[0]->addr->sa_flags & SRVR_ISDOWN) &&
-       !(((areq->idleError > 0) || (areq->tokenError > 0))
-         && (areq->skipserver[0] == 1)))
+       !((areq->skipserver[0] == 1)
+         && ((areq->idleError > 0) || (areq->tokenError > 0))))
        lowp = tv->serverHost[0]->addr;
 
     /* Otherwise we look at all of them. There are seven levels of
@@ -97,8 +97,8 @@
      */
     for (notbusy = not_busy; (!lowp && (notbusy <= end_not_busy)); notbusy++) {
        for (i = 0; i < MAXHOSTS && tv->serverHost[i]; i++) {
-           if (((areq->tokenError > 0)||(areq->idleError > 0)) 
-               && (areq->skipserver[i] == 1))
+           if ((areq->skipserver[i] == 1)
+               && ((areq->tokenError > 0)||(areq->idleError > 0)))
                continue;
            if (tv->status[i] != notbusy) {
                if (tv->status[i] == rd_busy || tv->status[i] == rdwr_busy) {

or make it with asserts:


# diff -u /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_conn.c{.orig,}
--- /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_conn.c.orig     2008-06-29 05:26:04.000000000 +0200
+++ /usr/src/redhat/BUILD/openafs-1.4.8pre2/src/afs/afs_conn.c  2008-10-21 01:05:35.000000000 +0200
@@ -84,9 +84,11 @@
     /* First is always lowest rank, if it's up */
     if ((tv->status[0] == not_busy) && tv->serverHost[0]
        && !(tv->serverHost[0]->addr->sa_flags & SRVR_ISDOWN) &&
-       !(((areq->idleError > 0) || (areq->tokenError > 0))
-         && (areq->skipserver[0] == 1)))
+       !((areq->skipserver[0] == 1)
+         && ((areq->idleError > 0) || (areq->tokenError > 0)))) {
+        osi_Assert((areq->skipserver[0] == 1) && areq->initd);
        lowp = tv->serverHost[0]->addr;
+    }
 
     /* Otherwise we look at all of them. There are seven levels of
      * not_busy. This means we will check a volume seven times before it
@@ -97,9 +99,11 @@
      */
     for (notbusy = not_busy; (!lowp && (notbusy <= end_not_busy)); notbusy++) {
        for (i = 0; i < MAXHOSTS && tv->serverHost[i]; i++) {
-           if (((areq->tokenError > 0)||(areq->idleError > 0)) 
-               && (areq->skipserver[i] == 1))
+           if ((areq->skipserver[i] == 1)
+               && ((areq->tokenError > 0)||(areq->idleError > 0))){
+                osi_Assert((areq->skipserver[0] == 1) && areq->initd);
                continue;
+           }
            if (tv->status[i] != notbusy) {
                if (tv->status[i] == rd_busy || tv->status[i] == rdwr_busy) {
                    if (!areq->busyCount)

So which one?!
Harald.