[OpenAFS-devel] stability problems, and interesting symptoms. ..

Neulinger, Nathan nneul@umr.edu
Wed, 30 May 2001 12:26:48 -0500


Something else that looks odd:

from VAttachVolumeByName_r:

    *ec = 0;
    strcpy(path, VPartitionPath(partp));
    strcat(path, "/");
    strcat(path, name);
    VOL_UNLOCK
    if ((fd = open(path, O_RDONLY)) == -1 || fstat(fd,&status) == -1) {
        close(fd);
        VOL_LOCK
        Log("VAttachVolume: Error opening/statting volume header file
(%s)\n", path);
        *ec = VNOVOL;
        goto done;
    }
    n = read(fd, &diskHeader, sizeof (diskHeader));


whenever I move a volume off the server - that error case with open/stat -
is ALWAYS triggering.

May 30 12:19:44 afs4 fileserver[2915]: VAttachVolume: Error opening/statting
volume header file (/vicepb//V0536897242.vol) 

I'm not sure if the above is actually an error condition, but that double
slash is a bit odd. It looks like the double slash does actually work ok in
the open() call, but still odd that this case is getting triggered all the
time, unless this is some weird "try and attach the volume after it was
deleted, just to make sure it was deleted" type of thing.

-- Nathan

> -----Original Message-----
> From: Neulinger, Nathan [mailto:nneul@umr.edu]
> Sent: Wednesday, May 30, 2001 12:10 PM
> To: 'openafs-devel@openafs.org'
> Subject: RE: [OpenAFS-devel] stability problems, and interesting
> symptoms. ..
> 
> 
> I added a pile of debugging to volume.c and volprocs.c and 
> came to this:
> 
> 
>     fdP = IH_OPEN(h);
>     if (fdP == NULL) {
>         Log("ReadHeader: %s:%d\n", __FILE__, __LINE__);
>         *ec = VSALVAGE;
>         return;
>     }
> 
> in ReadHeader in volume.c... The IH_OPEN is failing. I'm 
> trying to bump up
> inode-max and file-max on the box in question - we'll see if 
> that makes any
> difference.
> 
> -- Nathan
> 
> > -----Original Message-----
> > From: Neulinger, Nathan [mailto:nneul@umr.edu]
> > Sent: Wednesday, May 30, 2001 10:46 AM
> > To: 'openafs-devel@openafs.org'
> > Subject: [OpenAFS-devel] stability problems, and interesting 
> > symptoms...
> > 
> > 
> > I've got two problems and one interesting symptom, though 
> > probably not of
> > any relation to the first problem.
> > 
> > First, on a couple of my servers (and this started happening 
> > sometime back
> > about a month or so with no apparent changes to server 
> > hardware or software)
> > - if I start moving volumes off the server en-masse, one at a 
> > time, one
> > after another, at some point in the process, 50-100 volumes 
> > have been moved,
> > I get a volserver error complaining about being unable to 
> > attach a volume.
> > Once that happens, from then on out, any listvol or 
> volserver activity
> > against the server fails. Usually bos status indicates that 
> > vol exited with
> > signal 6 although not necessarily immediately (I haven't seen 
> > that with
> > openafs yet, but that was typically what I saw with 3.6-2.3). 
> > I have no
> > error messages from the volserver other than this - and basically no
> > indication that anything is wrong.
> > 
> > I get the error both with transarc 3.6-2.3 and openafs-cvs. 
> > 
> > Syslogs looks like this:
> > ----
> > (lots and lots of stuff like the next few lines for the other 
> > volumes that
> > moved ok.)
> > May 30 10:30:18 afs4 fileserver[511]: fssync: volume 
> > 537013509 moved to
> > 63019783; breaking all call backs 
> > May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume 
> > 537013509
> > deleted  
> > May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume 
> > 537013511
> > deleted  
> > May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume 
> > 537020173
> > deleted  
> > May 30 10:30:20 afs4 volserver[483]: 1 Volser: Clone: Cloning volume
> > 536897629 to new volume 537020174 
> > May 30 10:30:20 afs4 fileserver[511]: fssync: volume 
> > 536897629 moved to
> > 63019783; breaking all call backs 
> > May 30 10:30:20 afs4 volserver[483]: 1 Volser: Delete: volume 
> > 536897629
> > deleted  
> > May 30 10:30:20 afs4 volserver[483]: 1 Volser: Delete: volume 
> > 536897631
> > deleted  
> > May 30 10:30:22 afs4 volserver[483]: 1 Volser: Delete: volume 
> > 537020174
> > deleted  
> > May 30 10:30:23 afs4 volserver[483]: VAttachVolume: Error 
> > attaching volume
> > /vicepd/V0536906941.vol; volume needs salvage 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536906941 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536985904 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536889228 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536924071 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536896750 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536897341 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536983233 
> > May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: 
> > Could not attach
> > volume 536906834 
> > (tons of that for every volume on the server, and happens 
> > again if you do a
> > vos listvol against the server.)
> > -----
> > 
> > The other symptom - when clearing off a server, I happened to 
> > notice that
> > the volserver seemed to hang (and not respond to any new 
> > client requests
> > such as vos partinfo) if I started a vos release against it. 
> > Once the vos
> > release (in particular the ForwardMulti) completed, the 
> > volserver responded
> > again. I'm not talking about a huge volume - maybe 5-10 megs 
> > with a few
> > thousand files in it. 
> > 
> > I'm running volserver with no options in both cases. 
> > 
> > -- Nathan
> > 
> > ------------------------------------------------------------
> > Nathan Neulinger                       EMail:  nneul@umr.edu
> > University of Missouri - Rolla         Phone: (573) 341-4841
> > Computing Services                       Fax: (573) 341-4216
> > _______________________________________________
> > OpenAFS-devel mailing list
> > OpenAFS-devel@openafs.org
> > https://lists.openafs.org/mailman/listinfo/openafs-devel
> > 
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>