[OpenAFS] Odd vfsck behavior under Solaris 9

Dale Ghent daleg@umbc.edu
Wed, 14 Jul 2004 18:01:05 -0400


On Jul 14, 2004, at 3:02 PM, Derrick Brashear wrote:

> Why do you need to use fsck on a raw device to run a fileserver on 
> Solaris 9?
>
> The only fileserver caveat I know is "don't use it on a logging ufs 
> partition". I don't have any Solaris 9 machines, period, but there's 
> no other reason I know it shouldn't work.

Because on Solaris, the case has always been that fsck(1M) takes the 
raw (aka "special") device as its argument (on Solaris, these are the 
/dev/rdsk/* devices rather than the /dev/dsk/* character device 
counterparts.)

So, we have our UFS-formatted /vicep* file systems which have their 
entries in /etc/vfstab with the fs-type field being "afs" rather than 
"ufs". When the server boots, the init scripts of course fsck the 
filesystems. It execs vfsck of course for the file systems marked "afs" 
in /etc/vfstab. vfsck then barfs[1] on these file systems because the 
init script, per Solaris parlance, is passing the *raw* (/dev/rdsk) 
device as the argument. vfsck appears to only be happy with being 
passed the character device path. This, is broken.

If one answers "y" in response to the error[1] vfsck throws, the same 
error message is repeated. However, when hitting "y" again, vfsck 
proceeds to check the fs, and at "Phase 5", throws another error[2]. 
Re-running vfsck on the same fs again produces the same results... that 
is, the first vfsck run never really fixes anything.

To recap, this is with OpenAFS 1.2.11, no patches to it on Solaris 9 
4/04 running with the latest kernel (112233-12) and UFS-related 
patches.

I compiled OpenAFS 1.3.65 and used the vfsck binary produced from that 
on this same system to see if there was any change in behaviour. 
Unfortunately, there wasn't, but prior to the usual "CANNOT READ: BLK 
0" error a new error was observed[3].

/dale

[1]
----Open AFS (R) openafs 1.2.11 fsck----
** /dev/md/rdsk/d1

CANNOT READ: BLK 0
CONTINUE? [yn]



[2]
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN CYL GROUP (SUPERBLK)
SALVAGE? [yn] y

2 files, 9 used, 35270803 free (11 frags, 4408849 blocks, 0.0% 
fragmentation)

***** FILE SYSTEM WAS MODIFIED *****



[3]
[root@hfs6]/tmp>./vfsck /dev/md/rdsk/d11
----Open AFS (R) openafs 1.3.65 fsck----
** /dev/md/rdsk/d11
IMPOSSIBLE INTERLEAVE=0 IN SUPERBLOCK
SET TO DEFAULT? [yn] y


CANNOT READ: BLK 0
CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ:

CANNOT READ: BLK 0
CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ:
** Last Mounted on /vicepa
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN CYL GROUP (SUPERBLK)
SALVAGE? [yn] y

2 files, 9 used, 35270803 free (11 frags, 4408849 blocks, 0.0% 
fragmentation)

***** FILE SYSTEM WAS MODIFIED *****
[root@hfs6]/tmp>