[OpenAFS] vos examine oddity

Jeffrey Hutzelman jhutz@cmu.edu
Fri, 28 Jan 2005 19:00:45 -0500


On Friday, January 28, 2005 01:21:43 PM -0800 Russ Allbery 
<rra@stanford.edu> wrote:

> Dexter 'Kim' Kimball <dhk@ccre.com> writes:
>
>> In this case I'm explicitly examining an RO.  The ID of the RO is
>> correctly reported 536901459 on the <volume name> line -- but in that
>> same record two lines down the RO volID is "0"
>
>> So apparently vos exa does know the RO volID (first line) but it doesn't
>> get printed (third line) because (beats me ... called from different
>> routine?  gets reset somewhere after it's first populated? elves?)
>
> Yeah, this is the same issue.  The reason why the last part is correct is
> that that information is coming from the VLDB instead of the volserver.
> It looks like the volserver sometimes doesn't know that there's a RO.
>
> I'm going to file a bug with the OpenAFS bug tracker.  It's not a big deal
> for us, as I can still do what I needed to do by inverting the sense of my
> query (look for ROs and then find their RWs via the RWrite part, which
> appears to always be okay).


This is not a bug; you are just confused.  For illustrative purposes, let
me quote the results of 'vos examine root.cell.readonly -c cs.cmu.edu':

> root.cell.readonly                218175898 RO      26817 K  On-line
>     APRICOT.SRV.CS.CMU.EDU /vicepa
>     RWrite   16846699 ROnly          0 Backup          0
>     MaxQuota      30000 K
>     Creation    Fri Jan 28 13:44:44 2005
>     Last Update Fri Jan 28 13:44:44 2005
>     1118222 accesses in the past day (i.e., vnode references)

This part is information about a particular replica, from the volserver
on which the replica is located (in this case, APRICOT).  It is the same
information you would get from 'vos listvol -long' against that partition.
The same goes for the next three sections, one from each replica.



> root.cell.readonly                218175898 RO      26817 K  On-line
  XXXXXXXXXXXXXXXXXX                YYYYYYYYYYYY      ZZZZZZZZZZZZZZZZ

  As noted, this line gives you basic information about the specific volume
  you are looking at.  It is exactly the same information you get from
  'vos listvol'.

  X: The volume's name, as recorded in the volume header on disk.  In some
     situations this may not agree with the name in the VLDB.  For example,
     if the volume header is destroyed, the salvager will recreate it; in
     this case it uses a name of 'bogus.NNNNN' where NNNNN is the volume ID.
     The only way to change this is to rename or dump/restore the volume.

  Y: The ID and type of the specific volume you are looking at.  This ID is
     the one used by the fileserver and volserver to refer to the volume on
     disk.  The "RO" here does not mean "this is the RO ID"; it means "this
     volume has type RO".

  Z: The size and status of the replica you are looking at.


>     RWrite   16846699 ROnly          0 Backup          0
              AAAAAAAAA        BBBBBBBBB         CCCCCCCCC

  This is where the confusion comes in to play.  This line _also_ reports
  information about the particular volume you are looking at, as recorded
  on disk.  None of these numbers is necessarily the same as Y above.

  A: The parent volume for the volume group containing the volume you are
     looking at.  Volume groups are the unit of replication; all volumes in
     a group are read-only, copy-on-write replicas of the same parent
     volume.  With an inode fileserver, all inodes belonging to every volume
     in the group are labelled with the volume ID of the parent.  With a
     namei fileserver, all files related to one volume group are kept under
     the same top-level directory, named by the parent volume's ID.  This
     does _not_ necessarily have to be a read/write volume (*).

  B: The volume ID of the read-only replica(s), if any, of the specific
     volume you are looking at.  This will normally be non-zero only if you
     are looking at the parent volume of a volume group -- other volumes do
     not normally have clones made from them.

  C: The volume ID of the backup clone, if any, of the specific volume you
     are looking at.  Again, this will normally be non-zero only if you are
     looking at the parent volume of a volume group.



>     RWrite: 16846699      ROnly: 218175898     Backup: 218177190
>     number of sites -> 5
>        server DATE.SRV.CS.CMU.EDU partition /vicepa RW Site
>        server APRICOT.SRV.CS.CMU.EDU partition /vicepa RO Site
>        server FIG.SRV.CS.CMU.EDU partition /vicepa RO Site
>        server DATE.SRV.CS.CMU.EDU partition /vicepa RO Site
>        server PLUM.SRV.CS.CMU.EDU partition /vicepa RO Site

  THIS part does come from the VLDB.  The ID's reported here are the
  RW, RO, and BK ID's for this volume set as reported in the VLDB.  Note
  that any volume set has only one VLDB entry.

If you want to associate RW and RO volume ID's, use the data from the VLDB, 
not the data from the volume headers.  The latter record relationships 
between physical volumes on a particular partition.  Note that the volume 
ID's in the VLDB section are reporterd with a colon(:) after each label, 
while the ones from volume headers are not.  Also bear in mind that you can 
get the VLDB information using 'vos listvl', and with considerably less 
fileserver load (actually, none!) than is incurred by 'vos examine'.


(*) For an example of a case where the parent volume of a group is not an
    RW volume, take a look at this partial output from
    'vos listvol orange.srv.cs.cmu.edu vicepa -long':

> d.class.15410.f04.restored        688593058 RO    1280711 K  On-line
>     ORANGE.SRV.CS.CMU.EDU /vicepa
>     RWrite  688593058 ROnly          0 Backup          0
>     MaxQuota    1500000 K
>     Creation    Mon Jan 10 00:14:54 2005
>     Last Update Mon Jan 10 00:14:54 2005
>     662770 accesses in the past day (i.e., vnode references)

This is a read-only restore from our backup system.  Because it was 
restored directly as a read-only volume, its _type_ on the server is RO, 
but it is still the parent of its own volume group (and is thus listed in 
the RWrite field).  It does not have any clones, so it's ROnly and Backup 
fields are zero.  In the VLDB, this volume is recorded with only an RO ID 
and a single RO site.  For various reasons, 'vos examine' is unable to deal 
with this construction and will report only the VLDB data.


-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA