[OpenAFS-Doc] man page updates - full changes against cvs

Jason Edgecombe jason@rampaginggeek.com
Sun, 02 Sep 2007 22:55:35 -0400


This is a multi-part message in MIME format.
--------------080705000500070403000805
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hi,

Here is a dump of all the latest versions of my updates against cvs.

* vos_copy -live should be checked.
* read_tape is incomplete, but I cannot proceed further because I don't 
have a tape drive for testing.

vos_copy is the only new file and I fixed example syntax in 
vos_convertROtoRW and fixed the capitalization.

As always, comments are welcome.

Sincerely,
Jason

--------------080705000500070403000805
Content-Type: text/plain;
 name="diff.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="diff.txt"

? doc/man-pages/pod1/vos_convertROtoRW.pod
? doc/man-pages/pod1/vos_copy.pod
? doc/man-pages/pod8/read_tape.pod
Index: doc/man-pages/README
===================================================================
RCS file: /cvs/openafs/doc/man-pages/README,v
retrieving revision 1.21
diff -u -r1.21 README
--- doc/man-pages/README	19 Aug 2007 22:04:31 -0000	1.21
+++ doc/man-pages/README	3 Sep 2007 02:47:18 -0000
@@ -209,6 +209,7 @@
        vos clone
        vos convertROtoRW
        vos copy
+       vos setfields
        vos shadow
        vsys
 
Index: doc/man-pages/pod1/vos.pod
===================================================================
RCS file: /cvs/openafs/doc/man-pages/pod1/vos.pod,v
retrieving revision 1.5
diff -u -r1.5 vos.pod
--- doc/man-pages/pod1/vos.pod	18 Aug 2007 02:28:54 -0000	1.5
+++ doc/man-pages/pod1/vos.pod	3 Sep 2007 02:47:19 -0000
@@ -237,7 +237,8 @@
 L<vos_backup(1)>,
 L<vos_backupsys(1)>,
 L<vos_changeaddr(1)>,
-L<vos_changeloc(1)>,
+L<vos_convertROtoRW(1)>,
+L<vos_copy(1)>,
 L<vos_create(1)>,
 L<vos_delentry(1)>,
 L<vos_dump(1)>,
Index: doc/man-pages/pod8/fileserver.pod
===================================================================
RCS file: /cvs/openafs/doc/man-pages/pod8/fileserver.pod,v
retrieving revision 1.7
diff -u -r1.7 fileserver.pod
--- doc/man-pages/pod8/fileserver.pod	12 Jun 2007 03:49:56 -0000	1.7
+++ doc/man-pages/pod8/fileserver.pod	3 Sep 2007 02:47:19 -0000
@@ -17,7 +17,7 @@
     S<<< [B<-cb> <I<number of call backs>>] >>> [B<-banner>] [B<-novbc>]
     S<<< [B<-implicit> <I<admin mode bits: rlidwka>>] >>> [B<-readonly>]
     S<<< [B<-hr> <I<number of hours between refreshing the host cps>>] >>>
-    [B<-busyat> <I<< redirect clients when queue > n >>>]
+    S<<< [B<-busyat> <I<< redirect clients when queue > n >>>] >>>
     [B<-nobusy>] S<<< [B<-rxpck> <I<number of rx extra packets>>] >>>
     [B<-rxdbg>] [B<-rxdbge>] S<<< [B<-rxmaxmtu> <I<bytes>>] >>>
     S<<< [B<-rxbind> <I<address to bind the Rx socket to>>] >>>
@@ -48,9 +48,9 @@
 
 The File Server creates the F</usr/afs/logs/FileLog> log file as it
 initializes, if the file does not already exist. It does not write a
-detailed trace by default, but use the B<-d> option to increase the amount
-of detail. Use the B<bos getlog> command to display the contents of the
-log file.
+detailed trace by default, but the B<-d> option may be used to
+increase the amount of detail. Use the B<bos getlog> command to
+display the contents of the log file.
 
 The command's arguments enable the administrator to control many aspects
 of the File Server's performance, as detailed in L<OPTIONS>.  By default
@@ -68,7 +68,7 @@
 
 The maximum number of lightweight processes (LWPs) the File Server uses to
 handle requests for data; corresponds to the B<-p> argument. The File
-Server always uses a minimum of 32 KB for these processes.
+Server always uses a minimum of 32 KB of memory for these processes.
 
 =item *
 
@@ -168,6 +168,16 @@
 that it can take that long for changed group memberships to become
 effective. To change this frequency, use the B<-hr> argument.
 
+The File Server stores volumes in partitions. A partition is a
+filesystem or directory on the server machine that is named C</vicepX>
+or C</vicepXX> where XX is "a" through "z" or "aa" though "zz". The
+File Server expects that the /vicepXX directories are each on a
+dedicated filesystem. The File Server will only use a /vicepXX if it's
+a mountpoint for another filesystem, unless the file
+C</vicepXX/AlwaysAttach> exists. The data in the partition is a
+special format that can only be access using OpenAFS commands or an
+OpenAFS client.
+
 The File Server generates the following message when a partition is nearly
 full:
 
@@ -178,12 +188,12 @@
 
 =head1 CAUTIONS
 
-Do not use the B<-k> and -w arguments, which are intended for use by the
-AFS Development group only. Changing them from their default values can
-result in unpredictable File Server behavior.  In any case, on many
-operating systems the File Server uses native threads rather than the LWP
-threads, so using the B<-k> argument to set the number of LWP threads has
-no effect.
+Do not use the B<-k> and B<-w> arguments, which are intended for use
+by the AFS Development group only. Changing them from their default
+values can result in unpredictable File Server behavior.  In any case,
+on many operating systems the File Server uses native threads rather
+than the LWP threads, so using the B<-k> argument to set the number of
+LWP threads has no effect.
 
 Do not specify both the B<-spare> and B<-pctspare> arguments. Doing so
 causes the File Server to exit, leaving an error message in the
@@ -398,6 +408,163 @@
                 -cmd "/usr/afs/bin/fileserver -pctspare 10 \
                 -L" /usr/afs/bin/volserver /usr/afs/bin/salvager
 
+
+=head1 TROUBLESHOOTING
+
+Sending process signals to the File Server Process can change its
+behavior in the following ways:
+
+
+   Process          Signal       OS     Result
+   ---------------------------------------------------------------------
+
+   File Server      XCPU        Unix    Prints a list of client IP
+                                        Addresses.
+
+   File Server      USR2      Windows   Prints a list of client IP
+                                        Addresses.
+
+   File Server      POLL        HPUX    Prints a list of client IP
+                                        Addresses.
+
+   Any server       TSTP        Any     Increases Debug level by a power
+                                        of 5 -- 1,5,25,125, etc.
+                                        This has the same effect as the
+                                        S<<< B<-debug> <I<XXX>> >>>
+                                        command-line option.
+
+   Any Server       HUP         Any     Resets Debug level to 0
+
+   File Server      TERM        Any     Run minor instrumentation over
+                                        the list of descriptors.
+
+   Other Servers    TERM        Any     Causes the process to quit.
+
+   File Server      QUIT        Any     Causes the File Server to Quit.
+                                        Bos Server knows this.
+
+
+The basic metric of whether an AFS file server is doing well is its
+blocked connection count, which can be found by running the following
+command:
+
+   C</usr/afsws/etc/rxdebug> <I<server>> | grep waiting_for | wc -l
+
+Each line returned by C<rxdebug> that contains the text "waiting_for"
+represents a blocked conneciton.
+
+If the blocked connection count is ever above 0, the server is having
+problems replying to clients in a timely fashion.  If it gets above
+10, roughly, there will be noticable slowness by the user. The total
+number of connections is a mostly irrelevant number that goes
+essentially monotonically for as long as the server has been running
+and then goes back down to zero when it's restarted.
+
+The most common cause of blocked connections rising on a server is
+some process somewhere performing an abnormal number of accesses to
+that server and its volumes.  If multiple servers have a blocked
+connection count, the most likely explanation is that there is a
+volume replicated between those servers that is absorbing an
+abnormally high access rate.
+
+To get an access count on all the volumes on a server, run:
+
+   vos listvol <I<server>> -long
+
+and save the output in a file.  The results will look like a bunch of
+B<vos examine> output for each volume on the server.  Look for lines
+like:
+
+   40065 accesses in the past day (i.e., vnode references)
+
+and look for volumes with an abnormally high number of accesses.
+Anything over 10,000 is fairly high, but some core infrastructure
+volumes lie root.cell and other volumes close to the root of the cell
+will have that many hits routinely.  Anything over 100,000 is
+generally abnormally high.  The count resets about once a day.
+
+Another approach that can be used to narrow the possibilities for a
+replicated volume, when multiple servers are having trouble, is to
+find all replicated volumes for that server.  Run:
+
+   % vos listvldb -server <I<server>>
+
+where <I<server>> is one of the servers having problems to refresh the VLDB
+cache, and then run:
+
+   % vos listvldb -server <I<server>> -part <I<partition>>
+
+to get a list of all volumes on that server and partition, including
+every other server with replicas.
+
+Once the volume causing the problem has been identified, the best way to
+deal with the problem is to move that volume to another server with a low
+load.  Often the volume will be enough information to tell what's going on
+by scanning the cluster for scripts run by that user, if it's a user
+volume) or using that program, if it's a non-user volume.
+
+If you still need additional information about who's hitting that
+server, sometimes you can guess at that information from the failed
+callbacks in the F<FileLog> log in F</var/log/afs> on the server, or
+from the output of:
+
+   /usr/afsws/etc/rxdebug <I<server>> -rxstats
+
+but the best way is to turn on debugging output from the file server.
+(Warning: This generates a *lot* of output into FileLog on the AFS
+server.)  To do this, log on to the AFS server, find the PID of the
+fileserver process, and do:
+
+    kill -TSTP <I<pid of file server process>>
+
+This will raise the debugging level so that you'll start seeing what
+people are actually doing on the server.  You can do this up to three
+more times to get even more output if needed.  To reset the debugging
+level back to normal, use (The following command will NOT terminate
+the file server):
+
+    kill -HUP <I<pidof file server process>>
+
+The debugging setting on the File Server should be reset back to
+normal when debugging is no longer needed, otherwise the AFS
+server may well fill its disks with debugging output.
+
+The lines of the debugging output that are most useful for debugging
+load problems are:
+
+    SAFS_FetchStatus,  Fid = 2003828163.77154.82248, Host 171.64.15.76
+    SRXAFS_FetchData, Fid = 2003828163.77154.82248
+
+(The example above is partly truncated to highlight the interesting
+information).  The Fid identifies the volume and inode within the
+volume; the volume is the first long number.  So, for example, this
+was:
+
+   % vos examine 2003828163
+   pubsw.matlab61                   2003828163 RW    1040060 K  On-line
+       afssvr5.Stanford.EDU /vicepa 
+       RWrite 2003828163 ROnly 2003828164 Backup 2003828165 
+       MaxQuota    3000000 K 
+       Creation    Mon Aug  6 16:40:55 2001
+       Last Update Tue Jul 30 19:00:25 2002
+       86181 accesses in the past day (i.e., vnode references)
+
+       RWrite: 2003828163    ROnly: 2003828164    Backup: 2003828165
+       number of sites -> 3
+          server afssvr5.Stanford.EDU partition /vicepa RW Site 
+          server afssvr11.Stanford.EDU partition /vicepd RO Site 
+          server afssvr5.Stanford.EDU partition /vicepa RO Site 
+
+and from the Host information one can tell what system is accessing that
+volume.
+
+Note that the output of L<vos_examine(1)> also includes the access count,
+so once the problem has been identified, vos examine can be used to
+see if the access count is still increasing.  Also remember that you
+can run vos examine on the read-only replica, e.g.,
+pubsw.matlab61.readonly to see the access counts on the read-only
+replica on all of the servers that it's located on.
+
 =head1 PRIVILEGE REQUIRED
 
 The issuer must be logged in as the superuser C<root> on a file server
@@ -413,7 +580,8 @@
 L<bos_getlog(8)>,
 L<fs_setacl(1)>,
 L<salvager(8)>,
-L<volserver(8)>
+L<volserver(8)>,
+L<vos_examine(1)>
 
 =head1 COPYRIGHT
 

--------------080705000500070403000805
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
 name="vos_copy.pod"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="vos_copy.pod"

=head1 NAME

vos copy - Make a copy of a volume

=head1 SYNOPSIS

=for html
<div class="synopsis">

B<vos copy> S<<< [B<-id>] <I<volume name or ID of source>> >>>
   S<<< [B<-fromserver>] <I<machine name for source>> >>>
   S<<< [B<-frompartition>] <I<partition name for source>> >>>
   S<<< [B<-toname>] <I<volume name for new copy>> >>>
   S<<< [B<-toserver>] <I<machine name for destination>> >>>
   S<<< [B<-topartition>] <I<partition name for destination>> >>>
   [B<-offline>] [B<-readonly>] [B<-live>] S<<< [B<-cell> <I<cell name>>] >>>
   [B<-noauth>] [B<-localauth>] [B<-verbose>] [B<-encrypt>] [B<-help>]

=for html
</div>

=head1 DESCRIPTION

The B<vos copy> command makes a copy of a volume with a new name. This
equivalent to doing a B<vos dump> and then a B<vos restore>.

=head1 OPTIONS

=over 4

=item [B<-id>] <I<volume name or ID of source>>

The name or ID of the source volume for the copy.

=item [B<-fromserver>] <I<machine name for source>>

The server machine on which the source volume resides.

=item [B<-frompartition>] <I<partition name for source>>

The partition on which the source volume resides.

=item [B<-toname>] <I<volume name for new copy>>

The name of the new copy.

=item [B<-toserver>] <I<machine name for destination>>

The server machine on which the new volume will reside.

=item [B<-topartition>] <I<partition name for destination>>

The partition on which the new volume will reside.

=item B<-offline>

Leaves the new volume flagged as off-line in the volume database.

=item B<-readonly>

Flags the new volume as read-only in the volume database.

=item B<-live>

Copies the live volume without cloning.

=item B<-localauth>

Constructs a server ticket using a key from the local
F</usr/afs/etc/KeyFile> file. The B<vos> command interpreter presents it
to the Volume Server and Volume Location Server during mutual
authentication. Do not combine this flag with the B<-cell> argument or
B<-noauth> flag. For more details, see L<vos(1)>.

=item B<-verbose>

Produces on the standard output stream a detailed trace of the command's
execution. If this argument is omitted, only warnings and error messages
appear.

=item B<-encrypt>

Encrypts the command so that the operation's results are
not transmitted across the network in clear text.

=item B<-help>

Prints the online help for this command. All other valid options are
ignored.

=back

=head1 OUTPUT

This command has no output unless C<-verbose> is specified or there is
an error.

=head1 EXAMPLES

The following example makes a verbose copy of the C<test> volume named
C<test2> in the cell C<localcell>. The volume and copy both reside on
C</vicepa> of C<server1>.

   % vos copy test server1  a test2 server1 a -cell localcell -verbose
   Starting transaction on source volume 536870921 ... done
   Allocating new volume id for clone of volume 536870921 ... done
   Allocating new volume id for copy of volume 536870921 ... done
   Cloning source volume 536870921 ... done
   Ending the transaction on the source volume 536870921 ... done
   Starting transaction on the cloned volume 536870926 ... done
   Setting flags on cloned volume 536870926 ... done
   Getting status of cloned volume 536870926 ... done
   Creating the destination volume 536870927 ... done
   Setting volume flags on destination volume 536870927 ... done
   Dumping from clone 536870926 on source to volume 536870927 on destination ... done
   Ending transaction on cloned volume 536870926 ... done
   Starting transaction on source volume 536870921 ... done
   Doing the incremental dump from source to destination for volume 536870921 ...  done
   Setting volume flags on destination volume 536870927 ... done
   Ending transaction on destination volume 536870927 ... done
   Ending transaction on source volume 536870921 ... done
   Starting transaction on the cloned volume 536870926 ... done
   Deleting the cloned volume 536870926 ... done
   Ending transaction on cloned volume 536870926 ... done
   Created the VLDB entry for the volume test2 536870927
   Volume 536870921 copied from server1 /vicepa to test2 on server1 /vicepa 


=head1 PRIVILEGE REQUIRED

The issuer must be listed in the F</usr/afs/etc/UserList> file on the
machine specified with the B<-server> argument and on each database server
machine. If the B<-localauth> flag is included, the issuer must instead be
logged on to a server machine as the local superuser C<root>.

=head1 SEE ALSO

L<vos(1)>,
L<vos clone(1)>

=head1 COPYRIGHT

Copyright 2007 Jason Edgecombe <jason@rampaginggeek.com>

This documentation is covered by the IBM Public License Version 1.0. This
man page was written by Jason Edgecombe for OpenAFS.

--------------080705000500070403000805
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
 name="vos_convertROtoRW.pod"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="vos_convertROtoRW.pod"

=head1 NAME

vos convertROtoRW - Converts a Read-Only volume into a Read/Write volume

=head1 SYNOPSIS

=for html
<div class="synopsis">

B<vos convertROtoRW> S<<< [B<-server>] <I<machine name>> >>>
   S<<< [B<-partition>] <I<partition name>> >>>
   S<<< [B<-id>] <I<volume name or ID>> >>> [B<-force>]
   S<<< [B<-cell> <I<cell name>>] >>> [B<-noauth>] [B<-localauth>]
   [B<-verbose>] [B<-encrypt>] [B<-help>]

=for html
</div>

=head1 DESCRIPTION

B<vos convertROtoRW> converts a Read-Only volume into a Read/Write
volume when the original Read/Write volume is no longer
available. This could be the result of many events such a failed disk,
failed server or an accidental deletion.

=head1 CAUTIONS

The command name is case-sensitive. It must be issued with the capital
"RO" and "RW".

=head1 OPTIONS

=over 4

=item B<-server> <I<server name>>

Identifies the file server machine that houses the Read-Only volume
which will be converted.

=item B<-partition> <I<partition name>>

Identifies the partition on the file server machine that houses the
Read-Only volume which will be converted.

=item B<-id> <I<volume ID>>

Specifies either the complete name or volume ID number of a read/write
volume.

=item B<-force>

Don't ask for confirmation.

=item B<-cell> <I<cell name>>

Names the cell in which to run the command. Do not combine this argument
with the B<-localauth> flag. For more details, see L<vos(1)>.

=item B<-noauth>

Assigns the unprivileged identity C<anonymous> to the issuer. Do not
combine this flag with the B<-localauth> flag. For more details, see
L<vos(1)>.

=item B<-localauth>

Constructs a server ticket using a key from the local
F</usr/afs/etc/KeyFile> file. The B<vos> command interpreter presents it
to the Volume Server and Volume Location Server during mutual
authentication. Do not combine this flag with the B<-cell> argument or
B<-noauth> flag. For more details, see L<vos(1)>.

=item B<-verbose>

Produces on the standard output stream a detailed trace of the command's
execution. If this argument is omitted, only warnings and error messages
appear.

=item B<-encrypt>

Encrypts the command so that the operation's results are
not transmitted across the network in clear text.

=item B<-help>

Prints the online help for this command. All other valid options are
ignored.

=back

=head1 EXAMPLES

The following example converts the read-only volume test3.readonly in
partition vicepb on server1 to a read-write volume:

   % vos convertROtoRW server1 b test3.readonly

=head1 PRIVILEGE REQUIRED

The issuer must be listed in the F</usr/afs/etc/UserList> file on the
machines specified with the B<-toserver> and B<-fromserver> arguments and
on each database server machine.  If the B<-localauth> flag is included,
the issuer must instead be logged on to a server machine as the local
superuser C<root>.

=head1 SEE ALSO

L<vos(1)>

=head1 COPYRIGHT

Copyright 2007 Jason Edgecombe <jason@rampaginggeek.com>

This documentation is covered by the IBM Public License Version 1.0. This
man page was written by Jason Edgecombe for OpenAFS.

--------------080705000500070403000805
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
 name="read_tape.pod"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="read_tape.pod"

=head1 NAME

read_tape - Reads volume dumps from a backup tape and saves them to a file.

=head1 SYNOPSIS

=for html
<div class="synopsis">

B<read_tape> S<<< B<-tape> <I<tape device>> >>>
   S<<< B<-restore> <I<# of volumes to restore>> >>>
   S<<< B<-skip> # <I<# of volumes to skip>> >>>
   S<<< B<-file> <I<filename>> >>> [B<-scan>] [B<-noask>] [B<-label>]
   [B<-vheaders>] [B<-verbose>] [B<-help>]

=for html
</div>

=head1 DESCRIPTION

B<read_tape> reads an OpenAFS backup tape and prompts for each dump
file to save. This command does not require any OpenAFS
infrastructure. This command does not need an OpenAFS client or server
to be available, which is not the case with the B<backup(8)> command.

After saving each dump file, B<vos restore> or B<restorevol> can be
used to restore the volume into AFS and non-AFS space respectively.

B<read_tape> reads the tape while skipping the specified number of
volumes. After that, it restores the specified number of
volumes. B<read_tape> doesn't rewind the tape, so it can be used
multiple times in succession.

=head1 CAUTIONS



=head1 OPTIONS

=over 4

=item B<-tape> <I<tape device>>

Specifies the tape device from which to restore.

=item B<-restore> <I<# of volumes to restore>>

Specifies the number of volumes to restore from tape.

=item B<-skip> <I<# of volumes to skip>>

Specifies the number of volumes to skip before starting the restore.

=item B<-file> <I<filename>>

Specifies an alternate name for the restored volume dump file.

=item B<-scan>

Scans the tape.

=item B<-noask>

Doesn't prompt for each volume

=item B<-label>

Displays the full dump label.

=item B<-vheaders>

Displays the full volume headers

=item B<-verbose>

Produces on the standard output stream a detailed trace of the command's
execution. If this argument is omitted, only warnings and error messages
appear.

=item B<-help>

Prints the online help for this command. All other valid options are
ignored.

=back

=head1 OUTPUT



=head1 EXAMPLES

The following command will read the third through fifth volumes from
the tape device /dev/tape without prompting:

   % read_tape -tape /dev/tape -skip 2 -restore 3 -noask

=head1 PRIVILEGE REQUIRED

The issuer must have access to read and write to the specified tape
device.

=head1 SEE ALSO

L<backup(8)>,
L<restorevol(8)>,
L<vos_restore(1)>

IBM AFS Disaster Recovery at
L<http://www-1.ibm.com/support/docview.wss?rs=961&context=SSXMUG&dc=DA400&uid=swg27004312&loc=en_US&cs=UTF-8&lang=en&rss=ct961other#32>.

=head1 COPYRIGHT

Copyright 2007 Jason Edgecombe <jason@rampaginggeek.com>

This documentation is covered by the IBM Public License Version 1.0. This
man page was written by Jason Edgecombe for OpenAFS.

--------------080705000500070403000805--