[OpenAFS] log messages I don't understand
Jeffrey Hutzelman
jhutz@cmu.edu
Mon, 23 Oct 2006 14:28:14 -0400
On Monday, October 23, 2006 10:39:54 AM -0400 seth vidal
<skvidal@linux.duke.edu> wrote:
> hi,
> I've been getting log reports from some new (openafs 1.4.0 centos 4
> x86_64) vol servers we've setup. I'm kind of new at this so I have a few
> questions about what some of the logs mean:
>
>
> volserver[3417]: trans 843 on volume 538013895 is older than 1170
> seconds
> I get a lot of these each day
volume operations like move, release, backup, and so on are done in
"transactions" that link several volserver RPC's into a single operation.
In normal operation, the volserver periodically logs a message about any
transaction that has been in existance for more than 5 minutes. This is
perfectly normal, as some operations on large volumes can take considerably
longer than that.
> fileserver[4701]: CB: Call back connect back failed (in break delayed)
> for
> 192.168.62.100:7001
>
> I think this one is just a communication failure talking back to the
> client from what I've seen in google.
Right; this means the fileserver tried to break a callback with that client
and the client was nonresponsive. That could mean the client has shut down
or left the network, or that there's some network problem.
> fileserver[4496]: FindClient: client 9632c138(14277b94) already had conn
> b62030
> (host 67640398), stolen by client 9632c138(14277b94)
This is normal, but uncommon. It does not itself indicate a problem, but
provides debugging information for openafs developers that may be useful in
debugging a problem with the client-tracking code, if one occurs (note: I
always have trouble describing this, because I'm afraid people will think I
mean there's a known problem, which is not the case. However, we've found
problems in this area to be difficult to track down, and when doing so, the
situation that triggers this message can be confusing if you don't know it
happened).
> fileserver[4680]: fssync: volume 537466995 restored; breaking all call
> backs
This means that the volserver had attached that volume (use 'vos listvl' or
'vos examine' to find out what volume it is) and has now given it back,
possibly with changes. Typically you'll see this on the target of a move,
release, or restore, or for an RO or backup volume which has just been
recloned.
> volserver[4681]: 1 Volser: Clone: Recloning volume 537466996 to volume
> 537466998
And this is the volserver telling you it is recloning an RO or backup
volume; this is what 'vos release', 'vos backup', and 'vos backupsys' do.
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA