[OpenAFS] AFS outage, impact of "moving" root.cell.readonly,
root.afs.readonly
Kim Kimball
dhk@ccre.com
Thu, 26 Apr 2007 16:29:25 -0600
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Oh right, I remember that bug.<br>
<br>
I have, BTW, been enjoying the fruits of your AFS Windows endeavors.<br>
<br>
The VLDB entries were correct with "vos listvl root.afs/root.cell"
during this confusion -- but may have been in an inconsistent state at
some point.<br>
<br>
The only hypothesis I have right now involves clients having bad volume
location info, but why that wouldn't start for two hours escapes me. <br>
<br>
The client refresh of the cached volume info is on a 2 hr interval.
Surely some clients would have refreshed prior to the two hour mark at
which the issues began.<br>
<br>
<br>
<br>
Jeffrey Altman wrote:
<blockquote cite="mid28821055.1177625211498.JavaMail.root@m11"
type="cite">
<pre wrap="">Kim Kimball wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Don't know if Windows boxes were affected or not.
I know of at least one that was active during the entire window of
confusion.
I'm analyzing the file server detailed logs (not FileLog, the -auditlog
output) now and should be able to answer the question with some level of
confidence soon.
Kim
</pre>
</blockquote>
<pre wrap=""><!---->
The reason I asked about the Windows clients is that there was a bug in
the Windows clients that prevented read-only fail over from working. I
believe it was fixed prior to 1.4.0. If your Windows clients were
working and the UNIX clients were not, that could point to a bug in the
UNIX clients.
If however the Windows clients are also failing, then it points to
something wrong in one of the databases.
Jeffrey Altman
</pre>
</blockquote>
</body>
</html>