[OpenAFS] HELP! We 've lost our sync site
Mosley, Mike
jmmosley@uncc.edu
Sat, 10 Jan 2004 15:01:50 -0500
I found the notes and I understand now. Thanks to everybody who responded
for your help and for responding so quickly. I'll be looking for the new
binaries.
Thanks again.
Mike
-----Original Message-----
From: Douglas E. Engert [mailto:deengert@anl.gov]
Sent: Saturday, January 10, 2004 2:40 PM
To: Mosley, Mike
Cc: 'openafs-info@openafs.org'
Subject: Re: [OpenAFS] HELP! We 've lost our sync site
"Mosley, Mike" wrote:
>
> I have not seen the notes about the Ubik time overflow problem.
Sorry that was on openafs-dev.
> I'm
> currently runnint 1.2.10 under Solaris 9. I can drop back to 1.2.8.
As Russ indicated, this looks like a problem in the original Transarc
code too. So dropping back won't help.
> Is
> there anyway I can correct the problem for the short term to correct the
> problem while I prepare to back down to the earlier version?
Install new server binaries in /usr/afs/bin. You will need to wait
for the OpenAFS releaseod 1.2.11 (which might be any time now)
or do it yourself, or fine someone with a sun4x_59 build.
>
> Thanks,
>
> Mike
>
> -----Original Message-----
> From: Douglas E. Engert [mailto:deengert@anl.gov]
> Sent: Saturday, January 10, 2004 2:16 PM
> To: James M Mosley
> Cc: openafs-info@openafs.org
> Subject: Re: [OpenAFS] HELP! We 've lost our sync site
>
> Did you see the four notes on "Ubik time overflow at 0x40000000"
> That is the problem.
>
> What version of AFS are you running and on wht OS?
> If you can't build it yourself, or can't want for the OPenAFS peple
> to build a release, maybe some else might have built it by now.
> (I have OpenAFS-1.2.10 running on sunx4_58 for the last 15 minutes.)
>
> James M Mosley wrote:
> >
> > All,
> > We need immediate help! We have been unable to establish a sync
> > site for about 6 hours. All 3 of our database servers are up and appear
> > to be perfroming the election as expected. However, the server that
> > should be come the synce site doesn't. Here is some output from
udebug
> > on that server:
> >
> > as-sm1# udebug as-sm1 7002 -long
> > Host's addresses are: 152.15.10.70
> > Host's 152.15.10.70 time is Sat Jan 10 13:59:36 2004
> > Local time is Sat Jan 10 13:59:39 2004 (time differential 3 secs)
> > Last yes vote for 152.15.10.70 was 3 secs ago (not sync site);
> > Last vote started 3 secs ago (at Sat Jan 10 13:59:36 2004)
> > Local db version is 1073480540.254
> > I am not sync site
> > Lowest host 152.15.10.70 was set 3 secs ago
> > Sync host 0.0.0.0 was set 1073761176 secs ago
> > Sync site's db version is 1073480540.254
> > 0 locked pages, 0 of them for write
> >
> > Server (152.15.13.7): (db 0.0)
> > last vote rcvd 5 secs ago (at Sat Jan 10 13:59:34 2004),
> > last beacon sent 3 secs ago (at Sat Jan 10 13:59:36 2004), last vote
> was yes
> > dbcurrent=0, up=1 beaconSince=1
> >
> > Server (152.15.30.27): (db 0.0)
> > last vote rcvd 4 secs ago (at Sat Jan 10 13:59:35 2004),
> > last beacon sent 3 secs ago (at Sat Jan 10 13:59:36 2004), last vote
> was yes
> > dbcurrent=0, up=1 beaconSince=1
> > as-sm1#
> >
> > The only strange thing we have noticed is that when we attempted to
> > stop/restart the database servers to see if the condition we clear
itself
> > up we saw as-sm1 become the sync site (as it should) but it claimed it
was
> > a sync site for a negative number of seconds. The amount of time seemed
> > to refer back to about the time we started seeing the problem as
evidenced
> > by the last time the local database files were updated.
> >
> > All three database servers our running Solaris 9 and OpenAFS 1.2.10.
> >
> > We need help soon. Thanks.
> >
> > Mike
> >
> > -------------------------------------
> > Mike Mosley Email: jmmosley@uncc.edu
> > Systems Software Developer Phone: (704) 687-3522
> > College of Engineering, UNC-Charlotte Fax: (704) 687-2352
> > _______________________________________________
> > OpenAFS-info mailing list
> > OpenAFS-info@openafs.org
> > https://lists.openafs.org/mailman/listinfo/openafs-info
>
> --
>
> Douglas E. Engert <DEEngert@anl.gov>
> Argonne National Laboratory
> 9700 South Cass Avenue
> Argonne, Illinois 60439
> (630) 252-5444
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
--
Douglas E. Engert <DEEngert@anl.gov>
Argonne National Laboratory
9700 South Cass Avenue
Argonne, Illinois 60439
(630) 252-5444