[OpenAFS] one solaris client suddenly can't mount
Derek Atkins
warlord@MIT.EDU
31 May 2001 15:04:52 -0400
Oh, right... I think I know what's going on now. I'll see if I can
spend some time to come up with a patch.
In short: you're right, the server caches the fact that a server is
down and it becomes challenging or near-impossible to recover from
that (at least until the AFS Server restarts). I had supplied a
patch that fixed certain cases of this condition, but I bet there
are other cases, too.
Unfortunately I don't know how to reliably reproduce this particular
case, which makes it challenging to test.
-derek
"Stotler, John" <jstotler@quelsys.com> writes:
> Well, the client is up and running again.
>
> I rebooted the AFS server. It seems that it had somehow (?) cached the fact
> that it could not contact 192.168.0.14 from a hub outage earlier in the day,
> even though all indications through command-line utilities showed that the
> connection was fine.
>
> I had already deleted all of the cache files, and when the server came back
> up, the solaris client crashed, dumped, and rebooted. When it came back up
> everything was fine.
>
> Many thanks to all who offered assistance.
>
> -----Original Message-----
> From: Derek Atkins [mailto:warlord@MIT.EDU]
> Sent: Thursday, May 31, 2001 1:36 PM
> To: Stotler, John
> Cc: openafs-info@openafs.org
> Subject: Re: [OpenAFS] one solaris client suddenly can't mount
>
>
> "Stotler, John" <jstotler@quelsys.com> writes:
>
> > Sorry for the personal reply, it was intended for the list.
> >
> > This is what I've got in the FileLog:
> >
> > Thu May 31 09:18:59 2001 CB: RCallBackConnectBack failed for e00a8c0.22811
> > Thu May 31 11:05:23 2001 ProbeUuid failed for host e00a8c0.22811
>
> Is your client's IP Address 192.168.0.14? If not, then what machine
> has that IP Address?
>
> Basically, this log entry is saying that the RCallBackConnectBack RPC
> failed, as did the ProbeUuid callback. This generally means that the
> server cannot contact your client, for one reason or another. The
> "22811" is "7001", but byte-order-swapped.
>
> > This remains greek to me.
> >
> > I'll have to build the utilities for Solaris, at the moment I have only
> > built them for Linux.
>
> The OpenAFS build process should have built all the utilities
> automatically. Look in .../dest/bin/
>
> -derek
>
> > -----Original Message-----
> > From: Derek Atkins [mailto:warlord@MIT.EDU]
> > Sent: Thursday, May 31, 2001 1:23 PM
> > To: Stotler, John
> > Cc: openafs-info@openafs.org
> > Subject: Re: [OpenAFS] one solaris client suddenly can't mount
> >
> >
> > You should have responded to openafs-info, not to me directly.
> >
> > Having bos/vos/pts/fs on clients is usually a Good Thing. If nothing
> > else they can help you debug situations like this. If there are no
> > messages in the FileLog on the fileserver 192.168.0.59, then I don't
> > know what to say.
> >
> > You can try clearing your AFS Cache, but I don't know why that would
> > help. Is socratease.com your home cell?
> >
> > -derek
> >
> > "Stotler, John" <jstotler@quelsys.com> writes:
> >
> > > The server is definitely running, other clients can mount the shared
> > folders
> > > without any problems.
> > >
> > > I don't have bos installed on the clients, they're al just running the
> > > default installation. I'm not sure which utility programs will run on
> the
> > > clients....
> > >
> > > -----Original Message-----
> > > From: Derek Atkins [mailto:warlord@MIT.EDU]
> > > Sent: Thursday, May 31, 2001 1:06 PM
> > > To: Stotler, John
> > > Cc: 'openafs-info@openafs.org'
> > > Subject: Re: [OpenAFS] one solaris client suddenly can't mount
> > >
> > >
> > > What happens if you 'bos status <server> -no -long'... Is the fileserver
> > > actually running?
> > >
> > > -derek
> > >
> > > "Stotler, John" <jstotler@quelsys.com> writes:
> > >
> > > > I'm stumped on this one.....
> > > >
> > > > One of my solaris clients (5.8 on sparc) has been working fine for
> > months
> > > > now. We rebooted the box this morning, and now we're getting:
> > > >
> > > > May 31 10:45:03 db1 afs: [ID 888289 kern.notice] Starting AFS cache
> > > scan...
> > > > May 31 10:45:11 db1 afs: [ID 215355 kern.notice] found 4730 non-empty
> > > cache
> > > > files (94%).
> > > > May 31 10:45:12 db1 afs: [ID 446265 kern.notice] afs: Lost contact
> with
> > > file
> > > > server 1
> > > > 92.168.0.59 in cell socratease.com (all multi-homed ip addresses down
> > for
> > > > the server)
> > > > May 31 10:45:12 db1 last message repeated 1 time
> > > >
> > > > I can't get the afs mount to pick up. The AFSLog on the client is
> empty,
> > > and
> > > > nothing seems to be logged on the server side either.
> > > >
> > > > The client machine in question can ping, traceroute and DNS resolve
> the
> > > > server with no lost packets whatsoever.
> > > >
> > > > Any idea what I need to do?
> > > > _______________________________________________
> > > > OpenAFS-info mailing list
> > > > OpenAFS-info@openafs.org
> > > > https://lists.openafs.org/mailman/listinfo/openafs-info
> > >
> > > --
> > > Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
> > > Member, MIT Student Information Processing Board (SIPB)
> > > URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
> > > warlord@MIT.EDU PGP key available
> >
> > --
> > Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
> > Member, MIT Student Information Processing Board (SIPB)
> > URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
> > warlord@MIT.EDU PGP key available
> > _______________________________________________
> > OpenAFS-info mailing list
> > OpenAFS-info@openafs.org
> > https://lists.openafs.org/mailman/listinfo/openafs-info
> > _______________________________________________
> > OpenAFS-info mailing list
> > OpenAFS-info@openafs.org
> > https://lists.openafs.org/mailman/listinfo/openafs-info
>
> --
> Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
> Member, MIT Student Information Processing Board (SIPB)
> URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
> warlord@MIT.EDU PGP key available
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
warlord@MIT.EDU PGP key available