[OpenAFS] VLDB corruption cause mount point go to other volume.
Benjamin Kaduk
kaduk@mit.edu
Fri, 22 Feb 2019 21:34:43 -0600
On Thu, Feb 21, 2019 at 04:58:54PM +0700, Thossaporn (Pommm) Phetruphant wrote:
> Hi everyone,
>
> I have 3 vldb/pts servers and 13 file servers in my network. All are on
> the same subnet, same location.
> We have encountered 2nd time of corrupted VLDB where when 'cd' into a
> mount point it go difference volume.
>
> Example:
> live.D1 mount at /afs/domain/live/data1
> live.D2 mount at /afs/domain/live/data2
> root.cell is at /afs/domain
>
>
> cd /afs/domain/live/data1
>
> 'fs exa . ' show volume named 'live.D2' mounted at this mount point
>
> 'ls' show data in data2
>
> or
>
> cd /afs/domain
>
> 'fs exa . ' show volume named 'live.D1' mounted at this mount point
>
> 'ls' show data in data1
I may be confused -- does the difference show up for all clients or just
one?
-Ben
> At first I think NTP getting out of sync but it is not.
> I have 1 GPS NTP stratum 1 server and 2 of NTP stratum 2 on my network,
> Nagios and Cacti report no NTP down time during this event.
>
> 'vldb_check -database /var/lib/openafs/db/vldb.DB0' show 'root.cell
> (xxxxxxxxxx) has no RW volume' and ~10 volumes also 'has no RW volume'
>
> So, I have backup of VLDB hourly, so it can be recovered fast enough but
> it is 2nd time that this happen.
> Is anyone known why this would happen? How can we prevent it?
>
> Best regards,
>
> Pommm
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info