[OpenAFS] Mount point weirdness: fs lsm X, fs lq X return different volumes for same mount point.

Kim Kimball dhk@ccre.com
Fri, 03 Oct 2008 10:24:36 -0600


Jeffrey Altman wrote:
> Questions that pop into mind:
>
> 1) what versions of the clients were involved?
>
>   
Multiple, Solaris, Linux, Windows, Macintosh.
> 2) what was the output of vos examine on the volume names?
>   
Correct output -- using volume name, returned correct numeric ID and on 
line status.  Using number, returned correct name and on line status.
> 3) same problem after a cache manager shutdown and restart?
>   
No attempted.  Issue resolved within five minutes of fix -- first fix 
was to dump/restore affected volumes to force new volIDs.  This worked.

All clients fine after fix, simultaneously, so don't think restart would 
have helped.

fs checkv was used after each fix effort, along with lots of fs checkv 
just for good measure.
> 4) same problem from Unix and Windows clients?
>
>   
Yes.

Thanks!

Kim

> Jeffrey Altman
>
>
> Kim Kimball wrote:
>   
>> Had a weird one on Thursday, and am looking for any plausible
>> explanation so I can close out the incident report.
>> My best answer right now is NAFC (not an effing clue.)
>>
>> I'm using "X-mounted" to describe "volume named in mountpoint" not equal
>> to "volume accessed at mountpoint"
>>
>> Probably relevant:  We were moving volumes to clear a file server, and
>> noticed an unusual number of orphaned volumes.
>> When I went to start 'vos zapping' the orphans, many of them  turned out
>> to be those that incorrectly showed up at a given mount point.
>>
>> Could it be that the 'vos move' failures that created the orphans are
>> the proximate cause of the X-mounts?  If so, how could the two be related?
>>
>> Any FC greatly appreciated.
>>
>> Kim
>>
>> ====================================
>> Synopsis:
>>
>> From any AFS client, the volume named in a mount point was not the
>> volume actually accessed
>>
>>
>> Initial symptom:
>>       web servers start puking when invoking perl modules
>>       cd to path where perl modules are expected, and instead of perl
>> modules see bunch of unrelated png libraries
>>       check mount point to volume containing perl modules, and mount
>> point correctly names perl volume
>>       fs lq on mount point returns name of volume containing png
>> libraries -- not the name of the volume specified in fs lsm
>>
>> The diagnostic:
>>    fs lsm <path/mountpoint>   --> volumeA
>>    fs lq   <path/mountpoint>   --> volumeZ
>>
>> Confirmation:
>>    cd <path/mountpoint>
>>    ls
>>            ----- returns list of files/directories stored in volumeZ
>>
>> The mount point is correct; that is, fs lsm returns the expected volume
>> name.
>> The volume accessed at the mount point is incorrect.
>> The files/directories in the incorrectly accessed volume are correct.
>>
>> -------------------------------
>> We turned up forty plus instances of  X-mounted (for lack of a better
>> word) volumes.
>>
>> The fix:
>>    remove the mount point
>>    release the volume (containing mount point)
>>    create same mount point
>>    release volume again
>>
>>    vos addsite newserver newpart _mounted_ volume (as named in mount point)
>>    vos release _mounted_ volume
>>      fs checkv
>>
>> Then get expected responses.
>>        fs lsm <path/mountpoint>   --> volumeA
>>        fs lq   <path/mountpoint>   --> volumeA
>>
>> ========================
>>
>> Other efforts:
>>
>> I did restart the fs instances on all file servers, suspecting some sort
>> of off-by-one'ish glitch in some unknown index/table/?
>>
>> The restarts had no impact.
>>
>> 'vos move" of the volume containing the mount point did not help.
>>
>> -----------------
>>
>>
>> _______________________________________________
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>