OpenAFS Master Repository branch, master, updated. openafs-devel-1_9_2-84-ge022864

Gerrit Code Review gerrit@openafs.org
Tue, 21 Jan 2025 15:08:15 -0500


The following commit has been merged in the master branch:
commit e022864c923d4936ed68d01d89eef1df19bbe303
Author: Andrew Deason <adeason@sinenomine.net>
Date:   Thu Oct 17 00:13:23 2024 -0500

    volser: Avoid 'vos release' failure partial unlock
    
    Currently, if UV_ReleaseVolume() in 'vos release' fails to release the
    volume to some sites, we update the vlentry for the volume by calling
    VLDB_ReplaceEntry with LOCKREL_TIMESTAMP. This clears the lock timestamp
    in the vlentry, but not the lock flags (VLOP_RELEASE). We then fully
    unlock the vlentry later during cleanup after the 'rfail' label by
    calling ubik_VL_ReleaseLock().
    
    The timestamp-only unlock via VLDB_ReplaceEntry is unusual; all other
    code paths in OpenAFS clear a vlentry lock by clearing all lock-related
    fields at the same time (LOCKREL_OPCODE | LOCKREL_AFSID |
    LOCKREL_TIMESTAMP).
    
    If only the lock timestamp is cleared and specifically the VLOP_RELEASE
    flag is set, future calls to VL_SetLock() cause the vlserver to return a
    different error code: VL_RERELEASE instead of VL_ENTRYLOCKED. If
    UV_ReleaseVolume() sees VL_RERELEASE, we act as if the vlentry was
    locked successfully; the intention was probably to allow a subsequent
    'vos release' to fix the partially-failed release.
    
    This scheme has a few problems:
    
    If a volume is partially locked, multiple 'vos release' processes could
    see the VL_RERELEASE error for the same volume, and all of them would
    assume they have locked the vlentry. This would bypass the exclusion
    that vlentry locks normally provide, and allows multiple processes to go
    through the 'vos release' process for the same volume at the same time,
    causing various problems.
    
    Similarly, a second 'vos release' process could see the VL_RERELEASE
    error while a first 'vos release' process is still running its cleanup.
    The first process would then fully unlock the vlentry while the second
    process is running the release, causing the release to run with a fully
    unlocked volume. This would allow any 'vos' operation against the volume
    to run at the same time as the release, potentially causing various
    problems.
    
    Furthermore, it's rare for a vlentry to be left in this partially-locked
    state. Since UV_ReleaseVolume() fully unlocks the vlentry during
    cleanup, the 'vos release' process would need to die before running the
    last ubik_VL_ReleaseLock(), or the last ubik_VL_ReleaseLock() would need
    to fail. Even if the VL_RERELEASE scheme worked well, it almost always
    is not used, since the vlentry is usually left completely unlocked after
    'vos release' runs, even after a failure.
    
    It's possible that the details of this scheme used to work differently,
    perhaps avoiding the full unlock after a partial release failure.
    However, the current behavior has existed for quite a long time (since
    before OpenAFS 1.0), so it seems unlikely that anyone still expects
    whatever the original behavior was.
    
    To avoid all of this nonsense, just don't pass LOCKREL_TIMESTAMP during
    the failure-related VLDB_ReplaceEntry() call. Instead, the vlentry
    remains locked until we fully unlock it during the normal cleanup
    processing.
    
    Do the same for the equivalent code path in libadmin's
    UV_ReleaseVolume(). After this commit, all callers that pass any
    LOCKREL_* flags pass all of them at the same time, so partial unlocks
    should no longer be possible.
    
    For now, don't remove the special handling of the VL_RERELEASE error
    code after trying to lock a vlentry. Maybe this can be removed in the
    future, but for now, keep this to retain existing behavior if there are
    existing partially-locked entries in the VLDB.
    
    Thanks for mvitale@sinenomine.net for related work drawing attention to
    this.
    
    Change-Id: I6b053f88c923684bd95df3e2022e9cf125e45d04
    Reviewed-on: https://gerrit.openafs.org/15867
    Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
    Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
    Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
    Reviewed-by: Andrew Deason <adeason@sinenomine.net>
    Tested-by: BuildBot <buildbot@rampaginggeek.com>

 src/libadmin/vos/vsprocs.c | 2 +-
 src/volser/vsprocs.c       | 4 +---
 2 files changed, 2 insertions(+), 4 deletions(-)

-- 
OpenAFS Master Repository