[OpenAFS] - 50% SOLVED - Re: - Locked volumes

ProbaNet info@probanet.it
Fri, 09 Mar 2012 15:46:03 +0100


Il giorno mer, 07/03/2012 alle 11.33 -0600, Andrew Deason ha scritto:

> Can you make any modifications to the vldb or ptdb from this machine?
> Just as an example... 'vos addsite' or 'pts createuser' ?

No, commands stay stuck forever until all dbservers are dbcurrent=1
(udebug).

> > for server afsrm1 (dbcurrent=0, up=1 beaconSince=1). Recovery state "f".
> > No propagation triggered.. We don't understand why..
> 
> One possible cause may be due to packets not getting through. For an

Exactely, you pointed us in the right direction, thanks a lot! :)

> operation like 'vos lock'/'vos unlock' we should only need to contact
> the sync site, so the db being out of sync should not matter. However,
> both dbcurrent=0 and a hang could be symptoms of not being able to
> communicate with the sync site.

It's really strange because commands (vos unlock, vos addsite, etc) were
issued on afsmn1 (the sync site) and all commands like 'bos status
afsmn1', 'vos status afsmn1', 'udebug afsmn1 ptserver/vlserver',
'rxdebug afsmn1 7000/7001/7002/7003' etc seemed to be fine.

> If you want to quickly test this, you can run udebug against each of the
> dbservers from each of the other dbservers. If beacons are getting
> through but not database updates, though, maybe there could be an issue
> where only packets over a certain size aren't getting through or
> something, hmm...

Yes, I fear so.. We did all the udebug tests (from all servers to all
servers) and everithing was ok.

> You mention above there are at least _some_ messages in the log. What
> are they?

All messages was kind of 'remote server X voted "yes" on date Y',
'Received beacon=1 from server X', etc etc.. Nothing strange (debug
level 125).

> Also, who is the sync site, according to 'udebug' ? Would you be willing
> to provide the 'udebug' output?

The sync site is afsmn1. I don't have the old udebug log and now we
changed the situation:
- all db server other than afsmn1 (afsmn2, afsmn3, afsrm1, afsor1) are
now clones
- all BosConfig parameters are now standard (or almost). For example, on
afsmn1 and afsrm1 (no -sendsize/-udpsize/-cb in fileserver):
---------- on afsmn1 ----------
restarttime 16 0 0 0 0
checkbintime 16 0 0 0 0
bnode simple ptserver 1
parm /usr/lib/openafs/ptserver -p 16
end
bnode simple vlserver 1
parm /usr/lib/openafs/vlserver -p 16
end
bnode fs fs 1
parm /usr/lib/openafs/fileserver -L -vattachpar 64 -realm XXX
parm /usr/lib/openafs/volserver -p 16
parm /usr/lib/openafs/salvager -parallel 4
end
-------------------------------

---------- on afsrm1 ----------
restarttime 16 0 0 0 0
checkbintime 16 0 0 0 0
bnode fs fs 1
parm /usr/lib/openafs/fileserver -L -vattachpar 256 -realm XXX
parm /usr/lib/openafs/volserver
parm /usr/lib/openafs/salvager -parallel 4
end
bnode simple ptserver 1
parm /usr/lib/openafs/ptserver
end
bnode simple vlserver 1
parm /usr/lib/openafs/vlserver
end
-------------------------------

----------
udebug afsmn1 vlserver
Host's addresses are: 192.168.10.12 
Host's 192.168.10.12 time is Fri Mar  9 15:41:33 2012
Local time is Fri Mar  9 15:41:36 2012 (time differential 3 secs)
Last yes vote for 192.168.10.12 was 0 secs ago (sync site); 
Last vote started 0 secs ago (at Fri Mar  9 15:41:36 2012)
Local db version is 1331303687.5
I am sync site until 60 secs from now (at Fri Mar  9 15:42:36 2012) (1
server)
Recovery state 1f
Sync site's db version is 1331303687.5
0 locked pages, 0 of them for write
Last time a new db version was labelled was:
	 406 secs ago (at Fri Mar  9 15:34:50 2012)

Server (192.168.12.12): (db 1331303687.5)    is only a clone!
    last vote rcvd 0 secs ago (at Fri Mar  9 15:41:36 2012),
    last beacon sent 0 secs ago (at Fri Mar  9 15:41:36 2012), last vote
was yes
    dbcurrent=1, up=1 beaconSince=1

Server (192.168.11.12): (db 1331303687.5)    is only a clone!
    last vote rcvd 0 secs ago (at Fri Mar  9 15:41:36 2012),
    last beacon sent 0 secs ago (at Fri Mar  9 15:41:36 2012), last vote
was yes
    dbcurrent=1, up=1 beaconSince=1

Server (192.168.10.17): (db 1331303687.5)    is only a clone!
    last vote rcvd 0 secs ago (at Fri Mar  9 15:41:36 2012),
    last beacon sent 0 secs ago (at Fri Mar  9 15:41:36 2012), last vote
was yes
    dbcurrent=1, up=1 beaconSince=1

Server (192.168.10.14): (db 1331303687.5)    is only a clone!
    last vote rcvd 0 secs ago (at Fri Mar  9 15:41:36 2012),
    last beacon sent 0 secs ago (at Fri Mar  9 15:41:36 2012), last vote
was yes
    dbcurrent=1, up=1 beaconSince=1

----------

Scenario:
- afsmn1, afsmn3, afsor1: local KVM virutalizations, dbservers + very
small fileservers (< 100 volumes, iSCSI SAN disks)
- afsmn2: local real server, dbserver + very small fileserver (< 100
volumes)
- afsrm1: remote real server, dbserver + fileserver (around 6000
volumes)
- afsmn4, afsmn5: local KVM virtualizations, dafileservers (30000+
volumes, iSCSI SAN disks)
- local net and remote net (only afsrm1 on remote net) are connected via
OpenVPN
- on local net we use shorewall (frontend to iptables) with no
restrictions between local and remote net (pings are ok,
bos/vos/udebug/rxdebug commands are fine)
- on remote new there is no firewalling

NOW

The situation is quite stable, but we experience frequent timeouts in
AFS operations between local net and afsrm1 (ls, cp, open file, etc.).
SSH connections and ADSL + VPN in general is stable and good.. We are
also unable to perform 'vos release' commands from afsrm1 to local
servers. Example:

vos examine c.a16pa003l_4
c.a16pa003l_4                     536918990 RW      60466 K  On-line
    afsrm1.xxx /vicepa 
    RWrite  536918990 ROnly  536918991 Backup          0 
    MaxQuota          0 K 
    Creation    Tue Mar 16 13:42:41 2010
    Copy        Wed Mar 17 21:35:19 2010
    Backup      Never
    Last Access Fri Mar  9 15:06:05 2012
    Last Update Fri Mar  9 11:20:01 2012
    740 accesses in the past day (i.e., vnode references)

    RWrite: 536918990     ROnly: 536918991     RClone: 536918991 
    number of sites -> 2
       server afsrm1.xxx partition /vicepa RW Site  -- New release
       server afsrm1.xxx partition /vicepa RO Site  -- New release
       server afsmn5.xxx partition /vicepn RO Site  -- Not released


vos release c.a16pa003l_4 -verbose
c.a16pa003l_4 
    RWrite: 536918990     ROnly: 536918991 
    number of sites -> 3
       server afsrm1.watersoil.net partition /vicepa RW Site 
       server afsrm1.watersoil.net partition /vicepa RO Site 
       server afsmn5.watersoil.net partition /vicepn RO Site  -- Not
released
This is a complete release of volume 536918990
Recloning RW volume 536918991... done
Getting status of RW volume 536918990... done
Ending cloning transaction on RW volume 536918990... done
Starting transaction on RO clone volume 536918991... done
Setting volume flags for volume 536918991... done
Ending transaction on volume 536918991... done
Replacing VLDB entry for c.a16pa003l_4... done
Starting transaction on cloned volume 536918991... done
Creating new volume 536918991 on replication site afsmn5.watersoil.net:
done
Starting ForwardMulti from 536918991 to 536918991 on
afsmn5.watersoil.net (full release).

[it stucks there..]


vos status afsmn5
Total transactions: 1
--------------------------------------
transaction: 94  created: Fri Mar  9 14:37:58 2012
lastActiveTime: Fri Mar  9 14:44:05 2012
volumeStatus: 
volume: 536918991  partition: /vicepa  procedure: Restore
packetRead: 40  lastReceiveTime: Fri Mar  9 15:01:27 2012
packetSend: 1  lastSendTime: Fri Mar  9 15:01:27 2012
--------------------------------------

After some minutes, the 'packetRead' count is still 40.. (and sometimes
operation hangs on 'procedure: GetStatus').

Thanks again for your help, any suggestion is appreciated! :)

Stefano
Fabio