From xud@ihep.ac.cn Thu Apr 1 03:33:07 2010
From: xud@ihep.ac.cn (=?gb2312?B?0O22rA==?=)
Date: Thu, 1 Apr 2010 10:33:07 +0800
Subject: [OpenAFS] volume 536871264 is busy or server is down, recheck
Message-ID: <201004011033061568426@ihep.ac.cn>
This is a multi-part message in MIME format.
--=====003_Dragon421077404754_=====
Content-Type: text/plain;
charset="gb2312"
Content-Transfer-Encoding: 7bit
Hi,
I want to know how many parallel read requests for one volume at the same time? or how many parallel read requests for one replication volume at the same time?
In our afs system, there are about one hundred people to read a volume parallelly, and each people will issus about 500 read requests. I found the afs client's /var/log/message file often appear some error information, such as "volume 536871264 is busy or server is down, recheck ".
so, I want to know its reason.
Thank you!
With best regards !
Yours sincerely
Dong xu
--=====003_Dragon421077404754_=====
Content-Type: text/html;
charset="gb2312"
Content-Transfer-Encoding: 7bit
Hi,
I want to know how many parallel read
requests for one volume at the same time? or how many
parallel read requests for one replication volume at the same time?
In our afs system, there are about one
hundred people to read a volume parallelly, and each people will issus
about 500 read requests. I found the afs client's /var/log/message file
often appear some error information, such as
"volume 536871264 is busy or server is down, recheck
".
so, I want to know its reason.
Thank you!
With best regards !
Yours sincerely
Dong
xu
--=====003_Dragon421077404754_=====--
From adeason@sinenomine.net Thu Apr 1 04:58:01 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Wed, 31 Mar 2010 22:58:01 -0500
Subject: [OpenAFS] Re: volume 536871264 is busy or server is down, recheck
References: <201004011033061568426@ihep.ac.cn>
Message-ID: <20100331225801.75bc666f.adeason@sinenomine.net>
On Thu, 1 Apr 2010 10:33:07 +0800
"许冬" wrote:
> Hi,
>
> I want to know how many parallel read requests for one volume at the
> same time? or how many parallel read requests for one replication
> volume at the same time?
There is no real useful answer for that. There is a limit, however, to
the number of outstanding requests on a server (described below).
If 100 requests in parallel is your only load on that server, though,
the server should be able to handle that just fine.
> In our afs system, there are about one hundred people to read a
> volume parallelly, and each people will issus about 500 read
> requests.
Are these on different AFS clients, or all on the same one?
> I found the afs client's /var/log/message file often appear some
> error information, such as "volume 536871264 is busy or server is
> down, recheck ".
>
> so, I want to know its reason.
Do you mean the message "Waiting for busy volume 536871264"? There are a
number of reasons for that; only one I can think of is caused by load.
Is there anything in FileLog or VolserLog around the time that you see
these messages?
I believe it is possible to get this message if the server is overloaded
(possibly for a long enough period of time). If the number of
outstanding calls waiting for a servicing thread ("calls waiting")
exceeds a certain threshold, the client will get an error that can
result in that message. This threshold is by default the -rxpck setting
multiplied by 3/2. The default -rxpck setting is 150 (200 for -L), so
the default threshold is around 225 (or 300 for -L).
You can see how many calls are waiting for a thread with 'rxdebug
'. For example:
% rxdebug 192.168.1.100
Trying 192.168.1.100 (port 7000):
Free packets: 265, packet reclaims: 0, calls: 0, used FDs: 34
not waiting for packets.
0 calls waiting for a thread
11 threads are idle
Done.
shows that there are "0 calls waiting for a thread".
--
Andrew Deason
adeason@sinenomine.net
From jaltman@secure-endpoints.com Thu Apr 1 05:55:27 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Thu, 01 Apr 2010 00:55:27 -0400
Subject: [OpenAFS] volume 536871264 is busy or server is down, recheck
In-Reply-To: <201004011033061568426@ihep.ac.cn>
References: <201004011033061568426@ihep.ac.cn>
Message-ID: <4BB4273F.2090809@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms090908040804010104090601
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 3/31/2010 10:33 PM, =E8=AE=B8=E5=86=AC wrote:
> Hi,
> =20
> I want to know how many parallel read requests for one volume at the
> same time? or how many parallel read requests for one replication volum=
e
> at the same time?
> =20
> In our afs system, there are about one hundred people to read a volume
> parallelly, and each people will issus about 500 read requests. I found=
> the afs client's /var/log/message file often appear some error
> information, such as "volume 536871264 is busy or server is down, reche=
ck ".
> =20
> so, I want to know its reason.
> =20
> Thank you!
> =20
>=20
> With best regards !
>=20
> =20
>=20
> Yours sincerely
>=20
> =20
>=20
> Dong xu
>=20
Which operating system?
Which OpenAFS client version and which server version?
Are the clients only reading from the volume or are they writing as well?=
Jeffrey Altman
--------------ms090908040804010104090601
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MDEwNDU1MjdaMCMGCSqGSIb3DQEJBDEWBBQKdbpS
4Q9qIuDU9QYggLucQk5jwjBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAI8M0izFg8PlY70AJ1tJO89h+uo7WjCjxNCh
3VucK79k1qxTBf/Fokb4odMEnfSO3MMuVw5usrkQukpzx7qRgHVydkHkPIOXNUT+EZ8p0zM9
6BW1pW36crJfAG2wOO38yI5PJrIIjQLxZGw+cXiIUCHMP/H0Au6+POznZukc2t7uFvKNc0f0
G5vCnzD/fBA3//p0x+jQi96oxVgckTiJfD50VFmkaQM3LToOnce4pAeG41MRRnKhwy6Zbqow
8z4afgq1vBuwfz5pAPKw6OpyitT+oJ+5xruaGmYahWWrJx0BYsy04Xm7DHA7QbGb3bn4GRki
T2ViDKPwwNdzMHLKplAAAAAAAAA=
--------------ms090908040804010104090601--
From bbense@slac.stanford.edu Thu Apr 1 20:44:23 2010
From: bbense@slac.stanford.edu (Booker Bense)
Date: Thu, 1 Apr 2010 12:44:23 -0700 (PDT)
Subject: [OpenAFS] volume 536871264 is busy or server is down, recheck
In-Reply-To: <4BB4273F.2090809@secure-endpoints.com>
References: <201004011033061568426@ihep.ac.cn> <4BB4273F.2090809@secure-endpoints.com>
Message-ID:
On Thu, 1 Apr 2010, Jeffrey Altman wrote:
> On 3/31/2010 10:33 PM, ?? wrote:
>> Hi,
>>
>> I want to know how many parallel read requests for one volume at the
>> same time? or how many parallel read requests for one replication volume
>> at the same time?
>>
>> In our afs system, there are about one hundred people to read a volume
>> parallelly, and each people will issus about 500 read requests. I found
>> the afs client's /var/log/message file often appear some error
>> information, such as "volume 536871264 is busy or server is down, recheck ".
>>
Our experience is that AFS and a large batch farm is a denial of
service waiting to happen for rw volumes. What happens
is that each batch process registers a callback for volume it is
writing to and eventually the server gets starved for available
threads and all the volumes served by that server suffer
performance hits. Essentially the read requests are limited by
the number of threads on the server for the volume.
We have a constant user education problem with this, especially
since the tipping point doesn't get triggered until the user is
sure everything is working and "scales up" their runs to several
hundred simultaneous batch jobs.
In theory a read only replica volume should not be nearly as
resource intensive. However, we have found this is rarely
the case.
I suspect your real problem is that the jobs are opening dot
files or configuration/logging files in some volume that is also
on the same server as the volume you are reading from. Most
applications have some library that assumes reading/writing to
small files in the home directory will never be a problem.
AFS scales really well under the assumption of many machines each
accessing different volumes, it crashes and burns when the
scenario switches to many machines accessing the same volume.
_ Booker C. Bense
From Richard.Brittain@dartmouth.edu Fri Apr 2 21:17:25 2010
From: Richard.Brittain@dartmouth.edu (Richard Brittain)
Date: Fri, 2 Apr 2010 16:17:25 -0400 (EDT)
Subject: [OpenAFS] Specify size reported by 'df' ?
In-Reply-To:
References: <201004011033061568426@ihep.ac.cn> <4BB4273F.2090809@secure-endpoints.com>
Message-ID:
Hi,
I'm wondering if anyone has tried to customize the (fake) size reported
by 'df', and specifically if anyone has looked into how hard it might be
to make that configurable per-client, with something like a root-only
'fs setdfsize' ?
We occasionally run into problems with the 9000000 k value when some tool
wants to start dumping 10GB into AFS and decides to check first.
Richard
--
Richard Brittain, Research Computing Group,
Kiewit Computing Services, 6224 Baker/Berry Library
Dartmouth College, Hanover NH 03755
Richard.Brittain@dartmouth.edu 6-2085
From fbo2@gmx.net Sat Apr 3 09:06:02 2010
From: fbo2@gmx.net (Frank Burkhardt)
Date: Sat, 3 Apr 2010 10:06:02 +0200
Subject: [OpenAFS] Specify size reported by 'df' ?
In-Reply-To:
References: <201004011033061568426@ihep.ac.cn> <4BB4273F.2090809@secure-endpoints.com>
Message-ID: <20100403080602.GA2978@postman.alpha>
Hi,
On Fri, Apr 02, 2010 at 04:17:25PM -0400, Richard Brittain wrote:
> Hi,
> I'm wondering if anyone has tried to customize the (fake) size
> reported by 'df', and specifically if anyone has looked into how hard it
> might be to make that configurable per-client, with something like a
> root-only
> 'fs setdfsize' ?
>
> We occasionally run into problems with the 9000000 k value when some tool
> wants to start dumping 10GB into AFS and decides to check first.
I've got a similiar problem here. For MacOSX, I've to compile AFS myself -
changing the free-space-constant before that. Otherwise, our beloved
"Finder" refuses to copy largish data sets (which I have to move around a
lot) into AFS.
However, another fs subcommand might not be necessary - just increasing the
reported free space to 2TiB-1Block should be sufficient since most volumes'
quota is considerably smaller than that.
Are there any programs known to break when reported free space is that high?
Regards,
Frank
From shadow@gmail.com Sat Apr 3 16:34:19 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Sat, 3 Apr 2010 11:34:19 -0400
Subject: [OpenAFS] Specify size reported by 'df' ?
In-Reply-To: <20100403080602.GA2978@postman.alpha>
References: <201004011033061568426@ihep.ac.cn>
<4BB4273F.2090809@secure-endpoints.com>
<20100403080602.GA2978@postman.alpha>
Message-ID:
On Sat, Apr 3, 2010 at 4:06 AM, Frank Burkhardt wrote:
> Hi,
>
> On Fri, Apr 02, 2010 at 04:17:25PM -0400, Richard Brittain wrote:
>> Hi,
>> =A0 =A0I'm wondering if anyone has tried to customize the (fake) size
>> reported by 'df', and specifically if anyone has looked into how hard it
>> might be to make that configurable per-client, with something like a
>> root-only
>> 'fs setdfsize' ?
>>
>> We occasionally run into problems with the 9000000 k value when some too=
l
>> wants to start dumping 10GB into AFS and decides to check first.
>
> I've got a similiar problem here. For MacOSX, I've to compile AFS myself =
-
> changing the free-space-constant before that. Otherwise, our beloved
> "Finder" refuses to copy largish data sets (which I have to move around a
> lot) into AFS.
MacOS reports 1TB free for a while now. If you're trying to copy more
than that in, well, let's just say I don't have a terabyte laying
around to play with.
From sac@cheesecake.org Sun Apr 4 12:00:17 2010
From: sac@cheesecake.org (Sidney Cammeresi)
Date: Sun, 4 Apr 2010 13:00:17 +0200
Subject: [OpenAFS] Vos move/release failure
Message-ID: <20100404110017.GA12901@cheesecake.org>
One of my fileservers is returning errors when I try to move or release
certain volumes to it. For example,
----------
$ vos ex test
test 536871126 RW 2 K On-line
good /vicepa
RWrite 536871126 ROnly 536871133 Backup 0
MaxQuota 5000 K
Creation Sun Apr 4 12:40:27 2010
Copy Sun Apr 4 12:41:53 2010
Backup Never
Last Update Never
RWrite: 536871126
number of sites -> 1
server good partition /vicepa RW Site
$ vos move test good a bad a
Failed to move data for the volume 536871126
: No such file or directory
vos move: operation interrupted, cleanup in progress...
clear transaction contexts
move incomplete - attempt cleanup of target partition - no guarantee
cleanup complete - user verify desired result
----------
All I see in VolserLog on the bad server is
Sun Apr 4 12:50:09 2010 VAttachVolume: Failed to open /vicepa/V0536871126.vol (errno 2)
Sun Apr 4 12:50:09 2010 1 Volser: CreateVolume: volume 536871126 (test) created
Sun Apr 4 12:50:12 2010 1 Volser: Delete: volume 536871126 deleted
I deleted the volume "test," and then I performed this sequence of
commands to get a different failure:
----------
$ vos create bad a test
Volume 536871134 created on partition /vicepa of bad
$ vos addsite bad a test
Added replication site bad /vicepa for volume test
$ vos addsite good a test
Added replication site good /vicepa for volume test
$ vos release test
Released volume test successfully
$ vos move test bad a good a
WARNING : readOnly copies still exist
Volume 536871134 moved from bad /vicepa to good /vicepa
$ vos release test
Release failed: VOLSER: Problems encountered in doing the dump !
The volume 536871134 could not be released to the following 1 sites:
bad /vicepa
VOLSER: release could not be completed
Error in vos release command.
VOLSER: release could not be completed
----------
Does anyone have any idea what's going on here? Thanks for your
suggestions.
--
Sidney August Cammeresi IV
http://www.cheesecake.org/sac/
From jason@rampaginggeek.com Sun Apr 4 18:14:57 2010
From: jason@rampaginggeek.com (Jason Edgecombe)
Date: Sun, 04 Apr 2010 13:14:57 -0400
Subject: [OpenAFS] Vos move/release failure
In-Reply-To: <20100404110017.GA12901@cheesecake.org>
References: <20100404110017.GA12901@cheesecake.org>
Message-ID: <4BB8C911.5010604@rampaginggeek.com>
Sidney Cammeresi wrote:
> One of my fileservers is returning errors when I try to move or release
> certain volumes to it. For example,
>
> ----------
>
> $ vos ex test
> test 536871126 RW 2 K On-line
> good /vicepa
> RWrite 536871126 ROnly 536871133 Backup 0
> MaxQuota 5000 K
> Creation Sun Apr 4 12:40:27 2010
> Copy Sun Apr 4 12:41:53 2010
> Backup Never
> Last Update Never
>
> RWrite: 536871126
> number of sites -> 1
> server good partition /vicepa RW Site
> $ vos move test good a bad a
>
> Failed to move data for the volume 536871126
> : No such file or directory
> vos move: operation interrupted, cleanup in progress...
> clear transaction contexts
> move incomplete - attempt cleanup of target partition - no guarantee
> cleanup complete - user verify desired result
>
> ----------
>
> All I see in VolserLog on the bad server is
>
> Sun Apr 4 12:50:09 2010 VAttachVolume: Failed to open /vicepa/V0536871126.vol (errno 2)
> Sun Apr 4 12:50:09 2010 1 Volser: CreateVolume: volume 536871126 (test) created
> Sun Apr 4 12:50:12 2010 1 Volser: Delete: volume 536871126 deleted
>
>
>
> I deleted the volume "test," and then I performed this sequence of
> commands to get a different failure:
>
> ----------
>
> $ vos create bad a test
> Volume 536871134 created on partition /vicepa of bad
> $ vos addsite bad a test
> Added replication site bad /vicepa for volume test
> $ vos addsite good a test
> Added replication site good /vicepa for volume test
> $ vos release test
> Released volume test successfully
> $ vos move test bad a good a
> WARNING : readOnly copies still exist
> Volume 536871134 moved from bad /vicepa to good /vicepa
> $ vos release test
> Release failed: VOLSER: Problems encountered in doing the dump !
> The volume 536871134 could not be released to the following 1 sites:
> bad /vicepa
> VOLSER: release could not be completed
> Error in vos release command.
> VOLSER: release could not be completed
>
> ----------
>
> Does anyone have any idea what's going on here? Thanks for your
> suggestions.
>
What platform, OS, and fileystem do the servers run?
Have you tried salvaging the servers (requires downtime), if not, run:
bos salvage -server good -part /vicepa
bos salvage -server bad -part /vicepa
Jason
From shadow@gmail.com Sun Apr 4 18:21:07 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Sun, 4 Apr 2010 13:21:07 -0400
Subject: [OpenAFS] Vos move/release failure
In-Reply-To: <20100404110017.GA12901@cheesecake.org>
References: <20100404110017.GA12901@cheesecake.org>
Message-ID:
On Sun, Apr 4, 2010 at 7:00 AM, Sidney Cammeresi wrote=
:
> One of my fileservers is returning errors when I try to move or release
> certain volumes to it. =A0For example,
>
> ----------
>
> $ vos ex test
> test =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0536871126=
RW =A0 =A0 =A0 =A0 =A02 K =A0On-line
> =A0 =A0good /vicepa
> =A0 =A0RWrite =A0536871126 ROnly =A0536871133 Backup =A0 =A0 =A0 =A0 =A00
> =A0 =A0MaxQuota =A0 =A0 =A0 5000 K
> =A0 =A0Creation =A0 =A0Sun Apr =A04 12:40:27 2010
> =A0 =A0Copy =A0 =A0 =A0 =A0Sun Apr =A04 12:41:53 2010
> =A0 =A0Backup =A0 =A0 =A0Never
> =A0 =A0Last Update Never
>
> =A0 =A0RWrite: 536871126
> =A0 =A0number of sites -> 1
> =A0 =A0 =A0 server good partition /vicepa RW Site
> $ vos move test good a bad a
>
> Failed to move data for the volume 536871126
> =A0 : No such file or directory
> vos move: operation interrupted, cleanup in progress...
> clear transaction contexts
> move incomplete - attempt cleanup of target partition - no guarantee
> cleanup complete - user verify desired result
>
> ----------
>
> All I see in VolserLog on the bad server is
>
> Sun Apr =A04 12:50:09 2010 VAttachVolume: Failed to open /vicepa/V0536871=
126.vol (errno 2)
> Sun Apr =A04 12:50:09 2010 1 Volser: CreateVolume: volume 536871126 (test=
) created
> Sun Apr =A04 12:50:12 2010 1 Volser: Delete: volume 536871126 deleted
>
>
>
> I deleted the volume "test," and then I performed this sequence of
> commands to get a different failure:
>
> ----------
>
> $ vos create bad a test
> Volume 536871134 created on partition /vicepa of bad
> $ vos addsite bad a test
> Added replication site bad /vicepa for volume test
> $ vos addsite good a test
> Added replication site good /vicepa for volume test
> $ vos release test
> Released volume test successfully
> $ vos move test bad a good a
> WARNING : readOnly copies still exist
> Volume 536871134 moved from bad /vicepa to good /vicepa
> $ vos release test
> Release failed: VOLSER: Problems encountered in doing the dump !
So you gave us one log. There are 2 servers in play. How about that other l=
og?
From sac@cheesecake.org Mon Apr 5 08:12:57 2010
From: sac@cheesecake.org (Sidney Cammeresi)
Date: Mon, 5 Apr 2010 09:12:57 +0200
Subject: [OpenAFS] Re: Vos move/release failure
In-Reply-To:
References: <20100404110017.GA12901@cheesecake.org>
Message-ID: <20100405071257.GB18095@cheesecake.org>
On Sun, 04 Apr 2010 at 13.21.07 -0400, Derrick Brashear wrote:
> On Sun, Apr 4, 2010 at 7:00 AM, Sidney Cammeresi w=
rote:
> > One of my fileservers is returning errors when I try to move or relea=
se
> > certain volumes to it. =A0For example,
> >
> > ----------
> >
> > $ vos ex test
> > test =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A053687=
1126 RW =A0 =A0 =A0 =A0 =A02 K =A0On-line
> > =A0 =A0good /vicepa
> > =A0 =A0RWrite =A0536871126 ROnly =A0536871133 Backup =A0 =A0 =A0 =A0 =
=A00
> > =A0 =A0MaxQuota =A0 =A0 =A0 5000 K
> > =A0 =A0Creation =A0 =A0Sun Apr =A04 12:40:27 2010
> > =A0 =A0Copy =A0 =A0 =A0 =A0Sun Apr =A04 12:41:53 2010
> > =A0 =A0Backup =A0 =A0 =A0Never
> > =A0 =A0Last Update Never
> >
> > =A0 =A0RWrite: 536871126
> > =A0 =A0number of sites -> 1
> > =A0 =A0 =A0 server good partition /vicepa RW Site
> > $ vos move test good a bad a
> >
> > Failed to move data for the volume 536871126
> > =A0 : No such file or directory
> > vos move: operation interrupted, cleanup in progress...
> > clear transaction contexts
> > move incomplete - attempt cleanup of target partition - no guarantee
> > cleanup complete - user verify desired result
> >
> > ----------
> >
> > All I see in VolserLog on the bad server is
> >
> > Sun Apr =A04 12:50:09 2010 VAttachVolume: Failed to open /vicepa/V053=
6871126.vol (errno 2)
> > Sun Apr =A04 12:50:09 2010 1 Volser: CreateVolume: volume 536871126 (=
test) created
> > Sun Apr =A04 12:50:12 2010 1 Volser: Delete: volume 536871126 deleted
> >
> >
> >
> > I deleted the volume "test," and then I performed this sequence of
> > commands to get a different failure:
> >
> > ----------
> >
> > $ vos create bad a test
> > Volume 536871134 created on partition /vicepa of bad
> > $ vos addsite bad a test
> > Added replication site bad /vicepa for volume test
> > $ vos addsite good a test
> > Added replication site good /vicepa for volume test
> > $ vos release test
> > Released volume test successfully
> > $ vos move test bad a good a
> > WARNING : readOnly copies still exist
> > Volume 536871134 moved from bad /vicepa to good /vicepa
> > $ vos release test
> > Release failed: VOLSER: Problems encountered in doing the dump !
>=20
> So you gave us one log. There are 2 servers in play. How about that oth=
er log?
If I do the create, addsite, addsite, release sequence that I previously
described, I get the following in the VolserLog on the good server:
Mon Apr 5 09:05:17 2010 1 Volser: CreateVolume: volume 536871141 (test) =
created
Mon Apr 5 09:06:19 2010 1 Volser: Clone: Cloning volume 536871141 to new=
volume 536871142
and the following in the VolserLog on the bad server:
Mon Apr 5 09:06:27 2010 VAttachVolume: Failed to open /vicepa/V053687114=
2.vol (errno 2)
Mon Apr 5 09:06:27 2010 1 Volser: CreateVolume: volume 536871142 (test.r=
eadonly) created
Regarding other suggestions I've received, I've tried running bos salvage=
,
which had no effect, and I have tried running vos release -v, which provi=
ded
this output:
test
RWrite: 536871141 ROnly: 536871142 RClone: 536871142
number of sites -> 3
server good partition /vicepa RW Site -- New release
server good partition /vicepa RO Site -- New release
server bad partition /vicepa RO Site -- Old release
This is a completion of a previous release
Starting transaction on cloned volume 536871142... done
Updating existing ro volume 536871142 on bad ...
Starting ForwardMulti from 536871142 to 536871142 on bad (full release).
Failed to dump volume from clone to a ro site: : No such file or director=
y
The volume 536871141 could not be released to the following 1 sites:
bad /vicepa
VOLSER: release could not be completed
Error in vos release command.
VOLSER: release could not be completed
--=20
Sidney August Cammeresi IV
http://www.cheesecake.org/sac/
From shadow@gmail.com Mon Apr 5 12:56:19 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Mon, 5 Apr 2010 07:56:19 -0400
Subject: [OpenAFS] Re: Vos move/release failure
In-Reply-To: <20100405071257.GB18095@cheesecake.org>
References: <20100404110017.GA12901@cheesecake.org>
<20100405071257.GB18095@cheesecake.org>
Message-ID:
On Mon, Apr 5, 2010 at 3:12 AM, Sidney Cammeresi wrote=
:
> On Sun, 04 Apr 2010 at 13.21.07 -0400, Derrick Brashear wrote:
>> On Sun, Apr 4, 2010 at 7:00 AM, Sidney Cammeresi wr=
ote:
>> > One of my fileservers is returning errors when I try to move or releas=
e
>> > certain volumes to it. =A0For example,
>> >
>> > ----------
>> >
>> > $ vos ex test
>> > test =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0536871=
126 RW =A0 =A0 =A0 =A0 =A02 K =A0On-line
>> > =A0 =A0good /vicepa
>> > =A0 =A0RWrite =A0536871126 ROnly =A0536871133 Backup =A0 =A0 =A0 =A0 =
=A00
>> > =A0 =A0MaxQuota =A0 =A0 =A0 5000 K
>> > =A0 =A0Creation =A0 =A0Sun Apr =A04 12:40:27 2010
>> > =A0 =A0Copy =A0 =A0 =A0 =A0Sun Apr =A04 12:41:53 2010
>> > =A0 =A0Backup =A0 =A0 =A0Never
>> > =A0 =A0Last Update Never
>> >
>> > =A0 =A0RWrite: 536871126
>> > =A0 =A0number of sites -> 1
>> > =A0 =A0 =A0 server good partition /vicepa RW Site
>> > $ vos move test good a bad a
>> >
>> > Failed to move data for the volume 536871126
>> > =A0 : No such file or directory
>> > vos move: operation interrupted, cleanup in progress...
>> > clear transaction contexts
>> > move incomplete - attempt cleanup of target partition - no guarantee
>> > cleanup complete - user verify desired result
>> >
>> > ----------
>> >
>> > All I see in VolserLog on the bad server is
>> >
>> > Sun Apr =A04 12:50:09 2010 VAttachVolume: Failed to open /vicepa/V0536=
871126.vol (errno 2)
>> > Sun Apr =A04 12:50:09 2010 1 Volser: CreateVolume: volume 536871126 (t=
est) created
>> > Sun Apr =A04 12:50:12 2010 1 Volser: Delete: volume 536871126 deleted
>> >
>> >
>> >
>> > I deleted the volume "test," and then I performed this sequence of
>> > commands to get a different failure:
>> >
>> > ----------
>> >
>> > $ vos create bad a test
>> > Volume 536871134 created on partition /vicepa of bad
>> > $ vos addsite bad a test
>> > Added replication site bad /vicepa for volume test
>> > $ vos addsite good a test
>> > Added replication site good /vicepa for volume test
>> > $ vos release test
>> > Released volume test successfully
>> > $ vos move test bad a good a
>> > WARNING : readOnly copies still exist
>> > Volume 536871134 moved from bad /vicepa to good /vicepa
>> > $ vos release test
>> > Release failed: VOLSER: Problems encountered in doing the dump !
>>
>> So you gave us one log. There are 2 servers in play. How about that othe=
r log?
>
> If I do the create, addsite, addsite, release sequence that I previously
> described, I get the following in the VolserLog on the good server:
>
> Mon Apr =A05 09:05:17 2010 1 Volser: CreateVolume: volume 536871141 (test=
) created
> Mon Apr =A05 09:06:19 2010 1 Volser: Clone: Cloning volume 536871141 to n=
ew volume 536871142
>
> and the following in the VolserLog on the bad server:
>
> Mon Apr =A05 09:06:27 2010 VAttachVolume: Failed to open /vicepa/V0536871=
142.vol (errno 2)
> Mon Apr =A05 09:06:27 2010 1 Volser: CreateVolume: volume 536871142 (test=
.readonly) created
Yeah, but you're still not telling us about the source server's log
during the release to the "bad" server. What appears in the good
server's log at 9:06:27?
What OpenAFS versions are in play, again?
From sac@cheesecake.org Mon Apr 5 15:10:25 2010
From: sac@cheesecake.org (Sidney Cammeresi)
Date: Mon, 5 Apr 2010 16:10:25 +0200
Subject: [OpenAFS] Re: Re: Vos move/release failure
In-Reply-To:
References: <20100404110017.GA12901@cheesecake.org> <20100405071257.GB18095@cheesecake.org>
Message-ID: <20100405141025.GA24644@cheesecake.org>
On Mon, 05 Apr 2010 at 07.56.19 -0400, Derrick Brashear wrote:
> On Mon, Apr 5, 2010 at 3:12 AM, Sidney Cammeresi w=
rote:
> > On Sun, 04 Apr 2010 at 13.21.07 -0400, Derrick Brashear wrote:
> >> On Sun, Apr 4, 2010 at 7:00 AM, Sidney Cammeresi wrote:
> >> > One of my fileservers is returning errors when I try to move or re=
lease
> >> > certain volumes to it. =A0For example,
> >> >
> >> > ----------
> >> >
> >> > $ vos ex test
> >> > test =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A053=
6871126 RW =A0 =A0 =A0 =A0 =A02 K =A0On-line
> >> > =A0 =A0good /vicepa
> >> > =A0 =A0RWrite =A0536871126 ROnly =A0536871133 Backup =A0 =A0 =A0 =A0=
=A00
> >> > =A0 =A0MaxQuota =A0 =A0 =A0 5000 K
> >> > =A0 =A0Creation =A0 =A0Sun Apr =A04 12:40:27 2010
> >> > =A0 =A0Copy =A0 =A0 =A0 =A0Sun Apr =A04 12:41:53 2010
> >> > =A0 =A0Backup =A0 =A0 =A0Never
> >> > =A0 =A0Last Update Never
> >> >
> >> > =A0 =A0RWrite: 536871126
> >> > =A0 =A0number of sites -> 1
> >> > =A0 =A0 =A0 server good partition /vicepa RW Site
> >> > $ vos move test good a bad a
> >> >
> >> > Failed to move data for the volume 536871126
> >> > =A0 : No such file or directory
> >> > vos move: operation interrupted, cleanup in progress...
> >> > clear transaction contexts
> >> > move incomplete - attempt cleanup of target partition - no guarant=
ee
> >> > cleanup complete - user verify desired result
> >> >
> >> > ----------
> >> >
> >> > All I see in VolserLog on the bad server is
> >> >
> >> > Sun Apr =A04 12:50:09 2010 VAttachVolume: Failed to open /vicepa/V=
0536871126.vol (errno 2)
> >> > Sun Apr =A04 12:50:09 2010 1 Volser: CreateVolume: volume 53687112=
6 (test) created
> >> > Sun Apr =A04 12:50:12 2010 1 Volser: Delete: volume 536871126 dele=
ted
> >> >
> >> >
> >> >
> >> > I deleted the volume "test," and then I performed this sequence of
> >> > commands to get a different failure:
> >> >
> >> > ----------
> >> >
> >> > $ vos create bad a test
> >> > Volume 536871134 created on partition /vicepa of bad
> >> > $ vos addsite bad a test
> >> > Added replication site bad /vicepa for volume test
> >> > $ vos addsite good a test
> >> > Added replication site good /vicepa for volume test
> >> > $ vos release test
> >> > Released volume test successfully
> >> > $ vos move test bad a good a
> >> > WARNING : readOnly copies still exist
> >> > Volume 536871134 moved from bad /vicepa to good /vicepa
> >> > $ vos release test
> >> > Release failed: VOLSER: Problems encountered in doing the dump !
> >>
> >> So you gave us one log. There are 2 servers in play. How about that =
other log?
> >
> > If I do the create, addsite, addsite, release sequence that I previou=
sly
> > described, I get the following in the VolserLog on the good server:
> >
> > Mon Apr =A05 09:05:17 2010 1 Volser: CreateVolume: volume 536871141 (=
test) created
> > Mon Apr =A05 09:06:19 2010 1 Volser: Clone: Cloning volume 536871141 =
to new volume 536871142
> >
> > and the following in the VolserLog on the bad server:
> >
> > Mon Apr =A05 09:06:27 2010 VAttachVolume: Failed to open /vicepa/V053=
6871142.vol (errno 2)
> > Mon Apr =A05 09:06:27 2010 1 Volser: CreateVolume: volume 536871142 (=
test.readonly) created
>=20
> Yeah, but you're still not telling us about the source server's log
> during the release to the "bad" server. What appears in the good
> server's log at 9:06:27?
>=20
> What OpenAFS versions are in play, again?
Sorry for omitting the OpenAFS versions. Initially the good machine
was running 1.4.2, but I have upgraded it to Lenny and the corresponding
package of 1.4.7. The bad machine is also running Lenny's 1.4.7.
There was nothing in the good server's log at 9:06:27.
Testing again, I see on good:
VolserLog
Mon Apr 5 16:02:28 2010 1 Volser: Clone: Recloning volume 536871151 to v=
olume 536871152
FileLog
Mon Apr 5 16:02:28 2010 fssync: volume 536871152 restored; breaking all =
call backs
And on bad:
VolserLog
Mon Apr 5 16:02:36 2010 VAttachVolume: Failed to open /vicepa/V053687115=
2.vol (errno 2)
Mon Apr 5 16:02:36 2010 1 Volser: CreateVolume: volume 536871152 (test.r=
eadonly) created
There is nothing in the good server's logs at 16:02:36.
--=20
Sidney August Cammeresi IV
http://www.cheesecake.org/sac/
From shadow@gmail.com Mon Apr 5 15:17:51 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Mon, 5 Apr 2010 10:17:51 -0400
Subject: [OpenAFS] Re: Re: Vos move/release failure
In-Reply-To: <20100405141025.GA24644@cheesecake.org>
References: <20100404110017.GA12901@cheesecake.org>
<20100405071257.GB18095@cheesecake.org>
<20100405141025.GA24644@cheesecake.org>
Message-ID:
>> What OpenAFS versions are in play, again?
>
> Sorry for omitting the OpenAFS versions. =A0Initially the good machine
> was running 1.4.2, but I have upgraded it to Lenny and the corresponding
> package of 1.4.7. =A0The bad machine is also running Lenny's 1.4.7.
It sounds offhand familiar, like something we fixed, but 1.4.7 is
quite old. Not as elderly as 1.4.2, but...
From sac@cheesecake.org Mon Apr 5 16:11:54 2010
From: sac@cheesecake.org (Sidney Cammeresi)
Date: Mon, 5 Apr 2010 17:11:54 +0200
Subject: [OpenAFS] Re: Vos move/release failure
In-Reply-To:
References: <20100404110017.GA12901@cheesecake.org> <20100405071257.GB18095@cheesecake.org> <20100405141025.GA24644@cheesecake.org>
Message-ID: <20100405151154.GA27158@cheesecake.org>
On Mon, 05 Apr 2010 at 10.17.51 -0400, Derrick Brashear wrote:
> > > What OpenAFS versions are in play, again?
> >=20
> > Sorry for omitting the OpenAFS versions. =A0Initially the good
> > machine was running 1.4.2, but I have upgraded it to Lenny and the
> > corresponding package of 1.4.7. =A0The bad machine is also running
> > Lenny's 1.4.7.
>=20
> It sounds offhand familiar, like something we fixed, but 1.4.7 is
> quite old. Not as elderly as 1.4.2, but...
Hmm, I upgraded both servers in question to 1.4.11 from backports.org
and am still seeing the same issue.
--=20
Sidney August Cammeresi IV
http://www.cheesecake.org/sac/
From adeason@sinenomine.net Mon Apr 5 17:49:13 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Mon, 5 Apr 2010 11:49:13 -0500
Subject: [OpenAFS] Re: Vos move/release failure
References: <20100404110017.GA12901@cheesecake.org>
Message-ID: <20100405114913.e22d190e.adeason@sinenomine.net>
On Sun, 4 Apr 2010 13:00:17 +0200
Sidney Cammeresi wrote:
> One of my fileservers is returning errors when I try to move or release
> certain volumes to it. For example,
Try 'vos dump'ing a volume from the good server, and 'vos restore'ing it
to the bad server (restore it as some new volume). What happens? I
assume one of those will fail with 'No such file or directory', probably
at the 'vos restore' step.
Also, neither of these machines has something silly like a 127.foo entry
in /etc/hosts for their hostname, do they?
--
Andrew Deason
adeason@sinenomine.net
From shadow@gmail.com Mon Apr 5 18:31:08 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Mon, 5 Apr 2010 13:31:08 -0400
Subject: [OpenAFS] Re: Vos move/release failure
In-Reply-To: <20100405114913.e22d190e.adeason@sinenomine.net>
References: <20100404110017.GA12901@cheesecake.org>
<20100405114913.e22d190e.adeason@sinenomine.net>
Message-ID:
On Mon, Apr 5, 2010 at 12:49 PM, Andrew Deason wro=
te:
> On Sun, 4 Apr 2010 13:00:17 +0200
> Sidney Cammeresi wrote:
>
>> One of my fileservers is returning errors when I try to move or release
>> certain volumes to it. =A0For example,
>
> Try 'vos dump'ing a volume from the good server, and 'vos restore'ing it
> to the bad server (restore it as some new volume). What happens? I
> assume one of those will fail with 'No such file or directory', probably
> at the 'vos restore' step.
>
> Also, neither of these machines has something silly like a 127.foo entry
> in /etc/hosts for their hostname, do they?
His last reply was private. That was it.
From sabah.salih@hep.manchester.ac.uk Mon Apr 5 18:35:06 2010
From: sabah.salih@hep.manchester.ac.uk (sabah s. salih)
Date: Mon, 5 Apr 2010 18:35:06 +0100
Subject: [OpenAFS] knfs
Message-ID: <75828F93B3A771439C6F9F4B67CFE58B0477D7108C@exchange.hep.manchester.ac.uk>
Dear All,
Is knfs is still supported by openafs.
Thanks, Sabah.
>From Sabah Salih
The School of Physics and Astronomy,
The University of Manchester,
Schuster Laboratory,
Brunswick Street,
Manchester M13 9PL.
Tel: +44 1612754171 or x4171
E-mail: sabah.salih@hep.manchester.ac.uk=
From adeason@sinenomine.net Mon Apr 5 18:54:45 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Mon, 5 Apr 2010 12:54:45 -0500
Subject: [OpenAFS] Re: Vos move/release failure
References: <20100404110017.GA12901@cheesecake.org>
<20100405114913.e22d190e.adeason@sinenomine.net>
Message-ID: <20100405125445.096cb61c.adeason@sinenomine.net>
On Mon, 5 Apr 2010 13:31:08 -0400
Derrick Brashear wrote:
> On Mon, Apr 5, 2010 at 12:49 PM, Andrew Deason
> wrote:
>
> > Also, neither of these machines has something silly like a 127.foo
> > entry in /etc/hosts for their hostname, do they?
>
> His last reply was private. That was it.
Would it be feasible for viced to refuse to register addrs in 127/8, or
at least yell at you or something?
--
Andrew Deason
adeason@sinenomine.net
From shadow@gmail.com Mon Apr 5 19:16:29 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Mon, 5 Apr 2010 14:16:29 -0400
Subject: [OpenAFS] Re: Vos move/release failure
In-Reply-To: <20100405125445.096cb61c.adeason@sinenomine.net>
References: <20100404110017.GA12901@cheesecake.org>
<20100405114913.e22d190e.adeason@sinenomine.net>
<20100405125445.096cb61c.adeason@sinenomine.net>
Message-ID:
On Mon, Apr 5, 2010 at 1:54 PM, Andrew Deason wrote:
> On Mon, 5 Apr 2010 13:31:08 -0400
> Derrick Brashear wrote:
>
>> On Mon, Apr 5, 2010 at 12:49 PM, Andrew Deason
>> wrote:
>>
>> > Also, neither of these machines has something silly like a 127.foo
>> > entry in /etc/hosts for their hostname, do they?
>>
>> His last reply was private. That was it.
>
> Would it be feasible for viced to refuse to register addrs in 127/8, or
> at least yell at you or something?
we had people using other-than-127.0.0.1 complain once about that.
yelling otoh seems plausible.
From shadow@gmail.com Mon Apr 5 19:17:25 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Mon, 5 Apr 2010 14:17:25 -0400
Subject: [OpenAFS] knfs
In-Reply-To: <75828F93B3A771439C6F9F4B67CFE58B0477D7108C@exchange.hep.manchester.ac.uk>
References: <75828F93B3A771439C6F9F4B67CFE58B0477D7108C@exchange.hep.manchester.ac.uk>
Message-ID:
On Mon, Apr 5, 2010 at 1:35 PM, sabah s. salih
wrote:
> Dear All,
> =A0 =A0 =A0 =A0 =A0 =A0Is knfs is still supported by openafs.
If it works, it works. You need OpenAFS 1.5 and certain Linux
versions, or any OpenAFS with Solaris.
YMMV.
From crestani@informatik.uni-tuebingen.de Tue Apr 6 08:10:44 2010
From: crestani@informatik.uni-tuebingen.de (Marcus Crestani)
Date: Tue, 06 Apr 2010 09:10:44 +0200
Subject: [OpenAFS] Documentation for MacOSX Preference Pane in 1.4.12
Message-ID:
I have trouble getting the new maintenance release 1.4.12 working
properly on MacOSX 10.6.3. We use Kerberos 5 to authenticate and our
user's homes are stored in AFS, so we need to get a Token at login time.
We have a working setup with OpenAFS 1.4.11 that obtains a Kerberos
Ticket during login and uses the loginwindow's LoginHook to convert the
Kerberos Ticket to an AFS Token.
After I've installed 1.4.12, this does not seem work any longer. On a
first login attempt, I do not get an token at login time. To get one, I
have to logout and login again. But then, the Finder process stalls for
a couple of minutes with a spinning wheel. Something is terribly wrong
here. I have the feeling that the new Preference Pane and our previous
solution do not work together.
To understand what happens, I could use some insight on the new
Preference Pane. Is there some documentation? Especially, it would be
useful to have answers to the following questions:
- What is the purpose of the "Backgrounder" Launch Agent
it.infn.lnf.network.AFSBackgrounder.plist? How is it connected to the
Preference Pane?
- Where does the Preference Pane store its settings?
- What happens in the background when I check/uncheck the several
options like "Get Krb5 credential at login", "Use aklog", "get
credential at login time", "Backgrounder"?
- Where does the stuff under the "Parameter" tab get stored? How does
the AFS startup get these values?
I'd appreciate any help, either ideas concerning my described problems
or answers to my questions that should help me figuring things out.
Thanks in advance!
--
Marcus
From lists@drewstud.com Tue Apr 6 14:11:35 2010
From: lists@drewstud.com (lists@drewstud.com)
Date: Tue, 6 Apr 2010 09:11:35 -0400 (EDT)
Subject: [OpenAFS] servers not establishing a quorum
Message-ID: <1270559495.93977471@192.168.2.230>
We had two afs servers and things were running great, we had a nice quorum =
and all was happy.=0AWe added an addition afs server over the weekend, and =
now none of the machines will establish a quorum. All FileLogs show the 537=
6 error code.=0ATue Apr 6 08:59:59 2010 File server starting=0ATue Apr 6 =
08:59:59 2010 /var/openafs/sysid: doesn't exist=0ATue Apr 6 08:59:59 2010 =
Creating new SysID file=0ATue Apr 6 08:59:59 2010 VL_RegisterAddrs rpc fai=
led; will retry periodically (code=3D5376, err=3D0)=0ATue Apr 6 09:00:00 2=
010 Set thread id 133 for FSYNC_sync=0ATue Apr 6 09:00:00 2010 FSYNC_sync:=
bind failed with (98), removed bogus /var/openafs/fssync.sock=0A=0Audebug =
of 7002 of all three servers:=0Ahttp://pastebin.com/SZyM4BC7=0A=0A=0AThey a=
ll show the sync host as 0.0.0.0 (which is what it gets set to when a qurou=
m cannot be established right?)=0A=0Avos listaddrs shows the two original a=
fs servers, but not the current one.=0A=0AI upped the debug level on the vl=
server and get:=0ATue Apr 6 09:09:44 2010 beacon: amSyncSite is 0=0ATue Ap=
r 6 09:09:44 2010 Received beacon type 0 from host 10.130.8.160=0ATue Apr =
6 09:09:46 2010 Received beacon from unknown host 172.20.1.26=0ATue Apr 6=
09:09:48 2010 recovery running in state 0=0ATue Apr 6 09:09:48 2010 beaco=
n: amSyncSite is 0=0ATue Apr 6 09:09:52 2010 recovery running in state 0=
=0ATue Apr 6 09:09:52 2010 beacon: amSyncSite is 0=0A=0Arepeatedly. =0AWe =
added the server to the CellSrvDB file on all afs servers, and restarted th=
em, and we got this. Also the sysid file is not being created on the new se=
rver (which iirc is because no quorum can be established). =0AI checked tim=
e, and they are all sycned within ~1 second of each other. =0A=0AWhat else =
could I be missing or need to check? I am sure it is something very simple.=
=0A=0AThank you.=0A=0A=0A
From nymano@seznam.cz Tue Apr 6 15:05:51 2010
From: nymano@seznam.cz (=?us-ascii?Q?Alena=20Manova?=)
Date: Tue, 06 Apr 2010 16:05:51 +0200 (CEST)
Subject: [OpenAFS] =?us-ascii?Q?kmod=20packages=20for=20rh=20kernel=202=2E6=2E18=2D164=2E15=2E1=2Eel5?=
In-Reply-To:
Message-ID: <969.804.1039-30045-1078124667-1270562751@seznam.cz>
Hi,
any idea when this is available in the official OpenAFS repository? the latest I can see is kmod-openafs-xen-1.4.11-1.1.2.6.18_164.11.1.el5.
thanks, Nick.
From adeason@sinenomine.net Tue Apr 6 16:02:43 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Tue, 6 Apr 2010 10:02:43 -0500
Subject: [OpenAFS] Re: servers not establishing a quorum
References: <1270559495.93977471@192.168.2.230>
Message-ID: <20100406100243.6dd186ad.adeason@sinenomine.net>
On Tue, 6 Apr 2010 09:11:35 -0400 (EDT)
lists@drewstud.com wrote:
> udebug of 7002 of all three servers:
> http://pastebin.com/SZyM4BC7
7003 is vlserver, 7002 is ptserver. But I expect the output for either
would be similar.
> They all show the sync host as 0.0.0.0 (which is what it gets set to
> when a quroum cannot be established right?)
It says they have not yet established quorum. What tells you that quorum
will _not_ be established is that all of the hosts are voting for
themselves, so they will never agree on a leader.
> vos listaddrs shows the two original afs servers, but not the current
> one.
vos listaddrs lists fileservers.
> What else could I be missing or need to check? I am sure it is
> something very simple.
'udebug 7003 -long' on each one.
It looks like they may disagree on what the dbservers are, but 'udebug
-long' will say what they think they are.
--
Andrew Deason
adeason@sinenomine.net
From shadow@gmail.com Tue Apr 6 16:14:42 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Tue, 6 Apr 2010 11:14:42 -0400
Subject: [OpenAFS] kmod packages for rh kernel 2.6.18-164.15.1.el5
In-Reply-To: <969.804.1039-30045-1078124667-1270562751@seznam.cz>
References:
<969.804.1039-30045-1078124667-1270562751@seznam.cz>
Message-ID:
On Tue, Apr 6, 2010 at 10:05 AM, Alena Manova wrote:
> Hi,
>
> any idea when this is available in the official OpenAFS repository? the latest I can see is kmod-openafs-xen-1.4.11-1.1.2.6.18_164.11.1.el5.
if you upgrade to 1.4.12 that kmod is already provided. i will copy
new kmods to 1.4.11 later today and it will probably show up, but you
probably are better off with 1.4.12.
--
Derrick
From lists@drewstud.com Tue Apr 6 18:09:09 2010
From: lists@drewstud.com (lists@drewstud.com)
Date: Tue, 6 Apr 2010 13:09:09 -0400 (EDT)
Subject: [OpenAFS] Re: servers not establishing a quorum
In-Reply-To: <20100406100243.6dd186ad.adeason@sinenomine.net>
References: <1270559495.93977471@192.168.2.230>
<20100406100243.6dd186ad.adeason@sinenomine.net>
Message-ID: <1270573749.811611352@192.168.2.230>
Here is what I get from 7003 -long on each server=0Ahttp://pastebin.com/L8B=
JtNst=0A=0ATwo of our servers have the same local db version and sync site=
db=0AThe other one has v 1.1 for both. They all have Sync host showing as =
0.0.0.0=0AWhat we have tried is removing the new afs server from CellSrvDB,=
and stopping the openafs processes on that server (putting it back to what=
it was before we tried adding it), and the two original servers were able =
to quorum.=0AOnce we added this new server back, no more quorum. =0AWe veri=
fied that the hosts files are correct and match, and they do. =0AWe also st=
opped the vlserver process on two of the servers, and left it up on one (th=
is is with the new server in CellSrvDB) and it never established a quorum, =
even though it was the only one alive.=0AWe were however able to get vos li=
staddrs to show the new fileserver, which is good, but we are still stuck. =
=0AWhat has to be in place for a quorum to take place? I am certain we are =
missing something simple here.=0A=0AThanks!=0A=0A-----Original Message-----=
=0AFrom: "Andrew Deason" =0ASent: Tuesday, April 6,=
2010 11:02=0ATo: openafs-info@openafs.org=0ASubject: [OpenAFS] Re: servers=
not establishing a quorum=0A=0AOn Tue, 6 Apr 2010 09:11:35 -0400 (EDT)=0Al=
ists@drewstud.com wrote:=0A=0A> udebug of 7002 of all three servers:=0A> ht=
tp://pastebin.com/SZyM4BC7=0A=0A7003 is vlserver, 7002 is ptserver. But I e=
xpect the output for either=0Awould be similar.=0A=0A> They all show the sy=
nc host as 0.0.0.0 (which is what it gets set to=0A> when a quroum cannot b=
e established right?)=0A=0AIt says they have not yet established quorum. Wh=
at tells you that quorum=0Awill _not_ be established is that all of the hos=
ts are voting for=0Athemselves, so they will never agree on a leader.=0A=0A=
> vos listaddrs shows the two original afs servers, but not the current=0A>=
one.=0A=0Avos listaddrs lists fileservers.=0A=0A> What else could I be mis=
sing or need to check? I am sure it is=0A> something very simple.=0A=0A'ude=
bug 7003 -long' on each one.=0A=0AIt looks like they may disagree on what t=
he dbservers are, but 'udebug=0A-long' will say what they think they are.=
=0A=0A-- =0AAndrew Deason=0Aadeason@sinenomine.net=0A=0A___________________=
____________________________=0AOpenAFS-info mailing list=0AOpenAFS-info@ope=
nafs.org=0Ahttps://lists.openafs.org/mailman/listinfo/openafs-info=0A
From stephen@physics.unc.edu Tue Apr 6 18:32:27 2010
From: stephen@physics.unc.edu (Stephen Joyce)
Date: Tue, 6 Apr 2010 13:32:27 -0400 (EDT)
Subject: [OpenAFS] Re: servers not establishing a quorum
In-Reply-To: <1270573749.811611352@192.168.2.230>
References: <1270559495.93977471@192.168.2.230> <20100406100243.6dd186ad.adeason@sinenomine.net>
<1270573749.811611352@192.168.2.230>
Message-ID:
On Tue, 6 Apr 2010, lists@drewstud.com wrote:
> ...snip...
> I am certain we are missing something simple here.
>
> Thanks!
Simple? Do you have a firewall on any of the servers? Have you configured
it to allow packets to and from the other servers on the relevant ports?
(Remember udp).
... Just a shot in the dark.
From lists@drewstud.com Tue Apr 6 18:39:28 2010
From: lists@drewstud.com (lists@drewstud.com)
Date: Tue, 6 Apr 2010 13:39:28 -0400 (EDT)
Subject: [OpenAFS] Re: servers not establishing a quorum
In-Reply-To:
References: <1270559495.93977471@192.168.2.230>
<20100406100243.6dd186ad.adeason@sinenomine.net>
<1270573749.811611352@192.168.2.230>
Message-ID: <1270575568.078330632@192.168.2.230>
There is no firwewall in between or iptables running on these machines. We =
can also udebug to each server from the other server with no problems.=0AFo=
rgot to mention these are RHEL 5.4 w/openafs 1.4.11=0AWhat is interesting i=
s udebug that they all say that each machine is the lowest host.... =0AFrom=
VLLog:=0ATue Apr 6 13:20:52 2010 Ubik: vote 'yes' for 10.130.8.160 (NOT i=
n quorum)=0ASo they are all voting for themselves....=0A=0A-----Original Me=
ssage-----=0AFrom: "Stephen Joyce" =0ASent: Tuesda=
y, April 6, 2010 13:32=0ATo: lists@drewstud.com=0ACc: openafs-info@openafs.=
org=0ASubject: RE: [OpenAFS] Re: servers not establishing a quorum=0A=0AOn =
Tue, 6 Apr 2010, lists@drewstud.com wrote:=0A> ...snip...=0A> I am certain =
we are missing something simple here.=0A>=0A> Thanks!=0A=0ASimple? Do you h=
ave a firewall on any of the servers? Have you configured =0Ait to allow pa=
ckets to and from the other servers on the relevant ports? =0A(Remember udp=
).=0A=0A... Just a shot in the dark.=0A
From adeason@sinenomine.net Tue Apr 6 18:41:08 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Tue, 6 Apr 2010 12:41:08 -0500
Subject: [OpenAFS] Re: servers not establishing a quorum
References: <1270559495.93977471@192.168.2.230>
Message-ID: <20100406124108.0631ce22.adeason@sinenomine.net>
On Tue, 6 Apr 2010 09:11:35 -0400 (EDT)
lists@drewstud.com wrote:
> Tue Apr 6 09:09:46 2010 Received beacon from unknown host 172.20.1.26
What is this address? Your udebug output shows that it expects one of
the dbservers to be 172.20.125.226, but this log shows that you're
getting pinged from 172.20.1.26.
I could ask a few more questions, but if you just show me the new
updated CellServDB for your cell and indicate which server is the new
one, it would answer them.
--
Andrew Deason
adeason@sinenomine.net
From kula@tproa.net Tue Apr 6 18:46:08 2010
From: kula@tproa.net (Thomas Kula)
Date: Tue, 6 Apr 2010 13:46:08 -0400
Subject: [OpenAFS] Re: servers not establishing a quorum
In-Reply-To: <20100406124108.0631ce22.adeason@sinenomine.net>
References: <1270559495.93977471@192.168.2.230>
<20100406124108.0631ce22.adeason@sinenomine.net>
Message-ID: <20100406174608.GD17256@mcketrick.tproa.net>
On Tue, Apr 06, 2010 at 12:41:08PM -0500, Andrew Deason wrote:
> On Tue, 6 Apr 2010 09:11:35 -0400 (EDT)
> lists@drewstud.com wrote:
>
> > Tue Apr 6 09:09:46 2010 Received beacon from unknown host 172.20.1.26
>
> What is this address? Your udebug output shows that it expects one of
> the dbservers to be 172.20.125.226, but this log shows that you're
> getting pinged from 172.20.1.26.
>
> I could ask a few more questions, but if you just show me the new
> updated CellServDB for your cell and indicate which server is the new
> one, it would answer them.
>
And, importantly, since you are dealing with dbservers, it should
be the server side CellServDB, (I tend to find it in /etc/openafs/server
around here, but your path may vary).
Also, you are trying to add a dbserver, right? Not just a simple fileserver?
--
Thomas L. Kula | kula@tproa.net | http://kula.tproa.net/
From lists@drewstud.com Tue Apr 6 18:56:50 2010
From: lists@drewstud.com (lists@drewstud.com)
Date: Tue, 6 Apr 2010 13:56:50 -0400 (EDT)
Subject: [OpenAFS] Re: servers not establishing a quorum
In-Reply-To: <20100406124108.0631ce22.adeason@sinenomine.net>
References: <1270559495.93977471@192.168.2.230>
<20100406124108.0631ce22.adeason@sinenomine.net>
Message-ID: <1270576610.396627792@192.168.2.230>
awesome.=0AThis may help as well:=0Awe have afs "pairs" at each location. W=
e are syncing them with heartbeat/drbd.=0A172.20.125.226 is the floating VI=
P for the afs server at one locaiton.=0A172.20.1.26 is the local ip of one =
of the "pairs" of the servers (currently the active node). We have the db, =
sysid, configs etc... identical on each pair, and before adding this extra =
server, failing over each pair worked great. =0AWe have tried to get it to =
only "show" the one floating vip via NetInfo, however it seemed that the fi=
leserver loved to use the ip for the hostname no matter what, and to be hon=
set, since it worked when we failed over to the other server, we did not th=
ink it was a problem.=0AFileLog=0AFileServer afs1.afs.dfw1a has address 172=
.20.1.26=0AVLLog=0ATue Apr 6 13:23:37 2010 ubik: primary address 172.20.1.=
26 does not exist=0ATue Apr 6 13:23:37 2010 Using 172.20.125.226 as my pri=
mary address=0AContents of NetInfo:=0A172.20.125.226=0A=0ACellServDB:=0A>wm=
.mlsrvr.com #Cell name=0A172.20.125.226 #afs.dfw1a.rsapps.net=0A192.168=
.125.102 #afs.iad1a.rsapps.net=0A10.130.8.160 #store.afs.ord1a.rsapps=
.net=0A=0Athe 10.138.8.160 is the server we are trying to add=0A=0A-----Ori=
ginal Message-----=0AFrom: "Andrew Deason" =0ASent:=
Tuesday, April 6, 2010 13:41=0ATo: openafs-info@openafs.org=0ASubject: [Op=
enAFS] Re: servers not establishing a quorum=0A=0AOn Tue, 6 Apr 2010 09:11:=
35 -0400 (EDT)=0Alists@drewstud.com wrote:=0A=0A> Tue Apr 6 09:09:46 2010 =
Received beacon from unknown host 172.20.1.26=0A=0AWhat is this address? Yo=
ur udebug output shows that it expects one of=0Athe dbservers to be 172.20.=
125.226, but this log shows that you're=0Agetting pinged from 172.20.1.26.=
=0A=0AI could ask a few more questions, but if you just show me the new=0Au=
pdated CellServDB for your cell and indicate which server is the new=0Aone,=
it would answer them.=0A=0A-- =0AAndrew Deason=0Aadeason@sinenomine.net=0A=
=0A_______________________________________________=0AOpenAFS-info mailing l=
ist=0AOpenAFS-info@openafs.org=0Ahttps://lists.openafs.org/mailman/listinfo=
/openafs-info=0A
From lists@drewstud.com Tue Apr 6 18:58:04 2010
From: lists@drewstud.com (lists@drewstud.com)
Date: Tue, 6 Apr 2010 13:58:04 -0400 (EDT)
Subject: [OpenAFS] Re: servers not establishing a quorum
In-Reply-To: <20100406174608.GD17256@mcketrick.tproa.net>
References: <1270559495.93977471@192.168.2.230>
<20100406124108.0631ce22.adeason@sinenomine.net>
<20100406174608.GD17256@mcketrick.tproa.net>
Message-ID: <1270576684.121710186@192.168.2.230>
Yes, we are adding both a Fileserver and a DBserver. We have our afs server=
setup not to be running the client for a variety of reasons, so the only o=
ne to edit on our afs servers is the server side CellServDB =0A=0A-----Orig=
inal Message-----=0AFrom: "Thomas Kula" =0ASent: Tuesday, A=
pril 6, 2010 13:46=0ATo: openafs-info@openafs.org=0ASubject: Re: [OpenAFS] =
Re: servers not establishing a quorum=0A=0AOn Tue, Apr 06, 2010 at 12:41:08=
PM -0500, Andrew Deason wrote:=0A> On Tue, 6 Apr 2010 09:11:35 -0400 (EDT)=
=0A> lists@drewstud.com wrote:=0A> =0A> > Tue Apr 6 09:09:46 2010 Received=
beacon from unknown host 172.20.1.26=0A> =0A> What is this address? Your u=
debug output shows that it expects one of=0A> the dbservers to be 172.20.12=
5.226, but this log shows that you're=0A> getting pinged from 172.20.1.26.=
=0A> =0A> I could ask a few more questions, but if you just show me the new=
=0A> updated CellServDB for your cell and indicate which server is the new=
=0A> one, it would answer them.=0A> =0A=0AAnd, importantly, since you are d=
ealing with dbservers, it should=0Abe the server side CellServDB, (I tend t=
o find it in /etc/openafs/server=0Aaround here, but your path may vary). =
=0A=0AAlso, you are trying to add a dbserver, right? Not just a simple file=
server?=0A=0A=0A=0A-- =0AThomas L. Kula | kula@tproa.net | http://kula.tpro=
a.net/=0A_______________________________________________=0AOpenAFS-info mai=
ling list=0AOpenAFS-info@openafs.org=0Ahttps://lists.openafs.org/mailman/li=
stinfo/openafs-info=0A
From adeason@sinenomine.net Tue Apr 6 19:37:49 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Tue, 6 Apr 2010 13:37:49 -0500
Subject: [OpenAFS] Re: servers not establishing a quorum
References: <1270559495.93977471@192.168.2.230>
<20100406124108.0631ce22.adeason@sinenomine.net>
<1270576610.396627792@192.168.2.230>
Message-ID: <20100406133749.7172ede8.adeason@sinenomine.net>
On Tue, 6 Apr 2010 13:56:50 -0400 (EDT)
lists@drewstud.com wrote:
> awesome.
> This may help as well:
> we have afs "pairs" at each location. We are syncing them with
> heartbeat/drbd.
Trying to do that with dbservers seems overkill, but okay. So you have a
hot-spare thata starts up bosserver when the other node goes down, I
assume?
> We have tried to get it to only "show" the one floating vip via
> NetInfo
I haven't been thinking about the cluster-HA AFS thing recently, but I'm
not sure how necessary that is. Fileservers will register what addresses
they have on startup, so if the local IP is registered in the VLDB on
one fileserver, and it goes down and the other server comes up, the old
local IP should go away. If/when clients re-read VLDB information, they
won't get the IP for the downed fileserver.
> VLLog
> Tue Apr 6 13:23:37 2010 ubik: primary address 172.20.1.26 does not exist
> Tue Apr 6 13:23:37 2010 Using 172.20.125.226 as my primary address
> Contents of NetInfo:
> 172.20.125.226
That will work for fileservers, but I think for dbservers that's going
to cause problems like the one you're seeing. When 10.138.8.160 gets a
ping from 172.20.1.26, it doesn't know which site in the quorum it
corresponds to, since you told 172.20.1.26 not to advertise the
172.20.1.26 address. Preferably for dbservers you would not specify
anything in that file.
Alternatively, the easiest way for you to solve this would probably be
to just route outgoing packets such that they originate from
172.20.125.226 instead of 172.20.1.26 (enabled with some heartbeat
script). Would that be possible?
--
Andrew Deason
adeason@sinenomine.net
From lists@drewstud.com Tue Apr 6 20:02:24 2010
From: lists@drewstud.com (lists@drewstud.com)
Date: Tue, 6 Apr 2010 15:02:24 -0400 (EDT)
Subject: [OpenAFS] Re: servers not establishing a quorum
In-Reply-To: <20100406133749.7172ede8.adeason@sinenomine.net>
References: <1270559495.93977471@192.168.2.230>
<20100406124108.0631ce22.adeason@sinenomine.net>
<1270576610.396627792@192.168.2.230>
<20100406133749.7172ede8.adeason@sinenomine.net>
Message-ID: <1270580544.16875658@192.168.2.230>
first off thank you again! I definitely appreciate everyone taking time to =
answer.=0AI removed the NetInfo file from the new openafs server we are try=
ing to add, and BAM quorum, the new afs server became the quorum, and you a=
re definitely correct about it having to do with what is and is not getting=
registered (had to be since removing that file worked :) ) Each of our afs=
servers so far is a fileserver and dbserver. =0A=0AAs far as sourcing ever=
ything on each server so that it comes from the vip via heartbeat, that is =
absolutely possible and we will probably go that route, since our servers a=
re both fileserver and dbservers and we will need some kind of vip to fail =
over. Good idea.=0A=0AThanks again for the help! I also finally understand =
how a quorum is elected much better after digging through the mailing list =
(found http://www.openafs.org/pipermail/openafs-devel/2001-January/005470.h=
tml).=0A=0A=0A-----Original Message-----=0AFrom: "Andrew Deason" =0ASent: Tuesday, April 6, 2010 14:37=0ATo: openafs-info@open=
afs.org=0ASubject: [OpenAFS] Re: servers not establishing a quorum=0A=0AOn =
Tue, 6 Apr 2010 13:56:50 -0400 (EDT)=0Alists@drewstud.com wrote:=0A=0A> awe=
some.=0A> This may help as well:=0A=0A> we have afs "pairs" at each locatio=
n. We are syncing them with=0A> heartbeat/drbd.=0A=0ATrying to do that with=
dbservers seems overkill, but okay. So you have a=0Ahot-spare thata starts=
up bosserver when the other node goes down, I=0Aassume?=0A=0A> We have tri=
ed to get it to only "show" the one floating vip via=0A> NetInfo=0A=0AI hav=
en't been thinking about the cluster-HA AFS thing recently, but I'm=0Anot s=
ure how necessary that is. Fileservers will register what addresses=0Athey =
have on startup, so if the local IP is registered in the VLDB on=0Aone file=
server, and it goes down and the other server comes up, the old=0Alocal IP =
should go away. If/when clients re-read VLDB information, they=0Awon't get =
the IP for the downed fileserver.=0A=0A> VLLog=0A> Tue Apr 6 13:23:37 2010=
ubik: primary address 172.20.1.26 does not exist=0A> Tue Apr 6 13:23:37 2=
010 Using 172.20.125.226 as my primary address=0A> Contents of NetInfo:=0A>=
172.20.125.226=0A=0AThat will work for fileservers, but I think for dbserv=
ers that's going=0Ato cause problems like the one you're seeing. When 10.13=
8.8.160 gets a=0Aping from 172.20.1.26, it doesn't know which site in the q=
uorum it=0Acorresponds to, since you told 172.20.1.26 not to advertise the=
=0A172.20.1.26 address. Preferably for dbservers you would not specify=0Aan=
ything in that file.=0A=0AAlternatively, the easiest way for you to solve t=
his would probably be=0Ato just route outgoing packets such that they origi=
nate from=0A172.20.125.226 instead of 172.20.1.26 (enabled with some heartb=
eat=0Ascript). Would that be possible?=0A=0A-- =0AAndrew Deason=0Aadeason@s=
inenomine.net=0A=0A_______________________________________________=0AOpenAF=
S-info mailing list=0AOpenAFS-info@openafs.org=0Ahttps://lists.openafs.org/=
mailman/listinfo/openafs-info=0A
From adeason@sinenomine.net Tue Apr 6 20:26:08 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Tue, 6 Apr 2010 14:26:08 -0500
Subject: [OpenAFS] Re: servers not establishing a quorum
References: <1270559495.93977471@192.168.2.230>
<20100406124108.0631ce22.adeason@sinenomine.net>
<1270576610.396627792@192.168.2.230>
<20100406133749.7172ede8.adeason@sinenomine.net>
<1270580544.16875658@192.168.2.230>
Message-ID: <20100406142608.8d507f77.adeason@sinenomine.net>
On Tue, 6 Apr 2010 15:02:24 -0400 (EDT)
lists@drewstud.com wrote:
> I removed the NetInfo file from the new openafs server we are trying
> to add, and BAM quorum, the new afs server became the quorum, and you
> are definitely correct about it having to do with what is and is not
> getting registered (had to be since removing that file worked :) )
Okay, but that may or may not be what you want in the longer term. If
you don't have a NetInfo on the fileserver and it advertises all of it's
IPs, a client may choose to contact the fileserver on its local IP.
When it goes down and you failover, that client may try again to contact
the fileserver on that local IP, and it will hang until it times out on
the network. That should only happen once (at which point I believe the
client refreshes its information on where that fileserver is), but that
could still be annoying. But maybe it's not so bad, I don't remember
what people normally do.
> As far as sourcing everything on each server so that it comes from the
> vip via heartbeat, that is absolutely possible and we will probably go
> that route
Okay, that's good. I don't think there are any problems with doing that,
but it's a little annoying that it needs to be done.
--
Andrew Deason
adeason@sinenomine.net
From pontius@btv.ibm.com Wed Apr 7 19:24:05 2010
From: pontius@btv.ibm.com (Dale Pontius)
Date: Wed, 07 Apr 2010 14:24:05 -0400
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk>
Message-ID: <4BBCCDC5.9090805@btv.ibm.com>
At what point should the 1.5.xx series be considered "usable" on Linux?
I'm thinking usable as 1.4.xx is, not trying disconnected mode, at the
moment.
I've tried Linux clients from 1.5.71-73 with varying amounts of
success, but in no case has it been sufficient. A quick look at my
domain, and you'll see that I'm likely connecting to a stunning (or
annoying) variety of servers, which might explain my results. In every
case, I've had "holes" in my data as viewed in /afs. Though those holes
have gotten smaller with each release, some still appear to be there
with 1.5.73. Is there something I can run that will furnish debug
information to be of some help? A quick glance through "pts help" or
"fs help" doesn't really suggest anything to me. I'm presuming I want
to identify missing data, and run something against the mount point or
against the server. With 1.5.72 I found that the mount points
themselves were missing for some data. I guess I also need to know how
to map a mount point to a server. By the way, this stuff is both
same-cell and cross-cell. For the testing I've done so far, any
cross-cell data is "system:anyuser rl" - I haven't gotten to getting
extra tokens.
Thanks,
Dale Pontius
On 03/26/10 20:16, Simon Wilkinson wrote:
>
> On 27 Mar 2010, at 00:11, Jason Edgecombe wrote:
>
>> Are there any RPM or debian packages for 1.5?
>
> When I get the time, I've been producing RPMs for 1.5 using the
> 'standard' tool chain that we use for the 1.4.x RPMs.
>
> If people would like to see these on a more regular basis, let me know
>
> Cheers,
>
> Simon.
>
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
--
Dale Pontius
Senior Engineer
IBM Corporation
Phone: (802) 769-6850
Tie-Line: 446-6850
email: pontius@us.ibm.com
This e-mail and its attachments, if any, may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message from your system without copying it and notify sender of the misdirection by reply e-mail.
From Ken@Elkabany.com Wed Apr 7 19:24:33 2010
From: Ken@Elkabany.com (Ken Elkabany)
Date: Wed, 7 Apr 2010 11:24:33 -0700
Subject: [OpenAFS] Can client's CellServDB file rely on DNS?
Message-ID:
--000e0cd1170a87663b0483a9acfd
Content-Type: text/plain; charset=ISO-8859-1
Hello,
We have had to replace our master openafs fileserver several times. Each
time we have had to go through each client and update the CellServDB file to
reflect the IP address of the new replacement server. Since we always map a
domain name to the master openafs fileserver, is it possible to specify in
the CellServDB file to always use domain name x as opposed to an IP? This
feature would allow the system to reconfigure itself automatically once the
DNS information is updated. By any chance is this feature already present
and I've simply missed it?
Thanks,
Ken
--000e0cd1170a87663b0483a9acfd
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Hello,
We have had to replace our master openafs fileser=
ver several times. Each time we have had to go through each client and upda=
te the CellServDB file to reflect the IP=A0address of the new replacement s=
erver. Since we always map a domain name to the master openafs fileserver, =
is it possible to specify in the CellServDB file to always use domain name =
x as opposed to an IP? This feature would allow the system to reconfigure i=
tself automatically once the DNS information is updated. By any chance is t=
his feature already present and I've simply missed it?
Thanks,
Ken
--000e0cd1170a87663b0483a9acfd--
From rra@stanford.edu Wed Apr 7 19:28:29 2010
From: rra@stanford.edu (Russ Allbery)
Date: Wed, 07 Apr 2010 11:28:29 -0700
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <4BBCCDC5.9090805@btv.ibm.com> (Dale Pontius's message of "Wed,
07 Apr 2010 14:24:05 -0400")
References: <4BAD4D29.6040409@rampaginggeek.com>
<5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk>
<4BBCCDC5.9090805@btv.ibm.com>
Message-ID: <87ljczt8vm.fsf@windlord.stanford.edu>
Dale Pontius writes:
> At what point should the 1.5.xx series be considered "usable" on Linux?
> I'm thinking usable as 1.4.xx is, not trying disconnected mode, at the
> moment.
I finally got a working build on Debian with 1.5.73.3 plus two additional
patches (one of which has been merged and one of which is still in
review), so that's a good sign. :) I agree with your general impression
that up until now we've not really been there on Linux, but we seem to be
stabilizing.
I would give it a while longer, though, before considering it as stable as
1.4.x. At least a couple more releases, I suspect.
> I've tried Linux clients from 1.5.71-73 with varying amounts of success,
> but in no case has it been sufficient. A quick look at my domain, and
> you'll see that I'm likely connecting to a stunning (or annoying)
> variety of servers, which might explain my results. In every case, I've
> had "holes" in my data as viewed in /afs. Though those holes have
> gotten smaller with each release, some still appear to be there with
> 1.5.73. Is there something I can run that will furnish debug
> information to be of some help?
The first thing I would try, based on my experience from yesterday, is to
stop your AFS client and completely purge your cache directory. Then
start it again and see if the holes have gone away.
When switching from 1.4 to 1.5, I got some weird cache artifacts. I
thought that was because I also had a kernel panic with the new version
that could have left the cache in an inconsistent state, but I'm wondering
if there may be some more basic upgrade problem.
Other people have been switching back and forth without encountering this
problem, so this may be a red herring, but it's worth a try.
--
Russ Allbery (rra@stanford.edu)
From rra@stanford.edu Wed Apr 7 19:29:55 2010
From: rra@stanford.edu (Russ Allbery)
Date: Wed, 07 Apr 2010 11:29:55 -0700
Subject: [OpenAFS] Can client's CellServDB file rely on DNS?
In-Reply-To:
(Ken Elkabany's message of "Wed, 7 Apr 2010 11:24:33 -0700")
References:
Message-ID: <87hbnnt8t8.fsf@windlord.stanford.edu>
Ken Elkabany writes:
> We have had to replace our master openafs fileserver several times. Each
> time we have had to go through each client and update the CellServDB
> file to reflect the IP address of the new replacement server. Since we
> always map a domain name to the master openafs fileserver, is it
> possible to specify in the CellServDB file to always use domain name x
> as opposed to an IP? This feature would allow the system to reconfigure
> itself automatically once the DNS information is updated.
No. The file semantics don't support that.
However, you can enable AFSDB records (add the -afsdb flag to afsd), and
then use a zero-length CellServDB file, and all VLDB location information
will be resolved via AFSDB records.
--
Russ Allbery (rra@stanford.edu)
From sxw@inf.ed.ac.uk Wed Apr 7 19:38:59 2010
From: sxw@inf.ed.ac.uk (Simon Wilkinson)
Date: Wed, 7 Apr 2010 19:38:59 +0100
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <87ljczt8vm.fsf@windlord.stanford.edu>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk> <4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu>
Message-ID: <6DDDE213-0AC2-4BC2-8F88-DA24FA654364@inf.ed.ac.uk>
On 7 Apr 2010, at 19:28, Russ Allbery wrote:
> I agree with your general impression
> that up until now we've not really been there on Linux, but we seem
> to be
> stabilizing.
Those of us actively developing on Linux have been running the 1.5
series for ages. The fact that other people are seeing problems would
seem to indicate that testing across a wider variety of systems is
required. Unfortunately, we don't have the time, or the systems, to do
this by ourselves. If folk are interested in getting a stable 1.5 (and
1.6) for Linux any time this millenia, then we need more people
testing the builds.
This particularly applies to those running old, or non-standard
kernels and running on odd platforms and architectures. One of the
bugs I fixed for Russ surfaced exactly because he was running a kernel
with slightly out of the ordinary memory management.
If RPM packages would help with this please let me know. So far all I
have heard is silence.
Simon.
From shadow@gmail.com Wed Apr 7 19:39:04 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Wed, 7 Apr 2010 14:39:04 -0400
Subject: [OpenAFS] Can client's CellServDB file rely on DNS?
In-Reply-To: <87hbnnt8t8.fsf@windlord.stanford.edu>
References:
<87hbnnt8t8.fsf@windlord.stanford.edu>
Message-ID:
On Wed, Apr 7, 2010 at 2:29 PM, Russ Allbery wrote:
> Ken Elkabany writes:
>
>> We have had to replace our master openafs fileserver several times. Each
>> time we have had to go through each client and update the CellServDB
>> file to reflect the IP address of the new replacement server. Since we
>> always map a domain name to the master openafs fileserver, is it
>> possible to specify in the CellServDB file to always use domain name x
>> as opposed to an IP? This feature would allow the system to reconfigure
>> itself automatically once the DNS information is updated.
>
> No. =A0The file semantics don't support that.
>
> However, you can enable AFSDB records (add the -afsdb flag to afsd), and
> then use a zero-length CellServDB file, and all VLDB location information
> will be resolved via AFSDB records.
Or name just cells you want with this behavior as e.g.
>cell.name #Whatever
instead of
>cell.name #Whatever
IP #host
--=20
Derrick
From pontius@btv.ibm.com Wed Apr 7 19:40:21 2010
From: pontius@btv.ibm.com (Dale Pontius)
Date: Wed, 07 Apr 2010 14:40:21 -0400
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <87ljczt8vm.fsf@windlord.stanford.edu>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk> <4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu>
Message-ID: <4BBCD195.9000902@btv.ibm.com>
On 04/07/10 14:28, Russ Allbery wrote:
> Dale Pontius writes:
>
>
>> At what point should the 1.5.xx series be considered "usable" on Linux?
>> I'm thinking usable as 1.4.xx is, not trying disconnected mode, at the
>> moment.
>>
> I finally got a working build on Debian with 1.5.73.3 plus two additional
> patches (one of which has been merged and one of which is still in
> review), so that's a good sign. :) I agree with your general impression
> that up until now we've not really been there on Linux, but we seem to be
> stabilizing.
>
> I would give it a while longer, though, before considering it as stable as
> 1.4.x. At least a couple more releases, I suspect.
>
>
>> I've tried Linux clients from 1.5.71-73 with varying amounts of success,
>> but in no case has it been sufficient. A quick look at my domain, and
>> you'll see that I'm likely connecting to a stunning (or annoying)
>> variety of servers, which might explain my results. In every case, I've
>> had "holes" in my data as viewed in /afs. Though those holes have
>> gotten smaller with each release, some still appear to be there with
>> 1.5.73. Is there something I can run that will furnish debug
>> information to be of some help?
>>
> The first thing I would try, based on my experience from yesterday, is to
> stop your AFS client and completely purge your cache directory. Then
> start it again and see if the holes have gone away.
>
A year or two back I was having some odd cache issues with the stable
client. I hacked the init script (This is Gentoo, by the way.) to clear
the cache each time right before starting the client. That solved my
problems back then, and at some point the init script got overwritten by
an update. But by then my cache problems were gone, so I didn't worry
about it. It's certainly easy to put back in.
> When switching from 1.4 to 1.5, I got some weird cache artifacts. I
> thought that was because I also had a kernel panic with the new version
> that could have left the cache in an inconsistent state, but I'm wondering
> if there may be some more basic upgrade problem.
>
> Other people have been switching back and forth without encountering this
> problem, so this may be a red herring, but it's worth a try.
>
I'll certainly try clearing the cache. But I'm also guessing that I'm
also talking to a greater-than-average variety of servers, and wondering
if there could be a vintage problem here. Is there a way to query a
mount point to find the server version providing that data? (Or perhaps
there's other relevant information to query.)
Thanks,
--
Dale Pontius
Senior Engineer
IBM Corporation
Phone: (802) 769-6850
Tie-Line: 446-6850
email: pontius@us.ibm.com
This e-mail and its attachments, if any, may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message from your system without copying it and notify sender of the misdirection by reply e-mail.
From sxw@inf.ed.ac.uk Wed Apr 7 19:49:33 2010
From: sxw@inf.ed.ac.uk (Simon Wilkinson)
Date: Wed, 7 Apr 2010 19:49:33 +0100
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <4BBCD195.9000902@btv.ibm.com>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk> <4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu> <4BBCD195.9000902@btv.ibm.com>
Message-ID: <2FC93788-8A6C-45FB-BC25-B6525F578FEF@inf.ed.ac.uk>
>>
> A year or two back I was having some odd cache issues with the stable
> client. I hacked the init script (This is Gentoo, by the way.) to
> clear
> the cache each time right before starting the client.
And, of course, you reported these problems, and we fixed them in a
later release?
> I'll certainly try clearing the cache. But I'm also guessing that I'm
> also talking to a greater-than-average variety of servers, and
> wondering
> if there could be a vintage problem here. Is there a way to query a
> mount point to find the server version providing that data? (Or
> perhaps
> there's other relevant information to query.)
fs whereis . will tell you.
Cheers,
Simon.
From rra@stanford.edu Wed Apr 7 19:56:02 2010
From: rra@stanford.edu (Russ Allbery)
Date: Wed, 07 Apr 2010 11:56:02 -0700
Subject: [OpenAFS] Can client's CellServDB file rely on DNS?
In-Reply-To:
(Derrick Brashear's message of "Wed, 7 Apr 2010 14:39:04 -0400")
References:
<87hbnnt8t8.fsf@windlord.stanford.edu>
Message-ID: <87bpdvt7lp.fsf@windlord.stanford.edu>
Derrick Brashear writes:
> Or name just cells you want with this behavior as e.g.
> >cell.name #Whatever
> instead of
> >cell.name #Whatever
> IP #host
I didn't know you could do that. Added to the man page in:
http://gerrit.openafs.org/1710
--
Russ Allbery (rra@stanford.edu)
From rra@stanford.edu Wed Apr 7 19:59:52 2010
From: rra@stanford.edu (Russ Allbery)
Date: Wed, 07 Apr 2010 11:59:52 -0700
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <6DDDE213-0AC2-4BC2-8F88-DA24FA654364@inf.ed.ac.uk> (Simon
Wilkinson's message of "Wed, 7 Apr 2010 19:38:59 +0100")
References: <4BAD4D29.6040409@rampaginggeek.com>
<5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk>
<4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu>
<6DDDE213-0AC2-4BC2-8F88-DA24FA654364@inf.ed.ac.uk>
Message-ID: <877hojt7fb.fsf@windlord.stanford.edu>
Simon Wilkinson writes:
> Those of us actively developing on Linux have been running the 1.5
> series for ages. The fact that other people are seeing problems would
> seem to indicate that testing across a wider variety of systems is
> required. Unfortunately, we don't have the time, or the systems, to do
> this by ourselves. If folk are interested in getting a stable 1.5 (and
> 1.6) for Linux any time this millenia, then we need more people testing
> the builds.
Debian packages will be available from experimental as soon as the fixes
for the things I ran into yesterday are finalized, which will make it
easier for some folks to test hopefully. At that point, I'll start
running the new client on all of my systems, but they're all recent
kernels and roughly the same basic set of packages, and my usage pattern
is not heavy under normal circumstances. So I definitely echo Simon here:
please test more and report any problems you encounter.
--
Russ Allbery (rra@stanford.edu)
From pontius@btv.ibm.com Wed Apr 7 20:06:59 2010
From: pontius@btv.ibm.com (Dale Pontius)
Date: Wed, 07 Apr 2010 15:06:59 -0400
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <6DDDE213-0AC2-4BC2-8F88-DA24FA654364@inf.ed.ac.uk>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk> <4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu> <6DDDE213-0AC2-4BC2-8F88-DA24FA654364@inf.ed.ac.uk>
Message-ID: <4BBCD7D3.7010506@btv.ibm.com>
On 04/07/10 14:38, Simon Wilkinson wrote:
>
> On 7 Apr 2010, at 19:28, Russ Allbery wrote:
>
>> I agree with your general impression
>> that up until now we've not really been there on Linux, but we seem
>> to be
>> stabilizing.
>
> Those of us actively developing on Linux have been running the 1.5
> series for ages. The fact that other people are seeing problems would
> seem to indicate that testing across a wider variety of systems is
> required. Unfortunately, we don't have the time, or the systems, to do
> this by ourselves. If folk are interested in getting a stable 1.5 (and
> 1.6) for Linux any time this millenia, then we need more people
> testing the builds.
I'll try putting more effort into testing 1.5.x releases. I suspect my
employer has a wider-than-average variety of hardware and servers available.
>
> This particularly applies to those running old, or non-standard
> kernels and running on odd platforms and architectures. One of the
> bugs I fixed for Russ surfaced exactly because he was running a kernel
> with slightly out of the ordinary memory management.
>
> If RPM packages would help with this please let me know. So far all I
> have heard is silence.
I generally run Gentoo, and it's very simple to move to a new release.
All I need is the source tarball and a few ebuild tweaks on my side. On
a T61p I can switch afs versions in 15 minutes or so, so it's more a
matter of not absolutely needing that machine during the testing interval.
Thanks,
--
Dale Pontius
Senior Engineer
IBM Corporation
Phone: (802) 769-6850
Tie-Line: 446-6850
email: pontius@us.ibm.com
This e-mail and its attachments, if any, may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message from your system without copying it and notify sender of the misdirection by reply e-mail.
From pontius@btv.ibm.com Wed Apr 7 20:14:31 2010
From: pontius@btv.ibm.com (Dale Pontius)
Date: Wed, 07 Apr 2010 15:14:31 -0400
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <2FC93788-8A6C-45FB-BC25-B6525F578FEF@inf.ed.ac.uk>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk> <4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu> <4BBCD195.9000902@btv.ibm.com> <2FC93788-8A6C-45FB-BC25-B6525F578FEF@inf.ed.ac.uk>
Message-ID: <4BBCD997.2010905@btv.ibm.com>
On 04/07/10 14:49, Simon Wilkinson wrote:
>>>
>> A year or two back I was having some odd cache issues with the stable
>> client. I hacked the init script (This is Gentoo, by the way.) to clear
>> the cache each time right before starting the client.
>
> And, of course, you reported these problems, and we fixed them in a
> later release?
I believe I reported them to my distro, which is probably the annoying
thing for you to hear, because I suspect such things frequently don't
make it upstream.
>
>> I'll certainly try clearing the cache. But I'm also guessing that I'm
>> also talking to a greater-than-average variety of servers, and wondering
>> if there could be a vintage problem here. Is there a way to query a
>> mount point to find the server version providing that data? (Or perhaps
>> there's other relevant information to query.)
>
> fs whereis . will tell you.
Thanks, that does give the server name. Now is there a command that
will give meaningful and useful (to you) information about that server?
I doubt I have any sort of shell access to any of them.
Thanks,
--
Dale Pontius
Senior Engineer
IBM Corporation
Phone: (802) 769-6850
Tie-Line: 446-6850
email: pontius@us.ibm.com
This e-mail and its attachments, if any, may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message from your system without copying it and notify sender of the misdirection by reply e-mail.
From rra@stanford.edu Wed Apr 7 20:24:05 2010
From: rra@stanford.edu (Russ Allbery)
Date: Wed, 07 Apr 2010 12:24:05 -0700
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <4BBCD997.2010905@btv.ibm.com> (Dale Pontius's message of "Wed,
07 Apr 2010 15:14:31 -0400")
References: <4BAD4D29.6040409@rampaginggeek.com>
<5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk>
<4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu>
<4BBCD195.9000902@btv.ibm.com>
<2FC93788-8A6C-45FB-BC25-B6525F578FEF@inf.ed.ac.uk>
<4BBCD997.2010905@btv.ibm.com>
Message-ID: <87vdc3rrqi.fsf@windlord.stanford.edu>
Dale Pontius writes:
> On 04/07/10 14:49, Simon Wilkinson wrote:
>> fs whereis . will tell you.
> Thanks, that does give the server name. Now is there a command that
> will give meaningful and useful (to you) information about that server?
> I doubt I have any sort of shell access to any of them.
rxdebug 7000 -version
will display the version of AFS the file server is running.
--
Russ Allbery (rra@stanford.edu)
From adeason@sinenomine.net Wed Apr 7 20:24:50 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Wed, 7 Apr 2010 14:24:50 -0500
Subject: [OpenAFS] Re: Linux packages for 1.5?
References: <4BAD4D29.6040409@rampaginggeek.com>
<5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk>
<4BBCCDC5.9090805@btv.ibm.com>
<87ljczt8vm.fsf@windlord.stanford.edu>
<4BBCD195.9000902@btv.ibm.com>
<2FC93788-8A6C-45FB-BC25-B6525F578FEF@inf.ed.ac.uk>
<4BBCD997.2010905@btv.ibm.com>
Message-ID: <20100407142450.eb5f3778.adeason@sinenomine.net>
On Wed, 07 Apr 2010 15:14:31 -0400
Dale Pontius wrote:
> > fs whereis . will tell you.
>
> Thanks, that does give the server name. Now is there a command that
> will give meaningful and useful (to you) information about that
> server? I doubt I have any sort of shell access to any of them.
rxdebug -version
will give you a clue as to what code they are running. But everything
you've said so far sounds more like the client, to me.
--
Andrew Deason
adeason@sinenomine.net
From brandon.m.simmons@gmail.com Thu Apr 8 03:10:53 2010
From: brandon.m.simmons@gmail.com (Brandon Simmons)
Date: Wed, 7 Apr 2010 22:10:53 -0400
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an appropriate
application for AFS?
Message-ID:
I have a web application in which I would like many client web-servers
to be able to read and write to many separate and modestly-sized
sqlite databases, exported by a master server. Each database
corresponds to an account, so we might have several concurrent users
accessing an individual DB every few seconds.
I have been testing with NFS and haven't had any problems but I'm
concerned with issues of file-locking and caching problems, which the
whole internet seems to be warning about. Would AFS be appropriate for
this?
Thanks for any advice,
Brandon
From utoddl@email.unc.edu Thu Apr 8 12:22:09 2010
From: utoddl@email.unc.edu (Todd Lewis)
Date: Thu, 08 Apr 2010 07:22:09 -0400
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an
appropriate application for AFS?
In-Reply-To:
References:
Message-ID: <4BBDBC61.9000609@email.unc.edu>
On 04/07/2010 10:10 PM, Brandon Simmons sent:
> I have a web application in which I would like many client web-servers
> to be able to read and write to many separate and modestly-sized
> sqlite databases, exported by a master server. Each database
> corresponds to an account, so we might have several concurrent users
> accessing an individual DB every few seconds.
>
> I have been testing with NFS and haven't had any problems but I'm
> concerned with issues of file-locking and caching problems, which the
> whole internet seems to be warning about. Would AFS be appropriate for
> this?
In a word, no. If your multiple clients were on the same host, then that
host could enforce the locking sqlite attempts, but from multiple hosts
you lose.
I happen to be facing exactly that same problem at the moment, so I'm
hopeful (doubtful, but hopeful) someone will step up and prove me wrong.
> Thanks for any advice,
> Brandon
Same here, only not the Brandon part.
--
+--------------------------------------------------------------+
/ Todd_Lewis@unc.edu 919-445-9302 http://www.unc.edu/~utoddl /
/ "He had delusions of adequacy." - Walter Kerr /
+--------------------------------------------------------------+
From sxw@inf.ed.ac.uk Thu Apr 8 12:29:41 2010
From: sxw@inf.ed.ac.uk (Simon Wilkinson)
Date: Thu, 8 Apr 2010 12:29:41 +0100
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an appropriate application for AFS?
In-Reply-To: <4BBDBC61.9000609@email.unc.edu>
References: <4BBDBC61.9000609@email.unc.edu>
Message-ID:
On 8 Apr 2010, at 12:22, Todd Lewis wrote:
> In a word, no. If your multiple clients were on the same host, then
> that
> host could enforce the locking sqlite attempts, but from multiple
> hosts
> you lose.
This is actually only true on Linux. On other operating systems,
OpenAFS doesn't enforce byte range locks at all, so you lose even if
all of the lock requests are from the same host.
Someone (Matt, IIRC) was working on server support for byte range
locking, but I don't think we've seen any code yet.
Cheers,
Simon.
From pontius@btv.ibm.com Thu Apr 8 15:39:26 2010
From: pontius@btv.ibm.com (Dale Pontius)
Date: Thu, 08 Apr 2010 10:39:26 -0400
Subject: [OpenAFS] Re: Linux packages for 1.5?
In-Reply-To: <20100407142450.eb5f3778.adeason@sinenomine.net>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk> <4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu> <4BBCD195.9000902@btv.ibm.com> <2FC93788-8A6C-45FB-BC25-B6525F578FEF@inf.ed.ac.uk> <4BBCD997.2010905@btv.ibm.com> <20100407142450.eb5f3778.adeason@sinenomine.net>
Message-ID: <4BBDEA9E.5090009@btv.ibm.com>
On 04/07/10 15:24, Andrew Deason wrote:
> On Wed, 07 Apr 2010 15:14:31 -0400
> Dale Pontius wrote:
>
>
>>> fs whereis . will tell you.
>>>
>> Thanks, that does give the server name. Now is there a command that
>> will give meaningful and useful (to you) information about that
>> server? I doubt I have any sort of shell access to any of them.
>>
> rxdebug -version
>
> will give you a clue as to what code they are running. But everything
> you've said so far sounds more like the client, to me
I'm rerunning my big jobstream with using the 1.5.73 now, except that I
have tweaked the init script to clear the cache before loading the
kernel module. So far things are looking better. The first checking
tool, which failed last time, has already passed. There are quite a few
more job steps to go, so it might not get done before my ride home is
ready to leave. But for a few more jollies, some of the distinct server
types I have available are:
AFS version: Base configuration afs3.6
2.64;Iavinesh-ID71117-afs3.6-AIX-large
AFS version: Base configuration afs3.6 2.67
AFS version: Base configuration afs3.6
2.64;Navinesh-ID71117-afs3.6-AIX-large
This list is most likely not exhaustive - it just comes from 16 mount
points I could think of off the top of my head and tuck into a quick script.
OK - I had problems, again. Clearly better than last time, but not
clean. In this case, I'm parked in a directory that looks really
weird. Most of the subdirectories are shown, but have no contents.
There is a symlink that says it points nowhere, (Symlink target is
blank) yet it has contents. UIDs and sizes appear to be sheer
gibberish. The server type is:
AFS version: Base configuration afs3.6
2.64;Iavinesh-ID71117-afs3.6-AIX-largeP
----------------------------------------
I didn't finish this note before I had to leave yesterday, so it stayed
in the drafts folder overnight. The laptop was powered off overnight,
and started up fresh this morning.
Further observations...
If I query the server type from my 32-bit machine running 1.4.12, server
types are as in the list above. If I run that same script from my
64-bit laptop running 1.5.73, I get the same server types, except the
"-large" and the end becomes "-largeP".
As mentioned, yesterday I had missing data on some mount points.
However I couldn't seem to correlate it to server type. This morning I
decided to first retry the jobstream experiment from yesterday, and
today everything runs clean. So at this point I'm not sure how to
proceed, since everything appears to be fully functional. There are
some additional tools I can fire up, which will most likely touch other
volumes. I suppose I can also continue running with 1.5.73, and keep
looking for errors.
Yesterday there were problems, today there are none, so far.
Intermittent problems are the MOST fun.
Thanks,
Dale Pontius
Senior Engineer
IBM Corporation
Phone: (802) 769-6850
Tie-Line: 446-6850
email: pontius@us.ibm.com
This e-mail and its attachments, if any, may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message from your system without copying it and notify sender of the misdirection by reply e-mail.
From phalenor@gmail.com Thu Apr 8 17:51:04 2010
From: phalenor@gmail.com (Andy Cobaugh)
Date: Thu, 8 Apr 2010 12:51:04 -0400 (EDT)
Subject: [OpenAFS] Linux packages for 1.5?
In-Reply-To: <6DDDE213-0AC2-4BC2-8F88-DA24FA654364@inf.ed.ac.uk>
References: <4BAD4D29.6040409@rampaginggeek.com> <5D4F5B07-1054-4428-A77C-7BBD26A82196@inf.ed.ac.uk> <4BBCCDC5.9090805@btv.ibm.com> <87ljczt8vm.fsf@windlord.stanford.edu> <6DDDE213-0AC2-4BC2-8F88-DA24FA654364@inf.ed.ac.uk>
Message-ID:
On 2010-04-07 at 19:38, Simon Wilkinson ( sxw@inf.ed.ac.uk ) said:
>
> Those of us actively developing on Linux have been running the 1.5 series for
> ages. The fact that other people are seeing problems would seem to indicate
> that testing across a wider variety of systems is required. Unfortunately, we
> don't have the time, or the systems, to do this by ourselves. If folk are
> interested in getting a stable 1.5 (and 1.6) for Linux any time this
> millenia, then we need more people testing the builds.
>
> This particularly applies to those running old, or non-standard kernels and
> running on odd platforms and architectures. One of the bugs I fixed for Russ
> surfaced exactly because he was running a kernel with slightly out of the
> ordinary memory management.
>
> If RPM packages would help with this please let me know. So far all I have
> heard is silence.
Yes, please - if not for every 1.5.x release, at least for the ones you
want tested. I'm probably in a position here where I could start pushing
1.5 onto certain desktops/workstations around here, and having ready-made
RPMs for 1.5 will make that task that much easier.
--andy
From matt@linuxbox.com Thu Apr 8 19:18:13 2010
From: matt@linuxbox.com (Matt W. Benjamin)
Date: Thu, 8 Apr 2010 14:18:13 -0400 (EDT)
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an
appropriate application for AFS?
In-Reply-To: <1931138644.1895.1270750370350.JavaMail.root@thunderbeast.private.linuxbox.com>
Message-ID: <1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
Hi,
Simon is correct. A byte-range locking implementation for OpenAFS is being funded by Your File System, Inc., under its DOE SBIR Phase II grant. As stated elsewhere by Jeff, there are (or will be) structures for making completed available to the community during the course of the work.
However, my understanding that shared r/w access to sqlite through AFS probably does work, provided you ensure sqlite uses the correct locking style (cf. sqlite's os_unix.c):
#define SQLITE_WHOLE_FILE_LOCKING 0x0001 /* Use whole-file locking */
This feature is apparently due to Adam Megacz, who posted briefly about it in 2006. See http://marc.info/?l=sqlite-users&m=116742195016159&w=2 .
Regards,
Matt
----- "Simon Wilkinson" wrote:
> On 8 Apr 2010, at 12:22, Todd Lewis wrote:
>
>
> Someone (Matt, IIRC) was working on server support for byte range
> locking, but I don't think we've seen any code yet.
>
> Cheers,
>
> Simon.
--
Matt Benjamin
The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI 48104
http://linuxbox.com
tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309
From brandon.m.simmons@gmail.com Thu Apr 8 21:06:18 2010
From: brandon.m.simmons@gmail.com (Brandon Simmons)
Date: Thu, 8 Apr 2010 16:06:18 -0400
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an
appropriate application for AFS?
In-Reply-To: <1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
References: <1931138644.1895.1270750370350.JavaMail.root@thunderbeast.private.linuxbox.com>
<1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
Message-ID:
On Thu, Apr 8, 2010 at 2:18 PM, Matt W. Benjamin wrote:
> Hi,
>
> Simon is correct. =A0A byte-range locking implementation for OpenAFS is b=
eing funded by Your File System, Inc., under its DOE SBIR Phase II grant. =
=A0As stated elsewhere by Jeff, there are (or will be) structures for makin=
g completed available to the community during the course of the work.
>
> However, my understanding that shared r/w access to sqlite through AFS pr=
obably does work, provided you ensure sqlite uses the correct locking style=
(cf. sqlite's os_unix.c):
>
> #define SQLITE_WHOLE_FILE_LOCKING =A00x0001 =A0 /* Use whole-file locking=
*/
>
> This feature is apparently due to Adam Megacz, who posted briefly about i=
t in 2006. =A0See http://marc.info/?l=3Dsqlite-users&m=3D116742195016159&w=
=3D2 .
>
> Regards,
>
> Matt
>
Thanks for the response. It seems like whole-file locking in sqlite
would be a good choice for me in any case, and I can't imagine needing
that kind of writing concurrency.
Doing a little more research, this message describes a few more issues
with sqlite over NFS which I suppose might apply to AFS:
http://old.nabble.com/SQLite-on-NFS-cache-coherency-td15655701.html
In a situation where the whole-file locking scheme is used, would AFS
be an acceptable choice? Would it be better than NFS?
For instance I envision a handful of clients on different machines
each writing to a single sqlite DB every few seconds; would this
defeat AFS's caching scheme?
Thanks for the thoughtful responses.
> ----- "Simon Wilkinson" wrote:
>
>> On 8 Apr 2010, at 12:22, Todd Lewis wrote:
>>
>
>>
>> Someone (Matt, IIRC) was working on server support for byte range
>> locking, but I don't think we've seen any code yet.
>>
>> Cheers,
>>
>> Simon.
>
> --
>
> Matt Benjamin
>
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI =A048104
>
> http://linuxbox.com
>
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309
>
From kula@tproa.net Thu Apr 8 21:38:16 2010
From: kula@tproa.net (Thomas Kula)
Date: Thu, 8 Apr 2010 16:38:16 -0400
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an
appropriate application for AFS?
In-Reply-To:
References: <1931138644.1895.1270750370350.JavaMail.root@thunderbeast.private.linuxbox.com>
<1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
Message-ID: <20100408203816.GJ17256@mcketrick.tproa.net>
On Thu, Apr 08, 2010 at 04:06:18PM -0400, Brandon Simmons wrote:
>
> Thanks for the response. It seems like whole-file locking in sqlite
> would be a good choice for me in any case, and I can't imagine needing
> that kind of writing concurrency.
>
> Doing a little more research, this message describes a few more issues
> with sqlite over NFS which I suppose might apply to AFS:
>
> http://old.nabble.com/SQLite-on-NFS-cache-coherency-td15655701.html
>
> In a situation where the whole-file locking scheme is used, would AFS
> be an acceptable choice? Would it be better than NFS?
>
> For instance I envision a handful of clients on different machines
> each writing to a single sqlite DB every few seconds; would this
> defeat AFS's caching scheme?
>
Basically, every time that sqlite db file is changed the fileserver
will have to notify all of the clients that have callbacks on
that file, and then all the clients will have to go fetch that
file again (or the chunk that changed, maybe, I can never remember
that detail). It does kinda defeat caching, whether or not you
still get good enough performance for it to still be useful is
unknown.
This kind of thing does make me twitchy, since it is starting to
sound close to something that happens around here: every so often
someone (usually a student) re-invents this notion of finding a
lab full of unused machines, logging on to all of them and turning
AFS into a rather horrid and ineffective message passing interface.
Now, they usually do this at a higher rate than you are anticipating
(dozens or hundreds of file creations/modifications/deletions per
second --- the really fun ones renew their credentials every time
as well, to the point where we've got a local unit of measurement
named after a user who did a rather impressive level of this...),
and at that level, with all the callbacks being broken on a large
number of vnodes from a large number of clients at a high rate
usually makes that particular fileserver unhappy.
So, the question of concurrency and if sqlite will do the right
thing aside, you may want to try a good 200% expected load from
some number of clients on a volume on a fileserver you don't
particularly care about first and make sure you aren't going to
make things unhappy for yourself.
You may also want to make sure that other common AFS operations
don't cause problems. The only one that springs to mind off hand
is the brief period when a volume is backed up that the volume
is locked so the backup clone an be created. There may be other
examples that other folks can think of. Of course, your application
should be making sure it handles these kinds of things already
anyways, because even local disk likes to be goofy on occasion.
I love sqlite, but I don't use sqlite databases that are located
in AFS outside of a single user (me) from a single machine (my
workstation) and a single process that happen to be looking at a
db file that just happens to be in AFS because, well, then I'll
know where that file is and know it's getting backed up. Your
milage may vary, offer void except when it isn't, etc. etc.
--
Thomas L. Kula | kula@tproa.net | http://kula.tproa.net/
From Todd_Lewis@unc.edu Thu Apr 8 21:43:55 2010
From: Todd_Lewis@unc.edu (Todd Lewis)
Date: Thu, 08 Apr 2010 16:43:55 -0400
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an
appropriate application for AFS?
In-Reply-To:
References: <1931138644.1895.1270750370350.JavaMail.root@thunderbeast.private.linuxbox.com> <1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
Message-ID: <4BBE400B.1020004@email.unc.edu>
On 04/08/2010 04:06 PM, Brandon Simmons wrote:
> For instance I envision a handful of clients on different machines
> each writing to a single sqlite DB every few seconds; would this
> defeat AFS's caching scheme?
>
> Thanks for the thoughtful responses.
Every few seconds your cached data is going to be invalidated, which will make
sure you server stays thoroughly utilized.
Sqlite shines when you don't need a data base daemon running on some central
server and access requirements fit file ACLs. You don't have that. You need to
distribute at a higher level. Bite the bullet and set up a real db you can
connect to and write from multiple clients, and let it do what it's designed
to do: arbitrate writes to maintain data integrity. AFS doesn't solve this
problem (nor does NFS). Sorry.
--
Todd_Lewis@unc.edu
From adeason@sinenomine.net Thu Apr 8 22:03:59 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Thu, 8 Apr 2010 16:03:59 -0500
Subject: [OpenAFS] Re: Shared r/w access to numerous sqlite databases: an appropriate
application for AFS?
References: <1931138644.1895.1270750370350.JavaMail.root@thunderbeast.private.linuxbox.com>
<1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
<4BBE400B.1020004@email.unc.edu>
Message-ID: <20100408160359.ad073d63.adeason@sinenomine.net>
On Thu, 08 Apr 2010 16:43:55 -0400
Todd Lewis wrote:
>
>
> On 04/08/2010 04:06 PM, Brandon Simmons wrote:
> > For instance I envision a handful of clients on different machines
> > each writing to a single sqlite DB every few seconds; would this
> > defeat AFS's caching scheme?
As others have said, it'll be slow. But it should work; more likely to
work than NFS, anyway. Just don't turn off sqlite's fsync()ing behavior
(don't change the 'PRAGMA synchronous' setting from the default FULL).
In particular with the slowness, if one of those clients goes down that
was accessing the database, the next write could take many many seconds
to complete.
> arbitrate writes to maintain data integrity. AFS doesn't solve this
> problem (nor does NFS). Sorry.
I believe this would be a lot more efficient with XCB, if the written
sections are untouched by the other clients. It's not impossible for AFS
to do better.
--
Andrew Deason
adeason@sinenomine.net
From dirk.heinrichs@online.de Thu Apr 8 22:12:12 2010
From: dirk.heinrichs@online.de (Dirk Heinrichs)
Date: Thu, 08 Apr 2010 23:12:12 +0200
Subject: [OpenAFS] Shared r/w access to numerous sqlite databases: an appropriate application for AFS?
References: <4BBDBC61.9000609@email.unc.edu>
Message-ID: <201004082312.13198.dirk.heinrichs@online.de>
Am Donnerstag 08 April 2010 13:22:09 schrieb Todd Lewis:
> I happen to be facing exactly that same problem at the moment, so I'm
> hopeful (doubtful, but hopeful) someone will step up and prove me wrong.
Well, I won't. But why don't you both simply install a real Db server, like
PostgreSQL, for example?
Bye...
Dirk
From bampfamd@berkeley.edu Thu Apr 8 23:19:54 2010
From: bampfamd@berkeley.edu (bampfamd@berkeley.edu)
Date: Thu, 8 Apr 2010 15:19:54 -0700
Subject: [OpenAFS] openAFS 1.4.12 Kernel Panic on restart? (mac)
Message-ID: <532c8ee43e0c7a407fd39c5097b004f5.squirrel@calmail.berkeley.edu>
So i installed OpenAFS 1.4.12 on a Leopard Mac OS X computer earlier and
tried running it (i had a slow afs loading issue earlier...but this
version fixed that). Even though this version fixed that slow loading
issue, I get the Kernel Panic message (the "You need to power down your
computer...") every time I try to power down the computer/restart it. I
assume that the kernel panic happens when the system is trying to unmount
the AFS server...but as of right now, it happens every single time I turn
off my computer.
If the kernel panic log is needed I will post one up. But does anyone have
any idea as of right now?
From shadow@gmail.com Fri Apr 9 04:51:59 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Thu, 8 Apr 2010 23:51:59 -0400
Subject: [OpenAFS] openAFS 1.4.12 Kernel Panic on restart? (mac)
In-Reply-To: <532c8ee43e0c7a407fd39c5097b004f5.squirrel@calmail.berkeley.edu>
References: <532c8ee43e0c7a407fd39c5097b004f5.squirrel@calmail.berkeley.edu>
Message-ID:
On Thu, Apr 8, 2010 at 6:19 PM, wrote:
> So i installed OpenAFS 1.4.12 on a Leopard Mac OS X computer earlier and
> tried running it (i had a slow afs loading issue earlier...but this
> version fixed that). Even though this version fixed that slow loading
> issue, I get the Kernel Panic message (the "You need to power down your
> computer...") every time I try to power down the computer/restart it. I
> assume that the kernel panic happens when the system is trying to unmount
> the AFS server...but as of right now, it happens every single time I turn
> off my computer.
>
> If the kernel panic log is needed I will post one up. But does anyone have
> any idea as of right now?
a decoded panic log would be ideal, if you could. the decode-panic
sript should be able to help with that.
From boyland@cs.uwm.edu Fri Apr 9 17:08:47 2010
From: boyland@cs.uwm.edu (John Tang Boyland)
Date: Fri, 09 Apr 2010 11:08:47 -0500
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
Message-ID: <29935.1270829327@pabst.cs.uwm.edu>
We get an occasional deadlock happening on Solaris 5.10 using
OpenAFS 1.4.11. After the problem starts, any attempt to use AFS
on the machine freezes: For example:
% truss -f touch /afs/not-here
15694: execve("/usr/bin/touch", 0x08047E20, 0x08047E2C) argc = 2
15694: resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
15694: resolvepath("/usr/bin/touch", "/usr/bin/touch", 1023) = 14
15694: sysconfig(_CONFIG_PAGESIZE) = 4096
15694: xstat(2, "/usr/bin/touch", 0x08047BF8) = 0
15694: open("/var/ld/ld.config", O_RDONLY) = 3
15694: fxstat(2, 3, 0x08047B38) = 0
15694: mmap(0x00000000, 128104, PROT_READ, MAP_SHARED, 3, 0) = 0xFEFA1000
15694: close(3) = 0
15694: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEF90000
15694: xstat(2, "/lib/libc.so.1", 0x08047440) = 0
15694: resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
15694: open("/lib/libc.so.1", O_RDONLY) = 3
15694: mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEF80000
15694: mmap(0x00010000, 880640, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEEA0000
15694: mmap(0xFEEA0000, 775469, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEEA0000
15694: mmap(0xFEF6E000, 26855, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 778240) = 0xFEF6E000
15694: mmap(0xFEF75000, 5016, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEF75000
15694: munmap(0xFEF5E000, 65536) = 0
15694: memcntl(0xFEEA0000, 123376, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
15694: close(3) = 0
15694: munmap(0xFEF80000, 32768) = 0
15694: mmap(0x00010000, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEF80000
15694: getcontext(0x080479B0)
15694: getrlimit(RLIMIT_STACK, 0x080479A8) = 0
15694: getpid() = 15694 [15692]
15694: lwp_private(0, 1, 0xFEF82000) = 0x000001C3
15694: setustack(0xFEF82060)
15694: sysi86(SI86FPSTART, 0xFEF75A58, 0x0000133F, 0x00001F80) = 0x00000001
15694: brk(0x08062758) = 0
15694: brk(0x08064758) = 0
15694: xstat(2, "/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", 0x08046D08) = 015694: resolvepath("/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", "/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", 1023) = 44
15694: open("/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", O_RDONLY) = 3
15694: mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEE90000
15694: mmap(0x00010000, 2297856, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEC00000
15694: mmap(0xFEC00000, 2225278, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEC00000
15694: mmap(0xFEE2F000, 4234, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 2224128) = 0xFEE2F000
15694: munmap(0xFEE20000, 61440) = 0
15694: memcntl(0xFEC00000, 7188, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
15694: close(3) = 0
15694: xstat(2, "/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3", 0x08046C60) = 0
15694: resolvepath("/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3", "/usr/lib/locale/common/methods_unicode.so.3", 1023) = 43
15694: open("/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3", O_RDONLY) = 3
15694: mmap(0xFEE90000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFEE90000
15694: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEE80000
15694: mmap(0x00010000, 122880, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE60000
15694: mmap(0xFEE60000, 55437, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEE60000
15694: mmap(0xFEE7D000, 2524, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 53248) = 0xFEE7D000
15694: munmap(0xFEE6E000, 61440) = 0
15694: memcntl(0xFEE60000, 2532, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
15694: close(3) = 0
15694: xstat(2, "/usr/lib/locale/en_US.UTF-8/libc.so.1", 0x08046C60) Err#2 ENOENT
15694: munmap(0xFEE90000, 32768) = 0
15694: sysconfig(_CONFIG_PAGESIZE) = 4096
FREEZE
On a machine which has not had the problem (yet...) the output continues
...
24608: sysconfig(_CONFIG_PAGESIZE) = 4096
24608: stat64("/afs/not-here", 0x08047C90) Err#2 ENOENT
24608: creat64("/afs/not-here", 0666) Err#30 EROFS
24608: open("/usr/lib/locale/en_US.UTF-8/LC_MESSAGES/SUNW_OST_OSCMD.mo", O_RDONLY) Err#2 ENOENT
24608: fstat64(2, 0x08046EA0) = 0
24608: write(2, " t o u c h", 5) = 5
24608: write(2, " : ", 2) = 2
24608: write(2, " / a f s / n o t - h e r".., 13) = 13
24608: write(2, " c a n n o t c r e a".., 15) = 15
24608: _exit(1)
In other words, the stat64 call accesses AFS and (on the machine
with the problem), the thread gets stuck in the AFS tarbaby.
I suspected it was due to logging, so I changed the configuration to
mount a dedicated partition for /usr/vice/cache, and rebooted. The '
machine was fine for a month or two, but problem has re-occurred.
The machine is used frequently (it's our main computer server for
undergraduate classes) but "fortunately" AFS is not very popular here
so most courses don't use it (partly because of nasty things like
this happening now and then) and so the machine is still being used
for non-AFS courses. Hence I hadn't tried to install a newer version
of OpenAFS. If this is a known bug with OpenAFS, I will indeed
ask them to take the machine offline long enough to fix this.
(Political capital and all; I hope people understand.)
I haven't tried to reproduce this bug (and wouldn't want to
on the computer server!): it only seems to happen on these
main compute servers -- never on my little research machines... :-(
Any help would be appreciated.
John
From shadow@gmail.com Fri Apr 9 17:26:26 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Fri, 9 Apr 2010 12:26:26 -0400
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: <29935.1270829327@pabst.cs.uwm.edu>
References: <29935.1270829327@pabst.cs.uwm.edu>
Message-ID:
cmdebug or it didn't happen.
On Fri, Apr 9, 2010 at 12:08 PM, John Tang Boyland
wrote:
> We get an occasional deadlock happening on Solaris 5.10 using
> OpenAFS 1.4.11. =A0After the problem starts, any attempt to use AFS
> on the machine freezes: =A0For example:
>
> % truss -f touch /afs/not-here
> 15694: =A0execve("/usr/bin/touch", 0x08047E20, 0x08047E2C) =A0argc =3D 2
> 15694: =A0resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) =3D 12
> 15694: =A0resolvepath("/usr/bin/touch", "/usr/bin/touch", 1023) =3D 14
> 15694: =A0sysconfig(_CONFIG_PAGESIZE) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =3D 4096
> 15694: =A0xstat(2, "/usr/bin/touch", 0x08047BF8) =A0 =A0 =A0 =A0 =A0=3D 0
> 15694: =A0open("/var/ld/ld.config", O_RDONLY) =A0 =A0 =A0 =A0 =A0 =A0 =3D=
3
> 15694: =A0fxstat(2, 3, 0x08047B38) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0=3D 0
> 15694: =A0mmap(0x00000000, 128104, PROT_READ, MAP_SHARED, 3, 0) =3D 0xFEF=
A1000
> 15694: =A0close(3) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0=3D 0
> 15694: =A0mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIV=
ATE|MAP_ANON, -1, 0) =3D 0xFEF90000
> 15694: =A0xstat(2, "/lib/libc.so.1", 0x08047440) =A0 =A0 =A0 =A0 =A0=3D 0
> 15694: =A0resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) =3D 14
> 15694: =A0open("/lib/libc.so.1", O_RDONLY) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0=3D 3
> 15694: =A0mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_AL=
IGN, 3, 0) =3D 0xFEF80000
> 15694: =A0mmap(0x00010000, 880640, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|M=
AP_ANON|MAP_ALIGN, -1, 0) =3D 0xFEEA0000
> 15694: =A0mmap(0xFEEA0000, 775469, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_F=
IXED|MAP_TEXT, 3, 0) =3D 0xFEEA0000
> 15694: =A0mmap(0xFEF6E000, 26855, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_F=
IXED|MAP_INITDATA, 3, 778240) =3D 0xFEF6E000
> 15694: =A0mmap(0xFEF75000, 5016, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FI=
XED|MAP_ANON, -1, 0) =3D 0xFEF75000
> 15694: =A0munmap(0xFEF5E000, 65536) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =3D 0
> 15694: =A0memcntl(0xFEEA0000, 123376, MC_ADVISE, MADV_WILLNEED, 0, 0) =3D=
0
> 15694: =A0close(3) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0=3D 0
> 15694: =A0munmap(0xFEF80000, 32768) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =3D 0
> 15694: =A0mmap(0x00010000, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRI=
VATE|MAP_ANON|MAP_ALIGN, -1, 0) =3D 0xFEF80000
> 15694: =A0getcontext(0x080479B0)
> 15694: =A0getrlimit(RLIMIT_STACK, 0x080479A8) =A0 =A0 =A0 =A0 =A0 =A0 =3D=
0
> 15694: =A0getpid() =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0=3D 15694 [15692]
> 15694: =A0lwp_private(0, 1, 0xFEF82000) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =3D 0x000001C3
> 15694: =A0setustack(0xFEF82060)
> 15694: =A0sysi86(SI86FPSTART, 0xFEF75A58, 0x0000133F, 0x00001F80) =3D 0x0=
0000001
> 15694: =A0brk(0x08062758) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =A0 =A0 =A0 =3D 0
> 15694: =A0brk(0x08064758) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =A0 =A0 =A0 =3D 0
> 15694: =A0xstat(2, "/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", 0x0804=
6D08) =3D 015694: =A0resolvepath("/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.s=
o.3", "/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", 1023) =3D 44
> 15694: =A0open("/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", O_RDONLY) =
=3D 3
> 15694: =A0mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_AL=
IGN, 3, 0) =3D 0xFEE90000
> 15694: =A0mmap(0x00010000, 2297856, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|=
MAP_ANON|MAP_ALIGN, -1, 0) =3D 0xFEC00000
> 15694: =A0mmap(0xFEC00000, 2225278, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_=
FIXED|MAP_TEXT, 3, 0) =3D 0xFEC00000
> 15694: =A0mmap(0xFEE2F000, 4234, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIV=
ATE|MAP_FIXED|MAP_INITDATA, 3, 2224128) =3D 0xFEE2F000
> 15694: =A0munmap(0xFEE20000, 61440) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =3D 0
> 15694: =A0memcntl(0xFEC00000, 7188, MC_ADVISE, MADV_WILLNEED, 0, 0) =3D 0
> 15694: =A0close(3) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0=3D 0
> 15694: =A0xstat(2, "/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3"=
, 0x08046C60) =3D 0
> 15694: =A0resolvepath("/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so=
.3", "/usr/lib/locale/common/methods_unicode.so.3", 1023) =3D 43
> 15694: =A0open("/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3", O_=
RDONLY) =3D 3
> 15694: =A0mmap(0xFEE90000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FI=
XED, 3, 0) =3D 0xFEE90000
> 15694: =A0mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIV=
ATE|MAP_ANON, -1, 0) =3D 0xFEE80000
> 15694: =A0mmap(0x00010000, 122880, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|M=
AP_ANON|MAP_ALIGN, -1, 0) =3D 0xFEE60000
> 15694: =A0mmap(0xFEE60000, 55437, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FI=
XED|MAP_TEXT, 3, 0) =3D 0xFEE60000
> 15694: =A0mmap(0xFEE7D000, 2524, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIV=
ATE|MAP_FIXED|MAP_INITDATA, 3, 53248) =3D 0xFEE7D000
> 15694: =A0munmap(0xFEE6E000, 61440) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =3D 0
> 15694: =A0memcntl(0xFEE60000, 2532, MC_ADVISE, MADV_WILLNEED, 0, 0) =3D 0
> 15694: =A0close(3) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0=3D 0
> 15694: =A0xstat(2, "/usr/lib/locale/en_US.UTF-8/libc.so.1", 0x08046C60) E=
rr#2 ENOENT
> 15694: =A0munmap(0xFEE90000, 32768) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =3D 0
> 15694: =A0sysconfig(_CONFIG_PAGESIZE) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =3D 4096
>
> FREEZE
>
> On a machine which has not had the problem (yet...) the output continues
> ...
> 24608: =A0sysconfig(_CONFIG_PAGESIZE) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =3D 4096
> 24608: =A0stat64("/afs/not-here", 0x08047C90) =A0 =A0 =A0 =A0 =A0 =A0 Err=
#2 ENOENT
> 24608: =A0creat64("/afs/not-here", 0666) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0Err#30 EROFS
> 24608: =A0open("/usr/lib/locale/en_US.UTF-8/LC_MESSAGES/SUNW_OST_OSCMD.mo=
", O_RDONLY) Err#2 ENOENT
> 24608: =A0fstat64(2, 0x08046EA0) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0=3D 0
> 24608: =A0write(2, " t o u c h", 5) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =3D 5
> 24608: =A0write(2, " : =A0", 2) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =3D 2
> 24608: =A0write(2, " / a f s / n o t - h e r".., 13) =A0 =A0 =A0=3D 13
> 24608: =A0write(2, " =A0 c a n n o t =A0 c r e a".., 15) =A0 =A0 =A0=3D 1=
5
> 24608: =A0_exit(1)
>
> In other words, the stat64 call accesses AFS and (on the machine
> with the problem), the thread gets stuck in the AFS tarbaby.
>
> I suspected it was due to logging, so I changed the configuration to
> mount a dedicated partition for /usr/vice/cache, and rebooted. =A0The '
> machine was fine for a month or two, but problem has re-occurred.
>
> The machine is used frequently (it's our main computer server for
> undergraduate classes) but "fortunately" AFS is not very popular here
> so most courses don't use it (partly because of nasty things like
> this happening now and then) and so the machine is still being used
> for non-AFS courses. =A0Hence I hadn't tried to install a newer version
> of OpenAFS. =A0If this is a known bug with OpenAFS, I will indeed
> ask them to take the machine offline long enough to fix this.
> (Political capital and all; I hope people understand.)
>
> I haven't tried to reproduce this bug (and wouldn't want to
> on the computer server!): it only seems to happen on these
> main compute servers -- never on my little research machines... :-(
>
> Any help would be appreciated.
>
> John
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
--=20
Derrick
From deengert@anl.gov Fri Apr 9 17:33:48 2010
From: deengert@anl.gov (Douglas E. Engert)
Date: Fri, 09 Apr 2010 11:33:48 -0500
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: <29935.1270829327@pabst.cs.uwm.edu>
References: <29935.1270829327@pabst.cs.uwm.edu>
Message-ID: <4BBF56EC.7090900@anl.gov>
What happens if you have exported LANG=C into your environment.
Based on you trace the freeze comes before AFS does anything. Or
is just the truss has not written all its output.
John Tang Boyland wrote:
> We get an occasional deadlock happening on Solaris 5.10 using
> OpenAFS 1.4.11. After the problem starts, any attempt to use AFS
> on the machine freezes: For example:
>
> % truss -f touch /afs/not-here
> 15694: execve("/usr/bin/touch", 0x08047E20, 0x08047E2C) argc = 2
> 15694: resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
> 15694: resolvepath("/usr/bin/touch", "/usr/bin/touch", 1023) = 14
> 15694: sysconfig(_CONFIG_PAGESIZE) = 4096
> 15694: xstat(2, "/usr/bin/touch", 0x08047BF8) = 0
> 15694: open("/var/ld/ld.config", O_RDONLY) = 3
> 15694: fxstat(2, 3, 0x08047B38) = 0
> 15694: mmap(0x00000000, 128104, PROT_READ, MAP_SHARED, 3, 0) = 0xFEFA1000
> 15694: close(3) = 0
> 15694: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEF90000
> 15694: xstat(2, "/lib/libc.so.1", 0x08047440) = 0
> 15694: resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
> 15694: open("/lib/libc.so.1", O_RDONLY) = 3
> 15694: mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEF80000
> 15694: mmap(0x00010000, 880640, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEEA0000
> 15694: mmap(0xFEEA0000, 775469, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEEA0000
> 15694: mmap(0xFEF6E000, 26855, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 778240) = 0xFEF6E000
> 15694: mmap(0xFEF75000, 5016, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEF75000
> 15694: munmap(0xFEF5E000, 65536) = 0
> 15694: memcntl(0xFEEA0000, 123376, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
> 15694: close(3) = 0
> 15694: munmap(0xFEF80000, 32768) = 0
> 15694: mmap(0x00010000, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEF80000
> 15694: getcontext(0x080479B0)
> 15694: getrlimit(RLIMIT_STACK, 0x080479A8) = 0
> 15694: getpid() = 15694 [15692]
> 15694: lwp_private(0, 1, 0xFEF82000) = 0x000001C3
> 15694: setustack(0xFEF82060)
> 15694: sysi86(SI86FPSTART, 0xFEF75A58, 0x0000133F, 0x00001F80) = 0x00000001
> 15694: brk(0x08062758) = 0
> 15694: brk(0x08064758) = 0
> 15694: xstat(2, "/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", 0x08046D08) = 015694: resolvepath("/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", "/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", 1023) = 44
> 15694: open("/usr/lib/locale/en_US.UTF-8/en_US.UTF-8.so.3", O_RDONLY) = 3
> 15694: mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEE90000
> 15694: mmap(0x00010000, 2297856, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEC00000
> 15694: mmap(0xFEC00000, 2225278, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEC00000
> 15694: mmap(0xFEE2F000, 4234, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 2224128) = 0xFEE2F000
> 15694: munmap(0xFEE20000, 61440) = 0
> 15694: memcntl(0xFEC00000, 7188, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
> 15694: close(3) = 0
> 15694: xstat(2, "/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3", 0x08046C60) = 0
> 15694: resolvepath("/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3", "/usr/lib/locale/common/methods_unicode.so.3", 1023) = 43
> 15694: open("/usr/lib/locale/en_US.UTF-8/methods_en_US.UTF-8.so.3", O_RDONLY) = 3
> 15694: mmap(0xFEE90000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFEE90000
> 15694: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEE80000
> 15694: mmap(0x00010000, 122880, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE60000
> 15694: mmap(0xFEE60000, 55437, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEE60000
> 15694: mmap(0xFEE7D000, 2524, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 53248) = 0xFEE7D000
> 15694: munmap(0xFEE6E000, 61440) = 0
> 15694: memcntl(0xFEE60000, 2532, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
> 15694: close(3) = 0
> 15694: xstat(2, "/usr/lib/locale/en_US.UTF-8/libc.so.1", 0x08046C60) Err#2 ENOENT
> 15694: munmap(0xFEE90000, 32768) = 0
> 15694: sysconfig(_CONFIG_PAGESIZE) = 4096
>
> FREEZE
>
> On a machine which has not had the problem (yet...) the output continues
> ...
> 24608: sysconfig(_CONFIG_PAGESIZE) = 4096
> 24608: stat64("/afs/not-here", 0x08047C90) Err#2 ENOENT
> 24608: creat64("/afs/not-here", 0666) Err#30 EROFS
> 24608: open("/usr/lib/locale/en_US.UTF-8/LC_MESSAGES/SUNW_OST_OSCMD.mo", O_RDONLY) Err#2 ENOENT
> 24608: fstat64(2, 0x08046EA0) = 0
> 24608: write(2, " t o u c h", 5) = 5
> 24608: write(2, " : ", 2) = 2
> 24608: write(2, " / a f s / n o t - h e r".., 13) = 13
> 24608: write(2, " c a n n o t c r e a".., 15) = 15
> 24608: _exit(1)
>
> In other words, the stat64 call accesses AFS and (on the machine
> with the problem), the thread gets stuck in the AFS tarbaby.
>
> I suspected it was due to logging, so I changed the configuration to
> mount a dedicated partition for /usr/vice/cache, and rebooted. The '
> machine was fine for a month or two, but problem has re-occurred.
>
> The machine is used frequently (it's our main computer server for
> undergraduate classes) but "fortunately" AFS is not very popular here
> so most courses don't use it (partly because of nasty things like
> this happening now and then) and so the machine is still being used
> for non-AFS courses. Hence I hadn't tried to install a newer version
> of OpenAFS. If this is a known bug with OpenAFS, I will indeed
> ask them to take the machine offline long enough to fix this.
> (Political capital and all; I hope people understand.)
>
> I haven't tried to reproduce this bug (and wouldn't want to
> on the computer server!): it only seems to happen on these
> main compute servers -- never on my little research machines... :-(
>
> Any help would be appreciated.
>
> John
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
>
--
Douglas E. Engert
Argonne National Laboratory
9700 South Cass Avenue
Argonne, Illinois 60439
(630) 252-5444
From boyland@cs.uwm.edu Fri Apr 9 18:36:05 2010
From: boyland@cs.uwm.edu (John Tang Boyland)
Date: Fri, 09 Apr 2010 12:36:05 -0500
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: Your message of "Fri, 09 Apr 2010 12:26:26 EDT."
Message-ID: <595.1270834565@pabst.cs.uwm.edu>
] cmdebug or it didn't happen.
]
] On Fri, Apr 9, 2010 at 12:08 PM, John Tang Boyland
] wrote:
] > We get an occasional deadlock happening on Solaris 5.10 using
] > OpenAFS 1.4.11. After the problem starts, any attempt to use AFS
] > on the machine freezes: For example:
I foolishly thought that with every AFS access deadlocking, cmdebug
wouldn't work. But it does....
** Cache entry @ 0xa26da3f0 for 1.536875155.530.1418 [cs.uwm.edu]
locks: (reader_waiting, write_locked(pid:17732 at:250), 2 waiters)
26532 bytes DV 197 refcnt 3
callback 00000000 expires 1270781645
1 opens 0 writers
normal file
states (0x0)
** Cache entry @ 0xa27651d0 for 1.536875155.608.1458 [cs.uwm.edu]
locks: (upgrade_waiting, write_locked(pid:17679 at:66), 12 waiters)
19094744 bytes DV 109 refcnt 13
callback 00000000 expires 1270782798
0 opens 0 writers
normal file
states (0x0)
** Cache entry @ 0xa2710018 for 1.536875155.610.1499 [cs.uwm.edu]
locks: (none_waiting, write_locked(pid:17889 at:250))
23240 bytes DV 1 refcnt 1
callback 00000000 expires 1270782798
1 opens 0 writers
normal file
states (0x0)
** Cache entry @ 0xa26bf000 for 1.536873892.1.1 [cs.uwm.edu]
locks: (writer_waiting, 7 read_locks(pid:18421), 41 waiters)
2048 bytes DV 8 refcnt 49
callback 00000000 expires 1270774475
0 opens 0 writers
volume root
states (0x4), read-only
** Cache entry @ 0xa27c99a0 for 1.536874783.4234.9047 [cs.uwm.edu]
locks: (none_waiting, write_locked(pid:17834 at:250))
1641 bytes DV 1 refcnt 1
callback 00000000 expires 1270782797
1 opens 0 writers
normal file
states (0x0)
John
From boyland@cs.uwm.edu Fri Apr 9 18:39:43 2010
From: boyland@cs.uwm.edu (John Tang Boyland)
Date: Fri, 09 Apr 2010 12:39:43 -0500
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: Your message of "Fri, 09 Apr 2010 11:33:48 CDT."
Message-ID: <634.1270834783@pabst.cs.uwm.edu>
] What happens if you have exported LANG=C into your environment.
...
19813: lwp_private(0, 1, 0xFEF82000) = 0x000001C3
19813: setustack(0xFEF82060)
19813: sysi86(SI86FPSTART, 0xFEF75A58, 0x0000133F, 0x00001F80) = 0x00000001
19813: brk(0x08062758) = 0
19813: brk(0x08064758) = 0
FREEZE
On the machine WITHOUT the problem, it continues:
27096: stat64("/afs/not-here", 0x08047CA0) Err#2 ENOENT
27096: creat64("/afs/not-here", 0666) Err#30 EROFS
27096: fstat64(2, 0x08046EB0) = 0
27096: write(2, " t o u c h", 5) = 5
...
] Based on you trace the freeze comes before AFS does anything. Or
] is just the truss has not written all its output.
The freeze comes EXACTLY when AFS is asked to do somrething. truss
writes output once the call to stat64 returns, which it never does.
John Boyland
From shadow@gmail.com Fri Apr 9 19:21:31 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Fri, 9 Apr 2010 14:21:31 -0400
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: <595.1270834565@pabst.cs.uwm.edu>
References: <595.1270834565@pabst.cs.uwm.edu>
Message-ID:
On Fri, Apr 9, 2010 at 1:36 PM, John Tang Boyland
wrote:
> ] cmdebug or it didn't happen.
> ]
> ] On Fri, Apr 9, 2010 at 12:08 PM, John Tang Boyland
> ] wrote:
> ] > We get an occasional deadlock happening on Solaris 5.10 using
> ] > OpenAFS 1.4.11. =A0After the problem starts, any attempt to use AFS
> ] > on the machine freezes: =A0For example:
>
> I foolishly thought that with every AFS access deadlocking, cmdebug
> wouldn't work. =A0But it does....
That's the best time to try it.
>
> ** Cache entry @ 0xa26da3f0 for 1.536875155.530.1418 [cs.uwm.edu]
> =A0 =A0locks: (reader_waiting, write_locked(pid:17732 at:250), 2 waiters)
> =A0 =A0 =A0 =A0 =A0 26532 bytes =A0DV =A0 =A0 =A0 =A0 =A0197 =A0refcnt =
=A0 =A0 3
> =A0 =A0callback 00000000 =A0 expires 1270781645
> =A0 =A01 opens =A0 =A0 0 writers
> =A0 =A0normal file
> =A0 =A0states (0x0)
ok, that's the rdwr vnode op, so that makes some sense.
> ** Cache entry @ 0xa27651d0 for 1.536875155.608.1458 [cs.uwm.edu]
> =A0 =A0locks: (upgrade_waiting, write_locked(pid:17679 at:66), 12 waiters=
)
> =A0 =A0 =A0 =A019094744 bytes =A0DV =A0 =A0 =A0 =A0 =A0109 =A0refcnt =A0 =
=A013
> =A0 =A0callback 00000000 =A0 expires 1270782798
> =A0 =A00 opens =A0 =A0 0 writers
> =A0 =A0normal file
> =A0 =A0states (0x0)
and 66 is GetDCache
> ** Cache entry @ 0xa2710018 for 1.536875155.610.1499 [cs.uwm.edu]
> =A0 =A0locks: (none_waiting, write_locked(pid:17889 at:250))
> =A0 =A0 =A0 =A0 =A0 23240 bytes =A0DV =A0 =A0 =A0 =A0 =A0 =A01 =A0refcnt =
=A0 =A0 1
> =A0 =A0callback 00000000 =A0 expires 1270782798
> =A0 =A01 opens =A0 =A0 0 writers
> =A0 =A0normal file
> =A0 =A0states (0x0)
> ** Cache entry @ 0xa26bf000 for 1.536873892.1.1 [cs.uwm.edu]
> =A0 =A0locks: (writer_waiting, 7 read_locks(pid:18421), 41 waiters)
> =A0 =A0 =A0 =A0 =A0 =A02048 bytes =A0DV =A0 =A0 =A0 =A0 =A0 =A08 =A0refcn=
t =A0 =A049
> =A0 =A0callback 00000000 =A0 expires 1270774475
> =A0 =A00 opens =A0 =A0 0 writers
> =A0 =A0volume root
> =A0 =A0states (0x4), read-only
> ** Cache entry @ 0xa27c99a0 for 1.536874783.4234.9047 [cs.uwm.edu]
> =A0 =A0locks: (none_waiting, write_locked(pid:17834 at:250))
> =A0 =A0 =A0 =A0 =A0 =A01641 bytes =A0DV =A0 =A0 =A0 =A0 =A0 =A01 =A0refcn=
t =A0 =A0 1
> =A0 =A0callback 00000000 =A0 expires 1270782797
> =A0 =A01 opens =A0 =A0 0 writers
> =A0 =A0normal file
> =A0 =A0states (0x0)
Can you get the fids of the files in question?
--=20
Derrick
From shadow@gmail.com Fri Apr 9 19:27:24 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Fri, 9 Apr 2010 14:27:24 -0400
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To:
References: <595.1270834565@pabst.cs.uwm.edu>
Message-ID:
On Fri, Apr 9, 2010 at 2:21 PM, Derrick Brashear wrote:
> On Fri, Apr 9, 2010 at 1:36 PM, John Tang Boyland
> wrote:
>> ] cmdebug or it didn't happen.
>> ]
>> ] On Fri, Apr 9, 2010 at 12:08 PM, John Tang Boyland
>> ] wrote:
>> ] > We get an occasional deadlock happening on Solaris 5.10 using
>> ] > OpenAFS 1.4.11. =A0After the problem starts, any attempt to use AFS
>> ] > on the machine freezes: =A0For example:
>>
>> I foolishly thought that with every AFS access deadlocking, cmdebug
>> wouldn't work. =A0But it does....
>
> That's the best time to try it.
Almost certainly a dcache lock (which cmdebug doesn't show, alas)
involved here and causing this.
I suppose the other thing which may help: can you collect fstrace for this?
From adeason@sinenomine.net Fri Apr 9 19:38:23 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Fri, 9 Apr 2010 13:38:23 -0500
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
References: <595.1270834565@pabst.cs.uwm.edu>
Message-ID: <20100409133823.36f41ce3.adeason@sinenomine.net>
On Fri, 9 Apr 2010 14:27:24 -0400
Derrick Brashear wrote:
> Almost certainly a dcache lock (which cmdebug doesn't show, alas)
> involved here and causing this.
>
> I suppose the other thing which may help: can you collect fstrace for
> this?
What about the kernel stack trace for the proc(s) from mdb? Or do you
know where we're hanging?
--
Andrew Deason
adeason@sinenomine.net
From shadow@gmail.com Fri Apr 9 19:48:34 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Fri, 9 Apr 2010 14:48:34 -0400
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: <20100409133823.36f41ce3.adeason@sinenomine.net>
References: <595.1270834565@pabst.cs.uwm.edu>
<20100409133823.36f41ce3.adeason@sinenomine.net>
Message-ID:
On Fri, Apr 9, 2010 at 2:38 PM, Andrew Deason wrote:
> On Fri, 9 Apr 2010 14:27:24 -0400
> Derrick Brashear wrote:
>
>> Almost certainly a dcache lock (which cmdebug doesn't show, alas)
>> involved here and causing this.
>>
>> I suppose the other thing which may help: can you collect fstrace for
>> this?
>
> What about the kernel stack trace for the proc(s) from mdb? Or do you
> know where we're hanging?
nope. i figured fstrace would make it easier to guess that but jumping
directly to a stack trace is probabyl a fine course of action.
--
Derrick
From adeason@sinenomine.net Fri Apr 9 20:45:32 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Fri, 9 Apr 2010 14:45:32 -0500
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
References: <595.1270834565@pabst.cs.uwm.edu>
<20100409133823.36f41ce3.adeason@sinenomine.net>
Message-ID: <20100409144532.40a800cf.adeason@sinenomine.net>
On Fri, 9 Apr 2010 14:48:34 -0400
Derrick Brashear wrote:
> > What about the kernel stack trace for the proc(s) from mdb? Or do you
> > know where we're hanging?
>
> nope. i figured fstrace would make it easier to guess that but jumping
> directly to a stack trace is probabyl a fine course of action.
John, if you want to do this, do the following for each PID listed in
that cmdebug output:
("$pid", "ffffffffaddress1", and "ffffffffaddress2" etc are placeholders)
# mdb -k
> 0t$pid::pid2proc | ::threadlist
ADDR PROC LWP CMD/LWPID
ffffffffaddress1 ffffffffaddress2 0 XXX/YYY
> ffffffffaddress2::findstack
So, as an example, looking at process 674:
> 0t674::pid2proc | ::threadlist
ADDR PROC LWP CMD/LWPID
ffffffff83e74908 ffffffff832de020 0 /239
> ffffffff832de020::findstack
[stack trace]
(make sure you don't see anything sensitive in there, though I don't
think there would be)
fstrace may give more information on how we came to that point, but this
should tell us why someone is hanging with the lock we're waiting for...
--
Andrew Deason
adeason@sinenomine.net
From boyland@cs.uwm.edu Fri Apr 9 22:57:37 2010
From: boyland@cs.uwm.edu (John Tang Boyland)
Date: Fri, 09 Apr 2010 16:57:37 -0500
Subject: [OpenAFS] deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: Your message of "Fri, 09 Apr 2010 14:21:31 EDT."
Message-ID: <1640.1270850257@pabst.cs.uwm.edu>
Derrick Brashear writes:
] [...]
] John Boyland writes:
] <...>
] > ** Cache entry @ 0xa26da3f0 for 1.536875155.530.1418 [cs.uwm.edu]
] > locks: (reader_waiting, write_locked(pid:17732 at:250), 2 waiters)
] > 26532 bytes DV 197 refcnt 3
] > callback 00000000 expires 1270781645
] > 1 opens 0 writers
] > normal file
] > states (0x0)
]
] ok, that's the rdwr vnode op, so that makes some sense.
]
] > ** Cache entry @ 0xa27651d0 for 1.536875155.608.1458 [cs.uwm.edu]
] > locks: (upgrade_waiting, write_locked(pid:17679 at:66), 12 waiters)
] > 19094744 bytes DV 109 refcnt 13
] > callback 00000000 expires 1270782798
] > 0 opens 0 writers
] > normal file
] > states (0x0)
]
] and 66 is GetDCache
] > The person who generated this process says they were running aprocess
] > that was writing debugging output redirected into a file and there was a
] > loop so the file got very large.
]
] hm. did it really hang, or just get very slow?
I was told that at first things got slow, but then it stopped
altogether. They then moved to a different machine and read the file
from there. Meanwhile, on the bad machine, ANY afs file action hangs.
] > ** Cache entry @ 0xa2710018 for 1.536875155.610.1499 [cs.uwm.edu]
] > locks: (none_waiting, write_locked(pid:17889 at:250))
] > 23240 bytes DV 1 refcnt 1
] > callback 00000000 expires 1270782798
] > 1 opens 0 writers
] > normal file
] > states (0x0)
] > ** Cache entry @ 0xa26bf000 for 1.536873892.1.1 [cs.uwm.edu]
] > locks: (writer_waiting, 7 read_locks(pid:18421), 41 waiters)
] > 2048 bytes DV 8 refcnt 49
] > callback 00000000 expires 1270774475
] > 0 opens 0 writers
] > volume root
] > states (0x4), read-only
This last one (with 41 waiters) is /afs:
good% /usr/afsws/bin/fs getfid /afs
File /afs (536873892.1.1) contained in volume 536873892
(on the bad machine, I can't touch /usr/afsws of course since it is a
link into /afs)
cmdebug now shows 71 waiters -- of course it increases whenever someone
touches afs and joins the tarbaby.
] Almost certainly a dcache lock (which cmdebug doesn't show, alas)
] involved here and causing this.
]
] I suppose the other thing which may help: can you collect fstrace for this?
bad# ./fstrace setset cm -active
[ during this time I touch /afs/not-here and freeze ]
bad# ./fstrace dump cm
AFS Trace Dump -
Date: Fri Apr 9 16:50:14 2010
Found 1 logs.
Contents of log cmfx:
AFS Trace Dump - Completed
bad# ./fstrace setset cm -inactive
John Boyland
From shadow@gmail.com Sat Apr 10 05:51:29 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Sat, 10 Apr 2010 00:51:29 -0400
Subject: [OpenAFS] Reminder! Early Registration Discount for the OpenAFS & Kerberos Best
Practices Workshop 2010 ends April 14!
Message-ID:
Registration for the OpenAFS & Kerberos Best Practices Workshop is
available on the website, http://workshop.openafs.org/.
Register by April 14, 2010 to get the best prices. AFS and Kerberos
tutorials are $100 each, the Workshop itself is $150, or register for
all three for only $300.
After April 14, prices will go up, so register now and save.
A tentative schedule is available. Further details, including
social events, is still forthcoming.
Hotel and travel information is also available.
We'll be looking forward to meeting you at Illinois next month!
Derrick,
for the Workshop Organizers
http://workshop.openafs.org/
From boyland@cs.uwm.edu Sat Apr 10 18:54:30 2010
From: boyland@cs.uwm.edu (John Tang Boyland)
Date: Sat, 10 Apr 2010 12:54:30 -0500
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: Your message of "Sat, 10 Apr 2010 12:01:03 EDT."
Message-ID: <2575.1270922070@pabst.cs.uwm.edu>
[BTW: I get openafs-info messages digested.]
Andrew Deason wrote:
] On Fri, 9 Apr 2010 14:48:34 -0400
] Derrick Brashear wrote:
]
] > > What about the kernel stack trace for the proc(s) from mdb? Or do you
] > > know where we're hanging?
] >
] > nope. i figured fstrace would make it easier to guess that but jumping
] > directly to a stack trace is probabyl a fine course of action.
]
] John, if you want to do this, do the following for each PID listed in
] that cmdebug output:
]
] ("$pid", "ffffffffaddress1", and "ffffffffaddress2" etc are placeholders)
]
] # mdb -k
] > 0t$pid::pid2proc | ::threadlist
] ADDR PROC LWP CMD/LWPID
] ffffffffaddress1 ffffffffaddress2 0 XXX/YYY
] > ffffffffaddress2::findstack
]
] So, as an example, looking at process 674:
]
] > 0t674::pid2proc | ::threadlist
] ADDR PROC LWP CMD/LWPID
] ffffffff83e74908 ffffffff832de020 0 /239
] > ffffffff832de020::findstack
] [stack trace]
Thanks for the detailed instructions. I never even knew mdb existed.
BTW:
process 17679 is the one writing the LONG file that seemed to
initiate the deadlock. I notice it is inside "FetchWholeEnchilada".
process 18421 is the one listed for the cmdebug entry for /afs:
the root directory of the whole AFS system and the cm entry with the
most waiters.
> 0t17732::pid2proc | ::threadlist
ADDR PROC LWP CMD/LWPID
fffffe84baca3a98 fffffe8925fa2500 0 /239
> fffffe8925fa2500::findstack
stack pointer for thread fffffe8925fa2500: fffffe8002fce5a0
[ fffffe8002fce5a0 _resume_from_idle+0xf8() ]
fffffe8002fce5d0 swtch+0x110()
fffffe8002fce5f0 cv_wait+0x68()
fffffe8002fce640 afs_osi_Sleep+0x99()
fffffe8002fce6c0 Afs_Lock_Obtain+0x1cb()
fffffe8002fce780 afs_putpage+0x14a()
fffffe8002fce7f0 osi_VM_GetDownD+0xe8()
fffffe8002fce9c0 afs_GetDownD+0x7ed()
fffffe8002fceb90 afs_GetDCache+0x713()
fffffe8002fcecc0 afs_nfsrdwr+0xd19()
fffffe8002fced30 afs_vmread+0x89()
fffffe8002fced80 fop_read+0x31()
fffffe8002fceeb0 read+0x188()
fffffe8002fceec0 read32+0xe()
fffffe8002fcef10 sys_syscall32+0x101()
> 0t17679::pid2proc | ::threadlist
ADDR PROC LWP CMD/LWPID
fffffe84c16f55a8 fffffe84bae85500 0 /239
> fffffe84bae85500::findstack
stack pointer for thread fffffe84bae85500: fffffe8003244640
[ fffffe8003244640 _resume_from_idle+0xf8() ]
fffffe8003244670 swtch+0x110()
fffffe8003244690 cv_wait+0x68()
fffffe80032446e0 afs_osi_Sleep+0x99()
fffffe8003244760 Afs_Lock_Obtain+0x1cb()
fffffe8003244820 afs_putpage+0x14a()
fffffe8003244890 osi_VM_GetDownD+0xe8()
fffffe8003244a60 afs_GetDownD+0x7ed()
fffffe8003244c30 afs_GetDCache+0x713()
fffffe8003244cb0 FetchWholeEnchilada+0xf4()
fffffe8003244d80 afs_remove+0x7eb()
fffffe8003244de0 gafs_remove+0x4f()
fffffe8003244e10 fop_remove+0x25()
fffffe8003244ea0 vn_removeat+0x228()
fffffe8003244eb0 vn_remove+0x12()
fffffe8003244ec0 unlink+0xd()
fffffe8003244f10 sys_syscall32+0x101()
> 0t17889::pid2proc | ::threadlist
ADDR PROC LWP CMD/LWPID
fffffe84b1c218d0 fffffe8926218840 0 /239
> fffffe8926218840::findstack
stack pointer for thread fffffe8926218840: fffffe8001cff5a0
[ fffffe8001cff5a0 _resume_from_idle+0xf8() ]
fffffe8001cff5d0 swtch+0x110()
fffffe8001cff5f0 cv_wait+0x68()
fffffe8001cff640 afs_osi_Sleep+0x99()
fffffe8001cff6c0 Afs_Lock_Obtain+0x1cb()
fffffe8001cff780 afs_putpage+0x14a()
fffffe8001cff7f0 osi_VM_GetDownD+0xe8()
fffffe8001cff9c0 afs_GetDownD+0x7ed()
fffffe8001cffb90 afs_GetDCache+0x713()
fffffe8001cffcc0 afs_nfsrdwr+0xd19()
fffffe8001cffd30 afs_vmread+0x89()
fffffe8001cffd80 fop_read+0x31()
fffffe8001cffeb0 read+0x188()
fffffe8001cffec0 read32+0xe()
fffffe8001cfff10 sys_syscall32+0x101()
> 0t18421::pid2proc | ::threadlist
ADDR PROC LWP CMD/LWPID
fffffe84b6b60de0 fffffe8905b2d280 0 /239
> fffffe8905b2d280::findstack
stack pointer for thread fffffe8905b2d280: fffffe8000365300
[ fffffe8000365300 _resume_from_idle+0xf8() ]
fffffe8000365330 swtch+0x110()
fffffe8000365350 cv_wait+0x68()
fffffe80003653a0 afs_osi_Sleep+0x99()
fffffe8000365420 Afs_Lock_Obtain+0x1cb()
fffffe80003654e0 afs_putpage+0x14a()
fffffe8000365550 osi_VM_GetDownD+0xe8()
fffffe8000365720 afs_GetDownD+0x7ed()
fffffe80003658f0 afs_GetDCache+0x6f8()
fffffe8000365a20 afs_lookup+0x700()
fffffe8000365aa0 gafs_lookup+0x70()
fffffe8000365af0 fop_lookup+0x35()
fffffe8000365cc0 lookuppnvp+0x1bf()
fffffe8000365d30 lookuppnat+0xf9()
fffffe8000365df0 lookupnameat+0x86()
fffffe8000365e50 cstatat_getvp+0x115()
fffffe8000365eb0 cstatat64_32+0x4c()
fffffe8000365ec0 stat64_32+0x22()
fffffe8000365f10 sys_syscall32+0x101()
> 0t17834::pid2proc | ::threadlist
ADDR PROC LWP CMD/LWPID
fffffe84c2cbc008 fffffe8516ab2720 0 ?????????K?/239
> fffffe8516ab2720::findstack
stack pointer for thread fffffe8516ab2720: fffffe8002fda5a0
[ fffffe8002fda5a0 _resume_from_idle+0xf8() ]
fffffe8002fda5d0 swtch+0x110()
fffffe8002fda5f0 cv_wait+0x68()
fffffe8002fda640 afs_osi_Sleep+0x99()
fffffe8002fda6c0 Afs_Lock_Obtain+0x1cb()
fffffe8002fda780 afs_putpage+0x14a()
fffffe8002fda7f0 osi_VM_GetDownD+0xe8()
fffffe8002fda9c0 afs_GetDownD+0x7ed()
fffffe8002fdab90 afs_GetDCache+0x713()
fffffe8002fdacc0 afs_nfsrdwr+0xd19()
fffffe8002fdad30 afs_vmread+0x89()
fffffe8002fdad80 fop_read+0x31()
fffffe8002fdaeb0 read+0x188()
fffffe8002fdaec0 read32+0xe()
fffffe8002fdaf10 sys_syscall32+0x101()
>
From adam@megacz.com Mon Apr 12 04:13:41 2010
From: adam@megacz.com (Adam Megacz)
Date: Mon, 12 Apr 2010 03:13:41 +0000
Subject: [OpenAFS] sqlite on AFS will not work, even with whole-file locking
References: <1931138644.1895.1270750370350.JavaMail.root@thunderbeast.private.linuxbox.com>
<1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
Message-ID:
Brandon Simmons writes:
> Thanks for the response. It seems like whole-file locking in sqlite
> would be a good choice for me in any case,
> In a situation where the whole-file locking scheme is used, would AFS
> be an acceptable choice? Would it be better than NFS?
I had the same idea, and tried it. It does not work. Your databases
will get corrupted. I never figured out why, although I did confirm
that sqlite was in fact requesting only whole-file locks.
It would be nice if it worked, though. There are a lot of applications
out there where writes to the database are extremely rare, so
invalidating all the clients' caches is not a problem.
- a
From adeason@sinenomine.net Mon Apr 12 05:14:18 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Sun, 11 Apr 2010 23:14:18 -0500
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
References: <2575.1270922070@pabst.cs.uwm.edu>
Message-ID: <20100411231418.acf4f0cf.adeason@sinenomine.net>
On Sat, 10 Apr 2010 12:54:30 -0500
John Tang Boyland wrote:
> [mdb]
Thanks for those. I'm not sure myself what's going on, but perhaps some
discussion will help...
You appear to be running out of cache files, though, by the way. If you
increase the size of your cache (or maybe even just the number of
files), it may make this less likely to occur.
> BTW:
> process 17679 is the one writing the LONG file that seemed to
> initiate the deadlock. I notice it is inside "FetchWholeEnchilada".
It appears to have unlinked the file while it was open; does that sound
correct?
> fffffe8003244cb0 FetchWholeEnchilada+0xf4()
> fffffe8003244d80 afs_remove+0x7eb()
Can someone explain this, by the way? If I'm reading this correctly, we
fetch/cache the entire file contents of a file if it's unlinked from
under a process... Why?
> fffffe8002fda5d0 swtch+0x110()
> fffffe8002fda5f0 cv_wait+0x68()
> fffffe8002fda640 afs_osi_Sleep+0x99()
> fffffe8002fda6c0 Afs_Lock_Obtain+0x1cb()
> fffffe8002fda780 afs_putpage+0x14a()
> fffffe8002fda7f0 osi_VM_GetDownD+0xe8()
> fffffe8002fda9c0 afs_GetDownD+0x7ed()
> fffffe8002fdab90 afs_GetDCache+0x713()
So, all of these are waiting to free up a dcache entry. I'm not in this
code very much, but here's a guess... someone tell me if this makes any
sense.
What looks like may be possible is that some process locks vcache V1,
and tries to get a dcache entry for it; it tries to create a new dcache
entry and tries to free up a dcache entry (D1) because we're out. D1 has
mapped pages (or whatever IFAnyPages means), and we need to invalidate
the pages, so we need to lock D1's vcache. If D1's vcache is the same as
vcache V1, we have deadlock. This makes sense to me to see while
FetchWholeEnchilada is running, since fetching the later chunks may be
trying to free up the earlier chunks fetched in the same file...
If that is plausible, I think potential solutions include dropping the
V1 lock before GetDownD (I assume this isn't possible, or a lot of
things assume this doesn't happen and is a lot of work to make right,
etc)... or, passing the avc into GetDownD, and have GetDownD skip
dcaches that need page invalidation that have the same vcache as the one
passed in. That way we sleep and retry (although still while holding the
V1 lock...)
--
Andrew Deason
adeason@sinenomine.net
From shadow@gmail.com Mon Apr 12 05:34:23 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Mon, 12 Apr 2010 00:34:23 -0400
Subject: [OpenAFS] sqlite on AFS will not work, even with whole-file
locking
In-Reply-To:
References: <1931138644.1895.1270750370350.JavaMail.root@thunderbeast.private.linuxbox.com>
<1857482657.1897.1270750692978.JavaMail.root@thunderbeast.private.linuxbox.com>
Message-ID:
On Sun, Apr 11, 2010 at 11:13 PM, Adam Megacz wrote:
>
> Brandon Simmons writes:
>> Thanks for the response. It seems like whole-file locking in sqlite
>> would be a good choice for me in any case,
>
>> In a situation where the whole-file locking scheme is used, would AFS
>> be an acceptable choice? Would it be better than NFS?
>
> I had the same idea, and tried it. =A0It does not work. =A0Your databases
> will get corrupted. =A0I never figured out why, although I did confirm
> that sqlite was in fact requesting only whole-file locks.
>
> It would be nice if it worked, though. =A0There are a lot of applications
> out there where writes to the database are extremely rare, so
> invalidating all the clients' caches is not a problem.
do you happen to know what the corruption looked like (blocks of
zeroes, just not readable, something else)
--=20
Derrick
From atro.tossavainen+openafs@helsinki.fi Mon Apr 12 13:29:11 2010
From: atro.tossavainen+openafs@helsinki.fi (Atro Tossavainen)
Date: Mon, 12 Apr 2010 15:29:11 +0300 (EEST)
Subject: [OpenAFS] Ubik problem
Message-ID: <201004121229.o3CCTBgc010382@ruuvi.it.helsinki.fi>
I recently changed one of our cell's db servers from IBM AFS on
Solaris 8 / SPARC to OpenAFS on Solaris 10 / x64. The other one
remains on IBM AFS on Solaris 8 for what will hopefully be a very
short time until I migrate it over to S10x64 as well.
I've seen some strange database problems recently, and no wonder: the
OpenAFS server seems to be thinking its IP address is what it would be
if you reversed the octets.
The public db servers for biocenter.helsinki.fi are 128.214.58.174
and 128.214.88.114. The former seems to have some strange ideas about
174.58.214.128 being somehow involved. Please see udebug -long output
from both:
sun4x_58 # /usr/afs/bin/udebug 128.214.88.114 7002 -long
Host's addresses are: 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1
Host's 128.214.88.114 time is Mon Apr 12 15:17:30 2010
Local time is Mon Apr 12 15:17:34 2010 (time differential 4 secs)
Last yes vote for 128.214.88.114 was 1 secs ago (not sync site);
Last vote started 1 secs ago (at Mon Apr 12 15:17:33 2010)
Local db version is 1270721316.13
I am not sync site
Lowest host 128.214.88.114 was set 1 secs ago
Sync host 0.0.0.0 was set 1271074650 secs ago
Sync site's db version is 1270721316.13
0 locked pages, 0 of them for write
Server( 128.214.58.174 ): (db 0.0)
last vote rcvd 1 secs ago (at Mon Apr 12 15:17:33 2010),
last beacon sent 1 secs ago (at Mon Apr 12 15:17:33 2010), last vote was no
dbcurrent=0, up=1 beaconSince=1
sunx86_510 # /usr/afs/bin/udebug 128.214.58.174 7002 -long
Host's addresses are: 128.214.58.174
Host's 128.214.58.174 time is Mon Apr 12 15:17:27 2010
Local time is Mon Apr 12 15:17:28 2010 (time differential 1 secs)
Last yes vote for 174.58.214.128 was 2 secs ago (sync site);
Last vote started 2 secs ago (at Mon Apr 12 15:17:26 2010)
Local db version is 1270721316.13
I am sync site until 58 secs from now (at Mon Apr 12 15:18:26 2010) (2 servers)
Recovery state 1f
Sync site's db version is 1270721316.13
0 locked pages, 0 of them for write
Server (128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1): (db 1270721316.13)
last vote rcvd 2 secs ago (at Mon Apr 12 15:17:26 2010),
last beacon sent 2 secs ago (at Mon Apr 12 15:17:26 2010), last vote was no
dbcurrent=1, up=1 beaconSince=1
--
Atro Tossavainen (Mr.) / The Institute of Biotechnology at
Systems Analyst, Techno-Amish & / the University of Helsinki, Finland,
+358-9-19158939 UNIX Dinosaur / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS
From jaltman@secure-endpoints.com Mon Apr 12 13:36:44 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Mon, 12 Apr 2010 08:36:44 -0400
Subject: [OpenAFS] Ubik problem
In-Reply-To: <201004121229.o3CCTBgc010382@ruuvi.it.helsinki.fi>
References: <201004121229.o3CCTBgc010382@ruuvi.it.helsinki.fi>
Message-ID: <4BC313DC.3000401@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms040206060403060600000404
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/12/2010 8:29 AM, Atro Tossavainen wrote:
> I recently changed one of our cell's db servers from IBM AFS on
> Solaris 8 / SPARC to OpenAFS on Solaris 10 / x64. The other one
> remains on IBM AFS on Solaris 8 for what will hopefully be a very
> short time until I migrate it over to S10x64 as well.
>=20
> I've seen some strange database problems recently, and no wonder: the
> OpenAFS server seems to be thinking its IP address is what it would be
> if you reversed the octets.
>=20
> The public db servers for biocenter.helsinki.fi are 128.214.58.174
> and 128.214.88.114. The former seems to have some strange ideas about
> 174.58.214.128 being somehow involved.
What version of OpenAFS? What does the address report if you use
the udebug from the sparc system to query the x86 system? I believe
this is just a reporting problem with udebug client that was fixed
on master.
--------------ms040206060403060600000404
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTIxMjM2NDRaMCMGCSqGSIb3DQEJBDEWBBQwyiWI
JqJRQ78EgbYYJvntRGsVjzBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAG3zMQKVV9unkn6JXkJg+hNo8n1zAy77Cp9J
z6dQpDcd7HW+LumLfps/zKj6L9egbpRXn8CpB3zkqDPFl4rX9+pjsGopWvKeE+jIDYgpWDqh
PjQS90lzWTCQEuQ2p4dFEhxMp6C0KsTkFUnezL7eJCAoKlsy3j6GVJSlsl6oAYk+wLMAUJAW
91/2QVhNxwjDsNGQi7+TV94YnMwkHMZi1CDAVF0t+97q19B71LqERP7GvWX6Pfl6Ae8QOCEJ
+U+fjy8uOJWuxzv4zBEYYj44/C2raquX/Z4ZfextI8NP1v9EbQJqSd3DDUz4GRm7+kdfl8vt
aMpkMpLzHV1VXq9O8nsAAAAAAAA=
--------------ms040206060403060600000404--
From atro.tossavainen+openafs@helsinki.fi Mon Apr 12 13:54:44 2010
From: atro.tossavainen+openafs@helsinki.fi (Atro Tossavainen)
Date: Mon, 12 Apr 2010 15:54:44 +0300 (EEST)
Subject: [OpenAFS] Ubik problem
In-Reply-To: <4BC313DC.3000401@secure-endpoints.com>
Message-ID: <201004121254.o3CCsikk011778@ruuvi.it.helsinki.fi>
Jeffrey, thanks for the superfast response.
> What version of OpenAFS? What does the address report if you use
> the udebug from the sparc system to query the x86 system? I believe
> this is just a reporting problem with udebug client that was fixed
> on master.
OpenAFS 1.4.12.
The a.b.c.d becoming d.c.b.a seems to be just cosmetic, but there are
some other issues. Out of 128.214.58.174 and 128.214.88.114, the
lowest numbered host certainly isn't 128.214.88.114, and I'm slightly
worried that the report says the database version is 0.0 on the other
host.
sun4x_58 # /usr/afs/bin/udebug 128.214.58.174 7002 -long
Host's addresses are: 128.214.58.174
Host's 128.214.58.174 time is Mon Apr 12 15:50:20 2010
Local time is Mon Apr 12 15:50:24 2010 (time differential 4 secs)
Last yes vote for 128.214.58.174 was 9 secs ago (sync site); <----- indeed
Last vote started 9 secs ago (at Mon Apr 12 15:50:15 2010)
Local db version is 1270721316.13
I am sync site until 51 secs from now (at Mon Apr 12 15:51:15 2010) (2 servers)
Recovery state 1f
Sync site's db version is 1270721316.13
0 locked pages, 0 of them for write
Server( 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1 ): (db 1270721316.13)
last vote rcvd 9 secs ago (at Mon Apr 12 15:50:15 2010),
last beacon sent 9 secs ago (at Mon Apr 12 15:50:15 2010), last vote was no
dbcurrent=1, up=1 beaconSince=1
And indeed:
sunx86_510 # /usr/afs/bin/udebug 128.214.88.114 7002 -long
Host's addresses are: 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1
Host's 128.214.88.114 time is Mon Apr 12 15:51:10 2010
Local time is Mon Apr 12 15:51:10 2010 (time differential 0 secs)
Last yes vote for 114.88.214.128 was 11 secs ago (not sync site); <---
Last vote started 11 secs ago (at Mon Apr 12 15:50:59 2010)
Local db version is 1270721316.13
I am not sync site
Lowest host 128.214.88.114 was set 11 secs ago <---- this can't be right
Sync host 0.0.0.0 was set 1271076670 secs ago
Sync site's db version is 1270721316.13
0 locked pages, 0 of them for write
Server (128.214.58.174): (db 0.0) <---- and this must be wrong too
last vote rcvd 11 secs ago (at Mon Apr 12 15:50:59 2010),
last beacon sent 11 secs ago (at Mon Apr 12 15:50:59 2010), last vote was no
dbcurrent=0, up=1 beaconSince=1
--
Atro Tossavainen (Mr.) / The Institute of Biotechnology at
Systems Analyst, Techno-Amish & / the University of Helsinki, Finland,
+358-9-19158939 UNIX Dinosaur / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS
From jaltman@secure-endpoints.com Mon Apr 12 14:21:30 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Mon, 12 Apr 2010 09:21:30 -0400
Subject: [OpenAFS] Ubik problem
In-Reply-To: <201004121254.o3CCsikk011778@ruuvi.it.helsinki.fi>
References: <201004121254.o3CCsikk011778@ruuvi.it.helsinki.fi>
Message-ID: <4BC31E5A.3030009@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms060702040808020809040203
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/12/2010 8:54 AM, Atro Tossavainen wrote:
> Jeffrey, thanks for the superfast response.
>=20
>> What version of OpenAFS? What does the address report if you use
>> the udebug from the sparc system to query the x86 system? I believe
>> this is just a reporting problem with udebug client that was fixed
>> on master.
>=20
> OpenAFS 1.4.12.
>=20
> The a.b.c.d becoming d.c.b.a seems to be just cosmetic, but there are
> some other issues. Out of 128.214.58.174 and 128.214.88.114, the
> lowest numbered host certainly isn't 128.214.88.114,
actually it is because that server is reporting multiple addresses:
Server( 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1 )
several of which are lower than 128.214.58.174. What are these other
interface addresses are do you expect them to be used for ubik
synchronization?
> and I'm slightly
> worried that the report says the database version is 0.0 on the other
> host.
The db version is only reported by the server that is currently
the sync site. This is not a concern.
--------------ms060702040808020809040203
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTIxMzIxMzBaMCMGCSqGSIb3DQEJBDEWBBTDmvGc
5RuR6g1oK5G1C+fG1bloWjBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAE1qP7cwog5NMRQjdGlbcRjXNxacoDRqlh+l
074Lyh0YtH54pVMspq3qQCkiLAhtBxKmn6UWOL/8W6ZOtnA7r7n2xYpY8LkER7Wp6J5bz8Gm
WmR3K10uASiW3SUcNNcAsI0RBEtUcXa2IJfI8mdA/kK3RUmhbwnrtmzbaQfrnwesJP4YXl5H
sUjlv389h1ueAw3yOncqKaLbZUQu4mB/PRys1pDT3hhoX4O3nbPbkLhADh1FleuGpSBXaNP2
tPQlH578AsEPv/4bpqPX37uBdBzWbvzzaGKyRqaLcQdvcXTL9BvcrD61tcoooIrYyz9q1php
RJiGjqfFh7L1Vhk/mnwAAAAAAAA=
--------------ms060702040808020809040203--
From adeason@sinenomine.net Mon Apr 12 15:03:54 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Mon, 12 Apr 2010 09:03:54 -0500
Subject: [OpenAFS] Re: Ubik problem
References: <201004121254.o3CCsikk011778@ruuvi.it.helsinki.fi>
<4BC31E5A.3030009@secure-endpoints.com>
Message-ID: <20100412090354.c7f567e4.adeason@sinenomine.net>
On Mon, 12 Apr 2010 09:21:30 -0400
Jeffrey Altman wrote:
> On 4/12/2010 8:54 AM, Atro Tossavainen wrote:
> > Jeffrey, thanks for the superfast response.
> >
> >> What version of OpenAFS? What does the address report if you use
> >> the udebug from the sparc system to query the x86 system? I believe
> >> this is just a reporting problem with udebug client that was fixed
> >> on master.
Yeah, 63fe055ecd13c93a3a6070a15a745ace2e420817, not on 1.4 yet. It's
just cosmetic.
> > The a.b.c.d becoming d.c.b.a seems to be just cosmetic, but there are
> > some other issues. Out of 128.214.58.174 and 128.214.88.114, the
> > lowest numbered host certainly isn't 128.214.88.114,
>
> actually it is because that server is reporting multiple addresses:
>
> Server( 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1 )
>
> several of which are lower than 128.214.58.174. What are these other
> interface addresses are do you expect them to be used for ubik
> synchronization?
If you want the other db server to be the lowest host, you can restrict
the addresses advertised with NetInfo/NetRestrict. But if that doesn't
matter to you, there doesn't seem to be a problem. Are you seeing other
issues with the database itself?
--
Andrew Deason
adeason@sinenomine.net
From boyland@cs.uwm.edu Mon Apr 12 15:07:58 2010
From: boyland@cs.uwm.edu (John Tang Boyland)
Date: Mon, 12 Apr 2010 09:07:58 -0500
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: Your message of "Mon, 12 Apr 2010 09:22:00 EDT."
Message-ID: <4382.1271081278@pabst.cs.uwm.edu>
Andrew Deason writes
] John Tang Boyland wrote:
]
] > [mdb]
]
] Thanks for those. I'm not sure myself what's going on, but perhaps some
] discussion will help...
]
] You appear to be running out of cache files, though, by the way. If you
] increase the size of your cache (or maybe even just the number of
] files), it may make this less likely to occur.
OK. I'll do that the next time we reboot. The cacheinfo is
rather small (25000K).
(In fact, I guess that's why other people haven't noticed the problem.
Running with a 25MB disk cache is pretty ridiculous.)
] > BTW:
] > process 17679 is the one writing the LONG file that seemed to
] > initiate the deadlock. I notice it is inside "FetchWholeEnchilada".
]
] It appears to have unlinked the file while it was open; does that sound
] correct?
Possibly: process 17679 is listed as "make test".
I'm guessing the user was noticing
things were going slow and control-C'ed the make process, and "make"
decided to delete the output file.
But I don't know for sure.
] > fffffe8003244cb0 FetchWholeEnchilada+0xf4()
] > fffffe8003244d80 afs_remove+0x7eb()
]
] Can someone explain this, by the way? If I'm reading this correctly, we
] fetch/cache the entire file contents of a file if it's unlinked from
] under a process... Why?
]
] > fffffe8002fda5d0 swtch+0x110()
] > fffffe8002fda5f0 cv_wait+0x68()
] > fffffe8002fda640 afs_osi_Sleep+0x99()
] > fffffe8002fda6c0 Afs_Lock_Obtain+0x1cb()
] > fffffe8002fda780 afs_putpage+0x14a()
] > fffffe8002fda7f0 osi_VM_GetDownD+0xe8()
] > fffffe8002fda9c0 afs_GetDownD+0x7ed()
] > fffffe8002fdab90 afs_GetDCache+0x713()
]
] So, all of these are waiting to free up a dcache entry. I'm not in this
] code very much, but here's a guess... someone tell me if this makes any
] sense.
]
] What looks like may be possible is that some process locks vcache V1,
] and tries to get a dcache entry for it; it tries to create a new dcache
] entry and tries to free up a dcache entry (D1) because we're out. D1 has
] mapped pages (or whatever IFAnyPages means), and we need to invalidate
] the pages, so we need to lock D1's vcache. If D1's vcache is the same as
] vcache V1, we have deadlock. This makes sense to me to see while
] FetchWholeEnchilada is running, since fetching the later chunks may be
] trying to free up the earlier chunks fetched in the same file...
]
] If that is plausible, I think potential solutions include dropping the
] V1 lock before GetDownD (I assume this isn't possible, or a lot of
] things assume this doesn't happen and is a lot of work to make right,
] etc)... or, passing the avc into GetDownD, and have GetDownD skip
] dcaches that need page invalidation that have the same vcache as the one
] passed in. That way we sleep and retry (although still while holding the
] V1 lock...)
]
] --
] Andrew Deason
] adeason@sinenomine.net
BTW: Is there any more useful information I could get from the machine
or can we reboot it? Please reply by email to boyland@cs.uwm.edu.
From shadow@gmail.com Mon Apr 12 15:14:40 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Mon, 12 Apr 2010 10:14:40 -0400
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: <4382.1271081278@pabst.cs.uwm.edu>
References: <4382.1271081278@pabst.cs.uwm.edu>
Message-ID:
you might as well reboot it. i suspect (and wondered before) if the
real issue was not deadlock but that the machine simply went into a
loop, and with a cache that small it's likely it did. not the best
behavior, of course but not the most urgent thing to pursue at the
moment.
On Mon, Apr 12, 2010 at 10:07 AM, John Tang Boyland
wrote:
> Andrew Deason writes
> ] John Tang Boyland wrote:
> ]
> ] > [mdb]
> ]
> ] Thanks for those. I'm not sure myself what's going on, but perhaps some
> ] discussion will help...
> ]
> ] You appear to be running out of cache files, though, by the way. If you
> ] increase the size of your cache (or maybe even just the number of
> ] files), it may make this less likely to occur.
>
> OK. =A0I'll do that the next time we reboot. =A0The cacheinfo is
> rather small (25000K).
>
> (In fact, I guess that's why other people haven't noticed the problem.
> Running with a 25MB disk cache is pretty ridiculous.)
>
> ] > BTW:
> ] > process 17679 is the one writing the LONG file that seemed to
> ] > initiate the deadlock. =A0I notice it is inside "FetchWholeEnchilada"=
.
> ]
> ] It appears to have unlinked the file while it was open; does that sound
> ] correct?
>
> Possibly: process 17679 is listed as "make test".
> I'm guessing the user was noticing
> things were going slow and control-C'ed the make process, and "make"
> decided to delete the output file.
> But I don't know for sure.
>
> ] > =A0 fffffe8003244cb0 FetchWholeEnchilada+0xf4()
> ] > =A0 fffffe8003244d80 afs_remove+0x7eb()
> ]
> ] Can someone explain this, by the way? If I'm reading this correctly, we
> ] fetch/cache the entire file contents of a file if it's unlinked from
> ] under a process... Why?
> ]
> ] > =A0 fffffe8002fda5d0 swtch+0x110()
> ] > =A0 fffffe8002fda5f0 cv_wait+0x68()
> ] > =A0 fffffe8002fda640 afs_osi_Sleep+0x99()
> ] > =A0 fffffe8002fda6c0 Afs_Lock_Obtain+0x1cb()
> ] > =A0 fffffe8002fda780 afs_putpage+0x14a()
> ] > =A0 fffffe8002fda7f0 osi_VM_GetDownD+0xe8()
> ] > =A0 fffffe8002fda9c0 afs_GetDownD+0x7ed()
> ] > =A0 fffffe8002fdab90 afs_GetDCache+0x713()
> ]
> ] So, all of these are waiting to free up a dcache entry. I'm not in this
> ] code very much, but here's a guess... someone tell me if this makes any
> ] sense.
> ]
> ] What looks like may be possible is that some process locks vcache V1,
> ] and tries to get a dcache entry for it; it tries to create a new dcache
> ] entry and tries to free up a dcache entry (D1) because we're out. D1 ha=
s
> ] mapped pages (or whatever IFAnyPages means), and we need to invalidate
> ] the pages, so we need to lock D1's vcache. If D1's vcache is the same a=
s
> ] vcache V1, we have deadlock. This makes sense to me to see while
> ] FetchWholeEnchilada is running, since fetching the later chunks may be
> ] trying to free up the earlier chunks fetched in the same file...
> ]
> ] If that is plausible, I think potential solutions include dropping the
> ] V1 lock before GetDownD (I assume this isn't possible, or a lot of
> ] things assume this doesn't happen and is a lot of work to make right,
> ] etc)... or, passing the avc into GetDownD, and have GetDownD skip
> ] dcaches that need page invalidation that have the same vcache as the on=
e
> ] passed in. That way we sleep and retry (although still while holding th=
e
> ] V1 lock...)
> ]
> ] --
> ] Andrew Deason
> ] adeason@sinenomine.net
>
> BTW: Is there any more useful information I could get from the machine
> or can we reboot it? =A0Please reply by email to boyland@cs.uwm.edu.
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
--=20
Derrick
From atro.tossavainen+openafs@helsinki.fi Mon Apr 12 22:25:51 2010
From: atro.tossavainen+openafs@helsinki.fi (Atro Tossavainen)
Date: Tue, 13 Apr 2010 00:25:51 +0300 (EEST)
Subject: [OpenAFS] Ubik problem
In-Reply-To: <4BC31E5A.3030009@secure-endpoints.com>
Message-ID: <201004122125.o3CLPprA004189@ruuvi.it.helsinki.fi>
Jeffrey,
> actually it is because that server is reporting multiple addresses:
>
> Server( 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1 )
>
> several of which are lower than 128.214.58.174. What are these other
> interface addresses are do you expect them to be used for ubik
> synchronization?
Private addresses for purposes other than AFS.
I believe I am using NetRestrict to avoid the servers from picking up
these:
sun4x_58 # cat /usr/afs/local/NetRestrict
M 10.255.255.255
M 172.255.255.255
M 192.168.255.255
I don't expect to see the RFC1918 addresses anywhere in connection
with AFS. The NetRestrict file is unaltered as of Feb 13, 2009 and
has been essentially identical for years.
(I remember needing to raise a bug with IBM over wildcard support.
PMR-73077... late 2004/early 2005. Not that it says anything to anybody
outside IBM, probably. Merely using "255" wasn't enough, it needed
adding "M" in front for wildcards to work and I think this was and is
undocumented.)
I have not had any database issues for as long as the other sun4x_58
host was the other database server.
Andrew,
> Are you seeing other issues with the database itself?
Today, I could not open my screensaver, and I of course know my password.
When I used kas to see if I had some problem with my account, "examine
atossava" reported that the account did not exist. This was the case for
a few other accounts as well - they didn't exist. Querying one database
server at a time produced different results; correct on afsdb1 and
kaserver producing gobbledygook on afsdb2, if I remember correctly.
I needed to change the password for a user and bumped into an AFS error
message that I have not seen before. I don't think I wrote it down
anywhere, but basically I couldn't do anything with the account because
of a key version mismatch.
I rebooted the sun4x_58 server and it seemed to take a long time to
reach quorum. After that, everything has been all right. I'm worried
that I don't, despite having enabled logging, have much to report in the
AuthLog.
--
Atro Tossavainen (Mr.) / The Institute of Biotechnology at
Systems Analyst, Techno-Amish & / the University of Helsinki, Finland,
+358-9-19158939 UNIX Dinosaur / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS
From sxw@inf.ed.ac.uk Mon Apr 12 22:50:59 2010
From: sxw@inf.ed.ac.uk (Simon Wilkinson)
Date: Mon, 12 Apr 2010 22:50:59 +0100
Subject: [OpenAFS] Ubik problem
In-Reply-To: <201004122125.o3CLPprA004189@ruuvi.it.helsinki.fi>
References: <201004122125.o3CLPprA004189@ruuvi.it.helsinki.fi>
Message-ID: <57FB6B5C-AF7D-402D-8AA0-8642FCD06733@inf.ed.ac.uk>
> Private addresses for purposes other than AFS.
>=20
> I believe I am using NetRestrict to avoid the servers from picking up
> these:
We had this problem with our DB servers here. It would appear that you =
also need to specify -rxbind, to stop the servers from sending packets =
from interfaces that are in NetRestrict.
Cheers,
Simon.
From adeason@sinenomine.net Mon Apr 12 22:58:45 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Mon, 12 Apr 2010 16:58:45 -0500
Subject: [OpenAFS] Re: Ubik problem
References: <201004122125.o3CLPprA004189@ruuvi.it.helsinki.fi>
<57FB6B5C-AF7D-402D-8AA0-8642FCD06733@inf.ed.ac.uk>
Message-ID: <20100412165845.c5f44f42.adeason@sinenomine.net>
On Mon, 12 Apr 2010 22:50:59 +0100
Simon Wilkinson wrote:
> > Private addresses for purposes other than AFS.
> >
> > I believe I am using NetRestrict to avoid the servers from picking up
> > these:
>
> We had this problem with our DB servers here. It would appear that you
> also need to specify -rxbind, to stop the servers from sending packets
> from interfaces that are in NetRestrict.
Atro's problem appears to be that they are advertising the extra
addresses, though, not that they're sending packets out over them.
I've got a bad feeling that NetRestrict handling isn't doing endianness
properly or something, but it's hard to imagine that going unnoticed...
I'm checking, anyway.
--
Andrew Deason
adeason@sinenomine.net
From jaltman@secure-endpoints.com Mon Apr 12 23:05:58 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Mon, 12 Apr 2010 18:05:58 -0400
Subject: [OpenAFS] Ubik problem
In-Reply-To: <201004122125.o3CLPprA004189@ruuvi.it.helsinki.fi>
References: <201004122125.o3CLPprA004189@ruuvi.it.helsinki.fi>
Message-ID: <4BC39946.8020409@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms030007010102020605000500
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/12/2010 5:25 PM, Atro Tossavainen wrote:
> Jeffrey,
>=20
>> actually it is because that server is reporting multiple addresses:
>>
>> Server( 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1 )
>>
>> several of which are lower than 128.214.58.174. What are these other=
>> interface addresses are do you expect them to be used for ubik
>> synchronization?
>=20
> Private addresses for purposes other than AFS.
>=20
> I believe I am using NetRestrict to avoid the servers from picking up
> these:
>=20
> sun4x_58 # cat /usr/afs/local/NetRestrict
> M 10.255.255.255
> M 172.255.255.255
> M 192.168.255.255
>=20
> I don't expect to see the RFC1918 addresses anywhere in connection
> with AFS. The NetRestrict file is unaltered as of Feb 13, 2009 and
> has been essentially identical for years.
>=20
> (I remember needing to raise a bug with IBM over wildcard support.
> PMR-73077... late 2004/early 2005. Not that it says anything to anybod=
y
> outside IBM, probably. Merely using "255" wasn't enough, it needed
> adding "M" in front for wildcards to work and I think this was and is
> undocumented.)
>=20
> I have not had any database issues for as long as the other sun4x_58
> host was the other database server.
The OpenAFS NetRestrict documentation does not mention the use of a
preceding 'M'.
http://docs.openafs.org/Reference/5/NetRestrict.html
I suspect the change IBM implemented was never passed on to OpenAFS
and as far as I can tell from the source its presence will invalidate
the address and prevent it from being used as a filter.
Jeffrey Altman
--------------ms030007010102020605000500
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTIyMjA1NThaMCMGCSqGSIb3DQEJBDEWBBRkNgKA
A9GvxPC8bPyM47ksLB27sjBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAIfUvfYsxMk3G4j+w1pebPfdJkC04SxgDAbT
2UU9zZKk69eeeoxXf3t0fhcZMs6hJlqISHV94fnpqMBjjs/UHmsPGLEkDbUZEhjnzG0FIpW3
udGFMhZHo/lru6g9mrA7VfgspWsAHwUI7U55zGrpfI6wkfoliXwfkxa3S3SpOy5YmK5aH2Fb
qoePy1A9wwApyB1JLkxeDMu+d0oychcfLOjlaa+cOSZJ5IqbHFUckXiY9b4YOPRX8G3htAus
uefajf978EjJ8Y9uqu9B7oRmA8hMzIuxeuf7Zvi9EALbwoN82eOG4UoFAGgb2SvNKQ93q5qX
amwt9kdTSgkw1/pH7HEAAAAAAAA=
--------------ms030007010102020605000500--
From atro.tossavainen+openafs@helsinki.fi Tue Apr 13 06:43:47 2010
From: atro.tossavainen+openafs@helsinki.fi (Atro Tossavainen)
Date: Tue, 13 Apr 2010 08:43:47 +0300 (EEST)
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <20100412165845.c5f44f42.adeason@sinenomine.net>
Message-ID: <201004130543.o3D5hlwL015540@ruuvi.it.helsinki.fi>
Simon,
> We had this problem with our DB servers here. It would appear that you
> also need to specify -rxbind, to stop the servers from sending packets
> from interfaces that are in NetRestrict.
Which command(s) should be started with this flag?
(Remember 128.214.88.114 is still on IBM AFS and probably does not
support -rxbind if it's something that OpenAFS implemented post IBM.)
Andrew,
> Atro's problem appears to be that they are advertising the extra
> addresses, though, not that they're sending packets out over them.
What I don't get is why this would have changed transparently on the
sun4x_58 server when I changed the *other* db server from sun4x_58 to
sunx86_510 and from IBM AFS to OpenAFS.
Jeffrey,
> The OpenAFS NetRestrict documentation does not mention the use of a
> preceding 'M'.
Neither does the IBM documentation, which is what I said, I think.
> I suspect the change IBM implemented was never passed on to OpenAFS
> and as far as I can tell from the source its presence will invalidate
> the address and prevent it from being used as a filter.
On the IBM AFS sun4x_58 server where it was explicitly recommended
by IBM AFS support in early 2005? :-) I'm not running the OpenAFS
db servers. In fact, I seem to be running a mishmash of IBM AFS
servers - the various components aren't even all the same version.
>From my reading of IBM AFS patch READMEs, it appears that "M" appeared
in AFS 3.6 Patch 13 (aka Build Level 2.57) (APAR IY77101). This seems
to have been released in December 2005, but IBM support have indicated
its use in a support ticket conversation in January 2005 already. It
did not work, and in connection with another support ticket in September
2005, IBM made available a patched version of the 3.6 2.56 *fileserver*
only where it finally did work. So I don't think I've ever had any
database servers that are even supposed to obey "M" in NetRestrict,
but at the same time, I haven't had a problem with this so I haven't
noticed. What the...
--
Atro Tossavainen (Mr.) / The Institute of Biotechnology at
Systems Analyst, Techno-Amish & / the University of Helsinki, Finland,
+358-9-19158939 UNIX Dinosaur / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS
From jaltman@secure-endpoints.com Tue Apr 13 13:31:18 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Tue, 13 Apr 2010 08:31:18 -0400
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <201004130543.o3D5hlwL015540@ruuvi.it.helsinki.fi>
References: <201004130543.o3D5hlwL015540@ruuvi.it.helsinki.fi>
Message-ID: <4BC46416.7090203@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms020406050807080209090301
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/13/2010 1:43 AM, Atro Tossavainen wrote:
> Jeffrey,
>=20
>> The OpenAFS NetRestrict documentation does not mention the use of a
>> preceding 'M'.
>=20
> Neither does the IBM documentation, which is what I said, I think.
The OpenAFS Reference Manual has been thoroughly updated to document
how OpenAFS behaves. It is derived from the content of the IBM AFS
man pages but is not the same as them. There is no reference to 'M'
prefixing because OpenAFS does not support it.
>> I suspect the change IBM implemented was never passed on to OpenAFS
>> and as far as I can tell from the source its presence will invalidate
>> the address and prevent it from being used as a filter.
>=20
> On the IBM AFS sun4x_58 server where it was explicitly recommended
> by IBM AFS support in early 2005? :-) I'm not running the OpenAFS
> db servers. In fact, I seem to be running a mishmash of IBM AFS
> servers - the various components aren't even all the same version.
>=20
> From my reading of IBM AFS patch READMEs, it appears that "M" appeared
> in AFS 3.6 Patch 13 (aka Build Level 2.57) (APAR IY77101). This seems
> to have been released in December 2005, but IBM support have indicated
> its use in a support ticket conversation in January 2005 already. It
> did not work, and in connection with another support ticket in Septembe=
r
> 2005, IBM made available a patched version of the 3.6 2.56 *fileserver*=
> only where it finally did work. So I don't think I've ever had any
> database servers that are even supposed to obey "M" in NetRestrict,
> but at the same time, I haven't had a problem with this so I haven't
> noticed. What the...
I can't speak to the IBM patches. I can speak to the OpenAFS sources.
You can read them to see how the NetRestrict file is parsed.
http://git.openafs.org/?p=3Dopenafs.git;a=3Dblob;f=3Dsrc/util/netutils.c;=
h=3D03a90e3b6d89e6f1c781b7324f0f661280ceeaa6;hb=3DHEAD
See parseNetRestrictFile(). There is no reference to 'M' anywhere.
In OpenAFS, "255" should be a mask by itself.
The question OpenAFS is now left with is whether to add support for the
IBM behavior change.
Jeffrey Altman
--------------ms020406050807080209090301
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTMxMjMxMThaMCMGCSqGSIb3DQEJBDEWBBQy5F+T
YdEy0iJvkGvUM9aVHN86wzBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAET38kQyAJ+28qd/EdU1fPHRmwwF1Nsf2gEi
t58ysXb3ILgcew/fFEgnSEtukNVEL7/GnWZm9GLJ0r09O45XZWkQM+vXx8bs0R7cTjVSY2d+
I+PHxMOB3A/yDG+691g3cjUjj+4tAOEyF25Btp2e6xzpR6Yf6E9a/pEfbgoYGCeitxlPNvS/
Ce5AkRCeUQ8hcGucantPxulT+fbbV4xmYmJ27zy9WiaFszC7cN3ocyNAOyUxyj+Oh3fHMbdJ
iPeJeObSarmv0Pf90Izlzqra4NdWhzYlPkbRly3hd8lsw6wpkdZVWaIi/5lFIltp9h+/mKA2
EXU2ENTs4QWx39ygo7YAAAAAAAA=
--------------ms020406050807080209090301--
From jaltman@secure-endpoints.com Tue Apr 13 14:26:24 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Tue, 13 Apr 2010 09:26:24 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server UUIDs
Message-ID: <4BC47100.1030802@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms040009090001020707000000
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
In 2002, the OpenAFS version of the "vos listaddrs" command was updated
to include the Arla -printuuid and -noresolve options which permits the
UUID and IP address of registered file servers to be displayed. For
example:
UUID: 006cab10-0e3e-1b20-a3-aa-2601a8c0aa77
24.193.47.88
192.168.122.1
192.168.1.38
In 2008, the -noresolve option was made generic so that it could apply
to all vos commands so that instead of seeing DNS names the actual IP
addresses of server could be viewed. This change was made because DNS
name resolution often makes it appear that a file server is properly
registered when instead it is in fact not.
However, IP addresses are not the canonical method of identifying a file
server. For that the UUID is required and at the present time there is
no mechanism when using vos listvldb or vos examine to identify the UUID
of the server on which a volume is located. This lack has come up
several times in the #openafs IRC channel when attempting to help users
setup new cells or add new file servers. The most recent time on March
30th.
Gerrit http://gerrit.openafs.org/#change,1742 is an attempt to add
-printuuid as a standard option to all vos commands. The only issue at
the moment is what the format of the output should look like. UUIDs and
DNS names are long. Extending the existing format to include the UUID
inline with each server produces output that will not fit in an 80
column terminal.=20
An example of "vos examine -printuuid" output:
root.cell 537870331 RW 42 K On-line
ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77] /vicepr
RWrite 537870331 ROnly 537870333 Backup 537870332
MaxQuota 500 K
Creation Fri Jun 06 12:24:21 2008
Copy Thu Feb 26 11:43:23 2009
Backup Tue Apr 13 02:00:17 2010
Last Update Thu Oct 18 12:44:23 2007
7647 accesses in the past day (i.e., vnode references)
RWrite: 537870331 ROnly: 537870333 Backup: 537870332
number of sites -> 4
server ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77]
partition /vicepr RW Site
server ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77]
partition /vicepr RO Site
server MNEMOSYNE.MIT.EDU [005d91e8-f824-19a6-aa-5c-613c0912aa77]
partition /vicepr RO Site
server IXION.MIT.EDU [00086236-fa87-19a6-b4-de-ab015b12aa77]
partition /vicepr RO Site
An example of "vos listvldb -printuuid" output:
root.cell
RWrite: 536870915 ROnly: 536870916
number of sites -> 4
server bethlehem.your-file-system.com
[0008fa02-d48c-19b9-81-fc-419a1dccaa77] partition /vicepa RW Site
server bethlehem.your-file-system.com
[0008fa02-d48c-19b9-81-fc-419a1dccaa77] partition /vicepa RO Site
server faultline.your-file-system.com
[0007580a-7001-1aae-85-8e-2f9a1dccaa77] partition /vicepa RO Site
server cpe-24-193-47-88.nyc.res.rr.com
[006cab10-0e3e-1b20-a3-aa-2601a8c0aa77] partition /vicepa RO Site
One alternative output format that could be used when the -printuuid
option is specified is found below.
vos examine -printuuid:
root.cell 537870331 RW 42 K On-line
UUID: 0037555a-be36-19a6-a2-4d-5e3c0912aa77
Server ASCLEPIUS.MIT.EDU
Partition /vicepr
RWrite 537870331 ROnly 537870333 Backup 537870332
MaxQuota 500 K
Creation Fri Jun 06 12:24:21 2008
Copy Thu Feb 26 11:43:23 2009
Backup Tue Apr 13 02:00:17 2010
Last Update Thu Oct 18 12:44:23 2007
7647 accesses in the past day (i.e., vnode references)
RWrite: 537870331 ROnly: 537870333 Backup: 537870332
number of sites -> 4
RW Site
server ASCLEPIUS.MIT.EDU
uuid 0037555a-be36-19a6-a2-4d-5e3c0912aa77
partition /vicepr
RO Site
server ASCLEPIUS.MIT.EDU
uuid 0037555a-be36-19a6-a2-4d-5e3c0912aa77
partition /vicepr
RO Site
server MNEMOSYNE.MIT.EDU
uuid 005d91e8-f824-19a6-aa-5c-613c0912aa77
partition /vicepr
RO Site
server IXION.MIT.EDU
uuid 00086236-fa87-19a6-b4-de-ab015b12aa77
partition /vicepr
vos listvldb -printuuid:
root.cell
RWrite: 536870915 ROnly: 536870916
number of sites -> 4
RW Site
server bethlehem.your-file-system.com
uuid 0008fa02-d48c-19b9-81-fc-419a1dccaa77
partition /vicepa
RO Site
server bethlehem.your-file-system.com
uuid 0008fa02-d48c-19b9-81-fc-419a1dccaa77
partition /vicepa
RO Site
server faultline.your-file-system.com
uuid 0007580a-7001-1aae-85-8e-2f9a1dccaa77
partition /vicepa
RO Site
server cpe-24-193-47-88.nyc.res.rr.com
uuid 006cab10-0e3e-1b20-a3-aa-2601a8c0aa77
partition /vicepa
Please offer your opinions. As people have a variety of scripts that
parse the output of vos commands to automate behaviors, we would not be
changing the default output. Any format change would only be used when
the -printuuid option is specified.
Jeffrey Altman
--------------ms040009090001020707000000
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTMxMzI2MjRaMCMGCSqGSIb3DQEJBDEWBBQ1QuqU
E3cG5Cfu+InB7LyzOvfvpjBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBACh9A2TcnILwGwFdJ4k5oGFHdE89Ejohbl8N
OD3+uE/02/wpqwXedzb9Sz9Q+Yrv4+Ki1XMXiWBxk2oba1iGqAbupmJ/b9uZ2r/gAN4EiAq5
6J9RNpu0gAUf/ZCdD8cqpfxIMSeHyZXHG0dzTJdZlEQs3YzqMxUQyoB6619Ev8IHMZn5bUjc
2ApcbB/CkawIsbeU/prBsj58SfiTAhkAl6Bttwjs5jBQMbzL8sxf/+b7cKAQesK6Q8/K5tPM
rXT+1m4vqc/4HRAVCoRzlLCKAvrCoCeskyJk3YKqEQrMeo2lmkIWS/A1Xp++u/ZIGfzv7ZyN
iFtD3hH4pHfgn4kvQqMAAAAAAAA=
--------------ms040009090001020707000000--
From utoddl@email.unc.edu Tue Apr 13 15:02:35 2010
From: utoddl@email.unc.edu (Todd Lewis)
Date: Tue, 13 Apr 2010 10:02:35 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <4BC47100.1030802@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com>
Message-ID: <4BC4797B.3070202@email.unc.edu>
On 04/13/2010 09:26 AM, Jeffrey Altman sent:
> An example of "vos examine -printuuid" output:
> [...]
> An example of "vos listvldb -printuuid" output:
> [...]
> One alternative output format that could be used when the -printuuid
> option is specified is found below.
>
> vos examine -printuuid:
> [...]
> vos listvldb -printuuid:
>
> Please offer your opinions.
Clearly the multi-line form is easier for humans to read, and the
related-data-on-one-line form is far simpler for scripts to parse. By far.
In both cases.
Is there a place on the ballot to vote for... both, with a switch?
Otherwise, I don't care. I'm screwed sooner or later either way.
--
+--------------------------------------------------------------+
/ Todd_Lewis@unc.edu 919-445-9302 http://www.unc.edu/~utoddl /
/ In democracy it's your vote that counts; /
/ In feudalism it's your count that votes. /
+--------------------------------------------------------------+
From fbo2@gmx.net Tue Apr 13 15:40:25 2010
From: fbo2@gmx.net (Frank Burkhardt)
Date: Tue, 13 Apr 2010 16:40:25 +0200
Subject: [OpenAFS] Cache size limit?
In-Reply-To: <1e8734811003220738q2c97b6d2o18a4efa6d6214f65@mail.gmail.com>
References: <1e8734811003220738q2c97b6d2o18a4efa6d6214f65@mail.gmail.com>
Message-ID: <20100413144025.GA16782@postman.alpha>
Hi,
On Mon, Mar 22, 2010 at 02:38:50PM +0000, Stephen Quinney wrote:
> I was wondering if there are set limits on the AFS cache size for a
> client? Or are there any limiting factors which mean it is not worth
> going beyond a certain point? In this case, this is on a 32bit Linux
> machine but I am also interested in getting an answer for the same
> question for x64_64 Linux. The machine is being used by multiple users
> simultaneously to do big (i.e. large memory & cpu usage, lots of
> filesystem access, long running) computation jobs so I am trying to
> work out the best way to optimise the AFS access.
I've got about 100 linux hosts (x86_64,Debian Lenny,OA 1.4.10) here using a
30GB disk cache. However, I would be interested in some information about
cache limits, too. One of my user is very dissapointed about our AFS'
performance. So I put an additional 200GB HDD into his computer, set the
cache to 175GB ... and it just didn't work. I do not know exactly what the
symptoms were but if anyone is interested, I can do it again and post what
happens.
OK - back to the most interesting question: What are the theoretical and
practical limits of the cache size on linux? How do the practical limits
vary between machines accessing lots of small files and hosts accessing some
large files?
Thank you in advance for any information.
Regards,
Frank
From jblaine@kickflop.net Tue Apr 13 16:22:39 2010
From: jblaine@kickflop.net (Jeff Blaine)
Date: Tue, 13 Apr 2010 11:22:39 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <4BC47100.1030802@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com>
Message-ID: <4BC48C3F.6030802@kickflop.net>
IMO, unless a "for parsing" output format is available (never
likely), the existing line-per-site format should be kept and
not altered just for a new command-line option's output.
This isn't a book we're reading for a half hour. It's vos
output. Let the lines pass 80 cols.
You can have the best of both worlds by noting the position
of 's' in server, then padding the end of "server foo" with
spaces to wrap the uuid around properly instead of a
newline... if one is concerned with the 80col thing that much :)
On 4/13/2010 9:26 AM, Jeffrey Altman wrote:
> In 2002, the OpenAFS version of the "vos listaddrs" command was updated
> to include the Arla -printuuid and -noresolve options which permits the
> UUID and IP address of registered file servers to be displayed. For
> example:
>
> UUID: 006cab10-0e3e-1b20-a3-aa-2601a8c0aa77
> 24.193.47.88
> 192.168.122.1
> 192.168.1.38
>
> In 2008, the -noresolve option was made generic so that it could apply
> to all vos commands so that instead of seeing DNS names the actual IP
> addresses of server could be viewed. This change was made because DNS
> name resolution often makes it appear that a file server is properly
> registered when instead it is in fact not.
>
> However, IP addresses are not the canonical method of identifying a file
> server. For that the UUID is required and at the present time there is
> no mechanism when using vos listvldb or vos examine to identify the UUID
> of the server on which a volume is located. This lack has come up
> several times in the #openafs IRC channel when attempting to help users
> setup new cells or add new file servers. The most recent time on March
> 30th.
>
> Gerrit http://gerrit.openafs.org/#change,1742 is an attempt to add
> -printuuid as a standard option to all vos commands. The only issue at
> the moment is what the format of the output should look like. UUIDs and
> DNS names are long. Extending the existing format to include the UUID
> inline with each server produces output that will not fit in an 80
> column terminal.
>
> An example of "vos examine -printuuid" output:
>
> root.cell 537870331 RW 42 K On-line
> ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77] /vicepr
> RWrite 537870331 ROnly 537870333 Backup 537870332
> MaxQuota 500 K
> Creation Fri Jun 06 12:24:21 2008
> Copy Thu Feb 26 11:43:23 2009
> Backup Tue Apr 13 02:00:17 2010
> Last Update Thu Oct 18 12:44:23 2007
> 7647 accesses in the past day (i.e., vnode references)
>
> RWrite: 537870331 ROnly: 537870333 Backup: 537870332
> number of sites -> 4
> server ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77]
> partition /vicepr RW Site
> server ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77]
> partition /vicepr RO Site
> server MNEMOSYNE.MIT.EDU [005d91e8-f824-19a6-aa-5c-613c0912aa77]
> partition /vicepr RO Site
> server IXION.MIT.EDU [00086236-fa87-19a6-b4-de-ab015b12aa77]
> partition /vicepr RO Site
>
> An example of "vos listvldb -printuuid" output:
>
> root.cell
> RWrite: 536870915 ROnly: 536870916
> number of sites -> 4
> server bethlehem.your-file-system.com
> [0008fa02-d48c-19b9-81-fc-419a1dccaa77] partition /vicepa RW Site
> server bethlehem.your-file-system.com
> [0008fa02-d48c-19b9-81-fc-419a1dccaa77] partition /vicepa RO Site
> server faultline.your-file-system.com
> [0007580a-7001-1aae-85-8e-2f9a1dccaa77] partition /vicepa RO Site
> server cpe-24-193-47-88.nyc.res.rr.com
> [006cab10-0e3e-1b20-a3-aa-2601a8c0aa77] partition /vicepa RO Site
>
> One alternative output format that could be used when the -printuuid
> option is specified is found below.
>
> vos examine -printuuid:
>
> root.cell 537870331 RW 42 K On-line
> UUID: 0037555a-be36-19a6-a2-4d-5e3c0912aa77
> Server ASCLEPIUS.MIT.EDU
> Partition /vicepr
> RWrite 537870331 ROnly 537870333 Backup 537870332
> MaxQuota 500 K
> Creation Fri Jun 06 12:24:21 2008
> Copy Thu Feb 26 11:43:23 2009
> Backup Tue Apr 13 02:00:17 2010
> Last Update Thu Oct 18 12:44:23 2007
> 7647 accesses in the past day (i.e., vnode references)
>
> RWrite: 537870331 ROnly: 537870333 Backup: 537870332
> number of sites -> 4
> RW Site
> server ASCLEPIUS.MIT.EDU
> uuid 0037555a-be36-19a6-a2-4d-5e3c0912aa77
> partition /vicepr
> RO Site
> server ASCLEPIUS.MIT.EDU
> uuid 0037555a-be36-19a6-a2-4d-5e3c0912aa77
> partition /vicepr
> RO Site
> server MNEMOSYNE.MIT.EDU
> uuid 005d91e8-f824-19a6-aa-5c-613c0912aa77
> partition /vicepr
> RO Site
> server IXION.MIT.EDU
> uuid 00086236-fa87-19a6-b4-de-ab015b12aa77
> partition /vicepr
>
> vos listvldb -printuuid:
>
> root.cell
> RWrite: 536870915 ROnly: 536870916
> number of sites -> 4
> RW Site
> server bethlehem.your-file-system.com
> uuid 0008fa02-d48c-19b9-81-fc-419a1dccaa77
> partition /vicepa
> RO Site
> server bethlehem.your-file-system.com
> uuid 0008fa02-d48c-19b9-81-fc-419a1dccaa77
> partition /vicepa
> RO Site
> server faultline.your-file-system.com
> uuid 0007580a-7001-1aae-85-8e-2f9a1dccaa77
> partition /vicepa
> RO Site
> server cpe-24-193-47-88.nyc.res.rr.com
> uuid 006cab10-0e3e-1b20-a3-aa-2601a8c0aa77
> partition /vicepa
>
> Please offer your opinions. As people have a variety of scripts that
> parse the output of vos commands to automate behaviors, we would not be
> changing the default output. Any format change would only be used when
> the -printuuid option is specified.
>
> Jeffrey Altman
>
>
>
>
From sxw@inf.ed.ac.uk Tue Apr 13 16:25:17 2010
From: sxw@inf.ed.ac.uk (Simon Wilkinson)
Date: Tue, 13 Apr 2010 16:25:17 +0100
Subject: [OpenAFS] Modifying the output of vos commands to include server UUIDs
In-Reply-To: <4BC4797B.3070202@email.unc.edu>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu>
Message-ID:
On 13 Apr 2010, at 15:02, Todd Lewis wrote:
>=20
> Clearly the multi-line form is easier for humans to read, and the
> related-data-on-one-line form is far simpler for scripts to parse. By =
far.
> In both cases.
I don't think we should be catering for people parsing command output =
(beyond recognising that folk currently do it, and trying not to break =
them). Instead, we should focus on making interfaces that provide =
properly stuctured output available - things like the AFS-Perl module =
are a great example here.
In terms of what our commands display, we should prioritise human users.
S.
From adeason@sinenomine.net Tue Apr 13 16:34:03 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Tue, 13 Apr 2010 10:34:03 -0500
Subject: [OpenAFS] Re: Modifying the output of vos commands to include server UUIDs
References: <4BC47100.1030802@secure-endpoints.com>
<4BC48C3F.6030802@kickflop.net>
Message-ID: <20100413103403.57b92945.adeason@sinenomine.net>
On Tue, 13 Apr 2010 11:22:39 -0400
Jeff Blaine wrote:
> IMO, unless a "for parsing" output format is available (never
> likely),
'vos ex -format', though I don't think it currently affects that part of
the output.
And it puts things on different lines, which is Todd Lewis'
"harder-to-parse" case anyway.
--
Andrew Deason
adeason@sinenomine.net
From adeason@sinenomine.net Tue Apr 13 19:11:29 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Tue, 13 Apr 2010 13:11:29 -0500
Subject: [OpenAFS] Re: Ubik problem
References: <20100412165845.c5f44f42.adeason@sinenomine.net>
<201004130543.o3D5hlwL015540@ruuvi.it.helsinki.fi>
Message-ID: <20100413131129.b790823c.adeason@sinenomine.net>
On Tue, 13 Apr 2010 08:43:47 +0300 (EEST)
Atro Tossavainen wrote:
> Andrew,
>
> > Atro's problem appears to be that they are advertising the extra
> > addresses, though, not that they're sending packets out over them.
>
> What I don't get is why this would have changed transparently on the
> sun4x_58 server when I changed the *other* db server from sun4x_58 to
> sunx86_510 and from IBM AFS to OpenAFS.
Isn't the sunx86_510 server the one that's reporting extra addresses?
>From this:
>>> sunx86_510 # /usr/afs/bin/udebug 128.214.88.114 7002 -long
>>> Host's addresses are: 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1
It just looks like OpenAFS will (currently) ignore NetRestrict lines
with an 'M' in front as a parse error. So your upgraded sunx86_510
machine does not restrict those addresses, and advertises the private
ones. Some of those are lower than the IPs of the sun4x_58 box, so the
sunx86_510 box looks like the new "lowest IP" server.
--
Andrew Deason
adeason@sinenomine.net
From wayne_greene@unc.edu Tue Apr 13 20:21:29 2010
From: wayne_greene@unc.edu (Wayne Greene)
Date: Tue, 13 Apr 2010 15:21:29 -0400
Subject: [OpenAFS] Windows 7 Sleep behavior-Broken AFS Lock on wakeup
Message-ID: <4BC4C439.3060800@unc.edu>
Hello,
We are using Windows 7 Enterprise, MIT Kerberos for Windows 3.22, and
Open AFS 1-5.6.800 and have encountered a problem for our laptop users
and the sleep function. When the computers wake, the AFS lock symbol in
the system tray shows that it is broken and cannot connect to the AFS
service. Net ID manager shows they have AFS tokens and kerberos tickets
but AFS shares cannot be accessed until the computer is restarted. I
have tried destroying the Network ID credentials, stopping the AFS
service then sleeping the computer, regaining credentials and restarting
the service and after waking up it still doesn't work. I found a power
management option in the Lenovo Power Manager that will keep the network
connection for a specific amount of time during sleep, but that has not
worked either. These so far are Lenovo computers, my test machine is a
ThinkPad R400 but the symptom is the same on a few different models. Is
anyone else experiencing this and/or has anyone found a fix if so? Thanks.
Wayne Greene
Computer Science
UNC Chapel Hill
From jaltman@secure-endpoints.com Tue Apr 13 20:31:10 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Tue, 13 Apr 2010 15:31:10 -0400
Subject: [OpenAFS] Windows 7 Sleep behavior-Broken AFS Lock on wakeup
In-Reply-To: <4BC4C439.3060800@unc.edu>
References: <4BC4C439.3060800@unc.edu>
Message-ID: <4BC4C67E.5020105@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms040105000900060205030804
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/13/2010 3:21 PM, Wayne Greene wrote:
> Hello,
>=20
> We are using Windows 7 Enterprise, MIT Kerberos for Windows 3.22, and
> Open AFS 1-5.6.800 and have encountered a problem for our laptop users
> and the sleep function. When the computers wake, the AFS lock symbol in=
> the system tray shows that it is broken and cannot connect to the AFS
> service. Net ID manager shows they have AFS tokens and kerberos tickets=
> but AFS shares cannot be accessed until the computer is restarted. I
> have tried destroying the Network ID credentials, stopping the AFS
> service then sleeping the computer, regaining credentials and restartin=
g
> the service and after waking up it still doesn't work. I found a power
> management option in the Lenovo Power Manager that will keep the networ=
k
> connection for a specific amount of time during sleep, but that has not=
> worked either. These so far are Lenovo computers, my test machine is a
> ThinkPad R400 but the symptom is the same on a few different models. Is=
> anyone else experiencing this and/or has anyone found a fix if so? Than=
ks.
>=20
> Wayne Greene
> Computer Science
> UNC Chapel Hill
Please file a bug report with Microsoft for this issue. The problem
is not with OpenAFS but with the Netbios Name Resolution. If you
nbtstat -n
you will see that the "AFS" <20> Netbios name has been successfully
registered on the Microsoft Loopback adapter (10.254.254.253) and yet
when you attempt to access \\AFS, (net view \\afs), Windows reports
that the name cannot be resolved.
This is a bug in Microsoft Windows 7 and Server 2008 R2. Until
Microsoft receives enough reports from paying support customers,
it will not be fixed.
When you file your report, please indicate that Microsoft when
investigating the problem can contact openafs-gatekeepers@openafs.org
if they want to discuss the issue privately with the OpenAFS project.
Thank you and I am sorry your users are experiencing this problem.
Jeffrey Altman
--------------ms040105000900060205030804
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTMxOTMxMTBaMCMGCSqGSIb3DQEJBDEWBBSV9A9J
UgwxcMl4x9m9ZkS6V5nQkzBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBABfl7NgEssyfSsuweIk6cJw5aVA/0Ymu9/et
EyR5mzJxpMjcits0j9lYtSQGCMgjj3pxFu/yaYfw2eSuDBAMcp7GsVdbt66gEsvF2d+f4Y6C
3uj+S/wtjptlIJqI7bpgZ5mZPg3n+m8sJcCX82y9iGgmdGKcpeZUhbnlRYB6wOy3SdVBIVRW
9xq68FC7T9v5BHQAQXxOkXTMZyRYAKp5htRrotZevJgF3xZ7SrU5uMql86Lo9rphpgwkUVyA
WZvt1+dUzqMP5jCoTlTT7MQG8dZchfngTPQILl96rj+3H66OhW5gahG3BHf9PayUAzLEeqeu
TTmPU28pUGyyTCr56IcAAAAAAAA=
--------------ms040105000900060205030804--
From brandon.m.simmons@gmail.com Tue Apr 13 20:36:45 2010
From: brandon.m.simmons@gmail.com (Brandon Simmons)
Date: Tue, 13 Apr 2010 15:36:45 -0400
Subject: [OpenAFS] sqlite on AFS will not work, even with whole-file
Message-ID:
On Sun, Apr 11, 2010 at 11:13 PM, Adam Megacz wrote:
>
> Brandon Simmons writes:
>> Thanks for the response. It seems like whole-file locking in sqlite
>> would be a good choice for me in any case,
>
>> In a situation where the whole-file locking scheme is used, would AFS
>> be an acceptable choice? Would it be better than NFS?
>
> I had the same idea, and tried it. =A0It does not work. =A0Your databases
> will get corrupted. =A0I never figured out why, although I did confirm
> that sqlite was in fact requesting only whole-file locks.
>
> It would be nice if it worked, though. =A0There are a lot of applications
> out there where writes to the database are extremely rare, so
> invalidating all the clients' caches is not a problem.
A couple questions: I assume you were on a linux network? Also, how
exactly did you
ensure that you were using whole-file locking? I'm still not even
clear, after reading the
Sqlite docs and the responses here, how to do that.
Thanks,
Brandon
http://coder.bsimmons.name/blog/
From ela@cs.wisc.edu Tue Apr 13 21:59:17 2010
From: ela@cs.wisc.edu (Jacob Ela)
Date: Tue, 13 Apr 2010 15:59:17 -0500
Subject: [OpenAFS] OS X, AFS Home Directories and SSH/Unix Permissions
Message-ID: <393853E2-C3D1-4F6B-854C-ED0E1D06094D@cs.wisc.edu>
Greetings All,
I've been looking for some information on this because someone else has =
probably run into a similar issue, but I haven't found much that is =
recent or pointed towards solving the problem - though I've found some =
old email that suggests where this originates from...
I've got a Mac Mini lab running OSX 10.6.2 and OpenAFS 1.4.11 (but also =
have seen this on a MacBook running 10.6.3 and 1.5.73.3). User's home =
directories live in AFS, and users get Kerberos/AFS credentials at =
login. =20
I'm seeing on the Macs that all the unix file permissions on files in =
AFS are shown as 666, and from the old emails I've found I'm just =
guessing that this is to make AFS ACL's play nicely with the Finder (or =
rather the other way around). =20
This has the unfortunate side effect that my users can't use SSH on the =
Macs, as the reported permissions on their ~/.ssh/config file suggest it =
is group and world writable. This causes SSH to error out when a user =
attempts to connect to another computer because of insecure config file =
permissions. Trying to chmod the file from a Mac doesn't change the =
unix permissions as they are reported to the Mac, though Linux hosts can =
see these new permissions. =20
Has anyone run into something like this? Is there a way to change the =
permissions AFS reports to OSX, or is there a work around I'm failing to =
see?
Thanks for any help,
--
Jacob Ela
Computer Systems Lab
University of Wisconsin-Madison
ela@cs.wisc.edu=
From scs@umich.edu Tue Apr 13 22:08:49 2010
From: scs@umich.edu (Steve Simmons)
Date: Tue, 13 Apr 2010 17:08:49 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server UUIDs
In-Reply-To: <4BC4797B.3070202@email.unc.edu>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu>
Message-ID: <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu>
On Apr 13, 2010, at 10:02 AM, Todd Lewis wrote:
> Clearly the multi-line form is easier for humans to read, and the
> related-data-on-one-line form is far simpler for scripts to parse. By =
far.
> In both cases.
>=20
> Is there a place on the ballot to vote for... both, with a switch?
I'm a long-time fan of having a switch that causes tools to dump their =
data in an easy-to-machine-parse format. That isn't always doable, but =
when it is, it's a big win.=
From jaltman@secure-endpoints.com Tue Apr 13 22:28:31 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Tue, 13 Apr 2010 17:28:31 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu>
Message-ID: <4BC4E1FF.5050305@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms030804030605000409050504
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/13/2010 5:08 PM, Steve Simmons wrote:
>=20
> On Apr 13, 2010, at 10:02 AM, Todd Lewis wrote:
>=20
>> Clearly the multi-line form is easier for humans to read, and the
>> related-data-on-one-line form is far simpler for scripts to parse. By =
far.
>> In both cases.
>>
>> Is there a place on the ballot to vote for... both, with a switch?
>=20
> I'm a long-time fan of having a switch that causes tools to dump their =
data in an easy-to-machine-parse format. That isn't always doable, but wh=
en it is, it's a big win.
As Andrew pointed out in another reply in this thread, the -format
switch is support to provide that but it fails to provide a consistent
(value - data) pair per line.
Currently that output looks like:
name root.cell
id 536870915
serv 204.29.154.37 bethlehem.your-file-system.com
part /vicepa
status OK
backupID 0
parentID 536870915
cloneID 536870916
inUse Y
needsSalvaged N
destroyMe N
type RW
creationDate 1242930289 Thu May 21 14:24:49 2009
accessDate 0 Wed Dec 31 19:00:00 1969
updateDate 1269892897 Mon Mar 29 16:01:37 2010
backupDate 0 Wed Dec 31 19:00:00 1969
copyDate 1242930289 Thu May 21 14:24:49 2009
flags 0 (Optional)
diskused 43
maxquota 5000
minquota 0 (Optional)
filecount 38
dayUse 0
weekUse 0 (Optional)
spare2 0 (Optional)
spare3 0 (Optional)
root.cell
RWrite: 536870915 ROnly: 536870916
number of sites -> 4
server bethlehem.your-file-system.com partition /vicepa RW Site
server bethlehem.your-file-system.com partition /vicepa RO Site
server faultline.your-file-system.com partition /vicepa RO Site
server cpe-24-193-47-88.nyc.res.rr.com partition /vicepa RO Site
RWrite: 536870915 ROnly: 536870916
number of sites -> 4
server bethlehem.your-file-system.com partition /vicepa RW Site
server bethlehem.your-file-system.com partition /vicepa RO Site
server faultline.your-file-system.com partition /vicepa RO Site
server cpe-24-193-47-88.nyc.res.rr.com partition /vicepa RO Site
which re-uses a similar logic for the per volume output as the
non-format. I do notice a bug in the above output in that it lists the
same block twice.
Jeffrey Altman
--------------ms030804030605000409050504
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTMyMTI4MzFaMCMGCSqGSIb3DQEJBDEWBBSvAnZx
7W0aGL1jRmCkTKQ2zbh7BzBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAK34GnFi0nLjrfN2CoRY5RJKbUHWxyRg9I4d
8EtSATr2Ckn/nLEqMcZipnunT+MDbRR4oand2keD1ZBXdmeV6Bbgpnf1CpQVfWlV56+zE32h
23waa9sLLPDhSwCMDt0tM6hJsj81QAuR9HnYjOgt6dWeD4HP3R0w4m4/eZrRbovDiwGNeojt
6dJEAs+bptQ0qrIPh0kiHPHtdPCwFB04l2Qf53a6xvXpFDIRuCDfhnHb3NMQmoNMhr/R720G
4zc4W1EEL45Ouau+yUIvK+wfQMlWTaphdrl0MQvhpuUevRPARp8EITjBycPd9FMnYzYSddRb
2AvBvCWchFb25OIO5IAAAAAAAAA=
--------------ms030804030605000409050504--
From shadow@gmail.com Wed Apr 14 00:02:54 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Tue, 13 Apr 2010 19:02:54 -0400
Subject: [OpenAFS] OS X, AFS Home Directories and SSH/Unix Permissions
In-Reply-To: <393853E2-C3D1-4F6B-854C-ED0E1D06094D@cs.wisc.edu>
References: <393853E2-C3D1-4F6B-854C-ED0E1D06094D@cs.wisc.edu>
Message-ID:
On Tue, Apr 13, 2010 at 4:59 PM, Jacob Ela wrote:
> Greetings All,
>
> I've been looking for some information on this because someone else has p=
robably run into a similar issue, but I haven't found much that is recent o=
r pointed towards solving the problem - though I've found some old email th=
at suggests where this originates from...
>
> I've got a Mac Mini lab running OSX 10.6.2 and OpenAFS 1.4.11 (but also h=
ave seen this on a MacBook running 10.6.3 and 1.5.73.3). =A0User's home dir=
ectories live in AFS, and users get Kerberos/AFS credentials at login.
>
> I'm seeing on the Macs that all the unix file permissions on files in AFS=
are shown as 666, and from the old emails I've found I'm just guessing tha=
t this is to make AFS ACL's play nicely with the Finder (or rather the othe=
r way around).
>
> This has the unfortunate side effect that my users can't use SSH on the M=
acs, as the reported permissions on their ~/.ssh/config file suggest it is =
group and world writable. =A0This causes SSH to error out when a user attem=
pts to connect to another computer because of insecure config file permissi=
ons. =A0Trying to chmod the file from a Mac doesn't change the unix permiss=
ions as they are reported to the Mac, though Linux hosts can see these new =
permissions.
>
> Has anyone run into something like this? =A0Is there a way to change the =
permissions AFS reports to OSX, or is there a work around I'm failing to se=
e?
Check out the RealModes setting. Edit
/var/db/openafs/etc/config/settings.plist, and rerun
/var/db/openafs/etc/config/afssettings as root.
--=20
Derrick
From ela@cs.wisc.edu Wed Apr 14 00:58:20 2010
From: ela@cs.wisc.edu (Jacob Ela)
Date: Tue, 13 Apr 2010 18:58:20 -0500
Subject: [OpenAFS] OS X, AFS Home Directories and SSH/Unix Permissions
In-Reply-To:
References: <393853E2-C3D1-4F6B-854C-ED0E1D06094D@cs.wisc.edu>
Message-ID:
That's what I missed. Looks like it did the trick - I'll try it on the =
lab tomorrow.
Thanks!
Jacob Ela
Computer Systems Lab
University of Wisconsin-Madison
ela@cs.wisc.edu
On Apr 13, 2010, at 6:02 PM, Derrick Brashear wrote:
> On Tue, Apr 13, 2010 at 4:59 PM, Jacob Ela wrote:
>> Greetings All,
>>=20
>> I've been looking for some information on this because someone else =
has probably run into a similar issue, but I haven't found much that is =
recent or pointed towards solving the problem - though I've found some =
old email that suggests where this originates from...
>>=20
>> I've got a Mac Mini lab running OSX 10.6.2 and OpenAFS 1.4.11 (but =
also have seen this on a MacBook running 10.6.3 and 1.5.73.3). User's =
home directories live in AFS, and users get Kerberos/AFS credentials at =
login.
>>=20
>> I'm seeing on the Macs that all the unix file permissions on files in =
AFS are shown as 666, and from the old emails I've found I'm just =
guessing that this is to make AFS ACL's play nicely with the Finder (or =
rather the other way around).
>>=20
>> This has the unfortunate side effect that my users can't use SSH on =
the Macs, as the reported permissions on their ~/.ssh/config file =
suggest it is group and world writable. This causes SSH to error out =
when a user attempts to connect to another computer because of insecure =
config file permissions. Trying to chmod the file from a Mac doesn't =
change the unix permissions as they are reported to the Mac, though =
Linux hosts can see these new permissions.
>>=20
>> Has anyone run into something like this? Is there a way to change =
the permissions AFS reports to OSX, or is there a work around I'm =
failing to see?
>=20
> Check out the RealModes setting. Edit
> /var/db/openafs/etc/config/settings.plist, and rerun
> /var/db/openafs/etc/config/afssettings as root.
>=20
>=20
> --=20
> Derrick
From jason@rampaginggeek.com Wed Apr 14 01:01:49 2010
From: jason@rampaginggeek.com (Jason Edgecombe)
Date: Tue, 13 Apr 2010 20:01:49 -0400
Subject: [OpenAFS] OpenAFS Newsletter, Volume 2, Issue 4, April 2010
Message-ID: <4BC505ED.9040902@rampaginggeek.com>
Here is the April 2010 issue of the OpenAFS Newsletter:
OpenAFS Newsletter, Volume 2, Issue 4, April 2010
Welcome to the twelveth issue of the OpenAFS newsletter. This newsletter
summarizes what is happening in the OpenAFS community.
As always, volunteers, patches, bug reports, or any other type of help
is greatly appreciated.
Feedback on this newsletter is welcome. The goal is to summarize the
various development efforts and news of OpenAFS for the community.
Please let Jason Edgecombe know what you would
like to see out of this newsletter. Any news about AFS-related projects
is welcome and may be submitted to Jason for inclusion in the next
newsletter.
The current and past issues of this newsletter are available at
General OpenAFS Progress
OpenAFS version 1.5.73 was released on March 24. It was followed by
three point releases to fix some issues. The latest version is 1.5.73.3.
The gatekeepers are asking for people to really start testing the 1.5.x
releases on Unix machines to help iron out bugs before 1.6. To help
people with the testing efforts, Russ Allbery has uploaded new Debian
packages:
I've uploaded Debian packages of 1.5.73.3 plus some additional
recent patches to Debian experimental. I should be able to keep the
packages up to date with subsequent 1.5.x releases going forward.
Due to the new libkopenafs1 package, the upload will have to go
through NEW, so it will be a little bit before they show up in
Debian.
--Russ Allbery
The instructions for contributing to OpenAFS have been revised for using
Git. These instructions are in the README.git file in each release.
The growl agent for Mac OS X is included in 1.5.73, but it's not well
integrated yet. Initial feedback is positive and testers are welcome.
Please send any feedback ot the port-darmin@openafs.org mailing list.
A new version of the AFS PERL API was released. For more information and
downloads, go to
Events
Annual Best Practices Workshop
Plans are already underway for the seventh Workshop, to be held May
24-28, 2010, at the University of Illinois at Urbana-Champaign. We hope
to see you there.
Web site:
Register by April 14, 2010 to get the best prices. AFS and Kerberos
tutorials are $100 each, the Workshop itself is $150, or register for
all three for only $300.
After April 14, prices will go up, so register now and save.
A tentative schedule is available. Further details, including social
events, is still forthcoming.
Hotel and travel information is also available.
We'll be looking forward to meeting you at Illinois next month!
European AFS Conference
The date for the 3rd European AFS & Kerberos Conference has been set.
The conference will take place in Pilsen, Czech Republic, from September
13 to September 15, 2010. More details are forthcoming and will be
posted at . The conference is being hosted by
Centre for Information Technology, University of West Bohemia.
AFS Protocol Standardization
Informal drafts that haven't been uploaded to the IETF web site:
Rx Spec:
This draft is in the very early stages. Mike Meffie and Tom Keiser are
the current owners of this proposal. Nickolai Zeldovich wrote the
original draft. Mike and Tom have started updating the draft with
Nickolai's permission. A formal specification of Rx is needed for a
basis for other IETF proposals.
Discussion on these proposals is welcome and should be done on the
AFS3-standardization list at
PTS Alternate Authentication
Status: Active - Third call for review
Last Update: November 18, 2009
Expires: May 22, 2010
AFS Callback Extensions
Status: Active - Waiting on RPC refresh
This proposal will be rewritten with references to the RPC time refresh.
Last update: September 23, 2009
Expires: March 23, 2010
DNS SRV Resource Records for AFS
Status: Submitted to IETF
Still in the RFC Editor queue, waiting for them to have a chance to work
on it.
--Russ
RXGK
Status: Active
Rxgk is a security layer for AFS which will support strong encryption
and authentication through Kerberos v5, GSI and any other GSSAPI
security mechanism.
Changes which are considered suitable for the 1.5.x series are in git -
look for changes with author sxw@your-file-system.com. A development
tree, which will be frequently rebased, is at
http://github.com/your-file-system/openafs-rxgk
Last Update: Jan 9, 2010
AFS3 ACL Rights
Status: Second draft
Last update: Jan 13, 2010
See the Per-File ACLs section for more info.
Rx Security Object Providing Cleartext Peer Identity Assertions
Status: Second draft
Last Update: February 1, 2010
The Rx clear security class will improve on the rxnull security object
by eliminating certain race conditions related to IPv4 address changes.
A -02 revision of this internet draft will be forthcoming in the next
few days, which will update the introduction and security considerations
sections of the memo. Everyone is invited to review this document, and
comments should be sent to the afs3-standardization@grand.central.org
mailing list.
AFSVol Tag-Length-Value Remote Procedure Call Extensions
Status: Second Draft
Last Update: April 6, 2010
As new forms of metadata are added to AFS volumes, we are running into
limitations with the wire volume metadata structures used by the volume
server. This internet draft aims to standardize a tag-length-value (TLV)
encoding for arbitrary AFS volume metadata. A new version of this draft
was released on April 6th, 2010. Everyone is invited to review and
comment on this document. Comments should be sent to the
afs3-standardization@grand.central.org mailing list.
--Tom
Projects
Demand-Attach FileServer (DAFS)
Project Contacts:
* Andrew Deason
* Tom Keiser
* Mike Meffie
Gerrit 1406 (per-volume locks) has been merged along with the related
changes, and thus we can now salvage at the same time as volume
operations as well as other salvages. Gerrit 1092 is unfortunately still
not merged; we encourage additional review from anyone who can. Current
DAFS development involves adding background I/O threads to the salvager
code, and later making demand-salvages spawn as threads instead of
processes. Code for that should be available shortly.
--Andrew
Better Documentation
Project Contacts:
* Russ Allbery
* Jason Edgecombe
Davor Ocelic is working on writing man pages for the new demand-attach
binaries that aren't yet documented. The man page for state_analyzer has
been committed.
Pthreaded Ubik
Project Contact:
* Steven Jenkins
* Andrew Deason
* Alistair Ferguson
Gerrit 1546 (add locks for addresses and cheader) has been submitted for
review, and with that, we now believe the vlserver to be thread-safe.
Since the vlserver thread-safety issues were the only known pthreaded
ubik issues, after 1546 we are aware of no further issues with pthreaded
ubik.
--Steven
We're still discussing what solution to use in gerrit 1546 (add locks
for the vldb ubik cache). Additional problems affecting pthreaded ubik
were found and fixed in gerrit 1680 (kill afs_inet_ntoa) and 1681
(correct use of flags_cond and version_cond).
--Andrew
This project will likely be dropped as a separate project in a couple of
issues of the newsletter when this project is fully merged into the
mainline OpenAFS code.
Kerberos v5 and multiple encryption types
Project Contacts:
* Matt Benjamin
* Marcus Watts
I was hoping to have some time this week - got distracted by other
matters. I do have one change of interest: the "tokens expired" message
which formerly looked like this:
Feb 22 10:23:08 lancashire kernel: afs: Tokens for user of AFS id
555 for cell cats.umich.edu expired now
(where 555 was a fixed constant because the cache manager doesn't know
what viceid the user has), now looks like this:
Apr 3 04:55:37 lancashire kernel: afs: Tokens for
mdw@CATS.UMICH.EDU for cell cats.umich.edu expired now
I think that's an improvement.
Heimdal has some annoying weirdness with key types and checksums. In
1.3.1 (at least,) the verify checksum logic insists only one checksum
algorithm is acceptable per key type. The standards documents do not
forbid this interpetation, but don't exactly require it. Sorting out
acceptable checksum algorithms between various kerberos distributions
continues to be a problem.
I'm hoping to do a code drop soon. It will probably be a complete copy
of source not just diffs. I was overly aggressive about fixing tabs and
now have to "undo" some fixes to patches. Blech.
--Marcus
Per-File ACLs
Project Contacts:
* Marc Dionne
Current status:
* I plan to give a talk on the topic at the upcoming workshop
--Marc
Mac OS X OpenAFS Preference Pane
Project Contact:
* Claudio Bisegni
The preference pane has been updated to allow the renewal of Kerberos
tickets. The GUI and afs backgrounder were updated to accomodate this
change.
--Claudio
*BSD Support
Project Contacts:
* Matt Benjamin
Commit 028240329c09b6a311cb85736f41d75f7ee7a01f deals with some updated
VFS calls in FreeBSD.
Userspace cache manager
Project Contact:
* Andrew Deason
I've finally managed to find time to work on this again, resulting in
gerrit changes 1714-1726 which give a FUSE OpenAFS/libuafs client.
Functionality improvements over previous libuafs code include fixed
support for AFSDB and fakestat. Support will be forthcoming for some
pioctl operations (lsmount, rmmount, getacl, setacl, checkservers) and
perl SWIG bindings.
I will be presenting at AFSBPW 2010 about libuafs, on its potential uses
and how to use it.
--Andrew
S3 Front-end for AFS
Project Contacts:
* Fabrizio Manfredi
* Claudio Bisegni
We have a generic implementation in alpha test, without authentication
and specific AFS ACL support. We hope to release a first public beta at
the end of the April ( without authentication).
--Fabrizio
Virtual Machine Images
Project Contact:
* Fabrizio Manfredi
The Virtual Machine Images are updated, now the operating system is
Centos 5.4 with openafs 1.4.12, in the new distribution is also present
the AFS perl API with example scripts. If you want, you can downgrade to
openafs 1.4.11 with a simple snapshot rollback. The images are in vmware
format only, you can download from:
http://sourceforge.net/projects/s3afs/files/openafs-1.4.12-vm
--Fabrizio
Google Summer of Code 2010
Google will be doing their Summer of Code again in 2010. We're proud to
announce that for the third year, OpenAFS will be participating as a
mentoring organization.
--Simon
Accepted student proposals will be announced on April 26.
Projects with no progress or no update
Each project without progress this month is listed along with the month
of the last update.
* Rx OSD integration & Raw Vicep Access in Clients - August 2009
* Active Directory Backend for Ptserver - November 2009
* SetAG - December 2009
* Extended Callback Information - January 2010
* Disconnected AFS support - February 2010
Gerrit Activity
To review a change, go to http://gerrit.openafs.org/#change,NUM where
NUM is the Change# shown in the lists below.
Statistics
Number of patches waiting for review: 35 (last month: 50)
Patches merged into the master branch:
Month Number of Commits
2010-04 53 (Partial month)
2010-03 140
2010-02 156
2010-01 103
2009-12 72
2009-11 85
2009-10 154
2009-09 142
2009-08 78
2009-07 181
Patches merged into the stable branch:
Month Number of Commits
2010-04 2 (Partial month)
2010-03 28
2010-02 35
2010-01 11
2009-12 92
2009-11 21
2009-10 7
2009-09 8
2009-08 17
2009-07 5
Patches waiting for review
Date Author Change# Description
2010-04-11 Jonathan A. Kollasch (1738) NetBSD 5.0 support.
2010-04-11 Tharidu Fernando (1736) Windows: Secure C String usage
in src\WINNT\afsd\fs.c
2010-04-10 Andrew Deason (1614) Add the Jabber MUC to the
support page
2010-04-10 Andrew Deason (1723) Split afsd into afsd.c and
afsd_kernel.c
2010-04-09 Marc Dionne (1640) Fileserver capabilities support
for the UNIX client
2010-04-09 Andrew Deason (1725) Add a FUSE implementation for afsd
2010-04-09 Andrew Deason (1724) Make libuafs usable with afsd.o
2010-04-09 Andrew Deason (1726) Allow afsd.fuse to build on
darwin / amd64 linux
2010-04-05 Benjamin Kaduk (1691) Add entries for FBSD 8.1 and 9.0
2010-03-31 Andrew Deason (1546) Add locks around updating the
VLDB ubik cache
2010-03-28 Derrick Brashear (1333) byte-range lock warning should
include pid
2010-03-26 Rainer Toebbicke (1311) Lockless path through
afs_linux_dentry_revalidate
2010-03-23 Derrick Brashear (1625) preliminary support for pinned
vcaches
2010-03-19 Michael Meffie (215) rxdebug: show delayed abort
packet count for rx peers
2010-03-17 Michael Meffie (1562) ihandle positional read and write
2010-03-17 Simon Wilkinson (1581) Linux Keyrings: don't ignore
error code from session keyring creation
2010-03-17 Derrick Brashear (1553) dynamic volume allocation
2010-02-25 Michael Meffie (1092) DAFS: avoid volume lock
contention during initialization
2010-02-24 Simon Wilkinson (1392) More warnings cleanup
2010-02-24 Jacob Thebault-Spieker (433) Add throughput framework to
cm_RankServer()
2010-02-23 Anders Kaseorg (1373) Adjust afs_lockctl to compensate
for byte-range lock fixes
2010-02-15 Michael Meffie (1001) return an error from afs_readdir
when out of buffers
2010-02-06 Dan Hyde (1212) VTRANS_LOCK not needed in TryUnlock
2010-02-03 Dan Hyde (1191) runningCalls: VOL_COUNT_LOCK vs
VTRANS_LOCK
2010-02-03 Derrick Brashear (1172) linux mmap anti-deadlock
shouldn't break StoreAllSegments
2010-02-03 Derrick Brashear (1201) basic kernel event system for afs cm
2010-02-02 Simon Wilkinson (1072) Unix CM: Conflate
rxfs_[store,fetch]Variables
2010-01-20 Simon Wilkinson (1074) Unix CM: Include memcache's tiov
in rxfs_context
2009-11-29 Andrew Deason (875) Make ubik use unsigned addresses
2009-11-18 Andrew Deason (709) Break origin's callback for
RXAFS_Rename target
2009-11-04 Andrew Deason (436) Avoid unnecessarily updating ..
in SAFSS_Rename
2009-09-09 Matt Benjamin (435) clear stat flag on renamed
directories
2009-08-29 Matt Benjamin (376) K5SSL by Marcus Watts
2009-07-29 Michael Meffie (147) Fix bosserver directory creation
2009-07-24 Hartmut Reuter (70) preparing rxosd integration:
change in AFSFetchStatus
Patches merged into the master branch
Date Author Change# Description
2010-04-10 Matt Smith (1737) Fix problems from afs_osi_gcpags
reorganization
2010-04-10 Michael Meffie (1735) afsmonitor: fix segv on exit
2010-04-10 Michael Meffie (1734) afsmonitor: show busy counts
2010-04-10 Marc Dionne (1733) Fix UKERNEL build error -
include afs/afs_osi.h
2010-04-09 Matt Smith (1727) Move contents of afs_osi_gcpags
to per-OS files
2010-04-09 Andrew Deason (1679) Correct incorrect type-punning fixes
2010-04-09 Michael Meffie (1731) afsmonitor: add fs callback
xstats collection
2010-04-09 Michael Meffie (1730) afsmonitor: avoid showing full
perf stats garbage
2010-04-09 Derrick Brashear (1729) ukernel osi prototypes header
2010-04-09 Andrew Deason (1722) UKERNEL: allow creation of
non-detached threads
2010-04-09 Andrew Deason (1721) Use AFS_CACHE_VNODE_PATH for UKERNEL
2010-04-09 Andrew Deason (1714) Make osi_GetTime work on 64-bit
libuafs
2010-04-09 Andrew Deason (1720) afsd: squash inode format warning
2010-04-09 Andrew Deason (1719) UKERNEL: prototype uafs_Shutdown
2010-04-09 Andrew Deason (1718) UKERNEL: Use real vnode type
constants
2010-04-09 Andrew Deason (1717) UKERNEL: check for null
afs_CurrentDir on shutdown
2010-04-09 Andrew Deason (1716) UKERNEL: add uafs_statvfs
2010-04-09 Andrew Deason (1715) Prevent uafs_readdir/closedir
segfault
2010-04-09 Russ Allbery (1713) Update Debian packaging files
2010-04-09 Russ Allbery (1712) Add OpenAFS-debug.*.plist to
.gitignore
2010-04-08 Michael Meffie (1601) pts mem -expandgroups option
2010-04-08 Michael Meffie (1600) pts mem -supergroup option
2010-04-07 Russ Allbery (1710) Explain in CellServDB man page
that server lines can be omitted
2010-04-07 Simon Wilkinson (1705) Linux: kmap() not page_address()
2010-04-07 Andrew Deason (1709) Fix typo in bos_create manpage
2010-04-07 Rod Widdowson (1708) Make tests/afcp compile cleanly
2010-04-07 Russ Allbery (1706) Reallocate memory in aklog for
the AFS ID string
2010-04-07 Russ Allbery (1704) Make src/rx/rx.c not executable
2010-04-07 Russ Allbery (1707) Improve demand-attach fileserver
bos documentation
2010-04-06 Jeffrey Altman (1702) Windows: Support new Cygwin
docbook stylesheet location
2010-04-06 Jeffrey Altman (1696) Windows: WinTorture Verbose mode
display all logged messages
2010-04-06 Jeffrey Altman (1701) Windows: permit documentation to
be built without binaries
2010-04-06 Jeffrey Altman (1699) Windows: tag is listitem not
llstitem
2010-04-06 Derrick Brashear (1700) make openafs 1.5.73.3
2010-04-06 Derrick Brashear (1698) macos bulkstat avoid reclaiming
vnodes
2010-04-06 Derrick Brashear (1690) avoid macos bulkstat vlru when
no non-dead vnodes exist
2010-04-06 Derrick Brashear (1693) panic generation update
2010-04-06 Jeffrey Altman (1695) Windows: cm_UpdateVolumeLocation
!append exts to num vol names
2010-04-06 Jeffrey Altman (1697) Rx: Remove conn_call_lock
contention between rx_NewCall and rx_EndCall
2010-04-05 Aditya Sarawgi (1694) Replace kmodstat by kldstat
2010-04-05 Jeffrey Altman (1685) Fix usage of RX_CALL_TQ_WAIT flag
2010-04-05 Derrick Brashear (1682) rx_ClearTransmitQueue should
signal waiters when flushing
2010-04-05 Derrick Brashear (1692) macos panic decoder update
2010-04-02 Derrick Brashear (1687) macos 32 bit platform user
address transform
2010-04-02 Derrick Brashear (1688) make 1.5.73.2
2010-04-02 Derrick Brashear (1684) freebsd switch back to
condvar-based sleep
2010-04-02 Derrick Brashear (1686) macos installer pane warning fix
2010-04-02 Andrew Deason (1681) tubik: Correct use of flags_cond
and version_cond
2010-04-02 Andrew Deason (1680) Kill afs_inet_ntoa
2010-04-02 Derrick Brashear (1683) freebsd glock assertions
2010-04-01 Andrew Deason (1678) fssync-debug: fix
strict-aliasing problems
2010-04-01 Simon Wilkinson (1645) Fix formatting issues in src/afs
2010-04-01 Benjamin Kaduk (1677) Set a storeOps storeproc for the
memcache case
2010-03-31 Benjamin Kaduk (1676) Fix build for FBSD80
2010-03-31 Benjamin Kaduk (1675) Update to the new thread world
order for FBSD
2010-03-31 Benjamin Kaduk (1674) Include limits.h for FBSD
2010-03-31 Derrick Brashear (1670) openafs 1.5.73.1
2010-03-31 Benjamin Kaduk (1672) Make GCPAGs_perproc_func cleaner
for FBSD case
2010-03-31 Jonathan Billings (1671) Updated RedHat RPM spec file to
include unreferenced files
2010-03-30 Jonathan Billings (1669) Move restorevol to bin from
sbin in make dest
2010-03-30 Derrick Brashear (1668) darwin notify avoid reentrant
vfs context panic
2010-03-30 Russ Allbery (1667) Update VCS instructions for Git
2010-03-30 Benjamin Kaduk (1665) Catch up to dynamically-sized
cr_groups in FBSD80
2010-03-29 Davor Ocelic (1666) Minor state_analyzer manpage
corrections
2010-03-29 Rod Widdowson (1649) Render the IP address for the
"Ubik: Lost contact with sync-site" log message in the same way that all
other IP addresses are (via afs_inet_ntoa, rather than stripping the
buytes out in a manner which assumes a specific endianism).
2010-03-29 Davor Ocelic (1655) Initial; add state_analyzer manpage
2010-03-28 Simon Wilkinson (1042) Linux: Replace
invalidate_inode_pages
2010-03-28 Jeffrey Altman (1664) Windows: buffers whose offsets
are beyond EOF should be zero filled and locally allocated
2010-03-27 Claudio Bisegni (1656) GUI Update for Kerberos Ticket Renew
2010-03-27 Derrick Brashear (1663) aklog pt error table warning fix
2010-03-27 Derrick Brashear (1661) aklog more error tables
2010-03-27 Chas Williams - CONTRACTOR (1080) LINUX: you dont need
to memset() after allocating credentials
2010-03-25 Jeffrey Altman (1660) Windows: afslogon.dll vs windows 7
2010-03-25 Jeffrey Altman (1659) Windows: aklog must reset viceId
to 0 before pr_CreateUser call
2010-03-25 Jeffrey Altman (1658) Windows: output pt error
messages as strings
2010-03-24 Derrick Brashear (1651) growl agent should handle port busy
2010-03-24 Derrick Brashear (1654) avoid double-free cell name
canonicalization
2010-03-24 Derrick Brashear (1648) afsdump warning killing
2010-03-24 Simon Wilkinson (1647) Linux : Apply more dget_parent()
pixie dust
2010-03-24 Derrick Brashear (1642) make 1.5.73 relnotes
2010-03-24 Derrick Brashear (1620) openafs 1.5.73 version strings
2010-03-24 Jeffrey Altman (1521) Updating UserGuide with Kerberos
v5 authentication
2010-03-24 Asanka Herath (1633) Windows: Use a timestamp for the
minidump filename
2010-03-24 Asanka Herath (1632) Windows: Monitor requests and
gather diagnostics before a timeout
2010-03-24 Derrick Brashear (1641) add missed release notes
2010-03-24 Jeffrey Altman (1636) Windows: changelog for 1.5.73
2010-03-23 Jeffrey Altman (1639) Windows: cm_attrs_t requires
inclusion of cm_vnodeops.h
2010-03-23 Jeffrey Altman (1638) Windows LWP and UNIX LWP do not
have the same lwp_cpptr structure
2010-03-23 Marc Dionne (1637) Warning fix: print burstWait fields
2010-03-23 Marc Dionne (1635) Fix #ifdef typo
2010-03-23 Marc Dionne (1634) Define __USE_XOPEN conditionally
2010-03-23 Asanka Herath (1602) Windows: Make default mode bits
configurable
2010-03-23 Derrick Brashear (1629) remove vnop needs discon lock
2010-03-23 Claudio Bisegni (1606) Develop Kerberos renew system
for ticket
2010-03-23 Derrick Brashear (1631) kill MultiBreakVolumeCallBack too
2010-03-23 Andrew Deason (1628) Remove BreakVolumeCallBacks
prototype
2010-03-23 Russ Allbery (1589) vldb_check man page should say
-vheader, not -pheader
2010-03-23 Andrew Deason (1550) vos: correct syncvldb -verbose
server byte order
2010-03-23 Derrick Brashear (1547) make tryevalfakestat really not
block
2010-03-23 Derrick Brashear (1315) viced remove dead
BreakVolumeCallBacks function
2010-03-23 Andrew Deason (1559) vos: Avoid LWP stack overflow
error on SIGINT
2010-03-23 Andrew Deason (1558) vos: Use IOMGR_SoftSig for signals
2010-03-23 Andrew Deason (1557) vos: Mark longjmp-used variables
as 'volatile'
2010-03-23 Russ Allbery (1617) Fix strict aliasing problems or
add -fno-strict-aliasing
2010-03-22 Andrew Deason (1582) Use AC_USE_SYSTEM_EXTENSIONS
2010-03-22 Derrick Brashear (1590) aix mount failure unlock and
seterror
2010-03-22 Derrick Brashear (1386) update link order
2010-03-22 Simon Wilkinson (1340) XDR: Stop the madness
2010-03-22 Russ Allbery (1616) Use sigset_t and sigfillset
instead of memset
2010-03-22 Russ Allbery (1615) Move non-executable stack
assembly code to end of file
2010-03-22 Derrick Brashear (1599) multibreak callbacks add host
marking
2010-03-21 Derrick Brashear (1611) aix vfs table entry in rc script
2010-03-21 Derrick Brashear (1610) salvage variable initialization
2010-03-21 Derrick Brashear (1598) comment assumptions in lih0_r
2010-03-21 Andrew Deason (1235) Create missing root directory
when ORPH_ATTACH
2010-03-21 Derrick Brashear (1609) aix krb5 error message handling
2010-03-21 Derrick Brashear (1608) panic prototype for aix 6
2010-03-21 Simon Wilkinson (1577) Don't count root session
keyrings against quota
2010-03-20 Derrick Brashear (451) macos fsevents hinting
2010-03-19 Jeffrey Altman (1531) afsadminutil: translate krb5
error messages on Windows
2010-03-19 Andrew Deason (1596) volume_inline.h does not need
sys/file.h
2010-03-19 Andrew Deason (1406) DAFS: Replace partition locks
with volume locks
2010-03-19 Derrick Brashear (1594) macos uninstall redux
2010-03-19 Derrick Brashear (1593) update macos uninstaller
2010-03-19 Dan Hyde (1213) VOL_LOCK needed when traversing
DiskPartitionList
2010-03-18 Benjamin Kaduk (1587) Catch up with FBSD80's removal
of thread argument to VFS calls
2010-03-18 Derrick Brashear (1572) aix vnode hold simplification
2010-03-18 Derrick Brashear (1584) regain glock on storedata error exit
2010-03-18 Derrick Brashear (1583) kill apsl afssettings and fstab
2010-03-18 Evan Broder (778) Increase the maximum number of
sysnames
2010-03-17 Andrew Deason (1405) Add code for locking individual
volumes on disk
2010-03-17 Benjamin Kaduk (1576) Avoid panic on shutdown with
memcache and INVARIANTS
2010-03-17 Benjamin Kaduk (1560) Allocate and free backing store
for event mutices
2010-03-16 Derrick Brashear (1573) rx nat event connection reference
2010-03-16 Marc Dionne (1575) growlagent: remove generated
Makefile with make distclean
2010-03-15 Antoine Verheijen (1574) Remove return of value for
afs_MarinerLogFetch()
2010-03-15 Derrick Brashear (1554) freebsd per-event mutexes
2010-03-15 Andrew Deason (1545) vlserver: make rxinfo threadsafe
2010-03-15 Derrick Brashear (1538) afsdb lookup shouldn't leak
memory on realname lookup
2010-03-15 Michael Meffie (1570) squash warning in db_verify
2010-03-13 Jeffrey Altman (1567) Windows: warnings removal for
afskfw.c
2010-03-13 Jeffrey Altman (1529) Windows: afskfw - conditionalize
use of krb5_get_error_message for KFW 3.1 and earlier
2010-03-13 Jeffrey Altman (1530) Windows: netidmgr -
conditionalize use of krb5_get_error_message for KFW 3.1 and earlier
2010-03-11 Derrick Brashear (1561) macos dropbox fix for finder
2010-03-10 Derrick Brashear (1551) dkms configure correctly
2010-03-10 Andrew Deason (1556) Squash pthreaded vos warnings
2010-03-10 Simon Wilkinson (1555) Don't always use the local cell
for db clients
2010-03-09 Simon Wilkinson (1552) Update RPM CellServDB
2010-03-09 Andrew Deason (1549) udebug: Fix byte ordering of
last yes host
2010-03-09 Andrew Deason (1548) vldb_check: do not ntohl u_chars
2010-03-09 Andrew Deason (1404) Add FSYNC_VerifyCheckout
2010-03-09 Andrew Deason (1376) Add DAFS documentation overview
for developers
2010-03-09 Andrew Deason (1358) Add VLockFileReinit
2010-03-09 Andrew Deason (1357) VLockFile: add a couple of asserts
2010-03-09 Andrew Deason (1356) Schedule all salvages via
VScheduleSalvage_r
2010-03-09 Andrew Deason (1349) Add FSSYNC debug logging
2010-03-09 Andrew Deason (1390) Move *SYNC string translation
out of fssync-debug
2010-03-09 Andrew Deason (1348) Do not rely on vol header for
V*VolumeHandles_r
2010-03-09 Derrick Brashear (1544) darwin report kext load address
2010-03-09 Benjamin Kaduk (1541) Export prototypes for
osi_fbsd_{alloc,free} for use in rx
2010-03-09 Benjamin Kaduk (1540) Use correct types for UFS devices
2010-03-09 Benjamin Kaduk (1539) Use the correct API for msleep()
in FBSD's afs_osi_TimedSleep()
2010-03-09 Benjamin Kaduk (1526) FBSD build finishes for me
2010-03-08 Derrick Brashear (1537) afsconf srv lookup fill cellname
2010-03-07 Derrick Brashear (1533) Begin support for OpenBSD 4.7
2010-03-07 Derrick Brashear (1532) OpenBSD: eliminate use of VREF()
macro
2010-03-07 Benjamin Kaduk (1528) Be type correct in
osi_ThreadUnique() for FBSD
2010-03-07 Benjamin Kaduk (1527) FBSD module loads now
2010-03-06 Jeffrey Altman (1520) Windows: use
krb5_get_error_message instead of error_message
2010-03-06 Simon Wilkinson (1524) Linux: Make keyring destructor
remove all tokens
2010-03-06 Simon Wilkinson (1525) Linux: Fix builds on RHEL4
2010-03-06 Marc Dionne (1523) Linux: replace
invalidate_inode_pages
2010-03-06 Jeffrey Altman (1519) Windows: use
krb5_get_error_message to translate krb5 errors in afskfw library
2010-03-06 Jeffrey Altman (1518) Windows: use
krb5_get_error_message in netidmgr_plugin
2010-03-06 Jeffrey Altman (1517) Windows: Add krb5 error message
functions to loadfuncs header
2010-03-05 Jeffrey Altman (1514) Windows: reset local mount point
count during freelance re-initialization
2010-03-05 Simon Wilkinson (1522) Linux : Don't leak GLOCK when
writing CellServDB
2010-03-05 Derrick Brashear (1513) add growl agent for macos
2010-03-05 Derrick Brashear (1511) darwin afshelper fix startup check
2010-03-04 Derrick Brashear (1512) evalmount copy out volid for sure
2010-03-03 Derrick Brashear (1507) macos shutdown consistent behavior
2010-03-03 Derrick Brashear (1505) add user warning facility via
mariner for macos
2010-03-03 Derrick Brashear (1504) support mariner messages sans vcache
2010-03-03 Derrick Brashear (1503) rewrite marinerlogfetching
2010-03-03 Derrick Brashear (1502) restore mariner storing message
2010-03-03 Derrick Brashear (1501) de-printf the cache manager
2010-03-03 Jason Edgecombe (1308) Add a section on how to tune the
AFS cache using xstat_cm_test
2010-03-03 Marc Dionne (1506) Remove duplicate make targets in
tubik, cleanup dependencies
2010-03-02 Derrick Brashear (1500) darwin vfsops ansification
2010-03-02 Derrick Brashear (1499) afs_util don't use printf
2010-03-02 Derrick Brashear (1371) BOP_MOVE and userspace move
EXDEV helper
2010-03-01 Claudio Bisegni (1492) OSXPreferencePane
checkAfsStatusForStartup method modification for search /afs volume for
determinate if afs is on has been transfered into checkAfsStatus.
checkAfsStatusForStartup method is used to check when afs start axitn
system startup. Anyway these
2010-03-01 Marc Dionne (1489) Don't pass NULL to strcmp
Patches merged into the stable branch
Date Author Change# Description
2010-04-09 Hans-Werner Paulsen (1711) Build and install PIC
versions of libafsrpc and libafsauthent
2010-04-01 Dan Hyde (1595) VOL_LOCK needed when traversing
DiskPartitionList
2010-03-30 Simon Wilkinson (1580) Linux: don't count pag keys
against root's keyring quotas
2010-03-25 Marc Dionne (1657) Print rxdebug statistics as
unsigned values
2010-03-24 Evan Broder (1650) Increase the maximum number of
sysnames
2010-03-23 Dan Hyde (1588) volmonitor keep vtrans lock
2010-03-23 Andrew Deason (1627) Add support for amd64_obsd46
2010-03-23 Andrew Deason (1626) Add amd64 subtarget for OpenBSD
2010-03-23 Andrew Deason (1624) libafs: WRITEPAGE_ACTIVATE is
2.6-only
2010-03-23 Andrew Deason (1613) Create missing root directory
when ORPH_ATTACH
2010-03-23 Andrew Deason (1623) libafs: afs_backing_dev_info is
2.6-only
2010-03-23 Andrew Deason (1622) libafs: Remove some unused functions
2010-03-22 Russ Allbery (1618) Move non-executable stack
assembly code to end of file
2010-03-18 Dan Hyde (1586) volmonitor copy link before
calling free
2010-03-17 Andrew Deason (1368) h_TossStuff_r: make sure host
does not go away
2010-03-17 Andrew Deason (1367) h_TossStuff_r: check held-ness
after lock
2010-03-15 Michael Meffie (1571) Avoid 'static __inline' on HPUX
2010-03-10 Andrew Deason (1370) Allow GetSomeSpace_r to select
an optimal host
2010-03-10 Andrew Deason (1369) Remove lih_r
2010-03-08 Derrick Brashear (1536) remove fc_test from normal build
2010-03-08 Derrick Brashear (1535) openafs 1.4.12
2010-03-07 Antoine Verheijen (1510) Begin support for OpenBSD 4.7
2010-03-07 Antoine Verheijen (1509) OpenBSD: eliminate use of
VREF() macro
2010-03-05 Derrick Brashear (1516) darwin afshelper fix startup check
2010-03-05 Derrick Brashear (1515) correct cred mgmt typo
2010-03-03 Derrick Brashear (1508) remove the force.. comments
2010-03-03 Derrick Brashear (1498) Linux: bdi doesn't always have a
name
2010-03-03 Derrick Brashear (1497) linux bdi allocate memory
2010-03-01 Derrick Brashear (1494) OSXPreferencePane
checkAfsStatusForStartup method modification for search /afs volume for
determinate if afs is on has been transfered into checkAfsStatus.
checkAfsStatusForStartup method is used to check when afs start axitn
system startup. Anyway these
2010-03-01 Derrick Brashear (1493) macos prefs pane more reliable
running indicator
Resolved Tickets
Here is a list of tickets that have been resolved since March 1, 2010:
ticket # state created title
21423: resolved Sep 07, 2005 Enhanced auto CACHESIZE code for
afs.rc.linux
22608: resolved Oct 24, 2005 redhat-4 afs.rc smp startup broke
23229: resolved Nov 16, 2005 Openafs 1.4.0 cache on ext3
volume causes massive filesystem corruption
36725: resolved Aug 01, 2006 Relicense afssettings.m under
saner license
37658: resolved Aug 14, 2006 openafs-1.4.1 libafsrpc missing
symbols x86_64
41823: resolved Oct 05, 2006 reliability bug in caching seeked
files ...
49718: resolved Dec 19, 2006 vos dump fails on amd64_linux26
client
53759: resolved Feb 11, 2007 1.4.2 Debian sarge afsd crash in
afs_checkrootvolume
54062: resolved Feb 14, 2007 openafs 1.4.3rc2 still coredumps
on openbsd40
54299: resolved Feb 17, 2007 problem after volume moves with
1.5.15
92653: resolved Apr 01, 2008 1.4.7pre2 - Dependency on
/boot/config-{kernel-version}
106150: resolved Jul 06, 2008 Kernel memory leak with 1.4.7 on
Linux 2.6.25
107089: resolved Jul 12, 2008 Kernel panic -- 1.5.39
110696: resolved Aug 06, 2008 OpenAFS 1.4.7 on Mac OS X
10.5.x: intermittent access failures
112681: resolved Aug 19, 2008 RHEL4 Kernel Panic (firefox 3
related?)
117415: resolved Sep 24, 2008 Is there any script to uninstall
OpenAFS on Mac OS X 10.5
123798: resolved Dec 01, 2008 freebsd client panics if no root.afs
123806: resolved Dec 03, 2008 nbsd-44 pass 1
123820: resolved Dec 05, 2008 Request for multiple realm
support in 1.4.x
124083: resolved Jan 03, 2009 afssettings does not use build
system correctly
124591: resolved Apr 04, 2009 freebsd 8 wip
124761: resolved May 11, 2009 Issues blocked for inclusion in
future stable releases
124877: resolved May 27, 2009 obsd 1_5_x fixups
125634: resolved Nov 12, 2009 OpenAFS bug: Uninstall.command
for Mac OS X does not work (fix attached)
126067: resolved Jan 04, 2010 rewrite afs_MemWriteBlk() using
afs_MemWritevBlk()
126107: resolved Jan 08, 2010 emacs out of AFS server space
locks compute node in D wait state
126514: resolved Feb 16, 2010 Linux: Ooops in __wake_up_common
(from bdi_start_fn)
126678: resolved Mar 05, 2010 Linux: 1.5.x doesn't build on RHEL4
126716: resolved Mar 10, 2010 1.5.72's vos examine doesn't use
cell provided by -cell argument
126794: resolved Mar 22, 2010 Wince when foreign bull-dogs sent
out their threate
126812: resolved Mar 24, 2010 OpenAFS 1.5.73 MacOSX 10.6.2
growlagent-openafs crashes
126813: resolved Mar 24, 2010 OpenAFS 1.5.73 aklog crash
From jason@rampaginggeek.com Wed Apr 14 02:10:58 2010
From: jason@rampaginggeek.com (Jason Edgecombe)
Date: Tue, 13 Apr 2010 21:10:58 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <4BC47100.1030802@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com>
Message-ID: <4BC51622.8000903@rampaginggeek.com>
Jeffrey Altman wrote:
> In 2002, the OpenAFS version of the "vos listaddrs" command was updated
> to include the Arla -printuuid and -noresolve options which permits the
> UUID and IP address of registered file servers to be displayed. For
> example:
>
> UUID: 006cab10-0e3e-1b20-a3-aa-2601a8c0aa77
> 24.193.47.88
> 192.168.122.1
> 192.168.1.38
>
> In 2008, the -noresolve option was made generic so that it could apply
> to all vos commands so that instead of seeing DNS names the actual IP
> addresses of server could be viewed. This change was made because DNS
> name resolution often makes it appear that a file server is properly
> registered when instead it is in fact not.
>
> However, IP addresses are not the canonical method of identifying a file
> server. For that the UUID is required and at the present time there is
> no mechanism when using vos listvldb or vos examine to identify the UUID
> of the server on which a volume is located. This lack has come up
> several times in the #openafs IRC channel when attempting to help users
> setup new cells or add new file servers. The most recent time on March
> 30th.
>
> Gerrit http://gerrit.openafs.org/#change,1742 is an attempt to add
> -printuuid as a standard option to all vos commands. The only issue at
> the moment is what the format of the output should look like. UUIDs and
> DNS names are long. Extending the existing format to include the UUID
> inline with each server produces output that will not fit in an 80
> column terminal.
>
> An example of "vos examine -printuuid" output:
>
> root.cell 537870331 RW 42 K On-line
> ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77] /vicepr
> RWrite 537870331 ROnly 537870333 Backup 537870332
> MaxQuota 500 K
> Creation Fri Jun 06 12:24:21 2008
> Copy Thu Feb 26 11:43:23 2009
> Backup Tue Apr 13 02:00:17 2010
> Last Update Thu Oct 18 12:44:23 2007
> 7647 accesses in the past day (i.e., vnode references)
>
> RWrite: 537870331 ROnly: 537870333 Backup: 537870332
> number of sites -> 4
> server ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77]
> partition /vicepr RW Site
> server ASCLEPIUS.MIT.EDU [0037555a-be36-19a6-a2-4d-5e3c0912aa77]
> partition /vicepr RO Site
> server MNEMOSYNE.MIT.EDU [005d91e8-f824-19a6-aa-5c-613c0912aa77]
> partition /vicepr RO Site
> server IXION.MIT.EDU [00086236-fa87-19a6-b4-de-ab015b12aa77]
> partition /vicepr RO Site
>
> An example of "vos listvldb -printuuid" output:
>
> root.cell
> RWrite: 536870915 ROnly: 536870916
> number of sites -> 4
> server bethlehem.your-file-system.com
> [0008fa02-d48c-19b9-81-fc-419a1dccaa77] partition /vicepa RW Site
> server bethlehem.your-file-system.com
> [0008fa02-d48c-19b9-81-fc-419a1dccaa77] partition /vicepa RO Site
> server faultline.your-file-system.com
> [0007580a-7001-1aae-85-8e-2f9a1dccaa77] partition /vicepa RO Site
> server cpe-24-193-47-88.nyc.res.rr.com
> [006cab10-0e3e-1b20-a3-aa-2601a8c0aa77] partition /vicepa RO Site
>
> One alternative output format that could be used when the -printuuid
> option is specified is found below.
>
> vos examine -printuuid:
>
> root.cell 537870331 RW 42 K On-line
> UUID: 0037555a-be36-19a6-a2-4d-5e3c0912aa77
> Server ASCLEPIUS.MIT.EDU
> Partition /vicepr
> RWrite 537870331 ROnly 537870333 Backup 537870332
> MaxQuota 500 K
> Creation Fri Jun 06 12:24:21 2008
> Copy Thu Feb 26 11:43:23 2009
> Backup Tue Apr 13 02:00:17 2010
> Last Update Thu Oct 18 12:44:23 2007
> 7647 accesses in the past day (i.e., vnode references)
>
> RWrite: 537870331 ROnly: 537870333 Backup: 537870332
> number of sites -> 4
> RW Site
> server ASCLEPIUS.MIT.EDU
> uuid 0037555a-be36-19a6-a2-4d-5e3c0912aa77
> partition /vicepr
> RO Site
> server ASCLEPIUS.MIT.EDU
> uuid 0037555a-be36-19a6-a2-4d-5e3c0912aa77
> partition /vicepr
> RO Site
> server MNEMOSYNE.MIT.EDU
> uuid 005d91e8-f824-19a6-aa-5c-613c0912aa77
> partition /vicepr
> RO Site
> server IXION.MIT.EDU
> uuid 00086236-fa87-19a6-b4-de-ab015b12aa77
> partition /vicepr
>
> vos listvldb -printuuid:
>
> root.cell
> RWrite: 536870915 ROnly: 536870916
> number of sites -> 4
> RW Site
> server bethlehem.your-file-system.com
> uuid 0008fa02-d48c-19b9-81-fc-419a1dccaa77
> partition /vicepa
> RO Site
> server bethlehem.your-file-system.com
> uuid 0008fa02-d48c-19b9-81-fc-419a1dccaa77
> partition /vicepa
> RO Site
> server faultline.your-file-system.com
> uuid 0007580a-7001-1aae-85-8e-2f9a1dccaa77
> partition /vicepa
> RO Site
> server cpe-24-193-47-88.nyc.res.rr.com
> uuid 006cab10-0e3e-1b20-a3-aa-2601a8c0aa77
> partition /vicepa
>
> Please offer your opinions. As people have a variety of scripts that
> parse the output of vos commands to automate behaviors, we would not be
> changing the default output. Any format change would only be used when
> the -printuuid option is specified.
>
I like this output style best:
RO Site
server cpe-24-193-47-88.nyc.res.rr.com
uuid 006cab10-0e3e-1b20-a3-aa-2601a8c0aa77
partition /vicepa
If everything is put on one line, I would prefer the uuid column first.
That way, the columns line up better.
Would it be possible to use a shortened form of the uuid like git uses
short commit hash strings?
Jason
From atro.tossavainen+openafs@helsinki.fi Wed Apr 14 11:10:24 2010
From: atro.tossavainen+openafs@helsinki.fi (Atro Tossavainen)
Date: Wed, 14 Apr 2010 13:10:24 +0300 (EEST)
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <20100413131129.b790823c.adeason@sinenomine.net>
Message-ID: <201004141010.o3EAAOZr014266@ruuvi.it.helsinki.fi>
OK, I have it again. User reports inability to log in.
$ kas -a dsakfksda
Administrator's (dsakfksda) Password:
ka> exa useraccount
examine: ticket contained unknown key version number getting information for useraccount.
So:
$ udebug 128.214.88.114 7004 -long
Host's addresses are: 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1
172.18.0.1
Host's 128.214.88.114 time is Wed Apr 14 13:01:35 2010
Local time is Wed Apr 14 13:01:35 2010 (time differential 0 secs)
Last yes vote for 128.214.58.174 was 14 secs ago (sync site);
Last vote started 13 secs ago (at Wed Apr 14 13:01:22 2010)
Local db version is 1271074432.4
I am not sync site
Lowest host 128.214.58.174 was set 14 secs ago
Sync host 128.214.58.174 was set 14 secs ago
Sync site's db version is 1271074432.4
0 locked pages, 0 of them for write
Server (128.214.58.174): (db 0.0)
last vote rcvd 165054 secs ago (at Mon Apr 12 15:10:41 2010),
last beacon sent 165054 secs ago (at Mon Apr 12 15:10:41 2010), last
vote was no
dbcurrent=0, up=1 beaconSince=1
$ udebug 128.214.58.174 7004 -long
Host's addresses are: 128.214.58.174
Host's 128.214.58.174 time is Wed Apr 14 13:01:50 2010
Local time is Wed Apr 14 13:01:54 2010 (time differential 4 secs)
Last yes vote for 128.214.58.174 was 12 secs ago (sync site);
Last vote started 12 secs ago (at Wed Apr 14 13:01:42 2010)
Local db version is 1271074432.4
I am sync site until 47 secs from now (at Wed Apr 14 13:02:41 2010) (2
servers)
Recovery state 1f
Sync site's db version is 1271074432.4
0 locked pages, 0 of them for write
Last time a new db version was labelled was:
164878 secs ago (at Mon Apr 12 15:13:56 2010)
Server (128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1): (db
1271074432.4)
last vote rcvd -3 secs ago (at Wed Apr 14 13:01:57 2010),
last beacon sent -3 secs ago (at Wed Apr 14 13:01:57 2010), last
vote was yes
dbcurrent=1, up=1 beaconSince=1
Nothing obvious - at least to my untrained eye.
Using kas specifically against 128.214.58.174 (the sunx86_510 OpenAFS
host) reproduces the error with this account.
Using kas specifically against 128.214.88.114 (the sun4x_58 IBM AFS
host) shows correct information, however.
bos restart 128.214.58.174 kaserver made the problem go away.
There is no indication of anything unusual in the AuthLog of 128.214.58.174.
All pointers welcome.
--
Atro Tossavainen (Mr.) / The Institute of Biotechnology at
Systems Analyst, Techno-Amish & / the University of Helsinki, Finland,
+358-9-19158939 UNIX Dinosaur / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS
From rtb@pclella.cern.ch Wed Apr 14 13:52:32 2010
From: rtb@pclella.cern.ch (Rainer Toebbicke)
Date: Wed, 14 Apr 2010 14:52:32 +0200
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <201004141010.o3EAAOZr014266@ruuvi.it.helsinki.fi>
References: <201004141010.o3EAAOZr014266@ruuvi.it.helsinki.fi>
Message-ID: <4BC5BA90.8020608@pclella.cern.ch>
Atro Tossavainen schrieb:
> OK, I have it again. User reports inability to log in.
>
> $ kas -a dsakfksda
> Administrator's (dsakfksda) Password:
> ka> exa useraccount
> examine: ticket contained unknown key version number getting information for useraccount.
>
One of your DB servers has an incorrect /usr/afs/etc/KeyFile (in
Transarc-naming, if you use the /var/openafs style it's accordingly).
Very likely you added a new key but did not propagate the KeyFile everywhere.
"bos listkey " should help.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985 Fax: +41 22 767 7155
From atro.tossavainen+openafs@helsinki.fi Wed Apr 14 14:06:23 2010
From: atro.tossavainen+openafs@helsinki.fi (Atro Tossavainen)
Date: Wed, 14 Apr 2010 16:06:23 +0300 (EEST)
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <4BC5BA90.8020608@pclella.cern.ch>
Message-ID: <201004141306.o3ED6NOW027593@ruuvi.it.helsinki.fi>
Rainer,
> One of your DB servers has an incorrect /usr/afs/etc/KeyFile (in
> Transarc-naming, if you use the /var/openafs style it's accordingly).
>
> Very likely you added a new key but did not propagate the KeyFile everywhere.
> "bos listkey " should help.
According to bos listkey of the various servers (the two db/file
ones and the new file-only-so-far) the key has the same cksum on
all servers and has been last changed on the same date (which is
a while ago). Which isn't that surprising given that I copied
/usr/afs/etc from the old server verbatim.
This does also not explain why the problem goes away with a kaserver
restart - only to reappear in a bit?
--
Atro Tossavainen (Mr.) / The Institute of Biotechnology at
Systems Analyst, Techno-Amish & / the University of Helsinki, Finland,
+358-9-19158939 UNIX Dinosaur / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS
From scs@umich.edu Wed Apr 14 15:47:26 2010
From: scs@umich.edu (Steve Simmons)
Date: Wed, 14 Apr 2010 10:47:26 -0400
Subject: [OpenAFS] OS X, AFS Home Directories and SSH/Unix Permissions
In-Reply-To:
References: <393853E2-C3D1-4F6B-854C-ED0E1D06094D@cs.wisc.edu>
Message-ID:
On Apr 13, 2010, at 7:02 PM, Derrick Brashear wrote:
>> Has anyone run into something like this? Is there a way to change =
the permissions AFS reports to OSX, or is there a work around I'm =
failing to see?
>=20
> Check out the RealModes setting. Edit
> /var/db/openafs/etc/config/settings.plist, and rerun
> /var/db/openafs/etc/config/afssettings as root.
Wow. How the hell did I miss that? I'm passing this on to our OSX =
support guys.
Steve=
From scs@umich.edu Wed Apr 14 15:51:47 2010
From: scs@umich.edu (Steve Simmons)
Date: Wed, 14 Apr 2010 10:51:47 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server UUIDs
In-Reply-To: <4BC4E1FF.5050305@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com>
Message-ID:
On Apr 13, 2010, at 5:28 PM, Jeffrey Altman wrote:
>>=20
>> I'm a long-time fan of having a switch that causes tools to dump =
their data in an easy-to-machine-parse format. That isn't always doable, =
but when it is, it's a big win.
>=20
> As Andrew pointed out in another reply in this thread, the -format
> switch is support to provide that but it fails to provide a consistent
> (value - data) pair per line.
Exactly my point. We currently snarf off all that data nightly via =
script that parses the output from vos e -format. It works but was a =
pain.
Note, tho, that some data doesn't adapt well to single-line output. For =
example,=20
...
root.cell
RWrite: 536870915 ROnly: 536870916
number of sites -> 4
server bethlehem.your-file-system.com partition /vicepa RW Site
server bethlehem.your-file-system.com partition /vicepa RO Site
server faultline.your-file-system.com partition /vicepa RO Site
server cpe-24-193-47-88.nyc.res.rr.com partition /vicepa RO Site
RWrite: 536870915 ROnly: 536870916
number of sites -> 4
server bethlehem.your-file-system.com partition /vicepa RW Site
server bethlehem.your-file-system.com partition /vicepa RO Site
server faultline.your-file-system.com partition /vicepa RO Site
server cpe-24-193-47-88.nyc.res.rr.com partition /vicepa RO Site
...
just doesn't map well to single-line. We currently deal with this by =
creating four records, each will all the data from the rest of the =
output and the specifics of the four entries above.
Steve
From shadow@gmail.com Wed Apr 14 16:04:40 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Wed, 14 Apr 2010 11:04:40 -0400
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <201004141010.o3EAAOZr014266@ruuvi.it.helsinki.fi>
References: <20100413131129.b790823c.adeason@sinenomine.net>
<201004141010.o3EAAOZr014266@ruuvi.it.helsinki.fi>
Message-ID:
I'd suggest just using the IBM binary for the kaserver (and only the
kaserver) in your OpenAFS installation (or better yet switching to
krb5)
On Wed, Apr 14, 2010 at 6:10 AM, Atro Tossavainen
wrote:
> OK, I have it again. =A0User reports inability to log in.
>
> $ kas -a dsakfksda
> Administrator's (dsakfksda) Password:
> ka> exa useraccount
> examine: ticket contained unknown key version number getting information =
for useraccount.
>
> So:
>
> $ udebug 128.214.88.114 7004 -long
> Host's addresses are: 128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1
> 172.18.0.1
> Host's 128.214.88.114 time is Wed Apr 14 13:01:35 2010
> Local time is Wed Apr 14 13:01:35 2010 (time differential 0 secs)
> Last yes vote for 128.214.58.174 was 14 secs ago (sync site);
> Last vote started 13 secs ago (at Wed Apr 14 13:01:22 2010)
> Local db version is 1271074432.4
> I am not sync site
> Lowest host 128.214.58.174 was set 14 secs ago
> Sync host 128.214.58.174 was set 14 secs ago
> Sync site's db version is 1271074432.4
> 0 locked pages, 0 of them for write
>
> Server (128.214.58.174): (db 0.0)
> last vote rcvd 165054 secs ago (at Mon Apr 12 15:10:41 2010),
> last beacon sent 165054 secs ago (at Mon Apr 12 15:10:41 2010), last
> vote was no
> dbcurrent=3D0, up=3D1 beaconSince=3D1
>
>
> $ udebug 128.214.58.174 7004 -long
> Host's addresses are: 128.214.58.174
> Host's 128.214.58.174 time is Wed Apr 14 13:01:50 2010
> Local time is Wed Apr 14 13:01:54 2010 (time differential 4 secs)
> Last yes vote for 128.214.58.174 was 12 secs ago (sync site);
> Last vote started 12 secs ago (at Wed Apr 14 13:01:42 2010)
> Local db version is 1271074432.4
> I am sync site until 47 secs from now (at Wed Apr 14 13:02:41 2010) (2
> servers)
> Recovery state 1f
> Sync site's db version is 1271074432.4
> 0 locked pages, 0 of them for write
> Last time a new db version was labelled was:
> 164878 secs ago (at Mon Apr 12 15:13:56 2010)
>
> Server (128.214.88.114 10.0.0.3 172.16.0.1 172.17.0.1 172.18.0.1): (db
> 1271074432.4)
> last vote rcvd -3 secs ago (at Wed Apr 14 13:01:57 2010),
> last beacon sent -3 secs ago (at Wed Apr 14 13:01:57 2010), last
> vote was yes
> dbcurrent=3D1, up=3D1 beaconSince=3D1
>
>
> Nothing obvious - at least to my untrained eye.
>
> Using kas specifically against 128.214.58.174 (the sunx86_510 OpenAFS
> host) reproduces the error with this account.
>
> Using kas specifically against 128.214.88.114 (the sun4x_58 IBM AFS
> host) shows correct information, however.
>
> bos restart 128.214.58.174 kaserver made the problem go away.
>
> There is no indication of anything unusual in the AuthLog of 128.214.58.1=
74.
>
> All pointers welcome.
>
> --
> Atro Tossavainen (Mr.) =A0 =A0 =A0 =A0 =A0 =A0 =A0 / The Institute of Bio=
technology at
> Systems Analyst, Techno-Amish & =A0 =A0 / the University of Helsinki, Fin=
land,
> +358-9-19158939 =A0UNIX Dinosaur =A0 =A0 / employs me, but my opinions ar=
e my own.
> < URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHM=
ENTS
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
--=20
Derrick
From jaltman@secure-endpoints.com Wed Apr 14 16:23:17 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Wed, 14 Apr 2010 11:23:17 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To:
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com>
Message-ID: <4BC5DDE5.3020703@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms000805090504050000090208
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/14/2010 10:51 AM, Steve Simmons wrote:
>=20
> On Apr 13, 2010, at 5:28 PM, Jeffrey Altman wrote:
>=20
>>>
>>> I'm a long-time fan of having a switch that causes tools to dump thei=
r data in an easy-to-machine-parse format. That isn't always doable, but =
when it is, it's a big win.
>>
>> As Andrew pointed out in another reply in this thread, the -format
>> switch is support to provide that but it fails to provide a consistent=
>> (value - data) pair per line.
>=20
> Exactly my point. We currently snarf off all that data nightly via scri=
pt that parses the output from vos e -format. It works but was a pain.
>=20
> Note, tho, that some data doesn't adapt well to single-line output. For=
example,=20
>=20
> just doesn't map well to single-line. We currently deal with this by cr=
eating four records, each will all the data from the rest of the output a=
nd the specifics of the four entries above.
>=20
> Steve
Anyone want a -xml option?
--------------ms000805090504050000090208
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTQxNTIzMTdaMCMGCSqGSIb3DQEJBDEWBBQM6RyT
+UATegYLd2oB+c/AAiDqPjBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBABJwHWePWNAkaLE7aB7Jtmf4i1sGm48PDIqm
6UatsbEm5AAugH0iEaAXXJZPWh2SSLlNDmwkKiOPXhr0iqnTKTLYsoPfhGKjE47aNEmJ2sVD
iDst3alctOFjOfqAS5Vu5iOsylweb+Ux9Vpt42zskkxq/3lxLXSveXgXxlEckeyRi05w2pDw
YSDBdLD8oFulS6iajAgzNnMQ4iuhb2GsPaUL1QTwPhyDBxxR3c/fTIxbIlnIc4IXkVgFUWAx
7Q82/idv/bkUVfJtkqY9UqMj7sCA2jUDPAwYMCaH4DkKdH1VEu1MiBxklokzMnp91biwd2OU
003rRtzb5N5Vfs7IDDIAAAAAAAA=
--------------ms000805090504050000090208--
From shadow@gmail.com Wed Apr 14 16:37:33 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Wed, 14 Apr 2010 11:37:33 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <4BC5DDE5.3020703@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com>
<4BC4797B.3070202@email.unc.edu>
<6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu>
<4BC4E1FF.5050305@secure-endpoints.com>
<4BC5DDE5.3020703@secure-endpoints.com>
Message-ID:
On Wed, Apr 14, 2010 at 11:23 AM, Jeffrey Altman
wrote:
> On 4/14/2010 10:51 AM, Steve Simmons wrote:
>>
>> On Apr 13, 2010, at 5:28 PM, Jeffrey Altman wrote:
>>
>>>>
>>>> I'm a long-time fan of having a switch that causes tools to dump their data in an easy-to-machine-parse format. That isn't always doable, but when it is, it's a big win.
>>>
>>> As Andrew pointed out in another reply in this thread, the -format
>>> switch is support to provide that but it fails to provide a consistent
>>> (value - data) pair per line.
>>
>> Exactly my point. We currently snarf off all that data nightly via script that parses the output from vos e -format. It works but was a pain.
>>
>> Note, tho, that some data doesn't adapt well to single-line output. For example,
>>
>> just doesn't map well to single-line. We currently deal with this by creating four records, each will all the data from the rest of the output and the specifics of the four entries above.
>>
>> Steve
>
> Anyone want a -xml option?
didn't we already have a patch for that in RT?
--
Derrick
From phalenor@gmail.com Wed Apr 14 16:50:53 2010
From: phalenor@gmail.com (Andy Cobaugh)
Date: Wed, 14 Apr 2010 11:50:53 -0400 (EDT)
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <4BC5DDE5.3020703@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com>
<4BC5DDE5.3020703@secure-endpoints.com>
Message-ID:
On 2010-04-14 at 11:23, Jeffrey Altman ( jaltman@secure-endpoints.com ) said:
> On 4/14/2010 10:51 AM, Steve Simmons wrote:
>>
>> On Apr 13, 2010, at 5:28 PM, Jeffrey Altman wrote:
>>
>>>>
>>>> I'm a long-time fan of having a switch that causes tools to dump their data in an easy-to-machine-parse format. That isn't always doable, but when it is, it's a big win.
>>>
>>> As Andrew pointed out in another reply in this thread, the -format
>>> switch is support to provide that but it fails to provide a consistent
>>> (value - data) pair per line.
>>
>> Exactly my point. We currently snarf off all that data nightly via script that parses the output from vos e -format. It works but was a pain.
>>
>> Note, tho, that some data doesn't adapt well to single-line output. For example,
>>
>> just doesn't map well to single-line. We currently deal with this by creating four records, each will all the data from the rest of the output and the specifics of the four entries above.
>>
>> Steve
>
> Anyone want a -xml option?
Yes, please. As much as I am not a fan of XML, it would make some of our
lives easier for those of us using languages that include xml parsers.
--andy
From shadow@gmail.com Wed Apr 14 17:01:02 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Wed, 14 Apr 2010 12:01:02 -0400
Subject: [OpenAFS] A call to potential GSOC students: Modifying the output of vos
commands to include server UUIDs
Message-ID:
Hey GSoC candidates! here's a simple project for you. vos examine
already includes a "-format" switch to display a fixed, parseable
format. add a "-xml" switch, and output in xml! Everything you need to
change is in src/volser (vos.c and vsprocs.c). Ideally you won't
require an external library to do this, but instead can just output
xml directly, as we are unlikely to have any given xml library on all
supported platforms.
On Wed, Apr 14, 2010 at 11:50 AM, Andy Cobaugh wrote:
> On 2010-04-14 at 11:23, Jeffrey Altman ( jaltman@secure-endpoints.com )
> said:
>>
>> On 4/14/2010 10:51 AM, Steve Simmons wrote:
>>>
>>> On Apr 13, 2010, at 5:28 PM, Jeffrey Altman wrote:
>>>
>>>>>
>>>>> I'm a long-time fan of having a switch that causes tools to dump their
>>>>> data in an easy-to-machine-parse format. That isn't always doable, but when
>>>>> it is, it's a big win.
>>>>
>>>> As Andrew pointed out in another reply in this thread, the -format
>>>> switch is support to provide that but it fails to provide a consistent
>>>> (value - data) pair per line.
>>>
>>> Exactly my point. We currently snarf off all that data nightly via script
>>> that parses the output from vos e -format. It works but was a pain.
>>>
>>> Note, tho, that some data doesn't adapt well to single-line output. For
>>> example,
>>>
>>> just doesn't map well to single-line. We currently deal with this by
>>> creating four records, each will all the data from the rest of the output
>>> and the specifics of the four entries above.
>>>
>>> Steve
>>
>> Anyone want a -xml option?
>
> Yes, please. As much as I am not a fan of XML, it would make some of our
> lives easier for those of us using languages that include xml parsers.
>
> --andy
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
--
Derrick
From fbo2@gmx.net Wed Apr 14 18:21:27 2010
From: fbo2@gmx.net (Frank Burkhardt)
Date: Wed, 14 Apr 2010 19:21:27 +0200
Subject: [OpenAFS] Modifying the output of vos commands to include
server UUIDs
In-Reply-To: <4BC5DDE5.3020703@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com> <4BC5DDE5.3020703@secure-endpoints.com>
Message-ID: <20100414172127.GA18471@postman.alpha>
Hi,
On Wed, Apr 14, 2010 at 11:23:17AM -0400, Jeffrey Altman wrote:
[snip]
> >>> I'm a long-time fan of having a switch that causes tools to dump their
> >>> data in an easy-to-machine-parse format. That isn't always doable, but
> >>> when it is, it's a big win.
[snip]
> Anyone want a -xml option?
print "Yes - me." x $very_often;
Especially for listvol it would be very helpful.
Best,
Frank
From scs@umich.edu Wed Apr 14 18:52:17 2010
From: scs@umich.edu (Steve Simmons)
Date: Wed, 14 Apr 2010 13:52:17 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server UUIDs
In-Reply-To: <4BC5DDE5.3020703@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com> <4BC5DDE5.3020703@secure-endpoints.com>
Message-ID: <0BBD85D4-1918-450C-9226-4E162143424C@umich.edu>
On Apr 14, 2010, at 11:23 AM, Jeffrey Altman wrote:
> On 4/14/2010 10:51 AM, Steve Simmons wrote:
>>=20
>> On Apr 13, 2010, at 5:28 PM, Jeffrey Altman wrote:
>>=20
>>>>=20
>>>> I'm a long-time fan of having a switch that causes tools to dump =
their data in an easy-to-machine-parse format. That isn't always doable, =
but when it is, it's a big win.
>>>=20
>>> As Andrew pointed out in another reply in this thread, the -format
>>> switch is support to provide that but it fails to provide a =
consistent
>>> (value - data) pair per line.
>>=20
>> Exactly my point. We currently snarf off all that data nightly via =
script that parses the output from vos e -format. It works but was a =
pain.
>>=20
>> Note, tho, that some data doesn't adapt well to single-line output. =
For example,=20
>>=20
>> just doesn't map well to single-line. We currently deal with this by =
creating four records, each will all the data from the rest of the =
output and the specifics of the four entries above.
>>=20
>> Steve
>=20
> Anyone want a -xml option?
Parsing xml in ad-hoc shell scripts is hell. On the other hand, all our =
nightly data gathering scripts are in languages that have appropriate =
xml parsers available. So yes, it would be useful. Or to be more =
precise, it would *also* be useful. But comma-separated still has big =
wins for some things.
Steve=
From atro.tossavainen+openafs@helsinki.fi Wed Apr 14 19:55:41 2010
From: atro.tossavainen+openafs@helsinki.fi (Atro Tossavainen)
Date: Wed, 14 Apr 2010 21:55:41 +0300 (EEST)
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To:
Message-ID: <201004141855.o3EItfu4012245@ruuvi.it.helsinki.fi>
Derrick,
> I'd suggest just using the IBM binary for the kaserver (and only the
> kaserver) in your OpenAFS installation
That's an interesting thought, but unfortunately it's nowhere near
an option. sunx86_ is quite simply not a supported platform for
IBM AFS at all, even at 3.6 Patch 19 (August 2009).
> (or better yet switching to krb5)
Yes, I plan to do that. Eventually. At the moment, I have a number of
things that I need to change and I'd like to minimise the amount of
simultaneous changes.
--
Atro Tossavainen (Mr.) / The Institute of Biotechnology at
Systems Analyst, Techno-Amish & / the University of Helsinki, Finland,
+358-9-19158939 UNIX Dinosaur / employs me, but my opinions are my own.
< URL : http : / / www . helsinki . fi / %7E atossava / > NO FILE ATTACHMENTS
From jason@rampaginggeek.com Wed Apr 14 23:30:42 2010
From: jason@rampaginggeek.com (Jason Edgecombe)
Date: Wed, 14 Apr 2010 18:30:42 -0400
Subject: [OpenAFS] OS X, AFS Home Directories and SSH/Unix Permissions
In-Reply-To:
References: <393853E2-C3D1-4F6B-854C-ED0E1D06094D@cs.wisc.edu>
Message-ID: <4BC64212.2010505@rampaginggeek.com>
Derrick Brashear wrote:
> On Tue, Apr 13, 2010 at 4:59 PM, Jacob Ela wrote:
>
>> Greetings All,
>>
>> I've been looking for some information on this because someone else has probably run into a similar issue, but I haven't found much that is recent or pointed towards solving the problem - though I've found some old email that suggests where this originates from...
>>
>> I've got a Mac Mini lab running OSX 10.6.2 and OpenAFS 1.4.11 (but also have seen this on a MacBook running 10.6.3 and 1.5.73.3). User's home directories live in AFS, and users get Kerberos/AFS credentials at login.
>>
>> I'm seeing on the Macs that all the unix file permissions on files in AFS are shown as 666, and from the old emails I've found I'm just guessing that this is to make AFS ACL's play nicely with the Finder (or rather the other way around).
>>
>> This has the unfortunate side effect that my users can't use SSH on the Macs, as the reported permissions on their ~/.ssh/config file suggest it is group and world writable. This causes SSH to error out when a user attempts to connect to another computer because of insecure config file permissions. Trying to chmod the file from a Mac doesn't change the unix permissions as they are reported to the Mac, though Linux hosts can see these new permissions.
>>
>> Has anyone run into something like this? Is there a way to change the permissions AFS reports to OSX, or is there a work around I'm failing to see?
>>
>
> Check out the RealModes setting. Edit
> /var/db/openafs/etc/config/settings.plist, and rerun
> /var/db/openafs/etc/config/afssettings as root.
>
>
>
Is this documented somewhere?
Jason
From haba@kth.se Thu Apr 15 05:07:12 2010
From: haba@kth.se (Harald Barth)
Date: Thu, 15 Apr 2010 06:07:12 +0200 (CEST)
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <201004141855.o3EItfu4012245@ruuvi.it.helsinki.fi>
References:
<201004141855.o3EItfu4012245@ruuvi.it.helsinki.fi>
Message-ID: <20100415.060712.10863420.haba@habanero.pdc.kth.se>
> > I'd suggest just using the IBM binary for the kaserver (and only the
> > kaserver) in your OpenAFS installation
>
> That's an interesting thought, but unfortunately it's nowhere near
> an option. sunx86_ is quite simply not a supported platform for
> IBM AFS at all, even at 3.6 Patch 19 (August 2009).
"It's dead Jim."
If you really want to run kaserver in 2010, you'll have to do it on
hardware matching the age of the software, say an Sun SS10. It's not
like that there hasn't been an upgrade path to krb5 for years.
Harald.
From fcombernous@kezia.com Thu Apr 15 08:24:38 2010
From: fcombernous@kezia.com (Fabien COMBERNOUS)
Date: Thu, 15 Apr 2010 09:24:38 +0200
Subject: [OpenAFS] fix about info.numServers ?
Message-ID: <4BC6BF36.5060106@kezia.com>
Hi,
I'm setting up an AFS cell with MacOSX. My first server of the cell is
up and running. I started to setup an additional host. Unfortunatly,
during this setup, i killed the fileserver process with the kill
command. And now the bosserver think that a fileserver is running i
think. I get this log in FileLog :
Thu Apr 15 09:18:57 2010 File server starting
Thu Apr 15 09:18:57 2010 Failed to increase open file limit, using default
Thu Apr 15 09:18:57 2010 vl_Initialize: info.numServers=26052 (>
MAXSERVERS=20)
Can you provide help to fix this issue.
Best regards,
--
*Fabien COMBERNOUS*
/unix system engineer/
www.kezia.com
*Tel: +33 (0) 467 992 986*
Kezia Group
From rtb@pclella.cern.ch Thu Apr 15 09:13:53 2010
From: rtb@pclella.cern.ch (Rainer Toebbicke)
Date: Thu, 15 Apr 2010 10:13:53 +0200
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To:
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com> <4BC5DDE5.3020703@secure-endpoints.com>
Message-ID: <4BC6CAC1.3000105@pclella.cern.ch>
Andy Cobaugh schrieb:
>>
>> Anyone want a -xml option?
>
I don't care as long as that is taken as a "cherry on the icing" option, not
as a pretext to abandon improving normal vos output.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985 Fax: +41 22 767 7155
From fcombernous@kezia.com Thu Apr 15 09:27:10 2010
From: fcombernous@kezia.com (Fabien COMBERNOUS)
Date: Thu, 15 Apr 2010 10:27:10 +0200
Subject: [OpenAFS] fix about info.numServers ?
In-Reply-To: <4BC6BF36.5060106@kezia.com>
References: <4BC6BF36.5060106@kezia.com>
Message-ID: <4BC6CDDE.4010109@kezia.com>
Fabien COMBERNOUS wrote:
> Hi,
>
> I'm setting up an AFS cell with MacOSX. My first server of the cell is
> up and running. I started to setup an additional host. Unfortunatly,
> during this setup, i killed the fileserver process with the kill
> command. And now the bosserver think that a fileserver is running i
> think. I get this log in FileLog :
>
> Thu Apr 15 09:18:57 2010 File server starting
> Thu Apr 15 09:18:57 2010 Failed to increase open file limit, using
> default
> Thu Apr 15 09:18:57 2010 vl_Initialize: info.numServers=26052 (>
> MAXSERVERS=20)
>
> Can you provide help to fix this issue.
>
> Best regards,
I solved the issue by reinstalling the afs package.
For the archive: Be carefull, the uninstall script does not work. I
simply moved /Library/OpenAFS out of this path and rebooted.
--
*Fabien COMBERNOUS*
/unix system engineer/
www.kezia.com
*Tel: +33 (0) 467 992 986*
Kezia Group
From jaltman@secure-endpoints.com Thu Apr 15 13:50:45 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Thu, 15 Apr 2010 08:50:45 -0400
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <0BBD85D4-1918-450C-9226-4E162143424C@umich.edu>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com> <4BC5DDE5.3020703@secure-endpoints.com> <0BBD85D4-1918-450C-9226-4E162143424C@umich.edu>
Message-ID: <4BC70BA5.2020103@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms050000030000060507030409
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/14/2010 1:52 PM, Steve Simmons wrote:
> Parsing xml in ad-hoc shell scripts is hell. On the other hand, all our=
nightly data gathering scripts are in languages that have appropriate xm=
l parsers available. So yes, it would be useful. Or to be more precise, i=
t would *also* be useful. But comma-separated still has big wins for some=
things.
Since you have a good idea of how you would use a comma separated list,
I would encourage you to propose a -csv option as well. Either provide
a patch or at least detail a proposal of how the data should be represent=
ed.
--------------ms050000030000060507030409
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTUxMjUwNDVaMCMGCSqGSIb3DQEJBDEWBBReTtm/
lPiD7SpU/QQI8+J/ZCbEdDBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAC2QEWfPXd9xlOXnrG24c+i6JyRSkeWT2tHB
v6OEs/eocd9EVI3DOJifrT1g9EagXlwtVmR9wvFW4M4i9cHP2mHJ19A14P7fnvO8V/knh6mE
yJoAVyTxU21SM8RlkMHPeYdNnlnl53EVfmAMCyza4521KtVXLc2UVs6COlZLPHSiP47ioqSX
Zw5Y1o0sBCCzCffnTV+6+qijRzqrcG4B1ejU0q3hLee1P7E54zAES7e42obNzoMjL7fzvSFt
U/chItpnHQdiXUEqKUXpMal+2ZyGmqbpniZXTCO5U2Jz/jrHkRjyX7Czc9OBEzWRkg9BBrqb
5p7IlpdjkC9kmNnpZrwAAAAAAAA=
--------------ms050000030000060507030409--
From boyland@cs.uwm.edu Thu Apr 15 14:03:52 2010
From: boyland@cs.uwm.edu (John Tang Boyland)
Date: Thu, 15 Apr 2010 08:03:52 -0500
Subject: [OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)
In-Reply-To: Your message of "Mon, 12 Apr 2010 10:14:40 EDT."
Message-ID: <12345.1271336632@pabst.cs.uwm.edu>
Update:
After a week, I got up early enough to reboot the compute
server when few people were on it. As part of this process
I noticed that it had been set up with a memcache; I changed
this back to a disk cache.
The reason why I think this is a deadlock issue is that the
processes make no progress after a week, and indeed are
resistant to "kill -9" etc. Even shutting down the machine
gets stuck -- it has to be power-cycled.
But with a larger cache, it seems likely we won't see this
behavior again. Thanks for the help, everyone.
John
Derrick Brashear wrote:
] you might as well reboot it. i suspect (and wondered before) if the
] real issue was not deadlock but that the machine simply went into a
] loop, and with a cache that small it's likely it did. not the best
] behavior, of course but not the most urgent thing to pursue at the
] moment.
From adeason@sinenomine.net Thu Apr 15 15:08:44 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Thu, 15 Apr 2010 09:08:44 -0500
Subject: [OpenAFS] Re: Modifying the output of vos commands to include server UUIDs
References: <4BC47100.1030802@secure-endpoints.com>
<4BC4797B.3070202@email.unc.edu>
<6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu>
<4BC4E1FF.5050305@secure-endpoints.com>
<4BC5DDE5.3020703@secure-endpoints.com>
<4BC6CAC1.3000105@pclella.cern.ch>
Message-ID: <20100415090844.b642d734.adeason@sinenomine.net>
On Thu, 15 Apr 2010 10:13:53 +0200
Rainer Toebbicke wrote:
> Andy Cobaugh schrieb:
>
> >>
> >> Anyone want a -xml option?
> >
>
> I don't care as long as that is taken as a "cherry on the icing"
> option, not as a pretext to abandon improving normal vos output.
...nor abandoning abstracting vsprocs-like functionality enough to just
use it as a library.
--
Andrew Deason
adeason@sinenomine.net
From bbense@slac.stanford.edu Thu Apr 15 16:20:17 2010
From: bbense@slac.stanford.edu (Booker Bense)
Date: Thu, 15 Apr 2010 08:20:17 -0700 (PDT)
Subject: [OpenAFS] Modifying the output of vos commands to include server
UUIDs
In-Reply-To: <4BC70BA5.2020103@secure-endpoints.com>
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com> <4BC5DDE5.3020703@secure-endpoints.com>
<0BBD85D4-1918-450C-9226-4E162143424C@umich.edu> <4BC70BA5.2020103@secure-endpoints.com>
Message-ID:
On Thu, 15 Apr 2010, Jeffrey Altman wrote:
> On 4/14/2010 1:52 PM, Steve Simmons wrote:
>
>> Parsing xml in ad-hoc shell scripts is hell. On the other
>> hand, all our nightly data gathering scripts are in languages
>> that have appropriate xml parsers available. So yes, it would
>> be useful. Or to be more precise, it would *also* be useful.
>> But comma-separated still has big wins for some things.
>
YAML might be a reasonable compromise if there's only the tuit's
to get one new option.
_ Booker C. Bense
From sxw@inf.ed.ac.uk Thu Apr 15 16:25:16 2010
From: sxw@inf.ed.ac.uk (Simon Wilkinson)
Date: Thu, 15 Apr 2010 16:25:16 +0100
Subject: [OpenAFS] Modifying the output of vos commands to include server UUIDs
In-Reply-To:
References: <4BC47100.1030802@secure-endpoints.com> <4BC4797B.3070202@email.unc.edu> <6A5BB6C1-955E-4740-8CDE-FB9CD7663BD0@umich.edu> <4BC4E1FF.5050305@secure-endpoints.com> <4BC5DDE5.3020703@secure-endpoints.com> <0BBD85D4-1918-450C-9226-4E162143424C@umich.edu> <4BC70BA5.2020103@secure-endpoints.com>
Message-ID: <0C07D74C-61F9-4E98-9817-E57F7363A193@inf.ed.ac.uk>
On 15 Apr 2010, at 16:20, Booker Bense wrote:
>=20
> YAML might be a reasonable compromise if there's only the tuit's
> to get one new option.
I suspect that there aren't the tuits for any new options. If those =
expressing a desire for csv, xml, yaml, etc. want any of them, I'm sure =
patches would be welcome...
S.
From stephen@physics.unc.edu Thu Apr 15 20:37:02 2010
From: stephen@physics.unc.edu (Stephen Joyce)
Date: Thu, 15 Apr 2010 15:37:02 -0400 (EDT)
Subject: [OpenAFS] bos -localauth not working
Message-ID:
I just added a new key to the KeyFile on my db and file servers. This key
is for my campus's central krb5 realm.
Everything seems to be functioning normally regarding tickets and tokens. I
can kinit and aklog using tickets from the foreign krb5 realm and
manipulate files and folders in my cell.
However when I tried to use the -localauth flag to bos to restart server
processes, it no longer works. It does work if I have tokens rather than
using -localauth.
Everything else appears to be working fine, but I'd like to recover the
ability to use -localauth if at all possible. Errors I get:
(no tokens, but I am root):
# bos restart fs5 -all -localauth
bos: failed to restart srevers (ticket contained unknown key version number)
# kinit user/admin
(valid password entered)
# aklog
# bos restart fs5 -all
(success)
I've double-checked the new kvno is as expected, and have no problems on
the clients. So far the only symptom is bos.
What could I have missed?
Servers are OpenAFS 1.4.5 on Linux (yes, I know it's old. Upgrades are
planned, but not *right now*).
Cheers, Stephen
From shadow@gmail.com Thu Apr 15 20:39:43 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Thu, 15 Apr 2010 15:39:43 -0400
Subject: [OpenAFS] bos -localauth not working
In-Reply-To:
References:
Message-ID:
does localauth work after a bosserver restart?
On Thu, Apr 15, 2010 at 3:37 PM, Stephen Joyce wrote:
> I just added a new key to the KeyFile on my db and file servers. This key is
> for my campus's central krb5 realm.
>
> Everything seems to be functioning normally regarding tickets and tokens. I
> can kinit and aklog using tickets from the foreign krb5 realm and manipulate
> files and folders in my cell.
>
> However when I tried to use the -localauth flag to bos to restart server
> processes, it no longer works. It does work if I have tokens rather than
> using -localauth.
>
> Everything else appears to be working fine, but I'd like to recover the
> ability to use -localauth if at all possible. Errors I get:
>
> (no tokens, but I am root):
> # bos restart fs5 -all -localauth
> bos: failed to restart srevers (ticket contained unknown key version number)
>
> # kinit user/admin
> (valid password entered)
> # aklog
> # bos restart fs5 -all
> (success)
>
> I've double-checked the new kvno is as expected, and have no problems on the
> clients. So far the only symptom is bos.
>
> What could I have missed?
>
> Servers are OpenAFS 1.4.5 on Linux (yes, I know it's old. Upgrades are
> planned, but not *right now*).
>
> Cheers, Stephen
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
--
Derrick
From stephen@physics.unc.edu Thu Apr 15 20:46:07 2010
From: stephen@physics.unc.edu (Stephen Joyce)
Date: Thu, 15 Apr 2010 15:46:07 -0400 (EDT)
Subject: [OpenAFS] bos -localauth not working
In-Reply-To:
References:
Message-ID:
On Thu, 15 Apr 2010, Derrick Brashear wrote:
> does localauth work after a bosserver restart?
Yes...
Glad it was something simple!
> On Thu, Apr 15, 2010 at 3:37 PM, Stephen Joyce wrote:
>> I just added a new key to the KeyFile on my db and file servers. This key is
>> for my campus's central krb5 realm.
>>
>> Everything seems to be functioning normally regarding tickets and tokens. I
>> can kinit and aklog using tickets from the foreign krb5 realm and manipulate
>> files and folders in my cell.
>>
>> However when I tried to use the -localauth flag to bos to restart server
>> processes, it no longer works. It does work if I have tokens rather than
>> using -localauth.
>>
>> Everything else appears to be working fine, but I'd like to recover the
>> ability to use -localauth if at all possible. Errors I get:
>>
>> (no tokens, but I am root):
>> # bos restart fs5 -all -localauth
>> bos: failed to restart srevers (ticket contained unknown key version number)
>>
>> # kinit user/admin
>> (valid password entered)
>> # aklog
>> # bos restart fs5 -all
>> (success)
>>
>> I've double-checked the new kvno is as expected, and have no problems on the
>> clients. So far the only symptom is bos.
>>
>> What could I have missed?
>>
>> Servers are OpenAFS 1.4.5 on Linux (yes, I know it's old. Upgrades are
>> planned, but not *right now*).
>>
>> Cheers, Stephen
>> _______________________________________________
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>
>
>
>
> --
> Derrick
>
>
> --
>
>
From bampfamd@berkeley.edu Thu Apr 15 22:18:56 2010
From: bampfamd@berkeley.edu (bampfamd@berkeley.edu)
Date: Thu, 15 Apr 2010 14:18:56 -0700
Subject: [OpenAFS] openAFS 1.4.12 Kernel Panic on restart? (mac)
Message-ID:
------=_20100415141856_60636
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Sorry, I am unfamiliar with using these scripts to decode the panic logs,
but I have included both an attachment and a text copy of the log here. If
you could help it would be greatly appreciated! thanks!
---------------------------------------------------------------------
Thu Apr 15 13:48:03 2010
panic(cpu 1 caller 0x001AB0FE): Kernel trap at 0x3177b411, type 14=page
fault, registers:
CR0: 0x8001003b, CR2: 0x3177b411, CR3: 0x010e4000, CR4: 0x000006e0
EAX: 0x00000023, EBX: 0x00000000, ECX: 0x039cb000, EDX: 0x02f5b100
CR2: 0x3177b411, EBP: 0x21143f4c, ESI: 0x039cb000, EDI: 0x210dbb94
EFL: 0x00010216, EIP: 0x3177b411, CS: 0x00000004, DS: 0x0000000c
Error code: 0x00000010
Backtrace (CPU 1), Frame : Return Address (4 potential args on stack)
0x21143d88 : 0x12b4c6 (0x45f91c 0x21143dbc 0x13355c 0x0)
0x21143dd8 : 0x1ab0fe (0x469a98 0x3177b411 0xe 0x469248)
0x21143eb8 : 0x1a1713 (0x21143ecc 0x21143f4c 0x3177b411 0xe)
0x21143ec4 : 0x3177b411 (0xe 0x380048 0x8e62000c 0xc)
0x21143f4c : 0x3177aec0 (0x3179d7c0 0x1f4 0x0 0x10624dd3)
0x21143f88 : 0x3176acd8 (0x1f4 0x0 0x0 0x0)
0x21143fac : 0x3178007e (0x317978d0 0x3177fb18 0x2b25018 0x210dbb94)
0x21143fc8 : 0x1a14fc (0x210dbb94 0x0 0x1a40b5 0x39cce40)
Backtrace terminated-invalid frame pointer 0
BSD process name corresponding to current thread: kernel_task
Mac OS version:
9L31a
Kernel version:
Darwin Kernel Version 9.8.0: Wed Jul 15 16:55:01 PDT 2009;
root:xnu-1228.15.4~1/RELEASE_I386
System model name: iMac4,2 (Mac-F4218EC8)
System uptime in nanoseconds: 10331785353733
unloaded kexts:
org.openafs.filesystems.afs 1.4.12 - last unloaded 10331531078921
loaded kexts:
org.openafs.filesystems.afs 1.4.12 - last loaded 20825984843
com.apple.filesystems.autofs 2.0.2
com.apple.driver.AppleHDAPlatformDriver 1.7.1a2
com.apple.driver.AppleHDAHardwareConfigDriver 1.7.1a2
com.apple.driver.AppleHDA 1.7.1a2
com.apple.driver.AppleUpstreamUserClient 2.7.5
com.apple.driver.AppleIntelGMA950 5.4.8
com.apple.driver.AppleGraphicsControl 2.8.15
com.apple.driver.AppleIntelGMAX3100 5.4.8
com.apple.Dont_Steal_Mac_OS_X 6.0.3
com.apple.driver.AppleIntelIntegratedFramebuffer 5.4.8
com.apple.driver.AppleUSBOpticalMouse 3.2.0
com.apple.driver.AppleHDAController 1.7.1a2
com.apple.iokit.IOFireWireIP 1.7.7
com.apple.driver.AppleIRController 113
com.apple.driver.AudioIPCDriver 1.0.6
com.apple.driver.ACPI_SMC_PlatformPlugin 3.4.0a17
com.apple.driver.AppleLPC 1.3.1
com.apple.driver.AppleBacklight 1.6.0
com.apple.driver.AppleTyMCEDriver 1.0.0d28
com.apple.driver.AppleUSBMergeNub 3.5.2
com.apple.driver.USBCameraFirmwareLoader 1.0.9
com.apple.iokit.IOSCSIMultimediaCommandsDevice 2.1.1
com.apple.iokit.SCSITaskUserClient 2.1.1
com.apple.driver.XsanFilter 2.7.91
com.apple.iokit.IOATAPIProtocolTransport 1.5.3
com.apple.iokit.IOAHCIBlockStorage 1.2.2
com.apple.driver.AppleUSBHub 3.4.9
com.apple.iokit.IOUSBUserClient 3.5.2
com.apple.driver.AppleFWOHCI 3.9.7
com.apple.iokit.AppleYukon2 3.1.13b2
com.apple.driver.AirPortBrcm43xx 366.91.21
com.apple.driver.AppleAHCIPort 1.7.0
com.apple.driver.AppleIntelPIIXATA 2.0.1
com.apple.driver.AppleFileSystemDriver 1.1.0
com.apple.driver.AppleUSBEHCI 3.4.6
com.apple.driver.AppleUSBUHCI 3.5.2
com.apple.driver.AppleEFINVRAM 1.2.0
com.apple.driver.AppleRTC 1.2.3
com.apple.driver.AppleHPET 1.4
com.apple.driver.AppleACPIPCI 1.2.5
com.apple.driver.AppleACPIButtons 1.2.5
com.apple.driver.AppleSMBIOS 1.4
com.apple.driver.AppleACPIEC 1.2.5
com.apple.driver.AppleAPIC 1.4
com.apple.security.seatbelt 107.12
com.apple.nke.applicationfirewall 1.8.77
com.apple.security.TMSafetyNet 3
com.apple.driver.AppleIntelCPUPowerManagement 76.2.0
com.apple.driver.DiskImages 199
com.apple.BootCache 30.4
com.apple.driver.DspFuncLib 1.7.1a2
com.apple.iokit.IOHDAFamily 1.7.1a2
com.apple.iokit.IOAudioFamily 1.6.9fc5
com.apple.kext.OSvKernDSPLib 1.1
com.apple.driver.IOPlatformPluginFamily 3.4.0a17
com.apple.iokit.IONDRVSupport 1.7.3
com.apple.iokit.IOGraphicsFamily 1.7.3
com.apple.driver.AppleSMC 2.3.1d1
com.apple.iokit.IOUSBHIDDriver 3.4.6
com.apple.driver.AppleUSBComposite 3.2.0
com.apple.iokit.IOSCSIBlockCommandsDevice 2.1.1
com.apple.iokit.IOBDStorageFamily 1.5
com.apple.iokit.IODVDStorageFamily 1.5
com.apple.iokit.IOCDStorageFamily 1.5
com.apple.iokit.IOSCSIArchitectureModelFamily 2.1.1
com.apple.iokit.IOFireWireFamily 3.4.9
com.apple.iokit.IO80211Family 216.1
com.apple.iokit.IOAHCIFamily 1.5.0
com.apple.iokit.IOATAFamily 2.0.1
com.apple.iokit.IOUSBFamily 3.5.2
com.apple.iokit.IONetworkingFamily 1.6.1
com.apple.driver.AppleEFIRuntime 1.2.0
com.apple.iokit.IOSMBusFamily 1.1
com.apple.iokit.IOHIDFamily 1.5.5
com.apple.iokit.IOStorageFamily 1.5.6
com.apple.driver.AppleACPIPlatform 1.2.5
com.apple.iokit.IOACPIFamily 1.2.0
com.apple.iokit.IOPCIFamily 2.6
---------------------------------------------------------------------
> On Thu, Apr 8, 2010 at 6:19 PM, wrote:
>> So i installed OpenAFS 1.4.12 on a Leopard Mac OS X computer earlier and
>> tried running it (i had a slow afs loading issue earlier...but this
>> version fixed that). Even though this version fixed that slow loading
>> issue, I get the Kernel Panic message (the "You need to power down your
>> computer...") every time I try to power down the computer/restart it. I
>> assume that the kernel panic happens when the system is trying to
>> unmount
>> the AFS server...but as of right now, it happens every single time I
>> turn
>> off my computer.
>>
>> If the kernel panic log is needed I will post one up. But does anyone
>> have
>> any idea as of right now?
>
> a decoded panic log would be ideal, if you could. the decode-panic
> sript should be able to help with that.
>
------=_20100415141856_60636
Content-Type: application/octet-stream; name="2010-04-15-134803.panic"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="2010-04-15-134803.panic"
VGh1IEFwciAxNSAxMzo0ODowMyAyMDEwCnBhbmljKGNwdSAxIGNhbGxlciAweDAwMUFCMEZFKTog
S2VybmVsIHRyYXAgYXQgMHgzMTc3YjQxMSwgdHlwZSAxND1wYWdlIGZhdWx0LCByZWdpc3RlcnM6
CkNSMDogMHg4MDAxMDAzYiwgQ1IyOiAweDMxNzdiNDExLCBDUjM6IDB4MDEwZTQwMDAsIENSNDog
MHgwMDAwMDZlMApFQVg6IDB4MDAwMDAwMjMsIEVCWDogMHgwMDAwMDAwMCwgRUNYOiAweDAzOWNi
MDAwLCBFRFg6IDB4MDJmNWIxMDAKQ1IyOiAweDMxNzdiNDExLCBFQlA6IDB4MjExNDNmNGMsIEVT
STogMHgwMzljYjAwMCwgRURJOiAweDIxMGRiYjk0CkVGTDogMHgwMDAxMDIxNiwgRUlQOiAweDMx
NzdiNDExLCBDUzogIDB4MDAwMDAwMDQsIERTOiAgMHgwMDAwMDAwYwpFcnJvciBjb2RlOiAweDAw
MDAwMDEwCgpCYWNrdHJhY2UgKENQVSAxKSwgRnJhbWUgOiBSZXR1cm4gQWRkcmVzcyAoNCBwb3Rl
bnRpYWwgYXJncyBvbiBzdGFjaykKMHgyMTE0M2Q4OCA6IDB4MTJiNGM2ICgweDQ1ZjkxYyAweDIx
MTQzZGJjIDB4MTMzNTVjIDB4MCkgCjB4MjExNDNkZDggOiAweDFhYjBmZSAoMHg0NjlhOTggMHgz
MTc3YjQxMSAweGUgMHg0NjkyNDgpIAoweDIxMTQzZWI4IDogMHgxYTE3MTMgKDB4MjExNDNlY2Mg
MHgyMTE0M2Y0YyAweDMxNzdiNDExIDB4ZSkgCjB4MjExNDNlYzQgOiAweDMxNzdiNDExICgweGUg
MHgzODAwNDggMHg4ZTYyMDAwYyAweGMpIAoweDIxMTQzZjRjIDogMHgzMTc3YWVjMCAoMHgzMTc5
ZDdjMCAweDFmNCAweDAgMHgxMDYyNGRkMykgCjB4MjExNDNmODggOiAweDMxNzZhY2Q4ICgweDFm
NCAweDAgMHgwIDB4MCkgCjB4MjExNDNmYWMgOiAweDMxNzgwMDdlICgweDMxNzk3OGQwIDB4MzE3
N2ZiMTggMHgyYjI1MDE4IDB4MjEwZGJiOTQpIAoweDIxMTQzZmM4IDogMHgxYTE0ZmMgKDB4MjEw
ZGJiOTQgMHgwIDB4MWE0MGI1IDB4MzljY2U0MCkgCkJhY2t0cmFjZSB0ZXJtaW5hdGVkLWludmFs
aWQgZnJhbWUgcG9pbnRlciAwCgpCU0QgcHJvY2VzcyBuYW1lIGNvcnJlc3BvbmRpbmcgdG8gY3Vy
cmVudCB0aHJlYWQ6IGtlcm5lbF90YXNrCgpNYWMgT1MgdmVyc2lvbjoKOUwzMWEKCktlcm5lbCB2
ZXJzaW9uOgpEYXJ3aW4gS2VybmVsIFZlcnNpb24gOS44LjA6IFdlZCBKdWwgMTUgMTY6NTU6MDEg
UERUIDIwMDk7IHJvb3Q6eG51LTEyMjguMTUuNH4xL1JFTEVBU0VfSTM4NgpTeXN0ZW0gbW9kZWwg
bmFtZTogaU1hYzQsMiAoTWFjLUY0MjE4RUM4KQoKU3lzdGVtIHVwdGltZSBpbiBuYW5vc2Vjb25k
czogMTAzMzE3ODUzNTM3MzMKdW5sb2FkZWQga2V4dHM6Cm9yZy5vcGVuYWZzLmZpbGVzeXN0ZW1z
LmFmcwkxLjQuMTIgLSBsYXN0IHVubG9hZGVkIDEwMzMxNTMxMDc4OTIxCmxvYWRlZCBrZXh0czoK
b3JnLm9wZW5hZnMuZmlsZXN5c3RlbXMuYWZzCTEuNC4xMiAtIGxhc3QgbG9hZGVkIDIwODI1OTg0
ODQzCmNvbS5hcHBsZS5maWxlc3lzdGVtcy5hdXRvZnMJMi4wLjIKY29tLmFwcGxlLmRyaXZlci5B
cHBsZUhEQVBsYXRmb3JtRHJpdmVyCTEuNy4xYTIKY29tLmFwcGxlLmRyaXZlci5BcHBsZUhEQUhh
cmR3YXJlQ29uZmlnRHJpdmVyCTEuNy4xYTIKY29tLmFwcGxlLmRyaXZlci5BcHBsZUhEQQkxLjcu
MWEyCmNvbS5hcHBsZS5kcml2ZXIuQXBwbGVVcHN0cmVhbVVzZXJDbGllbnQJMi43LjUKY29tLmFw
cGxlLmRyaXZlci5BcHBsZUludGVsR01BOTUwCTUuNC44CmNvbS5hcHBsZS5kcml2ZXIuQXBwbGVH
cmFwaGljc0NvbnRyb2wJMi44LjE1CmNvbS5hcHBsZS5kcml2ZXIuQXBwbGVJbnRlbEdNQVgzMTAw
CTUuNC44CmNvbS5hcHBsZS5Eb250X1N0ZWFsX01hY19PU19YCTYuMC4zCmNvbS5hcHBsZS5kcml2
ZXIuQXBwbGVJbnRlbEludGVncmF0ZWRGcmFtZWJ1ZmZlcgk1LjQuOApjb20uYXBwbGUuZHJpdmVy
LkFwcGxlVVNCT3B0aWNhbE1vdXNlCTMuMi4wCmNvbS5hcHBsZS5kcml2ZXIuQXBwbGVIREFDb250
cm9sbGVyCTEuNy4xYTIKY29tLmFwcGxlLmlva2l0LklPRmlyZVdpcmVJUAkxLjcuNwpjb20uYXBw
bGUuZHJpdmVyLkFwcGxlSVJDb250cm9sbGVyCTExMwpjb20uYXBwbGUuZHJpdmVyLkF1ZGlvSVBD
RHJpdmVyCTEuMC42CmNvbS5hcHBsZS5kcml2ZXIuQUNQSV9TTUNfUGxhdGZvcm1QbHVnaW4JMy40
LjBhMTcKY29tLmFwcGxlLmRyaXZlci5BcHBsZUxQQwkxLjMuMQpjb20uYXBwbGUuZHJpdmVyLkFw
cGxlQmFja2xpZ2h0CTEuNi4wCmNvbS5hcHBsZS5kcml2ZXIuQXBwbGVUeU1DRURyaXZlcgkxLjAu
MGQyOApjb20uYXBwbGUuZHJpdmVyLkFwcGxlVVNCTWVyZ2VOdWIJMy41LjIKY29tLmFwcGxlLmRy
aXZlci5VU0JDYW1lcmFGaXJtd2FyZUxvYWRlcgkxLjAuOQpjb20uYXBwbGUuaW9raXQuSU9TQ1NJ
TXVsdGltZWRpYUNvbW1hbmRzRGV2aWNlCTIuMS4xCmNvbS5hcHBsZS5pb2tpdC5TQ1NJVGFza1Vz
ZXJDbGllbnQJMi4xLjEKY29tLmFwcGxlLmRyaXZlci5Yc2FuRmlsdGVyCTIuNy45MQpjb20uYXBw
bGUuaW9raXQuSU9BVEFQSVByb3RvY29sVHJhbnNwb3J0CTEuNS4zCmNvbS5hcHBsZS5pb2tpdC5J
T0FIQ0lCbG9ja1N0b3JhZ2UJMS4yLjIKY29tLmFwcGxlLmRyaXZlci5BcHBsZVVTQkh1YgkzLjQu
OQpjb20uYXBwbGUuaW9raXQuSU9VU0JVc2VyQ2xpZW50CTMuNS4yCmNvbS5hcHBsZS5kcml2ZXIu
QXBwbGVGV09IQ0kJMy45LjcKY29tLmFwcGxlLmlva2l0LkFwcGxlWXVrb24yCTMuMS4xM2IyCmNv
bS5hcHBsZS5kcml2ZXIuQWlyUG9ydEJyY200M3h4CTM2Ni45MS4yMQpjb20uYXBwbGUuZHJpdmVy
LkFwcGxlQUhDSVBvcnQJMS43LjAKY29tLmFwcGxlLmRyaXZlci5BcHBsZUludGVsUElJWEFUQQky
LjAuMQpjb20uYXBwbGUuZHJpdmVyLkFwcGxlRmlsZVN5c3RlbURyaXZlcgkxLjEuMApjb20uYXBw
bGUuZHJpdmVyLkFwcGxlVVNCRUhDSQkzLjQuNgpjb20uYXBwbGUuZHJpdmVyLkFwcGxlVVNCVUhD
SQkzLjUuMgpjb20uYXBwbGUuZHJpdmVyLkFwcGxlRUZJTlZSQU0JMS4yLjAKY29tLmFwcGxlLmRy
aXZlci5BcHBsZVJUQwkxLjIuMwpjb20uYXBwbGUuZHJpdmVyLkFwcGxlSFBFVAkxLjQKY29tLmFw
cGxlLmRyaXZlci5BcHBsZUFDUElQQ0kJMS4yLjUKY29tLmFwcGxlLmRyaXZlci5BcHBsZUFDUElC
dXR0b25zCTEuMi41CmNvbS5hcHBsZS5kcml2ZXIuQXBwbGVTTUJJT1MJMS40CmNvbS5hcHBsZS5k
cml2ZXIuQXBwbGVBQ1BJRUMJMS4yLjUKY29tLmFwcGxlLmRyaXZlci5BcHBsZUFQSUMJMS40CmNv
bS5hcHBsZS5zZWN1cml0eS5zZWF0YmVsdAkxMDcuMTIKY29tLmFwcGxlLm5rZS5hcHBsaWNhdGlv
bmZpcmV3YWxsCTEuOC43Nwpjb20uYXBwbGUuc2VjdXJpdHkuVE1TYWZldHlOZXQJMwpjb20uYXBw
bGUuZHJpdmVyLkFwcGxlSW50ZWxDUFVQb3dlck1hbmFnZW1lbnQJNzYuMi4wCmNvbS5hcHBsZS5k
cml2ZXIuRGlza0ltYWdlcwkxOTkKY29tLmFwcGxlLkJvb3RDYWNoZQkzMC40CmNvbS5hcHBsZS5k
cml2ZXIuRHNwRnVuY0xpYgkxLjcuMWEyCmNvbS5hcHBsZS5pb2tpdC5JT0hEQUZhbWlseQkxLjcu
MWEyCmNvbS5hcHBsZS5pb2tpdC5JT0F1ZGlvRmFtaWx5CTEuNi45ZmM1CmNvbS5hcHBsZS5rZXh0
Lk9Tdktlcm5EU1BMaWIJMS4xCmNvbS5hcHBsZS5kcml2ZXIuSU9QbGF0Zm9ybVBsdWdpbkZhbWls
eQkzLjQuMGExNwpjb20uYXBwbGUuaW9raXQuSU9ORFJWU3VwcG9ydAkxLjcuMwpjb20uYXBwbGUu
aW9raXQuSU9HcmFwaGljc0ZhbWlseQkxLjcuMwpjb20uYXBwbGUuZHJpdmVyLkFwcGxlU01DCTIu
My4xZDEKY29tLmFwcGxlLmlva2l0LklPVVNCSElERHJpdmVyCTMuNC42CmNvbS5hcHBsZS5kcml2
ZXIuQXBwbGVVU0JDb21wb3NpdGUJMy4yLjAKY29tLmFwcGxlLmlva2l0LklPU0NTSUJsb2NrQ29t
bWFuZHNEZXZpY2UJMi4xLjEKY29tLmFwcGxlLmlva2l0LklPQkRTdG9yYWdlRmFtaWx5CTEuNQpj
b20uYXBwbGUuaW9raXQuSU9EVkRTdG9yYWdlRmFtaWx5CTEuNQpjb20uYXBwbGUuaW9raXQuSU9D
RFN0b3JhZ2VGYW1pbHkJMS41CmNvbS5hcHBsZS5pb2tpdC5JT1NDU0lBcmNoaXRlY3R1cmVNb2Rl
bEZhbWlseQkyLjEuMQpjb20uYXBwbGUuaW9raXQuSU9GaXJlV2lyZUZhbWlseQkzLjQuOQpjb20u
YXBwbGUuaW9raXQuSU84MDIxMUZhbWlseQkyMTYuMQpjb20uYXBwbGUuaW9raXQuSU9BSENJRmFt
aWx5CTEuNS4wCmNvbS5hcHBsZS5pb2tpdC5JT0FUQUZhbWlseQkyLjAuMQpjb20uYXBwbGUuaW9r
aXQuSU9VU0JGYW1pbHkJMy41LjIKY29tLmFwcGxlLmlva2l0LklPTmV0d29ya2luZ0ZhbWlseQkx
LjYuMQpjb20uYXBwbGUuZHJpdmVyLkFwcGxlRUZJUnVudGltZQkxLjIuMApjb20uYXBwbGUuaW9r
aXQuSU9TTUJ1c0ZhbWlseQkxLjEKY29tLmFwcGxlLmlva2l0LklPSElERmFtaWx5CTEuNS41CmNv
bS5hcHBsZS5pb2tpdC5JT1N0b3JhZ2VGYW1pbHkJMS41LjYKY29tLmFwcGxlLmRyaXZlci5BcHBs
ZUFDUElQbGF0Zm9ybQkxLjIuNQpjb20uYXBwbGUuaW9raXQuSU9BQ1BJRmFtaWx5CTEuMi4wCmNv
bS5hcHBsZS5pb2tpdC5JT1BDSUZhbWlseQkyLjYKCg==
------=_20100415141856_60636--
From tcreedon@easystreet.net Fri Apr 16 00:37:09 2010
From: tcreedon@easystreet.net (Ted Creedon)
Date: Thu, 15 Apr 2010 16:37:09 -0700
Subject: [OpenAFS] Missing AFS Client tabs?
Message-ID:
--001636b2bc5532b73604844ef935
Content-Type: text/plain; charset=ISO-8859-1
OpenAFS for Windows 1.5.73:on Windows Server 2003
2 lock symbols appear in the system tray
the "Drive Letters" and "Advanced" tabs are missing from the AFS Client
popup.
Thanks
Tedc
--001636b2bc5532b73604844ef935
Content-Type: text/html; charset=ISO-8859-1
OpenAFS for Windows 1.5.73:on Windows Server 2003
2 lock symbols appear in the system tray
the "Drive Letters" and "Advanced" tabs are missing from the AFS Client popup.
Thanks
Tedc
--001636b2bc5532b73604844ef935--
From jaltman@secure-endpoints.com Fri Apr 16 01:42:06 2010
From: jaltman@secure-endpoints.com (Jeffrey Altman)
Date: Thu, 15 Apr 2010 20:42:06 -0400
Subject: [OpenAFS] Missing AFS Client tabs?
In-Reply-To:
References:
Message-ID: <4BC7B25E.6000705@secure-endpoints.com>
This is a cryptographically signed message in MIME format.
--------------ms090101000609090803000108
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 4/15/2010 7:37 PM, Ted Creedon wrote:
> OpenAFS for Windows 1.5.73:on Windows Server 2003
>=20
> 2 lock symbols appear in the system tray
>=20
> the "Drive Letters" and "Advanced" tabs are missing from the AFS Client=
> popup.
>=20
> Thanks
>=20
> Tedc
>=20
>=20
The Drive Letters and Advanced tabs are not compatible with Vista and
above. They have been removed. This is documented in the release notes
and was discussed on this mailing list before the change was made.
The two icons are the result of running both the AFS Authentication tool
and Network Identity Manager. Please choose one. Network Identity
Manager has an option to automatically disable the AFS Authentication
Tool if that is your choice.
Jeffrey Altman
--------------ms090101000609090803000108
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEAMF9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoX
DTEwMDgyODA0MDExOVowczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQDZNscYIvF6xzGSAfa/QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6
y0zlFqSbiFwgNM8m69K6m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWL
kNdaXQKk6EZVW9pfV2A4Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iE
jVhVzPobuZzwD2tuepY/bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1Zp
Yh8Fx+9cqsG8O4nqo26SVfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOK
ifHDyLZQC4qSsCUfP7vdwAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/Z
cW3icObO9FIZCSmgFMt2Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAxcwggKAoAMCAQICEAMF
9RTCGOz151fTpHLih+cwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDgyODA0MDExOVoXDTEwMDgyODA0MDExOVow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDZNscYIvF6xzGSAfa/
QUIqiElyn0EUxL2b86eKiYqe91bj0gLr/MJoErLnb+OmokxqSAH6y0zlFqSbiFwgNM8m69K6
m/6YO+x3+5zBc+u6snwTWMEWygnhx3rQ/lMhoQOgArraL+/k9aWLkNdaXQKk6EZVW9pfV2A4
Lk4DoZGFjY8tJRWWDLlFkYnxDuIEpLYwJpwakv3QHOaq/G8KW0iEjVhVzPobuZzwD2tuepY/
bsClwqxz/gfAEpUvAn/lYTqnoT7RYljZlCIdbrgcG/HSYMxAy1ZpYh8Fx+9cqsG8O4nqo26S
VfYZvrYhh8m6OqW8Vakdt7vBLCTa/QhIdJ4hAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQBvbvJNXUJ4atv1CExIe0J38jZqoEUTttkXOfCDT9e3mSmVboOKifHDyLZQC4qSsCUfP7vd
wAXjKtjak22HbfX2sEKCUgtnOkxRqXMM2V/NW/ESNVQZF0TO7L/ZcW3icObO9FIZCSmgFMt2
Al7VPfMQmaJNlqu9SLmXSwbRFJ5b4zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNxMIID
bQIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AwX1FMIY7PXnV9OkcuKH5zAJBgUrDgMCGgUAoIIB0DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA0MTYwMDQyMDZaMCMGCSqGSIb3DQEJBDEWBBQ5czuZ
5SUEqbcugYSXRIjWl/6X0TBfBgkqhkiG9w0BCQ8xUjBQMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwgYUGCSsGAQQBgjcQBDF4MHYwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0
ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVl
bWFpbCBJc3N1aW5nIENBAhADBfUUwhjs9edX06Ry4ofnMIGHBgsqhkiG9w0BCRACCzF4oHYw
YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x
LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBAhADBfUUwhjs
9edX06Ry4ofnMA0GCSqGSIb3DQEBAQUABIIBAJiQC9QHHarVJlTOb52Hz+q0olXT+x9hRXkC
cvOcTHnGYvKiLJanSRT86rO76foUUb8MfXWpngWOSzr4soNb8eLuJBSX4UI0As6HsYrI/XqY
gT625xaRfQgrd/fs6FCuOYVNWVTPXoFVo2YNpz1ffKjo4nSKN2zxzcxNc7gvJeRIeRE5VIN/
F0w7vA4JlSU6UnYeISd3qUh04N7EAE+u6AKCUV+GiayTWaoX1waJjORgMFZavV9hssJwSJ2M
X1cZZFc5ERhVbqI66Bx7viRbTTHQQ+Pd0Jj/yp/fn1jyqCMMIO1d8MX3EqQhM7tQXvAr2B1e
J5bBb7LVf5TCgPPo+QYAAAAAAAA=
--------------ms090101000609090803000108--
From adeason@sinenomine.net Fri Apr 16 01:44:22 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Thu, 15 Apr 2010 19:44:22 -0500
Subject: [OpenAFS] Re: Ubik problem
References:
<201004141855.o3EItfu4012245@ruuvi.it.helsinki.fi>
Message-ID: <20100415194422.b34491ec.adeason@sinenomine.net>
On Wed, 14 Apr 2010 21:55:41 +0300 (EEST)
Atro Tossavainen wrote:
> Derrick,
>
> > I'd suggest just using the IBM binary for the kaserver (and only the
> > kaserver) in your OpenAFS installation
>
> That's an interesting thought, but unfortunately it's nowhere near
> an option. sunx86_ is quite simply not a supported platform for
> IBM AFS at all, even at 3.6 Patch 19 (August 2009).
Older OpenAFS releases could be another option, but I don't know how
useful of an answer that is. I'm not sure what could have caused that,
so I don't have a particular range in mind; maybe just earlier 1.4...
1.4.9? 1.4.2?
That's not a good solution, of course, but I don't know how much
enthusiasm you're going to get for debugging kas issues in one's spare
time...
--
Andrew Deason
adeason@sinenomine.net
From rra@stanford.edu Fri Apr 16 02:13:28 2010
From: rra@stanford.edu (Russ Allbery)
Date: Thu, 15 Apr 2010 18:13:28 -0700
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <20100415194422.b34491ec.adeason@sinenomine.net> (Andrew Deason's
message of "Thu, 15 Apr 2010 19:44:22 -0500")
References:
<201004141855.o3EItfu4012245@ruuvi.it.helsinki.fi>
<20100415194422.b34491ec.adeason@sinenomine.net>
Message-ID: <87pr20nqrr.fsf@windlord.stanford.edu>
Andrew Deason writes:
> Atro Tossavainen wrote:
>> Derrick,
>>
>> > I'd suggest just using the IBM binary for the kaserver (and only the
>> > kaserver) in your OpenAFS installation
>>
>> That's an interesting thought, but unfortunately it's nowhere near
>> an option. sunx86_ is quite simply not a supported platform for
>> IBM AFS at all, even at 3.6 Patch 19 (August 2009).
> Older OpenAFS releases could be another option, but I don't know how
> useful of an answer that is. I'm not sure what could have caused that,
> so I don't have a particular range in mind; maybe just earlier 1.4...
> 1.4.9? 1.4.2?
We were successfully running a 1.2.x version of kaserver on SPARC Solaris,
and upgrading to 1.4.2 on Linux failed (albeit with different symptoms; it
would just stop successfully giving out tickets for a while and then come
back, regularly), so we stuck with 1.2.x on SPARC until we turned it off
entirely.
--
Russ Allbery (rra@stanford.edu)
From shadow@gmail.com Fri Apr 16 04:02:33 2010
From: shadow@gmail.com (Derrick Brashear)
Date: Thu, 15 Apr 2010 23:02:33 -0400
Subject: [OpenAFS] Re: Ubik problem
In-Reply-To: <87pr20nqrr.fsf@windlord.stanford.edu>
References:
<201004141855.o3EItfu4012245@ruuvi.it.helsinki.fi>
<20100415194422.b34491ec.adeason@sinenomine.net>
<87pr20nqrr.fsf@windlord.stanford.edu>
Message-ID:
On Thu, Apr 15, 2010 at 9:13 PM, Russ Allbery wrote:
> Andrew Deason writes:
>> Atro Tossavainen wrote:
>
>>> Derrick,
>>>
>>> > I'd suggest just using the IBM binary for the kaserver (and only the
>>> > kaserver) in your OpenAFS installation
>>>
>>> That's an interesting thought, but unfortunately it's nowhere near
>>> an option. =A0sunx86_ is quite simply not a supported platform for
>>> IBM AFS at all, even at 3.6 Patch 19 (August 2009).
>
>> Older OpenAFS releases could be another option, but I don't know how
>> useful of an answer that is. I'm not sure what could have caused that,
>> so I don't have a particular range in mind; maybe just earlier 1.4...
>> 1.4.9? 1.4.2?
>
> We were successfully running a 1.2.x version of kaserver on SPARC Solaris=
,
> and upgrading to 1.4.2 on Linux failed (albeit with different symptoms; i=
t
> would just stop successfully giving out tickets for a while and then come
> back, regularly), so we stuck with 1.2.x on SPARC until we turned it off
> entirely.
I'm pretty sure it "broke" between 1.2.11 and 1.4.1.
--=20
Derrick
From fcombernous@kezia.com Fri Apr 16 08:08:24 2010
From: fcombernous@kezia.com (Fabien COMBERNOUS)
Date: Fri, 16 Apr 2010 09:08:24 +0200
Subject: [OpenAFS] released volume, deleted
Message-ID: <4BC80CE8.80100@kezia.com>
Hi the list,
I'm testing OpenAFS with MacOSX 10.5.8 and OpenAFS 1.5.73.
Here setup of my cell :
I started to configure my first server. It hosts now several volumes
with some data.
All was looking to run fine.
Then i configured a second host. This host was added in the cell, only
fs service was created.
I tested to create a voluem one the second host with success. I was able
to get information from the 1st host about the 2rd host.
All was looking to run fine.
Then i added a site to be able to have a RO copy of my data hosted in 2
volumes by my 1st server.
Then i used the vos release command to sync the data.
All looked to running fine. The vos release command returned this kind
of following output :
Released volume 536870918 successfully
But since this action all volumes i released are broken and offline. The
vos listvol command give this message about the released volumes :
Could not attach volume 536870918
I can't access to data. In the VolserLog file i have this log :
Thu Apr 15 10:54:30 2010 Starting AFS Volserver 2.0 (/usr/afs/bin/volserver)
Thu Apr 15 11:24:32 2010 1 Volser: Clone: Cloning volume 536870918 to
new volume 536870919
Thu Apr 15 11:24:32 2010 Clone 536870918: filecount 28 -> 29 diskused
5338 -> 5338
Thu Apr 15 11:24:32 2010 SYNC_ask: negative response on circuit 'FSSYNC'
Thu Apr 15 11:24:32 2010 FSYNC_askfs: FSSYNC request denied for reason=101
Thu Apr 15 11:26:24 2010 1 Volser: Delete: volume 536870919 deleted
Thu Apr 15 11:48:45 2010 1 Volser: Clone: Cloning volume 536870933 to
new volume 536870934
Thu Apr 15 11:48:45 2010 Clone 536870933: filecount 4 -> 5 diskused 5839
-> 5839
Thu Apr 15 11:48:45 2010 SYNC_ask: negative response on circuit 'FSSYNC'
Thu Apr 15 11:48:45 2010 FSYNC_askfs: FSSYNC request denied for reason=101
Thu Apr 15 11:50:43 2010 1 Volser: Delete: volume 536870934 deleted
But after a vos salvage data are back. I'm not sure all data but at least some data. It is just a volume used for tests.
What happened ?
Best regards,
--
*Fabien COMBERNOUS*
/unix system engineer/
www.kezia.com
*Tel: +33 (0) 467 992 986*
Kezia Group
--
*Fabien COMBERNOUS*
/unix system engineer/
www.kezia.com
*Tel: +33 (0) 467 992 986*
Kezia Group
From fcombernous@kezia.com Fri Apr 16 14:00:46 2010
From: fcombernous@kezia.com (Fabien COMBERNOUS)
Date: Fri, 16 Apr 2010 15:00:46 +0200
Subject: [OpenAFS] released volume, deleted
In-Reply-To:
References: <4BC80CE8.80100@kezia.com>
Message-ID: <4BC85F7E.7040603@kezia.com>
Steven Jenkins wrote:
> Could you provide your fileserver configuration? (ie, the output of
> bos status $servername -long) Is it possible that you've used the
> non-Demand Attach Fileserver configuration with 1.5, which has Demand
> Attach? e.g., http://blog.endpoint.com/2009/06/getting-started-with-demand-attach.html
> and http://www.dementia.org/twiki/bin/view/AFSLore/DemandAttach
>
Thank you for your answer. Links you provided are very interesting.
We are using MacOSX as server. Since i noticed that for this OS the
production release of OpenAFS is 1.4, i reinstalled my cell with this
version. I did the same configuration and now a vos release command runs
like a charm. Unfortunately, I don't have more time to continue about
OpenAFS 1.5 tests.
Best regards,
--
*Fabien COMBERNOUS*
/unix system engineer/
www.kezia.com
*Tel: +33 (0) 467 992 986*
Kezia Group
From fcombernous@kezia.com Fri Apr 16 14:58:05 2010
From: fcombernous@kezia.com (Fabien COMBERNOUS)
Date: Fri, 16 Apr 2010 15:58:05 +0200
Subject: [OpenAFS] OpenAFS algorithm
Message-ID: <4BC86CED.8020001@kezia.com>
Hi,
We are testing OpenAFS. We have a new cell, hosted by two servers
located in two differents cities A and B.
Only on server (in A) have a RW volume, and all servers (in A and B)
have a RO copy of the volume.
From an OpenAFS client located in B, i tried to access to data of the
cell. The time necessary to display the picture lets me to say that the
client contacted the remote server in A.
My question is how to permit client in B to use server in B ? I didn't
found any document explaining the algorithm used by OpenAFS to decide
the server contacted by the client.
Best regards,
--
*Fabien COMBERNOUS*
/unix system engineer/
www.kezia.com
*Tel: +33 (0) 467 992 986*
Kezia Group
From adeason@sinenomine.net Fri Apr 16 15:34:43 2010
From: adeason@sinenomine.net (Andrew Deason)
Date: Fri, 16 Apr 2010 09:34:43 -0500
Subject: [OpenAFS] Re: OpenAFS algorithm
References: <4BC86CED.8020001@kezia.com>
Message-ID: <20100416093443.a7772b50.adeason@sinenomine.net>
On Fri, 16 Apr 2010 15:58:05 +0200
Fabien COMBERNOUS wrote:
> My question is how to permit client in B to use server in B ? I didn't
> found any document explaining the algorithm used by OpenAFS to decide
> the server contacted by the client.
'fs setserverprefs' / 'fs getserverprefs'. You can get information on
either via their manpages (fs_setserverprefs(1)) or here:
The default preferences tend to not be very good (there is a project to
improve them), so for such a distributed setup you probably want to set
them yourself. Possibly in some init script on the client so you don't
lose the preferences on reboot.
Also, you want to make sure that you are accessing the RO data. Running
'fs whereis /afs/path/to/file' should mention that the file is on
multiple hosts, if you are accessing the RO.
--
Andrew Deason
adeason@sinenomine.net
From tcreedon@easystreet.net Fri Apr 16 15:36:08 2010
From: tcreedon@easystreet.net (Ted Creedon)
Date: Fri, 16 Apr 2010 07:36:08 -0700
Subject: [OpenAFS] Missing AFS Client tabs?
In-Reply-To: <4BC7B25E.6000705@secure-endpoints.com>
References:
<4BC7B25E.6000705@secure-endpoints.com>
Message-ID:
--000e0cd4ce4a37580604845b88bf
Content-Type: text/plain; charset=ISO-8859-1
The documentation needs updating.
On Thu, Apr 15, 2010 at 5:42 PM, Jeffrey Altman <
jaltman@secure-endpoints.com> wrote:
> On 4/15/2010 7:37 PM, Ted Creedon wrote:
> > OpenAFS for Windows 1.5.73:on Windows Server 2003
> >
> > 2 lock symbols appear in the system tray
> >
> > the "Drive Letters" and "Advanced" tabs are missing from the AFS Client
> > popup.
> >
> > Thanks
> >
> > Tedc
> >
> >
>
> The Drive Letters and Advanced tabs are not compatible with Vista and
> above. They have been removed. This is documented in the release notes
> and was discussed on this mailing list before the change was made.
>
> The two icons are the result of running both the AFS Authentication tool
> and Network Identity Manager. Please choose one. Network Identity
> Manager has an option to automatically disable the AFS Authentication
> Tool if that is your choice.
>
> Jeffrey Altman
>
>
>
--000e0cd4ce4a37580604845b88bf
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
The documentation needs updating.