[OpenAFS] will OpenAFS serve my needs?

Jeffrey Altman jaltman@secure-endpoints.com
Sun, 30 Mar 2008 02:21:06 -0400


This is a cryptographically signed message in MIME format.

--------------ms010300040501080107050907
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

F. Even wrote:
> On Sat, Mar 29, 2008 at 11:17 PM, F. Even <openafslists@elitists.org> wrote:
>> On Sat, Mar 29, 2008 at 10:26 PM, Jason Edgecombe
>>  <jason@rampaginggeek.com> wrote:
>>  > Answers inline.
>>  >
>>  >
>>  >  F. Even wrote:
>>  >  > I'm trying to figure out if OpenAFS can accomplish what I need it to.
>>  >  >
>>  >  > Here are my requirements:
>>  >  >
>>  >  > All servers are generally AIX unless specified.
>>  >  >
>>  >  I have read of people running openafs on AIX, but I'm not sure how many
>>  >  people are running OpenAFS on AIX.
>>  >
>>  >
>>  >  > 01.  3 file servers in distinct geographic parts of the country (while
>>  >  > of course not the same subnet, all the networks are connected).
>>  >  > 02.  Each file server will have files that will be unique (I'm
>>  >  > guessing could be mapped back to unique cells).
>>  >  >
>>  >  You could have all three servers in the same cell. Different paths would
>>  >  seamlessly map to difference servers.
>
>  Each of these servers would be in a different geographic section of
>  the country...but its feasible that depending on what server a client
>  is connected too..they'll need access to all files identified (for
>  copied to a common place) as a result of a job run that identifies
>  files with related topics.  So...would that cell be read/write across
>  geographic/subnet boundaries on 3 seperate servers?  Would there be
>  some kind of sync of all of the cell data on each also so availability
>  would not be interrupted, or would I need to create a seperate
>  read-only cell for that function also on each of these servers?

An AFS cell is an administrative boundary.  Within a cell volumes may be
moved and replicated, users may be added to groups, users and groups may
be placed on access control lists.  Servers within a cell be distributed
across geographic boundaries.

There is no need to have a separate server or a separate cell for 
readonly data.  Each volume name defines a volume set that consist of
some combination of read/write, readonly and backup volumes.  A readonly
volume is a public snapshot of a read/write volume that can be 
replicated onto multiple servers within the same cell.

Clients will communicate with any file server in the cell that holds
the volumes that must be accessed.  File servers that do not hold any
volumes of interest will not be contacted.

For servers containing replicated volumes, the client's order of
preference can be configured.

AFS clients are quite efficient when reading data across a WAN.
The AFS Cache Manager is designed explicitly for that purpose.
Once data is read by the client there is no need to read the
data a second time from the file server unless either the data
is flushed from the cache or the data is altered on the file
server.  This is true regardless of the type of volume the data
is read from.

At the present time, failover is only provided between readonly
volumes because only readonly volume instances can be replicated
across multiple servers.

There are plans to add read/write replication and if all goes
well it may be completed by the end of this calendar year.  Note
however that different people have different requirements from
read/write replication.   What we propose to implement will have
the property of a single master for writes and lock acquisition
with lazy replication to the replicas.  This will provide for
automated replication without the risk that two clients communicating
with independent servers will not simultaneous update the same
files at the same time.

Until such time as read/write replication is available, other
methods must be devised if replication to each office is required.

>>  > > 03.  Each server will need some kind of failover capability (either
>>  >  > duplicating to a read-only volume or some kind of failover service on
>>  >  > server side).  Needs to be invisible to the client.
>>  >  >
>>  >  Currently, seamless failover is only possible between read-only copies
>>  >  of volumes hosted on different servers/partitions. Read-only copies are
>>  >  only updated when the  "vos release" command is run as an AFS admin.
>
>  There may be situations where read-only seamless failover is acceptable.

It is important to be aware that failover is not from a read/write 
volume to a readonly volume.  The choice is either access readonly 
copies and obtain failover or access the single read/write instance
and do not have failover.

>>  >  That said, there are other options that are not as seamless or have a
>>  >  short, but noticeable failover time.
>
>  Yup.  Currently HACMP is being looked at as the solution for current
>  failover requirements...but if we can better distribute the data, have
>  failover be more seamless, and have the same "view" of at least a
>  portion of the data accessible in all locations....we'd be going a
>  long way to accomplishing our goal and having less waste in
>  resources...(hopefully).
>
>
>>  > > All of these file servers will be a repository of files for
>>  >  > applications running on other servers (files will be written from
>>  >  > other servers and read from others).  Original idea was to use NFS
>>  >  > where possible and FTP/SAMBA for clients that do not support NFS
>>  >  > (Windows).
>>  >  >
>>  >  > 04.  Each server could have files/images/documents/audio/whatever
>>  >  > related to a particular topic.  All these files will need to be
>>  >  > brought together in a shared hierarchy from all of these servers into
>>  >  > one hierarchy.
>>  >  >
>>  >  >
>>  >  AFS offers one globally unified filesystem/hierarchy for all files on
>>  >  all server. Download the OpenAFS client to see how this works. No need
>>  >  to set up a server or get an account.
>>  >
>>  > > Initial design was to have one of the file servers act as a
>>  >  > centralized connection point, synchronize all files back to it, and
>>  >  > have all the processing done on it.
>>  >  >
>>  >  >
>>  >  > I guess I'm curious if I can configure OpenAFS to have space shared
>>  >  > and sychronized, fully writeable, across all 3 of the servers so it
>>  >  > could be mounted as one filesystem or one drive letter (windows
>>  >  > clients).  OR...if this is not possible...how quickly does
>>  >  > synchronization happen?  If a job were run to pull files together on
>>  >  > one server....would the replicated copies get updated fairly quickly?
>>  >  >
>>  >  >
>>  >  The data could be split across the three servers according to the
>>  >  directory structure. All files appear in one filesystem as one drive
>>  >  letter no matter what server they are located on.
>
>  Well.....OK...for the job that needs to be kicked off to gather files
>  to a central location (via symlinks or whatever, matters not) then
>  shares them back out to Windows clients, I need ALL that data to be
>  accessible from any of the regions the clients are located in from a
>  local server.  I'm still somewhat confused after reading up on some of
>  the docs exactly how things are organized or how I can achieve the
>  functionality I want.  I expect that applications that talk to each of
>  these servers in each of these areas will need to drop data to all of
>  these systems simultaneously (but not the same data...but maybe need
>  to dump the data in the same hierarchies).  Logical cell distinctions
>  would be based on projects, tests, or application implementations (at
>  least that's how the main export/mount points would be defined).

I believe you are confusing cells and volumes.  It is unlikely that you
will want a separate cell for each project, test or application. 
Instead you want one or more volumes dedicated to each project, test
or application.

The directory hierarchy is created by linking volumes together using
mount points.  Mount points are either normal or read/write.  A normal
mount point when crossed will select a readonly volume instance if it
exists or the read/write volume if no readonly exists.  A read/write
mount point will always select the read/write volume instance.

Mount points can cross cell boundaries but unless the projects require
separate administrative access controls it is not worth creating 
additional cells.  A file server cannot be in more than one cell.

>>  >  Synchronization of read-write to read-only copies is done manually and
>>  >  depends on the amount of data that has changed since the last sync. That
>>  >  said, you could script it to happen upon job completion.
>
>  Oh?  I thought there was a daemon (from my reading) that kept the
>  copies in sync w/ the master copies.  Not correct?  Hmm...that was
>  part of the appeal, getting away from a bunch of timed jobs and other
>  watchdog scripts....

AFS readonly volumes are used to publish a snapshot of a read/write
volume.  The volume release copies all directories and the full contents
of any file that changed since the last snapshot was taken.  There is no
automated release process.

>>  > > Right now the plan is:
>>  >  >
>>  >  > Application servers NFS mount from the file server in each area, right
>>  >  > out files to it.  Other servers will mount those same spots to take
>>  >  > files and do things w/ them.  Some Windows servers will need to write
>>  >  > files to the servers.  Where they need to write, it was going to be
>>  >  > via FTP.  Windows clients will need to retrieve files after jobs are
>>  >  > ran (the job that will pull files from all the servers).  As of right
>>  >  > now, one server will be chosen as the box that all the files are
>>  >  > copied to for the jobs to run and the server will run SAMBA.  The
>>  >  > windows clients will connect to a directory to pull those files that
>>  >  > have resulted from the job run.  ALSO, Active Directory authentication
>>  >  > needs to be supported (preferably seemlessly).
>>  >  >
>>  >  > Would OpenAFS make any of this easier?
>>  >  >
>>  >
>>  >  With openafs, the different servers would mount the global AFS
>>  >  filesystem and just read or write to certain directories. The openafs
>>  >  client seamlessly locates the files on the correct server. You can move
>>  >  the files (volumes) between servers without needing to reconfigure the
>>  >  clients.
>
>  ..but if a file was called that was needed, but not on a volume on a
>  local server, it would be pulled anyway (as long as it was part of the
>  cell that was mounted up), correct?

As long as the server on which the volume in which the file resides is
accessible, the file can be accessed by the AFS client.  AFS is a 
location independent file system.  The location of volumes within the
cell may be changed at any time without adverse impact on the clients
that access them.

>>  >  OpenAFS has clients for AIX, Windows, Mac, Linux and many others. The
>>  >  server portion runs on AIX, Linux, Unix, Mac OS X and others. The
>>  >  windows server is NOT recommended for production, but the windows client
>>  >  works just fine.
>
>  Nope...no Windows servers for this function anyway.  The file servers
>  will all be AIX currently (possibly Linux or HP-UX, but most likely
>  AIX as most of the rest of the application environment is already
>  running on AIX).

I recommend Solaris as the best server platform for OpenAFS.  The 
debugging tools are far superior to any of the other platforms on which
OpenAFS runs.

>>  >  Active Directory has been successfully used as the AFS Kerberos server.
>>  >  Check out these slides:
>>  >  http://workshop.openafs.org/afsbpw06/talks/shadow.html
>
>  Thanks, I'll take a look at them.

 From the perspective of OpenAFS, Active Directory is just a Kerberos v5
KDC.  You create a user account with the SPN "afs/<cell>@<DOMAIN>",
export a keytab, and then import the key into the AFS keyfile.

Jeffrey Altman


--------------ms010300040501080107050907
Content-Type: application/x-pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJeTCC
AxcwggKAoAMCAQICEALr5BE3U6n+HWCoLbyhohMwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA3MDUzMTA2MTM1N1oX
DTA4MDUzMDA2MTM1N1owczEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVy
aWMxHDAaBgNVBAMTE0plZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB
AQCsoz/0+s4Cn65n/3bU3shXw4y5u1uEMEsBOiqNU0PfIKGYQe95b1FKNbNAkctSdQT6GF5c
bhSnJPmb2OOb1frx64dlDgskaG561xa8XPA1aP8Cc+33dgsSLIxGEh97lyUYHEfWBC03KMCF
PKhZfcrGAXoVCrFBadnLAokQbUTFahVg/qQx2IT3wSj1sCIfV5UDuXcEKHCvRtEZIsSzu184
9Cj6I4nY5bt+r94kyDHM94MHYBJi+6tWLFRy2gkIB3HEPmxAiQrKljNpH9bOffiBLIAgmJ6d
1ZXepBXyexQbwOYvftpVlMEFHHQmdiwH3tj69hE78XvM5X9J+SbjbuNpAgMBAAGjOTA3MCcG
A1UdEQQgMB6BHGphbHRtYW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADAN
BgkqhkiG9w0BAQUFAAOBgQB8FShDN2Ig034Y5eyadiFDEtOvsIJ3Z2xV9aTL4u8xMlz1gZR1
AZAvCv+ZMMRRKWCsrG5tItV8DFPSfWAGMpInmMarA4f76JRLQEUhkRUg8GpkJM5ryk5EDakk
0oiBQcQD8A+UHwrcmaj3UWxQ9zCjDgU+1mY9nEQxZZyp4eeUfzCCAxcwggKAoAMCAQICEALr
5BE3U6n+HWCoLbyhohMwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNVBAoT
HFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25h
bCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA3MDUzMTA2MTM1N1oXDTA4MDUzMDA2MTM1N1ow
czEPMA0GA1UEBBMGQWx0bWFuMRUwEwYDVQQqEwxKZWZmcmV5IEVyaWMxHDAaBgNVBAMTE0pl
ZmZyZXkgRXJpYyBBbHRtYW4xKzApBgkqhkiG9w0BCQEWHGphbHRtYW5Ac2VjdXJlLWVuZHBv
aW50cy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCsoz/0+s4Cn65n/3bU
3shXw4y5u1uEMEsBOiqNU0PfIKGYQe95b1FKNbNAkctSdQT6GF5cbhSnJPmb2OOb1frx64dl
DgskaG561xa8XPA1aP8Cc+33dgsSLIxGEh97lyUYHEfWBC03KMCFPKhZfcrGAXoVCrFBadnL
AokQbUTFahVg/qQx2IT3wSj1sCIfV5UDuXcEKHCvRtEZIsSzu1849Cj6I4nY5bt+r94kyDHM
94MHYBJi+6tWLFRy2gkIB3HEPmxAiQrKljNpH9bOffiBLIAgmJ6d1ZXepBXyexQbwOYvftpV
lMEFHHQmdiwH3tj69hE78XvM5X9J+SbjbuNpAgMBAAGjOTA3MCcGA1UdEQQgMB6BHGphbHRt
YW5Ac2VjdXJlLWVuZHBvaW50cy5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB
gQB8FShDN2Ig034Y5eyadiFDEtOvsIJ3Z2xV9aTL4u8xMlz1gZR1AZAvCv+ZMMRRKWCsrG5t
ItV8DFPSfWAGMpInmMarA4f76JRLQEUhkRUg8GpkJM5ryk5EDakk0oiBQcQD8A+UHwrcmaj3
UWxQ9zCjDgU+1mY9nEQxZZyp4eeUfzCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAw
gdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUg
VG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRp
b24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFp
bCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0w
MzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxU
aGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwg
RnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV
+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfAr
hVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/
p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8
MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWls
Q0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxh
YmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/
TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amc
OY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNkMIID
YAIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5
KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQ
AuvkETdTqf4dYKgtvKGiEzAJBgUrDgMCGgUAoIIBwzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0wODAzMzAwNjIxMDZaMCMGCSqGSIb3DQEJBDEWBBSb31TQ
1KUHNvvYXNItHiH7sAQvkDBSBgkqhkiG9w0BCQ8xRTBDMAoGCCqGSIb3DQMHMA4GCCqGSIb3
DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDCBhQYJKwYB
BAGCNxAEMXgwdjBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcg
KFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3Vpbmcg
Q0ECEALr5BE3U6n+HWCoLbyhohMwgYcGCyqGSIb3DQEJEAILMXigdjBiMQswCQYDVQQGEwJa
QTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhh
d3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECEALr5BE3U6n+HWCoLbyhohMwDQYJ
KoZIhvcNAQEBBQAEggEALkfdv0a11FizyR1nxL7dzJYmOZEWWjn1OK8K35Ha34FFZM6scI2e
dJeYvoX+C+8r0/hiMEa4e3z5xD0uljYTLrEebPFraU28oOac6MbB9yANdqmOzmE97JBkZt1Y
iqz1uBT4RpUEQV3alZL4BkgpQVhJChIo5H16Cb4q8CD52fxSY0gx35X5NDhSSD3WVpMUe9wB
2I0NMHxCahHPPMeBOafneZUIQXbmNpb/CcGeFHbhcbO5GT+pKOQb2fT6fceJL5xSjH6ReS+I
hLW9mXzxW0ebfybkRkGArjz4CkcBJwMBKyplfC2xrVG66jLmWSMk+aujDs+gcSTAkUCksbzB
ZgAAAAAAAA==
--------------ms010300040501080107050907--