[OpenAFS] will OpenAFS serve my needs?

Jeffrey Altman jaltman@secure-endpoints.com
Sun, 30 Mar 2008 02:21:06 -0400

This is a cryptographically signed message in MIME format.

Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

F. Even wrote:
> On Sat, Mar 29, 2008 at 11:17 PM, F. Even <openafslists@elitists.org> wrote:
>> On Sat, Mar 29, 2008 at 10:26 PM, Jason Edgecombe
>>  <jason@rampaginggeek.com> wrote:
>>  > Answers inline.
>>  >
>>  >
>>  >  F. Even wrote:
>>  >  > I'm trying to figure out if OpenAFS can accomplish what I need it to.
>>  >  >
>>  >  > Here are my requirements:
>>  >  >
>>  >  > All servers are generally AIX unless specified.
>>  >  >
>>  >  I have read of people running openafs on AIX, but I'm not sure how many
>>  >  people are running OpenAFS on AIX.
>>  >
>>  >
>>  >  > 01.  3 file servers in distinct geographic parts of the country (while
>>  >  > of course not the same subnet, all the networks are connected).
>>  >  > 02.  Each file server will have files that will be unique (I'm
>>  >  > guessing could be mapped back to unique cells).
>>  >  >
>>  >  You could have all three servers in the same cell. Different paths would
>>  >  seamlessly map to difference servers.
>  Each of these servers would be in a different geographic section of
>  the country...but its feasible that depending on what server a client
>  is connected too..they'll need access to all files identified (for
>  copied to a common place) as a result of a job run that identifies
>  files with related topics.  So...would that cell be read/write across
>  geographic/subnet boundaries on 3 seperate servers?  Would there be
>  some kind of sync of all of the cell data on each also so availability
>  would not be interrupted, or would I need to create a seperate
>  read-only cell for that function also on each of these servers?

An AFS cell is an administrative boundary.  Within a cell volumes may be
moved and replicated, users may be added to groups, users and groups may
be placed on access control lists.  Servers within a cell be distributed
across geographic boundaries.

There is no need to have a separate server or a separate cell for 
readonly data.  Each volume name defines a volume set that consist of
some combination of read/write, readonly and backup volumes.  A readonly
volume is a public snapshot of a read/write volume that can be 
replicated onto multiple servers within the same cell.

Clients will communicate with any file server in the cell that holds
the volumes that must be accessed.  File servers that do not hold any
volumes of interest will not be contacted.

For servers containing replicated volumes, the client's order of
preference can be configured.

AFS clients are quite efficient when reading data across a WAN.
The AFS Cache Manager is designed explicitly for that purpose.
Once data is read by the client there is no need to read the
data a second time from the file server unless either the data
is flushed from the cache or the data is altered on the file
server.  This is true regardless of the type of volume the data
is read from.

At the present time, failover is only provided between readonly
volumes because only readonly volume instances can be replicated
across multiple servers.

There are plans to add read/write replication and if all goes
well it may be completed by the end of this calendar year.  Note
however that different people have different requirements from
read/write replication.   What we propose to implement will have
the property of a single master for writes and lock acquisition
with lazy replication to the replicas.  This will provide for
automated replication without the risk that two clients communicating
with independent servers will not simultaneous update the same
files at the same time.

Until such time as read/write replication is available, other
methods must be devised if replication to each office is required.

>>  > > 03.  Each server will need some kind of failover capability (either
>>  >  > duplicating to a read-only volume or some kind of failover service on
>>  >  > server side).  Needs to be invisible to the client.
>>  >  >
>>  >  Currently, seamless failover is only possible between read-only copies
>>  >  of volumes hosted on different servers/partitions. Read-only copies are
>>  >  only updated when the  "vos release" command is run as an AFS admin.
>  There may be situations where read-only seamless failover is acceptable.

It is important to be aware that failover is not from a read/write 
volume to a readonly volume.  The choice is either access readonly 
copies and obtain failover or access the single read/write instance
and do not have failover.

>>  >  That said, there are other options that are not as seamless or have a
>>  >  short, but noticeable failover time.
>  Yup.  Currently HACMP is being looked at as the solution for current
>  failover requirements...but if we can better distribute the data, have
>  failover be more seamless, and have the same "view" of at least a
>  portion of the data accessible in all locations....we'd be going a
>  long way to accomplishing our goal and having less waste in
>  resources...(hopefully).
>>  > > All of these file servers will be a repository of files for
>>  >  > applications running on other servers (files will be written from
>>  >  > other servers and read from others).  Original idea was to use NFS
>>  >  > where possible and FTP/SAMBA for clients that do not support NFS
>>  >  > (Windows).
>>  >  >
>>  >  > 04.  Each server could have files/images/documents/audio/whatever
>>  >  > related to a particular topic.  All these files will need to be
>>  >  > brought together in a shared hierarchy from all of these servers into
>>  >  > one hierarchy.
>>  >  >
>>  >  >
>>  >  AFS offers one globally unified filesystem/hierarchy for all files on
>>  >  all server. Download the OpenAFS client to see how this works. No need
>>  >  to set up a server or get an account.
>>  >
>>  > > Initial design was to have one of the file servers act as a
>>  >  > centralized connection point, synchronize all files back to it, and
>>  >  > have all the processing done on it.
>>  >  >
>>  >  >
>>  >  > I guess I'm curious if I can configure OpenAFS to have space shared
>>  >  > and sychronized, fully writeable, across all 3 of the servers so it
>>  >  > could be mounted as one filesystem or one drive letter (windows
>>  >  > clients).  OR...if this is not possible...how quickly does
>>  >  > synchronization happen?  If a job were run to pull files together on
>>  >  > one server....would the replicated copies get updated fairly quickly?
>>  >  >
>>  >  >
>>  >  The data could be split across the three servers according to the
>>  >  directory structure. All files appear in one filesystem as one drive
>>  >  letter no matter what server they are located on.
>  Well.....OK...for the job that needs to be kicked off to gather files
>  to a central location (via symlinks or whatever, matters not) then
>  shares them back out to Windows clients, I need ALL that data to be
>  accessible from any of the regions the clients are located in from a
>  local server.  I'm still somewhat confused after reading up on some of
>  the docs exactly how things are organized or how I can achieve the
>  functionality I want.  I expect that applications that talk to each of
>  these servers in each of these areas will need to drop data to all of
>  these systems simultaneously (but not the same data...but maybe need
>  to dump the data in the same hierarchies).  Logical cell distinctions
>  would be based on projects, tests, or application implementations (at
>  least that's how the main export/mount points would be defined).

I believe you are confusing cells and volumes.  It is unlikely that you
will want a separate cell for each project, test or application. 
Instead you want one or more volumes dedicated to each project, test
or application.

The directory hierarchy is created by linking volumes together using
mount points.  Mount points are either normal or read/write.  A normal
mount point when crossed will select a readonly volume instance if it
exists or the read/write volume if no readonly exists.  A read/write
mount point will always select the read/write volume instance.

Mount points can cross cell boundaries but unless the projects require
separate administrative access controls it is not worth creating 
additional cells.  A file server cannot be in more than one cell.

>>  >  Synchronization of read-write to read-only copies is done manually and
>>  >  depends on the amount of data that has changed since the last sync. That
>>  >  said, you could script it to happen upon job completion.
>  Oh?  I thought there was a daemon (from my reading) that kept the
>  copies in sync w/ the master copies.  Not correct?  Hmm...that was
>  part of the appeal, getting away from a bunch of timed jobs and other
>  watchdog scripts....

AFS readonly volumes are used to publish a snapshot of a read/write
volume.  The volume release copies all directories and the full contents
of any file that changed since the last snapshot was taken.  There is no
automated release process.

>>  > > Right now the plan is:
>>  >  >
>>  >  > Application servers NFS mount from the file server in each area, right
>>  >  > out files to it.  Other servers will mount those same spots to take
>>  >  > files and do things w/ them.  Some Windows servers will need to write
>>  >  > files to the servers.  Where they need to write, it was going to be
>>  >  > via FTP.  Windows clients will need to retrieve files after jobs are
>>  >  > ran (the job that will pull files from all the servers).  As of right
>>  >  > now, one server will be chosen as the box that all the files are
>>  >  > copied to for the jobs to run and the server will run SAMBA.  The
>>  >  > windows clients will connect to a directory to pull those files that
>>  >  > have resulted from the job run.  ALSO, Active Directory authentication
>>  >  > needs to be supported (preferably seemlessly).
>>  >  >
>>  >  > Would OpenAFS make any of this easier?
>>  >  >
>>  >
>>  >  With openafs, the different servers would mount the global AFS
>>  >  filesystem and just read or write to certain directories. The openafs
>>  >  client seamlessly locates the files on the correct server. You can move
>>  >  the files (volumes) between servers without needing to reconfigure the
>>  >  clients.
>  ..but if a file was called that was needed, but not on a volume on a
>  local server, it would be pulled anyway (as long as it was part of the
>  cell that was mounted up), correct?

As long as the server on which the volume in which the file resides is
accessible, the file can be accessed by the AFS client.  AFS is a 
location independent file system.  The location of volumes within the
cell may be changed at any time without adverse impact on the clients
that access them.

>>  >  OpenAFS has clients for AIX, Windows, Mac, Linux and many others. The
>>  >  server portion runs on AIX, Linux, Unix, Mac OS X and others. The
>>  >  windows server is NOT recommended for production, but the windows client
>>  >  works just fine.
>  Nope...no Windows servers for this function anyway.  The file servers
>  will all be AIX currently (possibly Linux or HP-UX, but most likely
>  AIX as most of the rest of the application environment is already
>  running on AIX).

I recommend Solaris as the best server platform for OpenAFS.  The 
debugging tools are far superior to any of the other platforms on which
OpenAFS runs.

>>  >  Active Directory has been successfully used as the AFS Kerberos server.
>>  >  Check out these slides:
>>  >  http://workshop.openafs.org/afsbpw06/talks/shadow.html
>  Thanks, I'll take a look at them.

 From the perspective of OpenAFS, Active Directory is just a Kerberos v5
KDC.  You create a user account with the SPN "afs/<cell>@<DOMAIN>",
export a keytab, and then import the key into the AFS keyfile.

Jeffrey Altman

Content-Type: application/x-pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature