[OpenAFS] will OpenAFS serve my needs?

F. Even openafslists@elitists.org
Sat, 29 Mar 2008 23:19:10 -0500


On Sat, Mar 29, 2008 at 11:17 PM, F. Even <openafslists@elitists.org> wrote:
> On Sat, Mar 29, 2008 at 10:26 PM, Jason Edgecombe
>  <jason@rampaginggeek.com> wrote:
>  > Answers inline.
>  >
>  >
>  >  F. Even wrote:
>  >  > I'm trying to figure out if OpenAFS can accomplish what I need it to.
>  >  >
>  >  > Here are my requirements:
>  >  >
>  >  > All servers are generally AIX unless specified.
>  >  >
>  >  I have read of people running openafs on AIX, but I'm not sure how many
>  >  people are running OpenAFS on AIX.
>  >
>  >
>  >  > 01.  3 file servers in distinct geographic parts of the country (while
>  >  > of course not the same subnet, all the networks are connected).
>  >  > 02.  Each file server will have files that will be unique (I'm
>  >  > guessing could be mapped back to unique cells).
>  >  >
>  >  You could have all three servers in the same cell. Different paths would
>  >  seamlessly map to difference servers.
>
>  Each of these servers would be in a different geographic section of
>  the country...but ti's feasible that depending on what server a client
>  is connected too..they'll need access to all files identified (for
>  copied to a common place) as a result of a job run that identifies
>  files with related topics.  So...would that cell be read/write across
>  geographic/subnet boundaries on 3 seperate servers?  Would there be
>  some kind of sync of all of the cell data on each also so availability
>  would not be interrupted, or would I need to create a seperate
>  read-only cell for that function also on each of these servers?
>
>
>  > > 03.  Each server will need some kind of failover capability (either
>  >  > duplicating to a read-only volume or some kind of failover service on
>  >  > server side).  Needs to be invisible to the client.
>  >  >
>  >  >
>  >  Currently, seamless failover is only possible between read-only copies
>  >  of volumes hosted on different servers/partitions. Read-only copies are
>  >  only updated when the  "vos release" command is run as an AFS admin.
>
>  There may be situations where read-only seemless failover is acceptable.
>
>
>  >  That said, there are other options that are not as seamless or have a
>  >  short, but noticeable failover time.
>
>  Yup.  Currently HACMP is being looked at as the solution for current
>  failover requirements...but if we can better distribute the data, have
>  failover be more seemless, and have the same "view" of at least a
>  portion of the data accessible in all locations....we'd be going a
>  long way to accomplishing our goal and having less waste in
>  resources...(hopefully).
>
>
>  > > All of these file servers will be a repository of files for
>  >  > applications running on other servers (files will be written from
>  >  > other servers and read from others).  Original idea was to use NFS
>  >  > where possible and FTP/SAMBA for clients that do not support NFS
>  >  > (Windows).
>  >  >
>  >  > 04.  Each server could have files/images/documents/audio/whatever
>  >  > related to a particular topic.  All these files will need to be
>  >  > brought together in a shared hierarchy from all of these servers into
>  >  > one hierarchy.
>  >  >
>  >  >
>  >  AFS offers one globally unified filesystem/hierarchy for all files on
>  >  all server. Download the OpenAFS client to see how this works. No need
>  >  to set up a server or get an account.
>  >
>  > > Initial design was to have one of the file servers act as a
>  >  > centralized connection point, synchronize all files back to it, and
>  >  > have all the processing done on it.
>  >  >
>  >  >
>  >  > I guess I'm curious if I can configure OpenAFS to have space shared
>  >  > and sychronized, fully writeable, across all 3 of the servers so it
>  >  > could be mounted as one filesystem or one drive letter (windows
>  >  > clients).  OR...if this is not possible...how quickly does
>  >  > synchronization happen?  If a job were run to pull files together on
>  >  > one server....would the replicated copies get updated fairly quickly?
>  >  >
>  >  >
>  >  The data could be split across the three servers according to the
>  >  directory structure. All files appear in one filesystem as one drive
>  >  letter no matter what server they are located on.
>
>  Well.....OK...for the job that needs to be kicked off to gather files
>  to a central location (via symlinks or whatever, matters not) then
>  shares them back out to Windows clients, I need ALL that data to be
>  accessible from any of the regions the clients are located in from a
>  local server.  I'm still somewhat confused after reading up on some of
>  the docs exactly how things are organized or how I can achieve the
>  functionality I want.  I expect that applications that talk to each of
>  these servers in each of these areas will need to drop data to all of
>  these systems simultaneously (but not the same data...but maybe need
>  to dump the data in the same hierarchies).  Logical cell distinctions
>  would be based on projects, tests, or application implementations (at
>  least that's how the main export/mount points would be defined).
>
>
>  >  Synchronization of read-write to read-only copies is done manually and
>  >  depends on the amount of data that has changed since the last sync. That
>  >  said, you could script it to happen upon job completion.
>
>  Oh?  I thought there was a daemon (from my reading) that kept the
>  copies in sync w/ the master copies.  Not correct?  Hmm...that was
>  part of the appeal, getting away from a bunch of timed jobs and other
>  watchdog scripts....
>
>
>  > > Right now the plan is:
>  >  >
>  >  > Application servers NFS mount from the file server in each area, right
>  >  > out files to it.  Other servers will mount those same spots to take
>  >  > files and do things w/ them.  Some Windows servers will need to write
>  >  > files to the servers.  Where they need to write, it was going to be
>  >  > via FTP.  Windows clients will need to retrieve files after jobs are
>  >  > ran (the job that will pull files from all the servers).  As of right
>  >  > now, one server will be chosen as the box that all the files are
>  >  > copied to for the jobs to run and the server will run SAMBA.  The
>  >  > windows clients will connect to a directory to pull those files that
>  >  > have resulted from the job run.  ALSO, Active Directory authentication
>  >  > needs to be supported (preferably seemlessly).
>  >  >
>  >  > Would OpenAFS make any of this easier?
>  >  >
>  >
>  >  With openafs, the different servers would mount the global AFS
>  >  filesystem and just read or write to certain directories. The openafs
>  >  client seamlessly locates the files on the correct server. You can move
>  >  the files (volumes) between servers without needing to reconfigure the
>  >  clients.
>
>  ..but if a file was called that was needed, but not on a volume on a
>  local server, it would be pulled anyway (as long as it was part of the
>  cell that was mounted up), correct?
>
>
>  >  OpenAFS has clients for AIX, Windows, Mac, Linux and many others. The
>  >  server portion runs on AIX, Linux, Unix, Mac OS X and others. The
>  >  windows server is NOT recommended for production, but the windows client
>  >  works just fine.
>
>  Nope...no Windows servers for this function anyway.  The file servers
>  will all be AIX currently (possibly Linux or HP-UX, but most likely
>  AIX as most of the rest of the application environment is already
>  running on AIX).
>
>
>  >  Active Directory has been successfully used as the AFS Kerberos server.
>  >  Check out these slides:
>  >  http://workshop.openafs.org/afsbpw06/talks/shadow.html
>
>  Thanks, I'll take a look at them.
>