[OpenAFS] What filesystem?

Christopher D. Clausen cclausen@acm.org
Mon, 6 Mar 2006 09:17:23 -0600


Rodney M Dyer <rmdyer@uncc.edu> wrote:
> At 11:45 AM 3/4/2006, Christopher D. Clausen wrote:
>> Actually, that is pretty much all it is.  Its a namespace for CIFS
>> shares.  The replicas work with the File Replication Service (frs)
>> and can operate even without a domain controller, although most
>> places will want the data hosted in active directory.  The file
>> replication can be more advanced than one to many replication. Active 
>> Directory has a concept of "sites" and this can be utilized
>> to save data transmission time over WAN links between branch
>> offices.  Group Policy can be used to point different sets of
>> machines at different servers by default, failing over to other
>> sites when needed.
>
> And for these reasons you can forget common name space ACLs, groups,
> and such unless you have a single AD where all your shares are from
> AD domain member servers and you are a member of the domain.  And
> since it is CIFS based, there's no concept of a local cache to reduce
> network traffic.  As well the Dfs root node must be authenticated to,
> meaning clients at home (or remote on the Internet) must be members
> of the AD domain that the root node belongs to.  And btw, who in
> their right mind would share out CIFS over the Internet these days? A 
> strict comparison of AFS to Dfs would look something like...

Well, at UIUC we have a common campus-wide domain so its a FEATURE that 
all shares are under it.  Seamless Kerberos logons between disparately 
managed file servers.  I don't think its possible to delegate control 
like that with AFS.

I'm going to have to disagree with your statement of caching as well, 
unless I totally misunderstand how offline files work.  And, that is in 
fact another feature, offline access to remote files.  (AFS sort of has 
this on Windows, but only on Windows and using the same methods.)

As to sharing CIFS over the Internet, I certainly would do so but 
unfortunately the matter has been decided and the campus firewall blocks 
MS RPC and CIFS traffic.  However, the common refrain I keep hearing is 
"use the VPN client" to gain access to campus resources.  I am not 
really a fan of the Cisco VPN that UIUC offers, but there is nothing 
preventing me from running my own "Routing and Remote Access" server 
allowing access with the much friendlier native Windows client.  (One 
could argue that VPNed access provides a higher level of encryption than 
AFS does, although one can also use the VPN with AFS.)

> AFS:  Yes, Yes, Yes, Yes, Yes, Yes
> Dfs:  No, No, Sometimes, Maybe, Unsure, Not documented, etc. etc...

Again, depends on your needs.  After working with NTFS ACLs I find AFS 
ones limiting.  There appears to be no way to allow one to create mount 
points and not change ACLs.  Why is that?  There isn't a easy way to 
have shared scratch space, preventing others from nuking not their own 
files.  There is no way to grant access to directory and NOT have that 
same access get inherited to newly created sub-directories.

Not needed to setup the equivalent of PTS is a feature IMHO as well. 
Why do I need an account to the filesystem?  Or for that matter, a 
token.  Not needed to have end-users worry about tokens can be a 
feature.  (Not being able to deny access to the filesystem by unloging 
is a non-feature.)

I'm going to have to say that AFS has some strange "undocumented 
features" as well.  Like how the owner of the root of a volume gets 
extra rights...  And the system:ptsviewers group... and the refresh time 
on IP-based ACLs...  (I could have missed all this in the pages of 
documentation though, and yes, I know its being worked on.)

> About the only thing Dfs has going for it -are- the automatic
> replication facilities, which most people in the AFS world find
> rather odd.  You see, in the AFS world, most of the tree is in the
> form of read-only replicated sites.  Only the user volumes and a few
> others are read-write.  And when a read-write volume goes down, you
> really don't want to "fail-over".  The business of read-write volumes
> for critical data should be in the form of RAID, not fail-over
> volumes.

Well, not everyone here at UIUC can afford RAID setups.  I've worked at 
several departments where the file-server is just a desktop machine with 
a single hard drive.  Convincing them to spend money on an entire 
redundant server is easier than on expensive disk.  ("I got this 300GB 
USB drive for $200.  Why do you need $7000 for an array?")  We both know 
that RAID is good, and if you have the money use RAID with a 
replicatation partner.

RAID doesn't help if a processor or a power supply goes bad, or on OS 
upgrade freaks out or whatever.  As in, its not protection against 
SERVER downtime.  It protection against only disk failures.  (And being 
in a place that uses 3 year old Dell desktops as fileservers, this is a 
concern.)

Only the user volumes are read-write?  All of my important data is in 
user volumes.  Read-only data is easy to keep online, just replicate it. 
The important stuff is what people are working on RIGHT NOW and if its 
down bad things happen.

> It's kind of hard to explain, but Windows people and Unix people think
> differently about what they are serving out.  AFS grew up in a world
> where Unix admins wanted to distribute executable "applications" as
> well as data from a single name space tree.  In the enterprise, Unix
> workstations actually run their applications off of the network.

Warning: I am a Windows person.

I'm not at an enterprise.  I'm at a University.  My systems at home are 
more advanced than most of the "production" services I find in use 
throughout campus.  (Not at the campus IT level, but at a department 
level: single fileserver, 100 clients max.)

> Very few people using AFS actually install applications on their
> local workstations.  This makes application distribution to thousands
> of workstations into a single vos replicate command.  Try that with
> Windows!  Windows application installation and distribution is a
> royal pain in the ass for admins and is why Windows has such a high
> cost of ownership for the enterprise.  Ever try running applications
> off your Windows server Dfs tree?

Yes, actually.  Once you trick them to install it works just fine.

> For that matter, how many Windows
> applications have you come across that are actually made to install
> on the network?

Yes, this was annoying.  Had to create local partition "S:" (for 
software) and install things there, then copy into AFS and mount the 
path as S: on each client machine.  Same trick for Dfs installs as well. 
Almost everything works using this technique.  The problem appears to be 
mainly the installers, not the apps themselves.  Got to watch things 
that install as system services or explorer extensions though.  That 
causes PAIN.  Unless of course you ACL everything system:anyuser rl so 
that the system can read it.  Probably not legal to do in all 
instances... but that's another matter...

Another problem is 100BASE networks are at least 4 times slower than 
hard drives.  Who wants to wait for that?  And, I have 8GBs of software 
installed on my Windows machines.  I believe the Windows AFS cache size 
limit is about 1.3GBs.  Say all you want about bloated software, but its 
a fact of life.

Do you really run Visual Studio and AutoCAD directly out of AFS?

> AFS has the concept of the "dotted" cell path which
> gets you to the read-write tree of volumes.  At our site we release
> an application and test it using the "dotted" path before any of our
> real workstations see it in the "read-only" tree.  When it is tested
> and ready, we issue a vos release command, which updates all the
> read-only sites and makes the application available to all the
> workstations.  Through blood, sweat, and tears, we've rigged our
> Windows XP environment to work much like the way our Unix environment
> works and it has saved us a hell of a lot of admin time.

I only have about 20 machines that I directly maintain.  Again, most of 
the data is in user volumes that people access from their own computers.

And the vos release stuff is annoying as well.  Why should I need to 
write an app to handle allowing users to replicate volumes?  What about 
the user education issue?  "Oh the website isn't up to date b/c no one 
vos released it yet."

Yes, having separate RO and RW paths is good in some cases.  Sometimes 
it isn't.

> <on soapbox>My conspiracy theory is that in the Microsoft Windows
> [snip]
> Remember how slow the network used to be?<off soapbox>

Oh I agree.  Your view of things and some other view may be different. 
I'm just supplying the reasons why someone may choose MS Dfs over 
OpenAFS.  (Having already done the research and picked AFS.)

> Windows admins on the other hand tend to put DATA, not applications,
> on the network.  Most Dfs trees lead only to user volumes and some
> shared database files, or common Office shares.  The auto-replication
> facilities of DFS would work fine in this circumstance.

Being a Windows admin, I concur with that.  Why waste the fileserver 
serving out the same exact app everyday?  Why not utilize it for files 
that people actually need to share?  (I'm on a 10BASE network right now. 
I can assure you that there would be much time wasted waiting for Office 
to startup out of a network filesystem.)

Also, I run some Debian and Mac OS machines as well.  Its still easier 
to install things locally.  Only Solaris appears to actually encourage 
remotely mountable applications.

> We've setup a Dfs tree along side our AFS tree, but haven't used it
> much at all.  Our primary intention was to have something to go to if
> the OpenAFS Windows client turned out to be unsupported when Transarc
> was broken up.  Companies like Network Appliance sell NAS gear that 
> can be used
> in Dfs environments too.  Now, the OpenAFS Windows client has great
> support through developers like Jeffrey Altman of Secure-Endpoints.

Well, we have no money for support of any kind and for us both Microsoft 
products and OpenAFS are free so its easy to try and compare.

> most Windows IT groups are
> interested in setting up big trees of data they actually end up
> turning to NAS or SAN.

Most departments here at UIUC can easily afford $2000 for an additional 
fileserver for replication but cannot afford SAN gear, even low-cost 
iSCSI stuff.  In the end, it comes down to what you can do with the 
money you have.  Dfs is still better than just plain CIFS on a single 
fileserver.

> Like you said, Dfs is primarily an all Windows technology anyway.  If
> you need anything heterogeneous then you must turn to AFS.

This is entirely true.  Again, some places are all Windows where Dfs 
would make sense.  I believe this is why you see more universities using 
AFS, they tend to allow more freedom in IT.  Corporations pick a 
standard and that's it.

<<CDC
-- 
Christopher D. Clausen
ACM@UIUC SysAdmin