[OpenAFS] Re: [AFS3-std] Re: IBM will not re-license OpenAFS .xg files

Fri, 31 Aug 2012 15:38:43 -0400

I agree with most everything Russ says below.

When I am asked why we are continuing to use AFS in our local department
here at Cornell (the Cornell NanoScale Facility), I only need mention a
couple of features... "distributed file system, cross platform, security
model, built in snapshots" before the questioners realize that AFS does
stuff that when taken together, pretty much nothing else really does
well.

Unfortunately, at least for the time being, AFS lost the battle to be
the new Cornell-wide file system. I was not part of that, so I don't
know the specifics on why. The current file service provides CIFS for
Windows and NFS for mac/unix (not NFSv4) and itself has a lot of
shortcomings that AFS does not.

While Cornell is also looking at data in the cloud (and in fact
subscribes to box.net through Internet 2's Net+), the cloud is not seen
as a replacement for Enterprise file systems. The cloud is seen more as
a file sharing tool. Part of that is because the cloud is "out there"
and not local and part of that is data security and liability
requirements (though for Net+, there are certain things an institution
has to agree to -- although "escrow" is not yet one of those things) and
in addition Cornell requires many data security and data liability
"things" in the contracts Cornell signs with cloud vendors, making
finding a vendor tough. 

If AFS were to solve a lot of the issues Russ mentions, AFS would become
a real contender for a real Cornell-wide file system, which we still do
not have.

That said, we are hoping the auditors do not seek us out here any time
soon. Single DES is high on their list of "red flags". And would
probably force us off of AFS without a definitive road map and timeframe
for single DES to go away.

I find the idea of an OpenAFS Foundation very interesting and very
intriguing. Having something concrete with a concrete direction for
organizations to contribute to and to fund would be a benefit. My
question is, how does one go about investigating and starting up such a
foundation (has anyone talked to other OSS projects with have started up
similar foundations)? Perhaps whichever institutions are interested
should get together in some sort of forum (irc, jabber, webex, phone
conference, etc) to talk about plans, etc.

On Thu, Aug 30, 2012 at 06:44:37PM -0700, Russ Allbery wrote:
> Jeffrey Altman <jaltman@your-file-system.com> writes:
> 
> > For any given protocol standardization proposal, substantial expertise
> > and time is required to perform proper analysis and review let alone
> > write a document in the first place.  On top of that it is very
> > difficult to anticipate all of the needs of a protocol and its viability
> > without implementation experience.  To ensure that the documentation of
> > a protocol is correct, multiple independent implementations are
> > required.
> 
> I want to highlight this.  The primary purpose of protocol specifications
> is to enable interoperability between multiple implementations of the
> protocol.  There are other benefits, of course, such as documentation for
> what guarantees are being made, which help the future maintainability of
> the code, but those can be addressed in other ways than full
> standardization process and third-party review.  The primary benefit that
> third-party review and a full standards process adds is that it ensures
> that the protocol specification is sufficiently complete that someone can
> write an interoperating implementation from nothing but the protocol
> specification.
> 
> Given unlimited resources, I absolutely think we should write protocol
> specifications.  I love having firm protocol specifications.  I tend to
> write them even for software for which there's only one implementation
> (remctl, originally, or WebAuth).  They offer substantial benefits for
> development and aid in thinking through all the complexities of a problem.
> 
> But AFS is currently operating in a seriously resource-starved mode.  I
> completely understand why people have put those resources into things
> other than detailed protocol documentation and review, particularly since
> there is a huge backlog of work that would be required to achieve a
> fully-documented protocol.  It's very similar to the reason why we don't
> have a completely updated manual, except that protocol work is even
> harder than user documentation.
> 
> > The fact is that the first three categories have rarely provided the
> > resources necessary to produce substantial change.  Individuals unless
> > they are independently wealthy simply do not have the time or resources.
> > End user organizations are willing to fund small focused projects that
> > can be achieved on the order of months not years.  End user
> > organizations are unwilling to fund partial work that is reliant upon
> > other organizations to independently fund subsequent segments.
> 
> When this comes up, folks from time to time observe that other open source
> projects manage under those circumstances.  I want to be sure that people
> realize the level of complexity that's involved in OpenAFS, and why
> comparisons to other open source projects don't always work.
> 
> Maintaining OpenAFS involves, among other things:
> 
> * Kernel code as well as userspace code for various UNIXes.
> * Mac OS X development (with quite a bit of OS internals involvement).
> * Windows kernel file system development.
> * High-performance threaded code with a complex lock model.
> * A custom network protocol with substantial complexity.
> * Cryptographic network security models.
> 
> Some of those things (such as the Windows kernel file system work) no
> other project, open source *or* commercial, does at the level that OpenAFS
> does.  This is a level of complexity *far* beyond the typical open source
> project.  The only open source projects I can think of with equivalent
> complexity are primarily maintained by full-time, professional developers
> whose job is to work on that software, and whose salaries are paid by
> companies like Red Hat, Novell, IBM, Google, or Oracle.
> 
> I love volunteer open source, and open source that's produced as an
> incidental side effect of solving problems for one's regular job.  I
> maintain a bunch of it myself.  OpenAFS is not at the level of complexity
> that can be maintained in that model.  That degree of investment might be
> enough to keep it basically running (although I'm dubious about even that
> for Mac OS X and Windows), but not enough to do any substantial new
> development at any sort of reasonable rate.
> 
> > I am stating this here and now because it has a direct impact on the
> > availability of resources to the AFS3 standardization process.  Given
> > the status quo it will be very difficult for YFSI to make its resources
> > available to the AFS3 standardization process until after it has
> > established it product on the market.  As Harald pointed out in this
> > thread and others have at various times on openafs-info@openafs.org, if
> > certain critical features are not made available to the end user
> > community real soon now, large organizations that require that
> > functionality will have no choice but to move their IT services off of
> > AFS3 to another file system protocol.  Many already have and others are
> > known to be evaluating their options.
> 
> Let me give folks a bit of a snapshot of the situation at Stanford, for
> example.
> 
> The general perception among senior IT management, both centrally and in
> the schools, is that the future of user storage is the cloud, with
> services such as Google Drive, Box, Dropbox, and so forth.  The whole
> concept of an enterprise file system is widely considered to be obsolete.
> I'm quite confident that this position is shared by a lot of other people
> in higher education as well.  While that's not the niche that AFS fills
> everywhere, that is the most common niche that it fills in higher ed.
> 
> Back in the NFSv4 days, the discussion we had at a management level around
> AFS and alternatives was in the form of "is this thing sufficiently better
> than AFS that we should switch?"  That's not the conversation that we're
> having any more.  The conversation we're having now is "can we use this to
> get off of that dead-end AFS stuff?"
> 
> I happen to think this position is wrong, and that it would be easier to
> add the UI components to AFS than it will be to solve all the file system
> edge cases in cloud storage.  But I will get absolutely nowhere with that
> argument.  No one is going to believe it; the UI is perceived to be very
> hard, and the advantages offered by something like Google Docs to be
> overwhelming.  Management will believe it when they see it; arguing about
> what AFS *could* become is entirely pointless.
> 
> Where AFS *does* have a very compelling story is around security, because
> security is the huge weakness of these cloud storage platforms.  If you
> try to do any serious authorization management in Google Drive, for
> example, you will immediately discover that their concept of security is
> pure amateur hour.  It's a complete embarassment.  And auditors and IT
> professionals who are responsible for storing data that has to remain
> secure, such as patient medical data or financial data, know that.  For
> example, the Stanford Medical School has flatly forbidden any of their
> staff to use any cloud storage to store anything related to patient data.
> None of the cloud storage providers will accept the sort of liability that
> would be required to make the university lawyers willing to permit such a
> thing (and rightfully so, given how bad their security is).
> 
> So the AFS *protocol* has a very compelling story here, one that I think I
> can sell.  However, the current OpenAFS *implementation* does not.
> 
> (For the record, NFSv4 isn't any better.  NFSv4 has nearly all of the
> problems of AFS, plus adds a bunch of new problems of its own.)
> 
> To tell this story, I need, *at least*:
> 
> * Full rxgk including mandatory data privacy and AES encryption.
> 
> * A coherent mobile story for how mobile devices and applications are
>   going to access data in AFS, including how they can authenticate without
>   using user passwords (which are increasingly a bad authentication story
>   anywhere but are particularly horrible on mobile devices).
> 
> * Client-side data encryption with untrusted servers, where I can store
>   data with someone I don't trust and be assured that they can't access it
>   in any meaningful way.
> 
> The last is how I get AFS into the cloud storage discussion.  This is
> already how cloud storage is addressing the problems that no one trusts
> cloud storage providers to maintain adequate security.  There are, for
> example, multiple appliances that you can buy that front-end Amazon S3 or
> other cloud storage services but have encryption keys and only store
> encrypted data in the cloud, re-exporting that file system to your data
> center via CIFS or NFS or iSCSI.  That's the sort of thing that AFS is
> competing against.  I think the AFS protocol offers considerable
> advantages over that model, but it needs to offer feature parity on the
> security capabilities.
> 
> I need all of this either deployable or pretty clearly in progress within
> about a year.  Beyond that, I think the conversation is already going to
> be over.  At the least, I'm not going to be particularly interested in
> continuing to fight it.
> 
> Other people will have different lists, of course.  That's just mine.
> There are things other people care about that I don't; for example, if I
> made a list of the top 50 things I want to see in AFS, IPv6 support would
> be #116.  Other people have different constraints.
> 
> Some sites, often due to lack of resources, are willing to stay with stuff
> that isn't really working because it's already there and it's expensive to
> keep it.  I know a lot of sites that are running AFS that way; some of
> them made a decision to get rid of it years and years ago, but still have
> it because it does do a bunch of cool stuff and no one has the heart to
> really make the push to get rid of it.
> 
> I am not one of those people.  I have a deep dislike of staying on zombie
> technologies.  I believe in the potential of AFS, and if AFS will go where
> I need it to go to make a compelling pitch to management, then I want to
> keep it.  But if AFS is not going to go there in a reasonable timeline,
> then I want Stanford to start its migration off of AFS in 2013, free up
> all those resources to invest in technologies that we can keep for the
> long run, and wish everyone the best of luck and go our separate way by
> 2014 or 2015.
> 
> The thing I want to really stress here is that we're right on the border
> of it being too late for AFS as a protocol to find the necessary
> investment.  It is already extremely difficult to get management here to
> consider funding anything related to AFS; it's perceived as a
> technological dead-end and something that is no longer strategic.  There
> is no longer an assumed default in favor of keeping AFS; if anything, the
> assumed default is that AFS will be retired in some forseeable timeline in
> favor of cloud-based collaboration environments, Amazon S3, and some
> data-center-centric storage such as cluster file systems for HPC.  To get
> a place like Stanford to invest in AFS requires a *proactive case*, a
> justification for why it's better; the "you already have it so it's
> cheaper to invest in it than replace it" argument is no longer working.
> 
> With a pace of OpenAFS development where rxgk maybe wanders into a stable
> release in 2014 or so, and then possibly X.509 authentication and improved
> ACLs in 2015, and a new directory format in 2016 or 2017, and so forth,
> AFS is going to lose a lot of sites.  Almost definitely including
> Stanford.
> 
> -- 
> Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
> 

-- 
********************************
David William Botsch
Programmer/Analyst
CNF Computing
botsch@cnf.cornell.edu
********************************