[OpenAFS] Puzzler: lack of access to AFS files

Lars Schimmer l.schimmer@cgv.tugraz.at
Mon, 17 Dec 2007 12:06:28 +0100


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeffrey Altman wrote:
> Rodney M. Dyer wrote:
>> I understand this, however you need to realize where I'm coming from.=20
>> We support professors who have research projects that run into the
>> millions of dollars.  Many times these people don't know anything abou=
t
>> where their data files are being saved when they choose "File->Save"
>> from an application.  They expect it to work.  We need to be in a
>> position to provide the "works" part.  If they save a valuable data fi=
le
>> from an application one day, then return the next and the application
>> won't load it because of some random network change updated a few byte=
s
>> here or there when the file was saved, what do we tell them?  "Oh btw,
>> maybe you should keep a local copy on your USB keychain unless the AFS
>> network fails?"  Most professors don't spend the extra time to run
>> checksums on their files after the save.  This kind of thing doesn't c=
ut
>> it.  I'm the type of "professional" sysadmin who's willing to give up =
10
>> percent of my speed for guaranteed delivery.  I'm not some young post
>> high school geek who's got a job running a smallish home network and
>> constantly boasts product x is faster than product y, and that's just
>> uber cool because product y sux'ors!
>=20
> The data corruption error that was discovered in January and reported b=
y
> David Bolt to OpenAFS RT was fixed in the 15 February 2008 release.  At
> the time of the announcement I stressed the importance of upgrading
> because of the seriousness of the error.
>=20
> For those who are unfamiliar, if during a background write operation to
> the file server the network drops out for any reason, the daemon thread
> would drop all of the dirty buffers that were in progress on the floor
> and mark them as clean.  The end result would be a hole in the file on
> the file server either leaving the previous data or a page full of
> zeros.  This error was present in the original OpenAFS 1.0 release.  IB=
M
> fixed the problem in the 3.6.2.59 release of IBM AFS for Windows.
> OpenAFS fixed it in 1.5.15.
>=20
> As for the performance improvements, I'm not on a performance kick for
> the hell of it.  I'm on a performance kick because large OpenAFS users
> have repeatedly mentioned the performance of the Windows client as one
> reason why they are moving away from AFS to CIFS.  In addition, the fil=
e
> servers are experiencing serious scalability issues and a large part of
> the problem is that the Windows clients have not been as smart as they
> could be and have re-requested data from the file servers that should
> have been accessed from the cache.
>=20
> Stupid things like re-using objects that were recently accessed because
> the queues did not track objects in the order of most recent use.  Bein=
g
> forced to read data or directory entries from the file server that was
> just written by the client because data buffer version numbers weren't
> incremented when merging the updated status data received as a result o=
f
> the write or the failure to locally update the directory entries when
> possible.  Re-issuing FetchStatus calls on .readonly volumes prematurel=
y
> because the volume callback expirations were not tracked by each object
> in the volume.  Some of the changes result in improved performance of
> the client when measured by throughput.   Other changes reduced the CPU
> time required by the client but most of all, the improvements have
> reduced network traffic and load on the file servers.
>=20
> Some of the changes have unfortunately triggered bug in the file server=
s
> that in turn have to be fixed.  That is the case with the
> GiveUpAllCallBacks RPC bug that exists in all file servers from 1.3.50
> to 1.4.5.  The attempt to be a good citizen by giving up callbacks when
> we know that the server will be unable to contact us since we are
> suspended or shutdown resulted in corruption of the file server state
> data and the possibility of eventual file server crashes.
>=20
> I am very thankful for the efforts you put into helping track down the
> thread safety issues in 1.5.26 as well as the issues with the infinite
> loop detection code that was added to 1.5.21 which resulted in client
> crashes.  As you are well aware the thread safety issues were
> particularly challenging to reproduce and identify.  It is both
> fortunate and unfortunate that your use case was the perfect use case t=
o
> trigger the race condition.   The race condition was finally fixed
> thanks to your efforts in 1.5.27.
>=20
> 1.5.28 in turn fixes addition crash reports that were received by the
> Windows Error Reporting service. Nothing significant.  The crash
> conditions are so rare that I doubt anyone who did experience them coul=
d
> reproduce them.
>=20
> As I said over the summer, I was truly embarrassed by the quality issue=
s
> in the releases from 1.5.21 to 1.5.25.  I do my best to test things
> given the tools at my disposal.  Unfortunately, I don't not have a test
> environment that can replicate all of the possible multiple client
> interactions.
>=20
>> I am happy with the speed improvements, and I hope we can continue to
>> use AFS.  However I need to be able to look at people with a straight
>> face when they ask about how well AFS works.
>>
>>      Speed?  Check
>>      Scale?  Check
>>      Functionality?  Check
>>      Reliablity?  hrm...
>=20
> You see I would actually give us less credit than that:
>=20
> Speed?  Not so much but you can get decent performance for specific
> classes of use cases
>=20
> Scale?  Well, we have global access but what Transarc advertised in the
> mid-90s as infinite scalability has not lived up to the claims.  The
> file servers are capable of handling approximately 100 simultaneous
> requests and when those requests require network traffic to query the
> client's identity, obtain protection data, or communicate with the
> volume database server, the threads sit idle blocked on the I/O.  The
> actual throughput of a given file server is far below what it needs to
> be if we are truly going to be serving petabytes of data to tens of
> thousands of clients from each file server.
>=20
> Functionality? Hmm.  Much of the complexity that was added this summer
> for the directory searching was necessary because of the lack of
> functionality in the AFS3 protocols.  The locking issues that everyone
> runs into are also a lack of functionality.  Shall we discuss Unicode
> object names in profile directories and the data corruption that
> produces?  What about the inability to maintain data connectivity due t=
o
> the CIFS client timeouts?  Do you like having Office apps crash on you?
>  I sure don't.
>=20
> Reliability? Given everything else I actually mark reliability on the
> high side.  At least when there is an issue, we get a fix out ASAP.
>=20
> The funny thing is that even with all of the negatives I have mentioned
> I actually think that OpenAFS is the best it has ever been.  I am
> finally at the point where I am willing to say to people that I think
> you should consider OpenAFS for new deployments.  Do we still have
> issues?  Absolutely.  But we also have plans and we have a growing
> number of skilled developers who are actively contributing to make
> OpenAFS better on a significantly broad number of platforms.
>=20
> If you need a file system that is going to provide good WAN performance
> with federated authentication and high availability, you really can't
> find anything else out there.

I like to mention my/our side of view:
The OpenAFS windows clients went better and faster from release to
release. I still remember old 1.3.x times, in this view, the 1.5.28 is
really far better and faster.
And I don't want to smaller the work of the dev team and the work that
has been done.

We use OpenAFS on 10-20 windows workstations with integrated logon and
user profile in OpenAFS filespace.

Problems mentioned by users:
1. speed - although in the 4-10 MB/sec area, we are working in graphics
area and handling with 1-500 MB models. Some folks wish to obtain higher
rates - I wait for direct windows drive instead of smb/cifs binding
before I mourn about speed problems ;-)
And the biggest problem with speed is windows itself - loading/saving a
1GB profile is damned slow, doesn't care if it is on win 2003 server
local space or in OpenAFS space. So no bad point for OpenAFS - If loaded
daily I assume OpenAFS is faster while loading than win 2003 server
local space.


2. Load/save problems with some programs
e.g. I had the problem on one pc with office we tried to solve with
Jeffrey Altman, still no luck. OpenAFS seems to do all right, office
not. It just waits for 30-120 sec until it opens a file and windows is
blocked in that time - solved for that PC with a complete system
reinstall and working with a local copy of office files and "backup"
local changes into OpenAFS filespace.
I just want to note, on my laptop and private PC I can work with that
office files out of OpenAFS space with no problem at all, but I don't
use office as much as the noted PC/user.

And latest exampel is DeepExploration 5 - don't know why but open a file
out of OpenAFS works, saving that loaded file let DeepExploration went
mad and do nothing more...
And sry, til yet no time to debug that issue, but I think it is DE which
went mad and OpenAFS seems to do all right.


3. Encoding of names, NTFS-nondata-streams, big files support
I know, that will be implemented with win native drive support, but I
just wanted to note, my users try to use OpenAFS like a normal drive and
sometimes hit theses problems.

But one point is: it IS reliable for me, as soon as data is on AFS
server (I use one RW copy and at least 2 RO copies).
Til yet no problem with data loss or changed data or else.
Even the "bad" win 1.5.22-1.5.25 worked flawless.
If data got on server, its reliable and secure. Getting data into
OpenAFS is sometimes tricky (e.g. DeepExploration), most times as easy
as local drive.



> Jeffrey Altman
>=20


MfG,
Lars Schimmer
- --
- -------------------------------------------------------------
TU Graz, Institut f=FCr ComputerGraphik & WissensVisualisierung
Tel: +43 316 873-5405       E-Mail: l.schimmer@cgv.tugraz.at
Fax: +43 316 873-5402       PGP-Key-ID: 0x4A9B1723
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHZlg0mWhuE0qbFyMRAg8TAKCISBLsX+SYLvC3UNslOPAPj42h/QCfUt4c
3fzqH7yPHeGWixSr2/MsjBk=3D
=3DYIDB
-----END PGP SIGNATURE-----