[OpenAFS-devel] Concurrent file create bug

Harald Barth haba@pdc.kth.se
Thu, 01 Jun 2006 22:45:02 +0200 (MEST)


I think I found a bug where AFS does not follow the Unix semantics.
Let me explain: When I create a file with

open(testfile, O_APPEND|O_CREAT|O_RDWR|O_TRUNC, S_IRWXU)

I expect an empty file to be opened or created no matter if there
exists a file with that name before or not. When I now do the
operation above from serveral hosts at the same time, open() returns
EEXIST! No, I did not specify O_EXCL. If you have several hosts with
time sync you can rsh to, you can use a program I made to reproduce
the error. My program waits until the next "even 10 seconds" and then
runs open() like this:

    gettimeofday(...);
    usleep(...); /* Wait for even 10 seconds */
    res = open(testfile, O_APPEND|O_CREAT|O_RDWR|O_TRUNC, S_IRWXU);

For the whole program and a shell wrapper that starts up the
whole thing, see

/afs/pdc.kth.se/home/h/haba/src/simultanix/

You'll have to edit at least the shell wrapper for your environment.

Example run:

$ /afs/pdc.kth.se/home/h/haba/src/simultanix/simultanix.sh red-0{1,2,4,5,6}.nada.kth.se
1149186711 8111: red-05.nada.kth.se: OK
1149186711 9312: red-02.nada.kth.se: File exists
1149186711 10419: red-04.nada.kth.se: File exists
1149186711 14909: red-06.nada.kth.se: OK
1149186711 17077: red-01.nada.kth.se: OK

Tested on clients Solaris 10 OpenAFS 1.4.79
Tested on clients Linux 2.6.x OpenAFS 1.4.0
Server in all cases was Linux on 1.4.1.

Harald.