[OpenAFS] Swedish characters in AFS, files created in Windows cannot be used in Mac OS X

Sebastian Flothow flothow@gip.com
Mon, 21 May 2012 16:12:05 +0200

Am 19.03.2012 20:16, schrieb Jeffrey Altman:
> 2. Finder does display Unicode Strings in composed form but always send=
> strings back to the file system in decomposed form which results in the=

> file names not being found.

We've got the same problem with German umlauts, and yes, this seems to=20
be what's happening. Opening files with an umlaut in the name results in =

an error message, whereas directories with umlaut are simply displayed=20
as empty, without any message, which is particularly confusing for=20
unsuspecting users. (This is with OpenAFS 1.6.1 on Mac OS X 10.7.4.)

> The UNIX AFS clients do not perform UTF8 string detection and do not
> normalize strings for comparison as is performed on Windows.  In
> addition, on Windows, the Explorer Shell always reflects file names bac=
> to the file system in the encoding presented by the file system.  In my=

> opinion this is what OS X should do but doesn't.
> A radar should be opened with Apple.

Did anyone log this issue with Apple? If so, what's the URL?

However, even if Apple were to change this behavior, it would only fix=20
one half of the problem, namely, allowing applications on Mac OS X to=20
access files created on Linux/Windows.

The other way around, though, there are problems too. As a simple test=20
I've created a file named "t=C3=A4st.txt" on Mac OS X. Now, in a terminal=
window on Linux, the file name is displayed properly, but I can't access =

the file by typing its name. Using tab completion works, but not if the=20
part before the first umlaut is ambiguous.

On Windows XP, the file name is displayed incorrectly, in that the dots=20
above the a are shifted to the right by half a character (which sort of=20
makes sense, given that the combining diaeresis character is stored=20
after the a; but it's incorrect nevertheless). When opening the file=20
with notepad, the window title shows it as "ta=E2=96=ABst.txt" (i.e. inst=
ead of=20
the diaeresis, there's a small square between a and s).

In summary, while it is at least possible to access umlaut-y files from=20
Mac OS X on other platforms, it is rather kludgy. Furthermore, it=20
strikes me as inelegant to have files with different normalizations=20
within one filesystem, and given that there are applications which are=20
very picky about filenames (such as Tivoli Storage Manager), I'm=20
concerned this might become rather messy.

Therefore, I'm in favor of normalizing file names to precomposed form in =

the Mac version of the AFS client, since it would fix both halves of the =

problem, and we wouldn't have to wait for Apple to change the behavior=20
of Mac OS X (if they are going to do so at all).

Sebastian Flothow