[OpenAFS-devel] AFS vs UNICODE

Roland Kuhn rkuhn@e18.physik.tu-muenchen.de
Sat, 10 May 2008 13:43:45 +0200


This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--Apple-Mail-9-184391835
Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
Content-Transfer-Encoding: quoted-printable

On 10 May 2008, at 11:40, Erik Dal=E9n wrote:
> On Fri, May 9, 2008 at 11:27 PM,  <u+openafsdev-t07O@chalmers.se> =20
> wrote:
>> Hello Mattias,
>>
>> On Fri, May 09, 2008 at 09:47:17PM +0200, Mattias Pantzare wrote:
>>>> File system drivers can not fully implement the stated change
>>>> unless each process has its own encoding-aware view of the file =20
>>>> system
>>>
>>> You are forgetting that the AFS client can translate between Unicode
>>> and the native charset on the client. If I run my Solaris client in
>>
>> Note that there is no "character set of the client", the character =20=

>> set
>> depends on the choice of each running process (in the first hand on
>> the locale used by the process).
>
> In Mac OS X there is a "character set of the client", all applications
> use UTF-8-NFD. In Windows the "character set of the client" is now
> UTF-8-NFC.
> But you're right, in other clients there is just a default charset,
> and no guarantee that all applications will use that. However, I think
> it would be useful to at least have the option to for example
> translate between UTF-8-NFC which is used in the cell to ISO-8859-1
> which happens to be the default charset on some clients. Or vice
> versa. Both samba and netatalk have this option, and it helps in some
> situations. But sure, it should be optional on Unix clients.
>
Where does this stupid idea come from that there is something like a =20
"default charset on some clients"? This whole mess with filesystems =20
and/or applications which insist on using this notion are nothing but =20=

a major PITA as soon as you have users from different parts of the =20
world sharing e.g. one institute network. I never actually managed to =20=

get TSM to backup _all_ files, because I could find no locale in which =20=

simply all byte values are legal constituents of file names. That's =20
not what a backup software is supposed to care about: it should take =20
files and their meta data to a safe place, from which they can be =20
retrieved IN THEIR ORIGINAL FORM later on. Nothing else.

Don't make that same mistake with a supposedly "global" filesystem. =20
Keep in mind that filesystem semantics are very clearly defined for =20
POSIX systems and users expect them to work in that way. And: users =20
WILL ALWAYS use your system in ways previously unimaginable!

Keep it simple.

Ciao,
                     Roland

--
Any society that would give up a little liberty to gain a little
security will deserve neither and lose both.  - Benjamin Franklin
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GS/CS/M/MU d-(++) s:+ a-> C+++ UL++++ P+++ L+++ E(+) W+ !N K- w--- M+ !=20=

V Y+
PGP++ t+(++) 5 R+ tv-- b+ DI++ e++++ h---- y+++
------END GEEK CODE BLOCK------




--Apple-Mail-9-184391835
content-type: application/pgp-signature; x-mac-type=70674453;
	name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkglinIACgkQI4MWO8QIRP2MpACfZ+BTr2TkLxjb50s8sJ/xtyYW
tNUAoLUAM3CTm8OVM9MciZsqrVG7pvjf
=St2/
-----END PGP SIGNATURE-----

--Apple-Mail-9-184391835--