[OpenAFS-devel] AFS vs UNICODE

Roland Kuhn rkuhn@e18.physik.tu-muenchen.de
Tue, 6 May 2008 22:48:32 +0200


This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--Apple-Mail-4--128520942
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit

Hi Jeffrey!

On 6 May 2008, at 20:22, Jeffrey Altman wrote:

> Garrett Wollman wrote:
>> <<On Tue, 06 May 2008 10:04:17 -0400, Jeffrey Altman <jaltman@secure-endpoints.com 
>> > said:
>>>   1. MacOS X and Linux clients begin to apply NFC to all UTF-8  
>>> strings
>>>      obtained from the operating system whether for directory  
>>> lookup,
>>>      object creation, or symlink target creation.
>> How do they know it's a UTF-8 string?  Traditional Unix semantics
>> provide that a file name is a byte sequence, not a character
>> sequence.
>
> There are algorithms you can use to validate utf-8 sequences.
>

Well, certainly. But I find it very irritating that a filesystem  
should somehow interpret and _change_ a filename based on the  
assumption of UTF-8 encoding, even if the filename's byte sequence  
happens to conform to the UTF-8 rules. Why bother? It's much easier  
and much more portable to regard filenames as opaque byte sequences.

I appreciate that don't have contributed much, so you are free to  
ignore my ranting, but if there's a technical problem with the above,  
I'd really like to hear the arguments, even in sketchy form.

Ciao,
                     Roland

--
Any society that would give up a little liberty to gain a little
security will deserve neither and lose both.  - Benjamin Franklin
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GS/CS/M/MU d-(++) s:+ a-> C+++ UL++++ P+++ L+++ E(+) W+ !N K- w--- M+ ! 
V Y+
PGP++ t+(++) 5 R+ tv-- b+ DI++ e++++ h---- y+++
------END GEEK CODE BLOCK------




--Apple-Mail-4--128520942
content-type: application/pgp-signature; x-mac-type=70674453;
	name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkggxCEACgkQI4MWO8QIRP2FLwCgvIvJG8Qs4273ghVSMRen+cwO
1f0AoIVrBNudb6GC/SD5imrwE3ya2cke
=ZJsW
-----END PGP SIGNATURE-----

--Apple-Mail-4--128520942--