> Reading this made me realise that our code won't spot case-insensitive
> clashes between non-ASCII characters (both before and after this
> patch). I have no idea how case-insensitive filesystems handle that either.
The most interesting candidate here would be Windows' NTFS. According to the
internet resources I could find, it uses a sequence of 16-bit values for file
names. It supports, but does NOT require or enforce, UTF-16 encoding. I could
not find any precise specification of case insensitive semantics. Perhaps they
case-normalize using the standard unicode semantics /if/ the full name is valid
UTF-16 (and do not normalize otherwise). Or perhaps they they use unicode case
rules for valid 16-bit UTF-16 code points (and the literal values for those
which are not). I guess one would have to do experiments to find out the
details. I guess there are still these things they call "code pages" around
which may influence this or not.
|