• Jan Kara's avatar
    udf: Fix leak of UTF-16 surrogates into encoded strings · 44f06ba8
    Jan Kara authored
    OSTA UDF specification does not mention whether the CS0 charset in case
    of two bytes per character encoding should be treated in UTF-16 or
    UCS-2. The sample code in the standard does not treat UTF-16 surrogates
    in any special way but on systems such as Windows which work in UTF-16
    internally, filenames would be treated as being in UTF-16 effectively.
    In Linux it is more difficult to handle characters outside of Base
    Multilingual plane (beyond 0xffff) as NLS framework works with 2-byte
    characters only. Just make sure we don't leak UTF-16 surrogates into the
    resulting string when loading names from the filesystem for now.
    
    CC: stable@vger.kernel.org # >= v4.6
    Reported-by: default avatarMingye Wang <arthur200126@gmail.com>
    Signed-off-by: default avatarJan Kara <jack@suse.cz>
    44f06ba8
unicode.c 8.89 KB