for_v4.17-rc2 - pub/scm/linux/kernel/git/jack/linux-fs

tag	dfe88746824162bd21b3faa6f58b2883c06d3ed0
tagger	Jan Kara <jack@suse.cz>	Wed Apr 18 17:20:14 2018 +0200
object	44f06ba8297c7e9dfd0e49b40cbe119113cca094

\n

commit	44f06ba8297c7e9dfd0e49b40cbe119113cca094	[log] [tgz]
author	Jan Kara <jack@suse.cz>	Thu Apr 12 17:22:23 2018 +0200
committer	Jan Kara <jack@suse.cz>	Wed Apr 18 16:34:55 2018 +0200
tree	376c1b3c48c949da90bd7cfebd254d2f72d76e98
parent	06856938112b84ff3c6b0594d017f59cfda2a43d [diff]

udf: Fix leak of UTF-16 surrogates into encoded strings

OSTA UDF specification does not mention whether the CS0 charset in case
of two bytes per character encoding should be treated in UTF-16 or
UCS-2. The sample code in the standard does not treat UTF-16 surrogates
in any special way but on systems such as Windows which work in UTF-16
internally, filenames would be treated as being in UTF-16 effectively.
In Linux it is more difficult to handle characters outside of Base
Multilingual plane (beyond 0xffff) as NLS framework works with 2-byte
characters only. Just make sure we don't leak UTF-16 surrogates into the
resulting string when loading names from the filesystem for now.

CC: stable@vger.kernel.org # >= v4.6
Reported-by: Mingye Wang <arthur200126@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>

fs/udf/unicode.c[diff]

1 file changed

tree: 376c1b3c48c949da90bd7cfebd254d2f72d76e98