vt: merge ucs_is_zero_width()/ucs_is_double_width() into ucs_get_width()

The hot path in vc_process_ucs() asks two independent questions about the
same code point -- "is it double-width?" and "is it zero-width?" -- and
was answering each with its own bsearch over its own table. For anything
past the leading bounds check that meant two scans of the BMP width
tables back to back for what is logically a single lookup.

Replace both with one ucs_get_width(cp) returning 0, 1, or 2 in a single
bsearch, while keeping the total table footprint at the same 2384 B as
before.

To do so, merge the zero-width and double-width ranges per region into
one sorted-by-`first` table. BMP entries stay 4 bytes; per-entry width
is hosted in spare bits of the non-BMP table's `last` field. Non-BMP
code points use only 20 of 32 bits, so each u32 has 12 unused high bits.
Store first/last shifted left by 12 and use the low 12 bits of `last`
for metadata: bit 11 is this entry's own width flag, bits 0..7 host an
8-bit chunk of the BMP double-width bitmap. Because the metadata bits
sit strictly below the lowest cp-scale bit, the bsearch comparator
remains a plain u32 compare on shifted keys with no masking.

In vc_process_ucs() the overwhelmingly common single-width path now
collapses to a single predicted branch:

	if (likely(w == 1))
		return 1;

Note: scripts/checkpatch.pl complains about "Macros with complex values
      should be enclosed in parentheses" for the BMP_*WIDTH and
      RANGE_*WIDTH macros. They are deliberately defined to expand to a
      comma-separated (first, last) pair so they can populate the two
      adjacent fields of a struct initializer; wrapping them in
      parentheses would turn that into a comma-expression and defeat
      the whole construction. Please ignore.

Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Link: https://patch.msgid.link/20260515034857.2514225-1-nico@fluxnic.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 files changed