riscv: lib: optimize strlen loop efficiency

Optimize the generic strlen implementation by using a pre-decrement
pointer. This reduces the loop body from 4 instructions to 3 and
eliminates the unconditional jump ('j').

Old loop (4 instructions, 2 branches):
  1: lbu t0, 0(t1); beqz t0, 2f; addi t1, t1, 1; j 1b

New loop (3 instructions, 1 branch):
  1: addi t1, t1, 1; lbu t0, 0(t1); bnez t0, 1b

This change improves execution efficiency and reduces branch pressure
for systems without the Zbb extension.

Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn>
Link: https://patch.msgid.link/20251218032614.57356-1-jiangfeng@kylinos.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
1 file changed