| From 8bfda52ab633a222e753c6f1dca8a476876d37d9 Mon Sep 17 00:00:00 2001 |
| From: Daniel Borkmann <daniel@iogearbox.net> |
| Date: Mon, 28 Jan 2019 21:23:28 +0100 |
| Subject: bpf: prevent out of bounds speculation on pointer arithmetic |
| |
| [ commit 979d63d50c0c0f7bc537bf821e056cc9fe5abd38 upstream ] |
| |
| Jann reported that the original commit back in b2157399cc98 |
| ("bpf: prevent out-of-bounds speculation") was not sufficient |
| to stop CPU from speculating out of bounds memory access: |
| While b2157399cc98 only focussed on masking array map access |
| for unprivileged users for tail calls and data access such |
| that the user provided index gets sanitized from BPF program |
| and syscall side, there is still a more generic form affected |
| from BPF programs that applies to most maps that hold user |
| data in relation to dynamic map access when dealing with |
| unknown scalars or "slow" known scalars as access offset, for |
| example: |
| |
| - Load a map value pointer into R6 |
| - Load an index into R7 |
| - Do a slow computation (e.g. with a memory dependency) that |
| loads a limit into R8 (e.g. load the limit from a map for |
| high latency, then mask it to make the verifier happy) |
| - Exit if R7 >= R8 (mispredicted branch) |
| - Load R0 = R6[R7] |
| - Load R0 = R6[R0] |
| |
| For unknown scalars there are two options in the BPF verifier |
| where we could derive knowledge from in order to guarantee |
| safe access to the memory: i) While </>/<=/>= variants won't |
| allow to derive any lower or upper bounds from the unknown |
| scalar where it would be safe to add it to the map value |
| pointer, it is possible through ==/!= test however. ii) another |
| option is to transform the unknown scalar into a known scalar, |
| for example, through ALU ops combination such as R &= <imm> |
| followed by R |= <imm> or any similar combination where the |
| original information from the unknown scalar would be destroyed |
| entirely leaving R with a constant. The initial slow load still |
| precedes the latter ALU ops on that register, so the CPU |
| executes speculatively from that point. Once we have the known |
| scalar, any compare operation would work then. A third option |
| only involving registers with known scalars could be crafted |
| as described in [0] where a CPU port (e.g. Slow Int unit) |
| would be filled with many dependent computations such that |
| the subsequent condition depending on its outcome has to wait |
| for evaluation on its execution port and thereby executing |
| speculatively if the speculated code can be scheduled on a |
| different execution port, or any other form of mistraining |
| as described in [1], for example. Given this is not limited |
| to only unknown scalars, not only map but also stack access |
| is affected since both is accessible for unprivileged users |
| and could potentially be used for out of bounds access under |
| speculation. |
| |
| In order to prevent any of these cases, the verifier is now |
| sanitizing pointer arithmetic on the offset such that any |
| out of bounds speculation would be masked in a way where the |
| pointer arithmetic result in the destination register will |
| stay unchanged, meaning offset masked into zero similar as |
| in array_index_nospec() case. With regards to implementation, |
| there are three options that were considered: i) new insn |
| for sanitation, ii) push/pop insn and sanitation as inlined |
| BPF, iii) reuse of ax register and sanitation as inlined BPF. |
| |
| Option i) has the downside that we end up using from reserved |
| bits in the opcode space, but also that we would require |
| each JIT to emit masking as native arch opcodes meaning |
| mitigation would have slow adoption till everyone implements |
| it eventually which is counter-productive. Option ii) and iii) |
| have both in common that a temporary register is needed in |
| order to implement the sanitation as inlined BPF since we |
| are not allowed to modify the source register. While a push / |
| pop insn in ii) would be useful to have in any case, it |
| requires once again that every JIT needs to implement it |
| first. While possible, amount of changes needed would also |
| be unsuitable for a -stable patch. Therefore, the path which |
| has fewer changes, less BPF instructions for the mitigation |
| and does not require anything to be changed in the JITs is |
| option iii) which this work is pursuing. The ax register is |
| already mapped to a register in all JITs (modulo arm32 where |
| it's mapped to stack as various other BPF registers there) |
| and used in constant blinding for JITs-only so far. It can |
| be reused for verifier rewrites under certain constraints. |
| The interpreter's tmp "register" has therefore been remapped |
| into extending the register set with hidden ax register and |
| reusing that for a number of instructions that needed the |
| prior temporary variable internally (e.g. div, mod). This |
| allows for zero increase in stack space usage in the interpreter, |
| and enables (restricted) generic use in rewrites otherwise as |
| long as such a patchlet does not make use of these instructions. |
| The sanitation mask is dynamic and relative to the offset the |
| map value or stack pointer currently holds. |
| |
| There are various cases that need to be taken under consideration |
| for the masking, e.g. such operation could look as follows: |
| ptr += val or val += ptr or ptr -= val. Thus, the value to be |
| sanitized could reside either in source or in destination |
| register, and the limit is different depending on whether |
| the ALU op is addition or subtraction and depending on the |
| current known and bounded offset. The limit is derived as |
| follows: limit := max_value_size - (smin_value + off). For |
| subtraction: limit := umax_value + off. This holds because |
| we do not allow any pointer arithmetic that would |
| temporarily go out of bounds or would have an unknown |
| value with mixed signed bounds where it is unclear at |
| verification time whether the actual runtime value would |
| be either negative or positive. For example, we have a |
| derived map pointer value with constant offset and bounded |
| one, so limit based on smin_value works because the verifier |
| requires that statically analyzed arithmetic on the pointer |
| must be in bounds, and thus it checks if resulting |
| smin_value + off and umax_value + off is still within map |
| value bounds at time of arithmetic in addition to time of |
| access. Similarly, for the case of stack access we derive |
| the limit as follows: MAX_BPF_STACK + off for subtraction |
| and -off for the case of addition where off := ptr_reg->off + |
| ptr_reg->var_off.value. Subtraction is a special case for |
| the masking which can be in form of ptr += -val, ptr -= -val, |
| or ptr -= val. In the first two cases where we know that |
| the value is negative, we need to temporarily negate the |
| value in order to do the sanitation on a positive value |
| where we later swap the ALU op, and restore original source |
| register if the value was in source. |
| |
| The sanitation of pointer arithmetic alone is still not fully |
| sufficient as is, since a scenario like the following could |
| happen ... |
| |
| PTR += 0x1000 (e.g. K-based imm) |
| PTR -= BIG_NUMBER_WITH_SLOW_COMPARISON |
| PTR += 0x1000 |
| PTR -= BIG_NUMBER_WITH_SLOW_COMPARISON |
| [...] |
| |
| ... which under speculation could end up as ... |
| |
| PTR += 0x1000 |
| PTR -= 0 [ truncated by mitigation ] |
| PTR += 0x1000 |
| PTR -= 0 [ truncated by mitigation ] |
| [...] |
| |
| ... and therefore still access out of bounds. To prevent such |
| case, the verifier is also analyzing safety for potential out |
| of bounds access under speculative execution. Meaning, it is |
| also simulating pointer access under truncation. We therefore |
| "branch off" and push the current verification state after the |
| ALU operation with known 0 to the verification stack for later |
| analysis. Given the current path analysis succeeded it is |
| likely that the one under speculation can be pruned. In any |
| case, it is also subject to existing complexity limits and |
| therefore anything beyond this point will be rejected. In |
| terms of pruning, it needs to be ensured that the verification |
| state from speculative execution simulation must never prune |
| a non-speculative execution path, therefore, we mark verifier |
| state accordingly at the time of push_stack(). If verifier |
| detects out of bounds access under speculative execution from |
| one of the possible paths that includes a truncation, it will |
| reject such program. |
| |
| Given we mask every reg-based pointer arithmetic for |
| unprivileged programs, we've been looking into how it could |
| affect real-world programs in terms of size increase. As the |
| majority of programs are targeted for privileged-only use |
| case, we've unconditionally enabled masking (with its alu |
| restrictions on top of it) for privileged programs for the |
| sake of testing in order to check i) whether they get rejected |
| in its current form, and ii) by how much the number of |
| instructions and size will increase. We've tested this by |
| using Katran, Cilium and test_l4lb from the kernel selftests. |
| For Katran we've evaluated balancer_kern.o, Cilium bpf_lxc.o |
| and an older test object bpf_lxc_opt_-DUNKNOWN.o and l4lb |
| we've used test_l4lb.o as well as test_l4lb_noinline.o. We |
| found that none of the programs got rejected by the verifier |
| with this change, and that impact is rather minimal to none. |
| balancer_kern.o had 13,904 bytes (1,738 insns) xlated and |
| 7,797 bytes JITed before and after the change. Most complex |
| program in bpf_lxc.o had 30,544 bytes (3,817 insns) xlated |
| and 18,538 bytes JITed before and after and none of the other |
| tail call programs in bpf_lxc.o had any changes either. For |
| the older bpf_lxc_opt_-DUNKNOWN.o object we found a small |
| increase from 20,616 bytes (2,576 insns) and 12,536 bytes JITed |
| before to 20,664 bytes (2,582 insns) and 12,558 bytes JITed |
| after the change. Other programs from that object file had |
| similar small increase. Both test_l4lb.o had no change and |
| remained at 6,544 bytes (817 insns) xlated and 3,401 bytes |
| JITed and for test_l4lb_noinline.o constant at 5,080 bytes |
| (634 insns) xlated and 3,313 bytes JITed. This can be explained |
| in that LLVM typically optimizes stack based pointer arithmetic |
| by using K-based operations and that use of dynamic map access |
| is not overly frequent. However, in future we may decide to |
| optimize the algorithm further under known guarantees from |
| branch and value speculation. Latter seems also unclear in |
| terms of prediction heuristics that today's CPUs apply as well |
| as whether there could be collisions in e.g. the predictor's |
| Value History/Pattern Table for triggering out of bounds access, |
| thus masking is performed unconditionally at this point but could |
| be subject to relaxation later on. We were generally also |
| brainstorming various other approaches for mitigation, but the |
| blocker was always lack of available registers at runtime and/or |
| overhead for runtime tracking of limits belonging to a specific |
| pointer. Thus, we found this to be minimally intrusive under |
| given constraints. |
| |
| With that in place, a simple example with sanitized access on |
| unprivileged load at post-verification time looks as follows: |
| |
| # bpftool prog dump xlated id 282 |
| [...] |
| 28: (79) r1 = *(u64 *)(r7 +0) |
| 29: (79) r2 = *(u64 *)(r7 +8) |
| 30: (57) r1 &= 15 |
| 31: (79) r3 = *(u64 *)(r0 +4608) |
| 32: (57) r3 &= 1 |
| 33: (47) r3 |= 1 |
| 34: (2d) if r2 > r3 goto pc+19 |
| 35: (b4) (u32) r11 = (u32) 20479 | |
| 36: (1f) r11 -= r2 | Dynamic sanitation for pointer |
| 37: (4f) r11 |= r2 | arithmetic with registers |
| 38: (87) r11 = -r11 | containing bounded or known |
| 39: (c7) r11 s>>= 63 | scalars in order to prevent |
| 40: (5f) r11 &= r2 | out of bounds speculation. |
| 41: (0f) r4 += r11 | |
| 42: (71) r4 = *(u8 *)(r4 +0) |
| 43: (6f) r4 <<= r1 |
| [...] |
| |
| For the case where the scalar sits in the destination register |
| as opposed to the source register, the following code is emitted |
| for the above example: |
| |
| [...] |
| 16: (b4) (u32) r11 = (u32) 20479 |
| 17: (1f) r11 -= r2 |
| 18: (4f) r11 |= r2 |
| 19: (87) r11 = -r11 |
| 20: (c7) r11 s>>= 63 |
| 21: (5f) r2 &= r11 |
| 22: (0f) r2 += r0 |
| 23: (61) r0 = *(u32 *)(r2 +0) |
| [...] |
| |
| JIT blinding example with non-conflicting use of r10: |
| |
| [...] |
| d5: je 0x0000000000000106 _ |
| d7: mov 0x0(%rax),%edi | |
| da: mov $0xf153246,%r10d | Index load from map value and |
| e0: xor $0xf153259,%r10 | (const blinded) mask with 0x1f. |
| e7: and %r10,%rdi |_ |
| ea: mov $0x2f,%r10d | |
| f0: sub %rdi,%r10 | Sanitized addition. Both use r10 |
| f3: or %rdi,%r10 | but do not interfere with each |
| f6: neg %r10 | other. (Neither do these instructions |
| f9: sar $0x3f,%r10 | interfere with the use of ax as temp |
| fd: and %r10,%rdi | in interpreter.) |
| 100: add %rax,%rdi |_ |
| 103: mov 0x0(%rdi),%eax |
| [...] |
| |
| Tested that it fixes Jann's reproducer, and also checked that test_verifier |
| and test_progs suite with interpreter, JIT and JIT with hardening enabled |
| on x86-64 and arm64 runs successfully. |
| |
| [0] Speculose: Analyzing the Security Implications of Speculative |
| Execution in CPUs, Giorgi Maisuradze and Christian Rossow, |
| https://arxiv.org/pdf/1801.04084.pdf |
| |
| [1] A Systematic Evaluation of Transient Execution Attacks and |
| Defenses, Claudio Canella, Jo Van Bulck, Michael Schwarz, |
| Moritz Lipp, Benjamin von Berg, Philipp Ortner, Frank Piessens, |
| Dmitry Evtyushkin, Daniel Gruss, |
| https://arxiv.org/pdf/1811.05441.pdf |
| |
| Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation") |
| Reported-by: Jann Horn <jannh@google.com> |
| Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
| Acked-by: Alexei Starovoitov <ast@kernel.org> |
| Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
| Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| include/linux/bpf_verifier.h | 10 ++ |
| kernel/bpf/verifier.c | 185 +++++++++++++++++++++++++++++++++-- |
| 2 files changed, 189 insertions(+), 6 deletions(-) |
| |
| diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h |
| index b01edd2aaa7b..5435bba302ed 100644 |
| --- a/include/linux/bpf_verifier.h |
| +++ b/include/linux/bpf_verifier.h |
| @@ -147,6 +147,7 @@ struct bpf_verifier_state { |
| /* call stack tracking */ |
| struct bpf_func_state *frame[MAX_CALL_FRAMES]; |
| u32 curframe; |
| + bool speculative; |
| }; |
| |
| #define bpf_get_spilled_reg(slot, frame) \ |
| @@ -166,15 +167,24 @@ struct bpf_verifier_state_list { |
| struct bpf_verifier_state_list *next; |
| }; |
| |
| +/* Possible states for alu_state member. */ |
| +#define BPF_ALU_SANITIZE_SRC 1U |
| +#define BPF_ALU_SANITIZE_DST 2U |
| +#define BPF_ALU_NEG_VALUE (1U << 2) |
| +#define BPF_ALU_SANITIZE (BPF_ALU_SANITIZE_SRC | \ |
| + BPF_ALU_SANITIZE_DST) |
| + |
| struct bpf_insn_aux_data { |
| union { |
| enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ |
| unsigned long map_state; /* pointer/poison value for maps */ |
| s32 call_imm; /* saved imm field of call insn */ |
| + u32 alu_limit; /* limit for add/sub register with pointer */ |
| }; |
| int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ |
| int sanitize_stack_off; /* stack slot to be cleared */ |
| bool seen; /* this insn was processed by the verifier */ |
| + u8 alu_state; /* used in combination with alu_limit */ |
| }; |
| |
| #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */ |
| diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c |
| index 973ebab5b19d..70fa0eb6ce81 100644 |
| --- a/kernel/bpf/verifier.c |
| +++ b/kernel/bpf/verifier.c |
| @@ -648,6 +648,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, |
| free_func_state(dst_state->frame[i]); |
| dst_state->frame[i] = NULL; |
| } |
| + dst_state->speculative = src->speculative; |
| dst_state->curframe = src->curframe; |
| for (i = 0; i <= src->curframe; i++) { |
| dst = dst_state->frame[i]; |
| @@ -692,7 +693,8 @@ static int pop_stack(struct bpf_verifier_env *env, int *prev_insn_idx, |
| } |
| |
| static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, |
| - int insn_idx, int prev_insn_idx) |
| + int insn_idx, int prev_insn_idx, |
| + bool speculative) |
| { |
| struct bpf_verifier_state *cur = env->cur_state; |
| struct bpf_verifier_stack_elem *elem; |
| @@ -710,6 +712,7 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, |
| err = copy_verifier_state(&elem->st, cur); |
| if (err) |
| goto err; |
| + elem->st.speculative |= speculative; |
| if (env->stack_size > BPF_COMPLEXITY_LIMIT_STACK) { |
| verbose(env, "BPF program is too complex\n"); |
| goto err; |
| @@ -2983,6 +2986,102 @@ static bool check_reg_sane_offset(struct bpf_verifier_env *env, |
| return true; |
| } |
| |
| +static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env) |
| +{ |
| + return &env->insn_aux_data[env->insn_idx]; |
| +} |
| + |
| +static int retrieve_ptr_limit(const struct bpf_reg_state *ptr_reg, |
| + u32 *ptr_limit, u8 opcode, bool off_is_neg) |
| +{ |
| + bool mask_to_left = (opcode == BPF_ADD && off_is_neg) || |
| + (opcode == BPF_SUB && !off_is_neg); |
| + u32 off; |
| + |
| + switch (ptr_reg->type) { |
| + case PTR_TO_STACK: |
| + off = ptr_reg->off + ptr_reg->var_off.value; |
| + if (mask_to_left) |
| + *ptr_limit = MAX_BPF_STACK + off; |
| + else |
| + *ptr_limit = -off; |
| + return 0; |
| + case PTR_TO_MAP_VALUE: |
| + if (mask_to_left) { |
| + *ptr_limit = ptr_reg->umax_value + ptr_reg->off; |
| + } else { |
| + off = ptr_reg->smin_value + ptr_reg->off; |
| + *ptr_limit = ptr_reg->map_ptr->value_size - off; |
| + } |
| + return 0; |
| + default: |
| + return -EINVAL; |
| + } |
| +} |
| + |
| +static int sanitize_ptr_alu(struct bpf_verifier_env *env, |
| + struct bpf_insn *insn, |
| + const struct bpf_reg_state *ptr_reg, |
| + struct bpf_reg_state *dst_reg, |
| + bool off_is_neg) |
| +{ |
| + struct bpf_verifier_state *vstate = env->cur_state; |
| + struct bpf_insn_aux_data *aux = cur_aux(env); |
| + bool ptr_is_dst_reg = ptr_reg == dst_reg; |
| + u8 opcode = BPF_OP(insn->code); |
| + u32 alu_state, alu_limit; |
| + struct bpf_reg_state tmp; |
| + bool ret; |
| + |
| + if (env->allow_ptr_leaks || BPF_SRC(insn->code) == BPF_K) |
| + return 0; |
| + |
| + /* We already marked aux for masking from non-speculative |
| + * paths, thus we got here in the first place. We only care |
| + * to explore bad access from here. |
| + */ |
| + if (vstate->speculative) |
| + goto do_sim; |
| + |
| + alu_state = off_is_neg ? BPF_ALU_NEG_VALUE : 0; |
| + alu_state |= ptr_is_dst_reg ? |
| + BPF_ALU_SANITIZE_SRC : BPF_ALU_SANITIZE_DST; |
| + |
| + if (retrieve_ptr_limit(ptr_reg, &alu_limit, opcode, off_is_neg)) |
| + return 0; |
| + |
| + /* If we arrived here from different branches with different |
| + * limits to sanitize, then this won't work. |
| + */ |
| + if (aux->alu_state && |
| + (aux->alu_state != alu_state || |
| + aux->alu_limit != alu_limit)) |
| + return -EACCES; |
| + |
| + /* Corresponding fixup done in fixup_bpf_calls(). */ |
| + aux->alu_state = alu_state; |
| + aux->alu_limit = alu_limit; |
| + |
| +do_sim: |
| + /* Simulate and find potential out-of-bounds access under |
| + * speculative execution from truncation as a result of |
| + * masking when off was not within expected range. If off |
| + * sits in dst, then we temporarily need to move ptr there |
| + * to simulate dst (== 0) +/-= ptr. Needed, for example, |
| + * for cases where we use K-based arithmetic in one direction |
| + * and truncated reg-based in the other in order to explore |
| + * bad access. |
| + */ |
| + if (!ptr_is_dst_reg) { |
| + tmp = *dst_reg; |
| + *dst_reg = *ptr_reg; |
| + } |
| + ret = push_stack(env, env->insn_idx + 1, env->insn_idx, true); |
| + if (!ptr_is_dst_reg) |
| + *dst_reg = tmp; |
| + return !ret ? -EFAULT : 0; |
| +} |
| + |
| /* Handles arithmetic on a pointer and a scalar: computes new min/max and var_off. |
| * Caller should also handle BPF_MOV case separately. |
| * If we return -EACCES, caller may want to try again treating pointer as a |
| @@ -3003,6 +3102,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, |
| umin_ptr = ptr_reg->umin_value, umax_ptr = ptr_reg->umax_value; |
| u32 dst = insn->dst_reg, src = insn->src_reg; |
| u8 opcode = BPF_OP(insn->code); |
| + int ret; |
| |
| dst_reg = ®s[dst]; |
| |
| @@ -3058,6 +3158,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, |
| |
| switch (opcode) { |
| case BPF_ADD: |
| + ret = sanitize_ptr_alu(env, insn, ptr_reg, dst_reg, smin_val < 0); |
| + if (ret < 0) { |
| + verbose(env, "R%d tried to add from different maps or paths\n", dst); |
| + return ret; |
| + } |
| /* We can take a fixed offset as long as it doesn't overflow |
| * the s32 'off' field |
| */ |
| @@ -3108,6 +3213,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, |
| } |
| break; |
| case BPF_SUB: |
| + ret = sanitize_ptr_alu(env, insn, ptr_reg, dst_reg, smin_val < 0); |
| + if (ret < 0) { |
| + verbose(env, "R%d tried to sub from different maps or paths\n", dst); |
| + return ret; |
| + } |
| if (dst_reg == off_reg) { |
| /* scalar -= pointer. Creates an unknown scalar */ |
| verbose(env, "R%d tried to subtract pointer from scalar\n", |
| @@ -4290,7 +4400,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, |
| } |
| } |
| |
| - other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx); |
| + other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx, |
| + false); |
| if (!other_branch) |
| return -EFAULT; |
| other_branch_regs = other_branch->frame[other_branch->curframe]->regs; |
| @@ -5031,6 +5142,12 @@ static bool states_equal(struct bpf_verifier_env *env, |
| if (old->curframe != cur->curframe) |
| return false; |
| |
| + /* Verification state from speculative execution simulation |
| + * must never prune a non-speculative execution one. |
| + */ |
| + if (old->speculative && !cur->speculative) |
| + return false; |
| + |
| /* for states to be equal callsites have to be the same |
| * and all frame states need to be equivalent |
| */ |
| @@ -5228,6 +5345,7 @@ static int do_check(struct bpf_verifier_env *env) |
| if (!state) |
| return -ENOMEM; |
| state->curframe = 0; |
| + state->speculative = false; |
| state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL); |
| if (!state->frame[0]) { |
| kfree(state); |
| @@ -5267,8 +5385,10 @@ static int do_check(struct bpf_verifier_env *env) |
| /* found equivalent state, can prune the search */ |
| if (env->log.level) { |
| if (do_print_state) |
| - verbose(env, "\nfrom %d to %d: safe\n", |
| - env->prev_insn_idx, env->insn_idx); |
| + verbose(env, "\nfrom %d to %d%s: safe\n", |
| + env->prev_insn_idx, env->insn_idx, |
| + env->cur_state->speculative ? |
| + " (speculative execution)" : ""); |
| else |
| verbose(env, "%d: safe\n", env->insn_idx); |
| } |
| @@ -5285,8 +5405,10 @@ static int do_check(struct bpf_verifier_env *env) |
| if (env->log.level > 1) |
| verbose(env, "%d:", env->insn_idx); |
| else |
| - verbose(env, "\nfrom %d to %d:", |
| - env->prev_insn_idx, env->insn_idx); |
| + verbose(env, "\nfrom %d to %d%s:", |
| + env->prev_insn_idx, env->insn_idx, |
| + env->cur_state->speculative ? |
| + " (speculative execution)" : ""); |
| print_verifier_state(env, state->frame[state->curframe]); |
| do_print_state = false; |
| } |
| @@ -6261,6 +6383,57 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) |
| continue; |
| } |
| |
| + if (insn->code == (BPF_ALU64 | BPF_ADD | BPF_X) || |
| + insn->code == (BPF_ALU64 | BPF_SUB | BPF_X)) { |
| + const u8 code_add = BPF_ALU64 | BPF_ADD | BPF_X; |
| + const u8 code_sub = BPF_ALU64 | BPF_SUB | BPF_X; |
| + struct bpf_insn insn_buf[16]; |
| + struct bpf_insn *patch = &insn_buf[0]; |
| + bool issrc, isneg; |
| + u32 off_reg; |
| + |
| + aux = &env->insn_aux_data[i + delta]; |
| + if (!aux->alu_state) |
| + continue; |
| + |
| + isneg = aux->alu_state & BPF_ALU_NEG_VALUE; |
| + issrc = (aux->alu_state & BPF_ALU_SANITIZE) == |
| + BPF_ALU_SANITIZE_SRC; |
| + |
| + off_reg = issrc ? insn->src_reg : insn->dst_reg; |
| + if (isneg) |
| + *patch++ = BPF_ALU64_IMM(BPF_MUL, off_reg, -1); |
| + *patch++ = BPF_MOV32_IMM(BPF_REG_AX, aux->alu_limit - 1); |
| + *patch++ = BPF_ALU64_REG(BPF_SUB, BPF_REG_AX, off_reg); |
| + *patch++ = BPF_ALU64_REG(BPF_OR, BPF_REG_AX, off_reg); |
| + *patch++ = BPF_ALU64_IMM(BPF_NEG, BPF_REG_AX, 0); |
| + *patch++ = BPF_ALU64_IMM(BPF_ARSH, BPF_REG_AX, 63); |
| + if (issrc) { |
| + *patch++ = BPF_ALU64_REG(BPF_AND, BPF_REG_AX, |
| + off_reg); |
| + insn->src_reg = BPF_REG_AX; |
| + } else { |
| + *patch++ = BPF_ALU64_REG(BPF_AND, off_reg, |
| + BPF_REG_AX); |
| + } |
| + if (isneg) |
| + insn->code = insn->code == code_add ? |
| + code_sub : code_add; |
| + *patch++ = *insn; |
| + if (issrc && isneg) |
| + *patch++ = BPF_ALU64_IMM(BPF_MUL, off_reg, -1); |
| + cnt = patch - insn_buf; |
| + |
| + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); |
| + if (!new_prog) |
| + return -ENOMEM; |
| + |
| + delta += cnt - 1; |
| + env->prog = prog = new_prog; |
| + insn = new_prog->insnsi + i + delta; |
| + continue; |
| + } |
| + |
| if (insn->code != (BPF_JMP | BPF_CALL)) |
| continue; |
| if (insn->src_reg == BPF_PSEUDO_CALL) |
| -- |
| 2.19.1 |
| |