init: define strong semantics
The x86 init path is critical, if anything fails in it things can
go wrong fast. To avoid issues since we are defining a small framework
for how to run routines, in what order, dependencies, and required
work, we need to define very strongs semantics. We start off with
defining semantics through documentation by considering all possible
combinations of uses of the init table framework.
While at it, rename init_fn to x86_init_fn to reflect how this is
specific to x86. This means the sorting routine is specific to x86
as well, as is all other calls. This is intentional, we are customizing
the table work for our own needs on x86.
First a clarification of the x86 boot sequence:
Bare metal, KVM, Xen HVM Xen PV / dom0
startup_64() startup_xen()
\ /
x86_64_start_kernel() xen_start_kernel()
\ /
x86_64_start_reservations()
|
start_kernel()
[ ... ]
[ setup_arch() ]
[ ... ]
init
A few highlights of pedantic semantic changes worth highlighting:
* We are meshing the gpxe and IOMMU specialized init solutions,
the gpxe solution works with a linker sort, the IOMMU init
solution has its own C sorter which memmove()'s routines
around, this not only uses the "depends" callback for sorting
purposes but also uses memmove() routines around depending on
sorting heuristics. This makes sorting specific to the subsystem
init structures, and semantics defined.
Two possible paths we need to pick from this:
a) We can make a generic library sort routine, which takes as input
only the subsystem specific table init first entry and last
entry. Doing this generalizes the sorting algorithm but
imposes a common shared set of semantics or -- imposes more
requirements on our sorting routines to take more input, which
would not only enable understanding sematnics but also the size
of the structure as memmove() is used to shuffle things around.
b) We just keep sorting routines specific to each subsystem. In this
x86 mockup we go with this solution.
Since I've opted to go with option b) I modified the sort routines,
the linker script to use subsystem specific table start and end
points.
* Clarified the semantic gap caused by current hypervisors between
x86 Linux entry point and pv_ops and how subarchitecture fills
the gap. Refer to init.h for details.
* Clarified order level use:
- Clarified that when considering two init sequences with the same
order level the next thing which considers order is placement
on the C file and next Makefile. Note that SORT() is still used by the
linker but the sort is specific to the order subsystem specific
structures alone and the function names play no role in sortin. Refer to
custom.lds.S -- *(SORT(.tbl.init_fns.*)) is used. The name of the struct
is not considered for sorting. For instance:
$ readelf -S kasan.o| grep tbl
[ 8] .tbl.init_fns.01 PROGBITS 0000000000000000 000000a0
[ 9] .rela.tbl.init_fn RELA 0000000000000000 00000f00
The sort here is going to be performed on the .tbl.init_fns.*, the
order level attributes.
- The 2 digit order level is used to help sort init routines
by batched levels. Since an init sequence can also depend on
another init sequence the order level of the subordinate
init sequence must be less than or equal to the init sequence it
depends on.
Since the ordering is part of the section name we can't use it on
C code to validate that the subordinate order level unless we use
macros to enable setting the order level, and as part of that a
new member part of the init sequence. Coccinelle cannot use
C annotations, its just not part of the abstract syntax tree in
Coccinelle, as such we can't easily get access to them for validation,
however Coccinelle does understand macro declarations
(ie, DEFINE_MUTEX(), through declarer name DEFINE_MUTEX;) so another
prospect to enforce proper order level could be to annotate order
level with a respective macro declarations and an SmPL rule for
validation.
We can accomplish only similar enforcement checking at run time
by pegging the order level as part of the init structure.
Changes will be made next to define and use macros to enable declarations
of x86 init sequences. This will enables semantic parsers to be used to
vet for sorting semantics on order level, reduces code, and also makes
it easier to modify the use of annotation as needed without affecting
users.
* IOMMU init framework uses an int for the return type for the
detect and depends callback. This seems error-prone, I've
changd this to simply be of return type bool.
* Although using an int return type for early_init() and late_init()
might seem appealing its not what we use for the Linux x86 init
boot code, all code is critical and cannot fail. As such I've removed
the critical declaration option from the struct x86_init_fn -- to
match Linux. In the future though, this commit could be referenced
for example code if one wanted to add routines which could
optionally fail, and still allow strong semantics for dependencies
between init sequences.
* If an init sequence has X86_SUBARCH_XEN only it must mean
mean then only that the early_init() callback would be run on
Xen HVM currently. Note that this would mean that the calls are made
right before x86_64_start_reservations(). As it stands typically you
would put this sort of routine within the xen_start_kernel() work
flow, this not only enables us to define xen early init calls but also
ultimatley would enable us to fold the two separate x86 entry points as
one if we so desired.
* use u32 for x86_init_fn flags, document what they are.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
19 files changed