[Tock Devel] Re: Looking for Insights on Tock's Context Switch Latency

Oct. 17, 2024

      Hi Zhiyao,

"zhiyao.ma.98--- via Devel" <devel@lists.tockos.org> writes:
...
Update to include some additional observation from my end: Compiler
optimization significantly affects the speed. Currently Tock's kernel
uses "z" which is the slowest.
On STM32F412G Discovery board @ 96 MHz, performing 10,000
ping-pong. When changing kernel compilation to use different
optimization flags, the measured time and code size are shown below:
Thanks for posting this, this is really quite interesting. I did not
expect there to be such a drastic difference between optimization level
and overall performance. Especially on the simpler microcontroller
platforms, conventional wisdom would suggest that fewer & smaller
instructions should correlate with better performance. I'd guess that
the compiler may choose more efficient algorithms for certain primitives
and performs more aggressive inlining on higher optimization levels,
which may serve as a gateway to enabling other optimizations. I'm
curious whether we could pinpoint these performance gains to some few
subroutines.

I did a similar exploration a while back for Tock on RISC-V, which I've
posted to the old mailing list:
https://groups.google.com/g/tock-dev/c/FPTmNe4BAq0

This highlighted the `memcpy` intrinsic as being particularly
inefficient on the `-Oz` optimization level. Supposedly this is fixed in
upstream Rust / LLVM since a couple months or years now, but it might be
good to verify that.

For RISC-V we can use the LiteX Sim target (compiling its HDL to a
Verilated simulation) to generate an instruction trace with
cycle-accurate information on the instructions executed by the CPU. I
have a set of patches for this which I can dig up if you're curious. I
don't know whether we have an equivalent for ARM Cortex-M (yet). I'm
Ccing @Alex who had Tock running on Renode (emulating an STM ARM chip)
once, which may also be able to generate such a trace.

-Leon

[Tock Devel] Re: Looking for Insights on Tock's Context Switch Latency

Leon Schuermann