Looking for Insights on Tock's Context Switch Latency

9 Oct 2024

Hello Zhiyao and Amit,

Sorry for the misinformation, ignore my previous message. The kernel indeed returns immediately to the process after a  command().

I measured that, from board startup, the kernel regains control 100,014 times for 10,000 IPC ping-pongs (Measurement Methodology: counted how many times  do_process  reached the  match). Doesn't this mean that the user/kernel barrier is crossed 20 times per IPC loop (10 round trips each)?

@Amit What is the benchmarking story for Tock? There are no code speed benchmarks in the Tock repo. I found  tock/TockBenchmarking, but it never got upstreamed to Tock. I will be benchmarking Tock this quarter as part of the same class as Gabe (Pat's student, the guy who made the aforementioned repo). It will be interesting to compare my results to his. What should I do if I want to get some benchmarks upstreamed? I can wrap them with an Action like the  code size check (https://github.com/tock/tock/blob/fb33e5f672411318ce9ca4700d9fb3c4a08babe0/.github/workflows/benchmarks.yml).

Secondly, I am having a hard time trying to explain the cause of IPC's performance. I am not sure how to go about profiling within the kernel to figure out where all the overhead is going. Any tips?

Thanks,

Samir
Fingerprint: DE65 F61B 7AD6 69C8 3972 1530 A81C A0CA 8957 AC94

On 2024-10-14 07:12, Samir Rashid via Devel  <devel@lists.tockos.org>  wrote:
> Hello Zhiyao,
>  
> Unfortunately, I also do not have a great explanation for you. At 64MHz (on the NRF52840DK), your benchmark takes 4846 ms.
>  
> I would like to point out some things. Your context switch diagram is underestimating the amount of context switches. The sequence you present is not correct because
> If a syscall does not immediately error (which IPC is not doing in this case), then the kernel triggers a context switch (ContextSwitchReason::SyscallFired). To be more precise, the kernel's process loop is going to mark that a syscall has fired and ask the scheduler whether it should immediately handle the syscall. The default round robin scheduler will always choose to switch to the other process.
>  
> To revise your diagram:
> On the "client" side:
- calls command() ->  context switch to the kernel
> - (NEW) kernel defers processing the command() ->  kernel round robin scheduler context switches to server's main()
> - (NEW) server app main loop immediately yields ->  server context switches to kernel
> - kernel processes the client's command ->  kernel context switch to the client app
> - client immediately runs yield_for ->  context switch to kernel
>
> I empirically verified that the `loop { yield(); }` runs 10,000 times in the server.
>  
> It is possible there is more overhead from the server calling `ipc_notify_client(pid);`, but I did not trace further in your timeline. I think the best method to get more info about this bug would be to do some debugging inside the kernel scheduler (`round_robin.rs`). Let me know if I got anything wrong, Professor Levy.
>
>   
Samir
> Fingerprint: DE65 F61B 7AD6 69C8 3972 1530 A81C A0CA 8957 AC94
>  
>
>  
> On 2024-10-11 18:23, zhiyao.ma.98--- via Devel  <devel@lists.tockos.org>  wrote:
> > Hi Professor Levy,
> >
> > Thank you so much for looking into the problem. Please let me know if there is anything I can help on my side.
> >
> > Best regards,
> > Zhiyao
> > _______________________________________________
> > Devel mailing list -- devel@lists.tockos.org
> > To unsubscribe send an email to devel-leave@lists.tockos.org   
_______________________________________________ Devel mailing list -- devel@lists.tockos.org To unsubscribe send an email to devel-leave@lists.tockos.org

zhiyao.ma.98＠gmail.com

Amit Levy

Amit Levy

zhiyao.ma.98＠gmail.com

Samir Rashid

Samir Rashid

Samir Rashid

zhiyao.ma.98＠gmail.com

Leon Schuermann

tags

participants (4)