Hi Zhiyao, Thank you for including code. I haven't done so just yet, but will replicate this soon to both validate and try to debug/explain in more detail. Meanwhile, there area few things going on here, I think. First, to confirm some general numbers. At 64MHz (on the NRF52840DK), an almost no-op system-call (a context switch back and forth that reads the current RTC counter) takes about 42uS (10,000 in ~2.1 seconds), so about 21uS each way or 1680 instructions at 64MHz. That's about 13x off of what you're seeing _if_ no work is being done between those context switches, though of course there is. So on average, there is something like 12k instructions left to explain, which is indeed a lot. Second, it's worth noting that IPC in Tock is so unoptimized its not even funny. I added IPC in 2016 originally in service of a paper deadline, when there was no real use case for IPC for any Tock users, but as a proof of concept that, yes, this kind of interaction could also be supported. Since then it's hardly been touched. Recently there are some actual use cases that use and/or require IPC, so it's a good time to design a half-way decent system... Finally, IPC in Tock is quite different from FreeRTOS task notifications (as I understand them anyway), and as a result does quite a bit more. FreeRTOS task notifications directly pass control from the sender to the receiver rather than queue a notification to be handled asynchronously, which is what Tock IPC does. So there are a bunch of subsystems that get traversed in Tock (process scheduler, systick reset, hardware event loop, MPU reconfiguration, etc) that shouldn't need to be traversed in FreeRTOS. MPU reconfiguration is expensive (not _that_ expensive), but should only be happening twice here. Because the IPC is asynchronous, other things _could_ be happening between yieling in the client and receiving the notification in the server (or vice versa), such as handling hardware interupts, that that seems unlikely. In summary, while I don't have an off-the-cuff explanation for the high latency, I'm not _that_ surprised that IPC is very slow. To be continued... -Amit "zhiyao.ma.98--- via Devel" <devel@lists.tockos.org> writes:
Hello Tock developers,
I am recently testing Tock's context switch speed with an STM32F412G-Discovery board, and I noticed that the latency is longer than expected. I hope to confirm that my settings are correct and gain some insights behind the data.
To test the context switch speed, I implemented two apps called "server" and "client." The server sets up an IPC service that simply notifies completion back to the client. The client invokes the IPC service 10,000 times in a loop and then calculates the average ping-pong time.
On my board, the 10,000 ping-pong takes 10,748 ms. That is more than 1ms for each ping-pong.
I believe the CPU clock is set to 96 MHz by the default kernel configuration for the board. With a similar clock speed, a ping-pong implemented through task notification on FreeRTOS takes around 13 μs, which is almost two orders of magnitude faster than Tock.
Professor Campbell pointed out to me that there are multiple context switches for each ping-pong:
On the "client" side: - calls command() -> context switch to the kernel - command returns -> context switch back to the client app - client app calls yield -> context switch back to the kernel
Next, on the “server” side: - upcall scheduled -> context switch to the server app - calls ipc_notify_client -> context switch to the kernel to schedule the upcall - command returns -> context switch back to the server app - yield -> context switch to the kernel
Finally, on the “client” side again: - upcall scheduled -> context switch to the client app
However, even if there are 8 context switches per ping-pong, on average each context switch still takes around 135 μs, which is around 13k instructions executed per context switch on a 96 MHz CPU. The number still seems to be high.
Below is the source code for the “server” and “client” app.
Server:
```c #include <libtock/kernel/ipc.h> #include <libtock/tock.h>
static void ipc_callback( int pid, __attribute__ ((unused)) int len, __attribute__ ((unused)) int buf, __attribute__ ((unused)) void* ud) { ipc_notify_client(pid); }
int main(void) { ipc_register_service_callback("server", ipc_callback, NULL);
while (1) { yield(); } } ```
Client:
```c #include <stdio.h> #include <libtock-sync/services/alarm.h> #include <libtock/kernel/ipc.h>
size_t service_id = -1;
char buf[64] __attribute__((aligned(64)));
bool done = false;
static void ipc_callback(__attribute__ ((unused)) int pid, __attribute__ ((unused)) int len, __attribute__ ((unused)) int arg2, __attribute__ ((unused)) void* ud) { done = true; }
static uint32_t get_time_ms(void) { struct timeval tv; libtock_alarm_gettimeasticks(&tv, NULL); return (tv.tv_sec * 1000) + (tv.tv_usec / 1000); }
int main(void) { int ret;
ret = ipc_discover("server", &service_id); while (ret != RETURNCODE_SUCCESS) { printf("No server, retry in 1 second.\n"); libtocksync_alarm_delay_ms(1000); }
printf("Server discovered.\n");
ipc_register_client_callback(service_id, ipc_callback, NULL); ipc_share(service_id, buf, 64);
uint32_t start_time = get_time_ms();
for (int i = 0; i < 10000; ++i) { done = false; ipc_notify_service(service_id); yield_for(&done); }
uint32_t end_time = get_time_ms();
printf("Ping-pong 10000 times take %lu ms.\n", end_time - start_time);
return 0; } ``` _______________________________________________ Devel mailing list -- devel@lists.tockos.org To unsubscribe send an email to devel-leave@lists.tockos.org