Feature #5751
closedio_uring support in libosmocore
100%
Description
Traditionally our I/O abstraction in libosmocore has been select()
. In libosmocore 1.5.0 (2020) we migrated over to poll()
to support more than 1024 FDs and to avoid the extreme amount of fd-set memcpy()ing involved in the venerable select interface.
Now of course both select and poll are ancient unix interfaces for non-blocking I/O, and both come at a high cost for systems under high load.
Specifically, we are getting reports from osmo-bsc users that indicate a busy BSC with 100 BTS ( 400 TRX)_is spending about 40% of its CPU cycles in the (kernel side) sock_poll, tcp_poll, do_sys_poll.
There are other interfaces such as linux aio, posix aio and epoll, but the brightest and shiniest new I/O interface on Linux is io_uring
. Contrary to any of its predecessors, io_uring can, in the "worst" case, operate without any system calls at all anymore. io_uring recognizes that each syscall is associated with a rather high context switch cost.
io_uring consists of memory-mapped (between kernel and userspace process) queues for requests and completions, as well as lockless primitives to enqueue/dequeue from these.
The requests in the queue are requests like read N bytes from this file descriptor or write N bytes to that file descriptor. But io_uring can do much more (many other syscalls), though the read/write is the most relevant part to us.
we already have two io_uring users in the osmocom universe: the GTP and the UDP/RTP load generators I wrote some time ago. They manage their file descriptors internally.
This ticket is now about introducing io_uring support into libosmocore itself, in a way to enable all osmocom programs to use that shared infrastructure.
Conceptual differences¶
reading from a socket¶
Conceptually, the existing code typically works like this:
- register some socket file descriptor for read
- libosmocore includes it in the poll-set
- libosmocore calls poll()
- kernel returns from poll, indicating fd is readable
- libosmocore dispatches to the application call-back
- application allocates msgb, reads data from socket
- application processes data in msgb
With io_uring, this model needs to change to something like this:
- application tells us it wants to read from a socket
- libosmocore or application pre-allocate the msgb
- libosmocore uses liburing to add a read request to the io_uring submission queue
- kernel signals us at some point a completion event via io_uring / liburing
- libosmocore dispatches pre-filled msgb to application call-back
- application processes data n msgb
So as we can see, the responsibility for the actual reading transfers from application (or intermediate library like libosmo-netif / libosmo-sigtran) into library.
writing to a socket¶
Conceptually, the existing code typically works like this:
- register some socket file descriptor for read
- libosmocore includes it in the poll-set
- libosmocore calls poll()
- kernel returns from poll, indicating fd is writeable
- libosmocore dispatches to the application call-back
- application writes data to msgb and free's msgb.
With io_uring, this model needs to change to something like this:
- application tells us it wants to write to a socket, including the msgb
- libosmocore uses liburing to add a write request to the io_uring submission queue
- kernel signals us at some point a completion event via io_uring / liburing
- libosmocore releases the msgb with msgb_free()
Again, the actual reading/writing passes into the library, and outside the scope of the application (or intermediate library like libosmo-netif / libosmo-sigtran)
Related issues