The io_uring
interface works through two main data structures: the submission queue entry (sqe) and the completion queue entry (cqe). Instances of those structures live in a shared memory single-producer-single-consumer ring buffer between the kernel and the application.
The application asynchronously adds sqes to the queue (potentially many) and then tells the kernel that there is work to do. The kernel does its thing, and when work is ready it posts the results in the cqe ring. This also has the added advantage that system calls are now batched. Remember Meltdown? At the time I wrote about how little it affected our Scylla NoSQL database, since we would batch our I/O system calls through aio
. Except now we can batch much more than just the storage I/O system calls, and this power is also available to any application.
The application, whenever it wants to check whether work is ready or not, just looks at the cqe ring buffer and consumes entries if they are ready. There is no need to go to the kernel to consume those entries.
Here are some of the operations that io_uring
supports: read
, write
, send
, recv
, accept
, openat
, stat
, and even way more specialized ones like fallocate
.
How io_uring and eBPF Will Revolutionize Programming in Linux
from Glauber Costa
Filed under:
Same Source
Related Notes
- Dependencies (coupling) is an important concern to address, but it&...from kbouck
- Deep and shallow modules: The best modules are deep: they allow a ...from John Ousterhout
- By replacing integration tests with unit tests, we're losing al...from Computer Things
- I propose that there is one problem chief among them, an impetus fo...from George Hosu
- When software -- or idea-ware for that matter -- fails to be access...from gist.github.com
- Any software is considered free software so long as it upholds the ...from writefreesoftware.org
- Nathan's four Laws of Software: 1. **Software is a gas** ...from Jeff Atwood
- > Software with fewer concepts composes, scales, and evolves mor...from oilshell