In Arrakis, applications send and receive network packets by directly communicating
with hardware. The data plane interface is therefore implemented in an application
library, allowing it to be codesigned with the application [Marinos et al. 2014]. The
Arrakis library provides two interfaces to applications. We describe the native Arrakis
interface, which departs slightly from the POSIX standard to support true zero-copy
I/O; Arrakis also provides a POSIX compatibility layer that supports unmodified
applications.
Applications send and receive packets on queues, which have previously been assigned
filters as described earlier. While filters can include IP, TCP, and UDP field
predicates, Arrakis does not require the hardware to perform protocol processing, only
multiplexing. In our implementation, Arrakis provides a user-space network stack
above the data plane interface. This stack is designed to maximize both latency and
throughput. We maintain a clean separation between three aspects of packet transmission
and reception.
First, packets are transferred asynchronously between the network and main memory
using conventional DMA techniques using rings of packet buffer descriptors.
Second, the application transfers ownership of a transmit packet to the network
hardware by enqueuing a chain of buffers onto the hardware descriptor rings, and
acquires a received packet by the reverse process. This is performed by two VNIC
driver functions. send_packet(queue, packet_array) sends a packet on a queue; the
packet is specified by the scatter-gather array packet_array and must conform to a
filter already associated with the queue. receive_packet(queue) = packet receives
a packet from a queue and returns a pointer to it. Both operations are asynchronous.
packet_done(packet) returns ownership of a received packet to the VNIC.
For optimal performance, the Arrakis stack would interact with the hardware queues
not through these calls but directly via compiler-generated, optimized code tailored to
the NIC descriptor format. However, the implementation we report on in this article
uses function calls to the driver.
Third, we handle asynchronous notification of events using doorbells associated with
queues. Doorbells are delivered directly from hardware to user programs via hardware
virtualized interrupts when applications are running and via the control plane
to invoke the scheduler when applications are not running. In the latter case, higher
latency is tolerable. Doorbells are exposed to Arrakis programs via regular event delivery
mechanisms (e.g., a file descriptor event) and are fully integrated with existing
I/O multiplexing interfaces (e.g., select). They are useful both to notify an application
of general availability of packets in receive queues and as a lightweight notification
mechanism for I/O completion and the reception of packets in high-priority queues.
This design results in a protocol stack that decouples hardware from software as
much as possible using the descriptor rings as a buffer, maximizing throughput and
minimizing overhead under high packet rates, yielding low latency. On top of this native
interface, Arrakis provides POSIX-compatible sockets. This compatibility layer allows
Arrakis to support unmodified Linux applications. However, we show that performance
gains can be achieved by using the asynchronous native interface.