by Andy Grover
When the computer receives a packet, it is copied into a kernel buffer by the NIC, then copied by the CPU from the kernel buffer to its actual destination in the receiving process's address space. The same data is transferred over the memory bus THREE times, and the CPU must dumbly read and then write every single byte, even before the application sees it.
RDMA (Remote Direct Memory Access) lets processes on different machines send data directly into each other's process spaces, resulting in greatly increased efficiency. But, using RDMA is very hard, compared to BSD sockets. This talk will introduce my work on making RDMA usable by mere mortals, from Python!