Scaling out NUMA-Aware Applications with RDMA-Based Distributed Shared Memory
Scaling out NUMA-Aware Applications with RDMA-Based Distributed Shared Memory
Yang Hong;Yang Zheng;Fan Yang;Bin-Yu Zang;Hai-Bing Guan;Hai-Bo Chen
【期刊名称】《计算机科学技术学报(英文版)》 【年(卷),期】2019(034)001
【摘要】The multicore evolution has stimulated renewed interests in scaling
up
applications
on
shared-memory
multi-processors,
significantly improving the scalability of many applications. But the scalability is limited within a single node;therefore programmers still have to redesign applications to scale out over multiple nodes. This paper revisits the design and implementation of distributed shared memory (DSM) as a way to scale out applications optimized for non-uniform memory access (NUMA) architecture over a well-connected cluster. This paper presents MAGI, an efficient DSM system that provides a transparent shared address space with scalable performance on a cluster with fast network interfaces. MAGI is unique in that it presents a NUMA abstraction to fully harness the multicore resources in each node through hierarchical synchronization and memory management. MAGI also exploits the memory access patterns of big-data applications and leverages a set of optimizations for remote direct memory access (RDMA) to reduce the number of page faults and the cost of the coherence protocol. MAGI has been implemented as a user-