Srivinasa, Avinash and Sosonkina, Masha (2011) Non-uniform Memory Affinity Strategy in Multi-Threaded Sparse Matrix Computations. Technical Report 11-07, Computer Science, Iowa State University.
As the core counts on modern multi-processor systems increase, so does
the memory contention with all the processes/threads trying to access the main memory
simultaneously. This is typical of UMA (Uniform Memory Access) architectures
with a single physical memory bank leading to
poor scalability in multi-threaded applications.
To palliate this problem, modern systems
are moving increasingly towards Non-Uniform Memory Access (NUMA)
architectures, in which the physical memory is split into several
(typically two or four) banks. Each memory bank
is associated with a set of cores enabling threads to
operate from their own physical memory banks while
retaining the concept of a shared virtual address space.
However, accessing shared data structures from the remote memory banks
may become increasingly slow.
This paper proposes a way to determine and pin certain parts of the shared
data to specific memory banks, thus minimizing remote accesses.
To achieve this, the existing application code has be supplied with the proposed interface to set-up and distribute the shared data appropriately among memory banks. Experiments with NAS benchmark as well as with a
realistic large-scale application calculating ab-initio nuclear structure have been performed. Speedups of up to 3.5 times were observed with the proposed approach compared with the default memory placement policy.
Contact site administrator at: email@example.com