University College London
Browse
1/1
17 files

Datasets for MONet: Heterogeneous Memory over Optical Network for Large-Scale Data Centre Resource Disaggregation

dataset
posted on 2021-04-13, 08:45 authored by Joshua Benjamin, Vaibhawa MishraVaibhawa Mishra, Georgios ZervasGeorgios Zervas
Fig. 4 MONet: Switch-Plane Characterization - Architecture Power and Latency
Switch Plane Characterization: Power and network latency comparison between Non-Parallel (fat tree) and MONet architectures

Fig. 5: MONet: Remote memory access Round Trip Latency
DDR4/HMC local/remote (8m) memory read/write latency: 8-bonded transceivers each at 10,12.5, 15 Gb/s

Fig 6: MONet: DDR4/HMC Remote memory read/write latency overhead
Remote memory read/write latency: Impact of optical distance b/w CPU and remote memory on round-trip latency. Values are measured experimentally for 8, 18 and 36m; only 100m is based on Eq1.

Fig 7. MONet DDR4: Achieved Bandwidth, Memory/Link Utilization
MONet DDR4 Access: Achieved bandwidth, link and memory bandwidth utilization for locally and remotely (8m) attached DDR4: transceiver lanes at rates (10, 12.5 and 15 Gb/s)

Fig 8. MONet HMC: Achieved Bandwidth, Memory/Link Utilization
Achieved bandwidth, link and memory bandwidth utilization for locally and remotely (8 m) attached HMC, compared with achieved maximum memory bandwidth for different lane rates (10, 12.5, 15 Gb/s)

Fig. 9 MONet: Power Consumption Distribution
Power consumption distribution between CPU and memory over 8-metre round-trip optical data path. Round-trip net energy efficiency (with and without MONet’s resources) and memory-to-link ratio over number of transceivers link and lane rate.

Fig. 10: MONet HMC Access: Physical Layer Performance
Physical layer performance of a single bi-directional channel CPU and HMC: Received optical power (dBm) vs log10(BER).

Fig. 11: MONet HMC Access: BER vs Bandwidth
Impact of Bit Error Rate (BER) on memory bandwidth performance per one HMC half width link (8 transceivers).

Fig. 12: MONet STREAM Benchmark DDR4
STREAM benchmark performance for DDR4 at 8-metres round-trip distance using 8 and single channel

Fig. 13: MONet HMC STREAMS benchmark
Application level performance using the STREAM benchmark for accessing serial memory (local and remote at 8 metres round-trip) at 10, 12.5 and 15 Gb/s lane rate.

Fig. 14 MONet: DDR4 and HMC: STREAM and baseline
Sustained STREAM and baseline bandwidth (8-links) over round-trip over round-trip optical distance: 8, 18, 26,36 metres.

Fig. 15-16: MONet DDR4: Memcached Throughput
Achieved Throughput in Workload (A, B, C and F) when DDR4 is locally/remotely attached. For DDR4: parallel accessed (MM), stream data-width size in bytes (8 to 64). Sustained Throughput in Workload (A, B, C and F)when DDR4 is remotely attached at round-trip optical distance 8, 16, 26 and 36-metres.

Fig. 15-16: MONet HMC: Memcached Throughput
Achieved Throughput in Workload (A, B, C and F) when HMC is locally/remotely attached. For HMC: full-width (FW) (16-lane) and half-width (HW) (8-lane)at 10, 12.5 and 15 Gb/s bit-rates. Sustained Throughput in Workload (A, B, C and F) when HMC is remotely attached at round-trip optical distance 8, 16, 26 and 36-metres.

Fig. 17-18 MONet: DDR4 Memcached Latency
Achieved Average Latency in Workload (A, B, C and F) when DDR4 is locally and remotely attached. For local attachment in DDR4: parallel accessed (MM), stream data-width size in bytes (8 to 64). Sustained Average Latency in Workload (A, B, C and F) when DDR4 is remotely attached at round-trip optical distance 8, 16, 26 and 36-metres

Fig. 17-18: MONet HMC Memcached Latency
Achieved Average Latency in Workload (A, B, C and F) when HMC is locally and remotely attached. For local attachment in HMC: Full-width (FW) (16-lane) and half-width (HW) (8-lane) at 10, 12.5 and 15 Gb/s bit-rates. Sustained Average Latency in Workload (A, B, C and F) when HMC is remotely attached at round-trip optical distance 8, 16, 26 and 36-metres

Fig. 19: MONet DDR4 Memcached IPC
Impact of optical distance on IPC in workload (A,B, C and F) for whole CPU. For DDR4: using 8 and one transceivers links each at 10, 12.5 and 15 Gb/s.

Fig. 19: MONet HMC Memcached IPC
Impact of optical distance on IPC in workload (A,B, C and F) for whole CPU. For HMC: Half-width (8-lane) at 10, 12.5 and 15 Gb/s bit-rates

History