Find broken RAM DIMM in server

A broken ram module is pretty annoying in itself but trying to find it by trial and error is even worse. This simple command shows the locations of the active dimms in the system. Run it and find the missing one, in this case:

CPU#0Channel#0_DIMM#1

[root@machine ~]# cat /sys/devices/system/edac/mc/mc*/csrow*/ch*_dimm_label 
CPU#0Channel#0_DIMM#0
CPU#0Channel#1_DIMM#0
CPU#0Channel#2_DIMM#0
CPU#0Channel#1_DIMM#1
CPU#0Channel#2_DIMM#1
CPU#1Channel#0_DIMM#0
CPU#1Channel#1_DIMM#0
CPU#1Channel#2_DIMM#0
CPU#1Channel#0_DIMM#1
CPU#1Channel#1_DIMM#1
CPU#1Channel#2_DIMM#1
[root@machine ~]# 

Or:

[root@machine ~]# cat /sys/devices/system/edac/mc/mc*/dimm*/dimm_label
Cloud & Open-Source magician 🧙‍♂️

I try to find the KISS in complex systems and share it with the world.

comments powered by Disqus