2026-07-03 09:51:42 Displayed 4 times

Troubleshooting memory issues

Sun/Oracle SPARC Enterprise T5220 Memory Fault Recovery Summary

Initial symptoms

The ILOM service processor reported:

Unsupported memory configuration

Later, after changes/reseating, the system reported:

/SYS/MB/CMP0/MCU0 Forced fail (IBIST)
Operating with a degraded memory configuration.

At OpenBoot, POST eventually showed:

POST Passed all devices.
ERROR: The following devices are disabled:
    MB/CMP0/MCU0
Aborting auto-boot sequence.

This showed that the memory controller was no longer actively failing POST, but it had remained disabled by ASR/ILOM from a previous fault.

Root causes

There were two separate issues:

DIMMs were not in a supported T5220 population pattern. The T5220 requires DIMMs to be installed in specific slots. For example, a valid 4-DIMM configuration is:

J1001
J1401
J2001
J2401

Valid population counts are 4, 8, or 16 DIMMs, with correct slot order and compatible matching FB-DIMMs.

DIMM slot/contact issue.

Dirty or marginal DIMM contacts caused a memory-controller path fault during testing, which led ILOM/ASR to disable:

/SYS/MB/CMP0/MCU0

Successful recovery steps

  1. Powered down the host.
  2. Corrected the DIMM placement according to the supported T5220 slot population rules.
  3. Cleaned the DIMM slots and reseated the memory modules carefully.
  4. Started the host and observed that POST passed.
  5. Checked the disabled component state: -> show /SYS/MB/CMP0/MCU0

It showed:

component_state = Disabled

Re-enabled the memory controller manually:

-> set /SYS/MB/CMP0/MCU0 component_state=Enabled
Restarted the system.
Verified the result:
-> show faulty

Final result:

No faults found

Final conclusion

The successful fix was not only clearing an ILOM fault. The real recovery required:

Correct DIMM positions

  1. clean/reseated DIMM slots
  2. re-enable the ASR-disabled MCU0 component
  3. reboot/retest

After this, all machines showed an empty show faulty, confirming the hardware state was clean.