itops:hp_smart_array_p420i:p420p_raid_controller

Guide to Managing an HP Smart Array P420i/P420p RAID Controller

This guide provides instructions on how to check, manage, and maintain an HP Smart Array P420i/P420p RAID controller on a Linux system (specifically Ubuntu/Debian).

The HP Smart Array P420i (integrated) and P420p (PCIe card) are hardware RAID controllers that manage physical disks and present them to the operating system as “Logical Drives”. Proper management is critical for:

  • Data Integrity: Ensuring your data is not corrupted.
  • Performance: Optimizing read and write speeds.
  • Redundancy: Protecting against data loss from a single disk failure (depending on RAID level).

We will use the ssacli command-line tool for all management tasks.

NOTE: Initial creation of a RAID array (Logical Drive) is typically performed in the controller's configuration utility (Smart Storage Administrator), accessible via a key press (e.g., F10 for Intelligent Provisioning, or F8 during boot) on system startup. This guide focuses on management and verification from within the running operating system.

You can follow this guide to add the repo to apt in Ubuntu.

First, confirm the system sees the controller. The lsscsi command is useful for this.

$ lsscsi -g
[4:0:0:0]    storage HP       P420             4.68  -          /dev/sg0 
[4:1:0:0]    disk    HP       LOGICAL VOLUME   4.68  /dev/sda   /dev/sg1 
[5:0:0:0]    storage HP       P420i            8.32  -          /dev/sg2 
[5:1:0:0]    disk    HP       LOGICAL VOLUME   8.32  /dev/sdb   /dev/sg3 
[6:0:0:0]    storage HP       P420             8.00  -          /dev/sg4 

If you are unsure of which device corresponds to which controller, you can cross-check with the logical volumes:

$ lsblk
NAME                      MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda                         8:0    0   8.2T  0 disk 
sdb                         8:16   0 953.8G  0 disk 
├─sdb1                      8:17   0     1M  0 part 
├─sdb2                      8:18   0     2G  0 part /boot
└─sdb3                      8:19   0 951.8G  0 part 
  ├─ubuntu--vg-ubuntu--lv 252:0    0   100G  0 lvm  /
  └─ubuntu--vg-lv--swap   252:1    0    32G  0 lvm  [SWAP]

Next, use ssacli to get a system-wide overview. This will show you the controller's slot number. We will refer to this as slot=X in all subsequent commands (it is very often slot=0).

$ sudo ssacli ctrl all show status
 
Smart Array P420i in Slot 0 (Embedded)
   Controller Status: OK
   Cache Status: OK
   Battery/Capacitor Status: OK

From the output above, we can see:

  • Controller: Smart Array P420i in Slot 0. This is our controller.
  • Controller Status: OK. The controller hardware is healthy.
  • Cache Status: OK. The write cache is operational.
  • Battery/Capacitor Status: OK. The Flash-Backed Write Cache (FBWC) power source is healthy.

If you are inheriting a server or just want to check the health of an existing array, the main command is show config detail.

sudo ssacli ctrl slot=0 show config detail

This command provides a lot of information. The most important sections are the main controller block, the logicaldrive blocks, and the physicaldrive blocks.

Smart Array P420i in Slot 0 (Embedded)
   Bus Interface: PCI
   [...]
   Controller Status: OK
   Cache Status: OK
   Battery/Capacitor Status: OK

   Array A (SAS, Unused Space: 0  MB)

      logicaldrive 1 (931.5 GB, RAID 5, OK)

      physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 300 GB, OK)
      physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 300 GB, OK)
      physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 300 GB, OK)
      physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 300 GB, OK)

Key things to check:

  • Controller/Cache/Capacitor Status: These should all be OK. Any other state requires investigation.
  • logicaldrive Status: The status for each logical drive should be OK. A Degraded state means a drive has failed and you are running without full redundancy.
  • physicaldrive Status: All physical drives should be OK. Any other state (e.g., Failed, Predictive Failure) requires attention. The drive should be replaced.

HP Smart Array controllers proactively check for and fix issues. The two most important automated tasks are the Surface Scan and the Parity Scan (Consistency Check).

  • Surface Scan: Scans the physical disks for bad blocks (media errors) and remaps them before they cause an error during a read operation. This is HP's equivalent of a Patrol Read and is crucial for preventing a disk from failing during a critical array rebuild.
  • Parity Scan (Consistency Check): Verifies the RAID parity data. It reads stripes and checks if the parity matches the data, correcting any errors it finds.

All relevant health check settings are displayed in the detailed configuration output.

sudo ssacli ctrl slot=0 show config detail

In the output for the controller, look for the following line:

   Surface Scan Delay: 15 secs
   Surface Scan Mode: Idle
   [...]
  • Surface Scan Mode should be Idle, meaning it runs when the controller is not busy.
  • Surface Scan Delay is how long the controller waits for I/O to be idle before starting a scan. The default is fine.

The consistency check is not a separately scheduled task in the same way as other vendor's tools. It runs automatically in the background with low priority. You can verify its status is complete for a given logical drive in the same output:

      logicaldrive 1 (931.5 GB, RAID 5, OK)
         [...]
         Parity Initialization Status: Initialization Completed
         [...]

By default, HP controllers have sane settings for these checks. You typically do not need to schedule them. However, you can modify their behavior.

Surface Scan To ensure the surface scan runs with a specific priority (e.g., high) or at a specific time, you would have to script it manually. In general, the default “Idle” mode is sufficient for most use cases, as it automatically performs the scan during periods of low I/O.

Consistency Check (Parity Scan) While this is an automatic background process, you can modify its priority or trigger it manually. For example, if you suspect an issue, you can start a check on logical drive 1 in slot 0:

sudo ssacli ctrl slot=0 logicaldrive 1 startconsistencycheck

You can also adjust the priority of the background consistency check. Setting it higher will complete the check faster but may have a greater performance impact on the server.

sudo ssacli ctrl slot=0 modify consistencycheckpriority=medium

(Options are low, medium, high. The default is low).

The P420i/p controller uses a Flash-Backed Write Cache (FBWC) module, which is powered by a super-capacitor, not a battery (BBU). This module protects the data in the controller's write-cache in case of a power failure. If the capacitor is dead or failing, the controller will disable the Write Cache, severely degrading write performance.

Check the cache and capacitor status regularly with the show status or show detail command.

sudo ssacli ctrl slot=0 show detail

Look for the Cache Status and Capacitor Status lines.

Smart Array P420i in Slot 0 (Embedded)
   [...]
   Controller Status: OK
   Cache Status: OK
   Cache Status Details: The cache is configured.
   [...]
   Capacitor Status: OK

Both values must be OK. A Failed or Degraded status indicates the cache module or its capacitor needs to be replaced.

  • itops/hp_smart_array_p420i/p420p_raid_controller.txt
  • Last modified: 2025/09/23 14:43
  • by fabricio