itops:dell_perc_h710p_raid_controller

This is an old revision of the document!


Guide to Managing a Dell PERC H710P RAID Controller

This guide provides instructions on how to check, manage, and maintain a Dell PERC H710P RAID controller on a Linux system (specifically Ubuntu/Debian).

The Dell PowerEdge RAID Controller (PERC) H710P is a hardware RAID controller that manages physical disks and presents them to the operating system as logical volumes, or “Virtual Drives”. Proper management is critical for:

  • Data Integrity: Ensuring your data is not corrupted.
  • Performance: Optimizing read and write speeds.
  • Redundancy: Protecting against data loss from a single disk failure (depending on RAID level).

We will use the perccli command-line tool for all management tasks.

NOTE: Initial creation of a RAID array (Virtual Drive) is typically performed in the controller's configuration utility, accessible via a key press (e.g., Ctrl+R) during system boot. This guide focuses on management and verification from within the running operating system.

The official Dell tool, perccli, is distributed as an .rpm package. On Debian-based systems like Ubuntu, you must first convert it to a .deb package using alien. You can download the .rpm (or rather the .tar.gz containing the .rpm) from Dell Support

NOTE:You will need to download the perccli tool from the Dell support website. Search for your server model or “PERC H710P” and find the appropriate download for Linux.

First, install alien if you don't have it:

sudo apt-get update
sudo apt-get install alien

After downloading and unzipping the perccli package from Dell, navigate into its Linux subdirectory.

Convert the .rpm package to a .deb package:

sudo alien -k perccli-007.0127.0000.0000-1.noarch.rpm

Install the newly generated .deb package:

sudo dpkg -i perccli_007.0127.0000.0000-1_all.deb

The binary is installed in /opt/MegaRAID/perccli/. To make it easier to use, create a symbolic link into a standard path:

sudo ln -s /opt/MegaRAID/perccli/perccli64 /usr/sbin/perccli64

You may need to refresh your shell session for the system to recognize the new command:

source ~/.bashrc

First, confirm the system sees the controller. The lsscsi command is useful for this.

$ lsscsi -g
[0:2:0:0]    disk    DELL     PERC H710P       3.13  /dev/sda   /dev/sg0 
[1:0:0:0]    cd/dvd  PLDS     DVD+-RW DS-8A8SH KD11  /dev/sr0   /dev/sg1 
[2:0:0:0]    disk    ATA      Samsung SSD 850  2B6Q  /dev/sdb   /dev/sg2 

Next, use perccli to get a system-wide overview. This will show you the controller's index number (usually 0). We will refer to this as /c0 in all subsequent commands.

$ sudo perccli64 show all
 
Status Code = 0
Status = Success
Description = None
 
Number of Controllers = 1
Host Name = cmpt3
Operating System  = Linux5.15.0-153-generic
 
System Overview :
===============
----------------------------------------------------------------------------
Ctl Model            Ports PDs DGs DNOpt VDs VNOpt BBU sPR DS EHS ASOs Hlth 
----------------------------------------------------------------------------
  0 PERCH710PAdapter     8   4   1     0   1     0 Opt On  3  N      0 Opt  
----------------------------------------------------------------------------

From the output above, we can see:

  • Ctl (Controller Index): 0. This is our /c0.
  • PDs (Physical Drives): 4 drives are attached.
  • VDs (Virtual Drives): 1 virtual drive is configured.
  • BBU (Battery Backup Unit): Opt (Optimal). This is good!
  • Hlth (Health): Opt (Optimal). The overall controller health is good.

If you are inheriting a server or just want to check the health of an existing array, the main command is show.

sudo perccli64 /c0 show

This command provides a lot of information. The most important sections are TOPOLOGY and VD LIST.

[...]
TOPOLOGY :
========

---------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type  State BT     Size PDC  PI SED DS3  FSpace TR 
---------------------------------------------------------------------------
 0 -   -   -        -   RAID5 Optl  N  8.185 TB dflt N  N   dflt N      N  
 0 0   -   -        -   RAID5 Optl  N  8.185 TB dflt N  N   dflt N      N  
 0 0   0    :4      4   DRIVE Onln  Y  2.728 TB dflt N  N   dflt -      N  
 0 0   1    :5      5   DRIVE Onln  Y  2.728 TB dflt N  N   dflt -      N  
 0 0   2    :7      7   DRIVE Onln  Y  2.728 TB dflt N  N   dflt -      N  
 0 0   3    :6      6   DRIVE Onln  Y  2.728 TB dflt N  N   dflt -      N  
---------------------------------------------------------------------------
[...]
Virtual Drives = 1

VD LIST :
=======

-------------------------------------------------------------
DG/VD TYPE  State Access Consist Cache Cac sCC     Size Name 
-------------------------------------------------------------
0/0   RAID5 Optl  RW     Yes     RWBD  -   ON  8.185 TB      
-------------------------------------------------------------

Key things to check:

  • DRIVE State: In the TOPOLOGY section, all drives should be Onln (Online). Any other state (e.g., Failed, Rbld) requires attention.
  • VD State: In the VD LIST, the state should be Optl (Optimal). A Dgrd (Degraded) state means a drive has failed and you are running without redundancy.
  • Cache: RWBD (Read Write-Back with BBU) gives the best performance. If this shows WT (Write-Through), it may indicate a problem with the BBU.
  • sCC: ON means scheduled consistency checks are enabled, which is good practice.

RAID controllers can proactively check for and fix issues. The two most important automated tasks are the Patrol Read and the Consistency Check.

  • Patrol Read (PR): Scans the physical disks for bad blocks and remaps them before they cause an error during a read operation. This prevents a disk from failing during a critical array rebuild.
  • Consistency Check (CC): Verifies the RAID parity data. It reads stripes and checks if the parity matches the data, correcting any errors it finds.

These tasks should be enabled and scheduled to run automatically during periods of low server activity (e.g., weekends, overnight).

Check the Patrol Read settings:

$ sudo perccli64 /c0 show pr
Controller = 0
Status = Success
Description = None
 
Controller Properties :
=====================
---------------------------------------------
Ctrl_Prop               Value                
---------------------------------------------
PR Mode                 Auto                 
PR Execution Delay      168 hours   
PR Next Start time      10/23/2025, 02:00:00 
[...]

Here, PR Mode`` is Auto and it runs every ``168 hours (1 week), which is excellent.

Check the Consistency Check settings:

$ sudo perccli64 /c0 show cc
Controller = 0
Status = Success
Description = None
 
Controller Properties :
=====================
-----------------------------------------------
Ctrl_Prop                 Value                
-----------------------------------------------
CC Operation Mode         Sequential           
CC Execution Delay        168                  
CC Next Starttime         10/15/2025, 02:00:00 
CC Current State          Stopped 
[...]

Here, CC Operation Mode is Sequential (meaning automatic) and it runs every 168 hours, which is also excellent.

If your checks are set to Manual or Disabled, you must enable them. A good strategy is to run them weekly at different times.

NOTE: The starttime parameter sets the next time the task will run. The delay (e.g., 168 hours) makes it recur from that point onward.

Example: Schedule Consistency Check for every Saturday at 2 AM and Patrol Read for every Sunday at 2 AM.

To enable the automatic Consistency Check, set it to sequential mode. The command below sets it to run on a specific future date/time and repeat every 168 hours (7 days).

sudo perccli64 /c0 set cc=seq starttime=YYYY/MM/DD 02 delay=168

To enable the automatic Patrol Read, set its mode to auto. The command below sets it to run on a specific future date/time and repeat every 168 hours.

sudo perccli64 /c0 set patrolread mode=auto starttime=YYYY/MM/DD 02 delay=168

The BBU protects the data in the controller's write-cache in case of a power failure. If the BBU is dead or failing, the controller will disable Write-Back cache (WB), severely degrading write performance.

Check the BBU status regularly:

sudo perccli64 /c0/bbu show all

Look for the State. It should be Optimal. The battery will periodically run a “Learn Cycle” to recalibrate its capacity, during which performance may be temporarily reduced. If the battery is marked as Failed or Degraded, it needs to be replaced.

  • itops/dell_perc_h710p_raid_controller.1758632095.txt.gz
  • Last modified: 2025/09/23 12:54
  • by fabricio