====== Guide to Managing a Dell PERC H710P RAID Controller ======
This guide provides instructions on how to check, manage, and maintain a Dell PERC H710P RAID controller on a Linux system (specifically Ubuntu/Debian).
===== Overview =====
The Dell PowerEdge RAID Controller (PERC) H710P is a hardware RAID controller that manages physical disks and presents them to the operating system as logical volumes, or "Virtual Drives". Proper management is critical for:
* **Data Integrity:** Ensuring your data is not corrupted.
* **Performance:** Optimizing read and write speeds.
* **Redundancy:** Protecting against data loss from a single disk failure (depending on RAID level).
We will use the ''perccli'' command-line tool for all management tasks.
**NOTE:** Initial creation of a RAID array (Virtual Drive) is typically performed in the controller's configuration utility, accessible via a key press (e.g., Ctrl+R) during system boot. This guide focuses on management and verification from within the running operating system.
===== 1. Initial Setup: The `perccli` Tool =====
The official Dell tool, ''perccli'', is distributed as an ''.rpm'' package. On Debian-based systems like Ubuntu, you must first convert it to a ''.deb'' package using ''alien''.
**NOTE:**You will need to download the ''perccli'' tool from the [[https://www.dell.com/support/home/de-ch/drivers/driversdetails?driverid=wd0r5|Dell support website]]. Search for your server model or "PERC H710P" and find the appropriate download for Linux. The ''.rpm'' is usually within a packaged ''.tar.gz''.
First, install ''alien'' if you don't have it:
sudo apt-get update
sudo apt-get install alien
After downloading and unzipping the ''perccli'' package from Dell, navigate into its ''Linux'' subdirectory.
Convert the ''.rpm'' package to a ''.deb'' package:
sudo alien -k perccli-007.0127.0000.0000-1.noarch.rpm
Install the newly generated ''.deb'' package:
sudo dpkg -i perccli_007.0127.0000.0000-1_all.deb
The binary is installed in ''/opt/MegaRAID/perccli/''. To make it easier to use, create a symbolic link into a standard path:
sudo ln -s /opt/MegaRAID/perccli/perccli64 /usr/sbin/perccli64
You may need to refresh your shell session for the system to recognize the new command:
source ~/.bashrc
===== 2. Identifying the Controller =====
First, confirm the system sees the controller. The ''lsscsi'' command is useful for this.
$ lsscsi -g
[0:2:0:0] disk DELL PERC H710P 3.13 /dev/sda /dev/sg0
[1:0:0:0] cd/dvd PLDS DVD+-RW DS-8A8SH KD11 /dev/sr0 /dev/sg1
[2:0:0:0] disk ATA Samsung SSD 850 2B6Q /dev/sdb /dev/sg2
Next, use ''perccli'' to get a system-wide overview. This will show you the controller's index number (usually 0). We will refer to this as ''/c0'' in all subsequent commands.
$ sudo perccli64 show all
Status Code = 0
Status = Success
Description = None
Number of Controllers = 1
Host Name = cmpt3
Operating System = Linux5.15.0-153-generic
System Overview :
===============
----------------------------------------------------------------------------
Ctl Model Ports PDs DGs DNOpt VDs VNOpt BBU sPR DS EHS ASOs Hlth
----------------------------------------------------------------------------
0 PERCH710PAdapter 8 4 1 0 1 0 Opt On 3 N 0 Opt
----------------------------------------------------------------------------
From the output above, we can see:
* **Ctl (Controller Index):** ''0''. This is our ''/c0''.
* **PDs (Physical Drives):** ''4'' drives are attached.
* **VDs (Virtual Drives):** ''1'' virtual drive is configured.
* **BBU (Battery Backup Unit):** ''Opt'' (Optimal). This is good!
* **Hlth (Health):** ''Opt'' (Optimal). The overall controller health is good.
===== 3. Checking an Existing Setup =====
If you are inheriting a server or just want to check the health of an existing array, the main command is ''show''.
sudo perccli64 /c0 show
This command provides a lot of information. The most important sections are ''TOPOLOGY'' and ''VD LIST''.
[...]
TOPOLOGY :
========
---------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type State BT Size PDC PI SED DS3 FSpace TR
---------------------------------------------------------------------------
0 - - - - RAID5 Optl N 8.185 TB dflt N N dflt N N
0 0 - - - RAID5 Optl N 8.185 TB dflt N N dflt N N
0 0 0 :4 4 DRIVE Onln Y 2.728 TB dflt N N dflt - N
0 0 1 :5 5 DRIVE Onln Y 2.728 TB dflt N N dflt - N
0 0 2 :7 7 DRIVE Onln Y 2.728 TB dflt N N dflt - N
0 0 3 :6 6 DRIVE Onln Y 2.728 TB dflt N N dflt - N
---------------------------------------------------------------------------
[...]
Virtual Drives = 1
VD LIST :
=======
-------------------------------------------------------------
DG/VD TYPE State Access Consist Cache Cac sCC Size Name
-------------------------------------------------------------
0/0 RAID5 Optl RW Yes RWBD - ON 8.185 TB
-------------------------------------------------------------
Key things to check:
* **DRIVE State:** In the ''TOPOLOGY'' section, all drives should be ''Onln'' (Online). Any other state (e.g., ''Failed'', ''Rbld'') requires attention.
* **VD State:** In the ''VD LIST'', the state should be ''Optl'' (Optimal). A ''Dgrd'' (Degraded) state means a drive has failed and you are running without redundancy.
* **Cache:** ''RWBD'' (Read Write-Back with BBU) gives the best performance. If this shows ''WT'' (Write-Through), it may indicate a problem with the BBU.
* **sCC:** ''ON'' means scheduled consistency checks are enabled, which is good practice.
===== 4. Essential Maintenance: Automatic Health Checks =====
RAID controllers can proactively check for and fix issues. The two most important automated tasks are the **Patrol Read** and the **Consistency Check**.
* **Patrol Read (PR):** Scans the physical disks for bad blocks and remaps them //before// they cause an error during a read operation. This prevents a disk from failing during a critical array rebuild.
* **Consistency Check (CC):** Verifies the RAID parity data. It reads stripes and checks if the parity matches the data, correcting any errors it finds.
These tasks should be enabled and scheduled to run automatically during periods of low server activity (e.g., weekends, overnight).
==== 4.1. Checking Current Settings ====
Check the Patrol Read settings:
$ sudo perccli64 /c0 show pr
Controller = 0
Status = Success
Description = None
Controller Properties :
=====================
---------------------------------------------
Ctrl_Prop Value
---------------------------------------------
PR Mode Auto
PR Execution Delay 168 hours
PR Next Start time 10/23/2025, 02:00:00
[...]
Here, ''PR Mode`` is ''Auto'' and it runs every ``168 hours'' (1 week), which is excellent.
Check the Consistency Check settings:
$ sudo perccli64 /c0 show cc
Controller = 0
Status = Success
Description = None
Controller Properties :
=====================
-----------------------------------------------
Ctrl_Prop Value
-----------------------------------------------
CC Operation Mode Sequential
CC Execution Delay 168
CC Next Starttime 10/15/2025, 02:00:00
CC Current State Stopped
[...]
Here, ''CC Operation Mode'' is ''Sequential'' (meaning automatic) and it runs every ''168'' hours, which is also excellent.
==== 4.2. Enabling and Scheduling Checks (Best Practice) ====
If your checks are set to ''Manual'' or ''Disabled'', you **must** enable them. A good strategy is to run them weekly at different times.
**NOTE:** The ''starttime'' parameter sets the **next** time the task will run. The ''delay'' (e.g., 168 hours) makes it recur from that point onward.
**Example:** Schedule Consistency Check for every Saturday at 2 AM and Patrol Read for every Sunday at 2 AM.
To enable the automatic **Consistency Check**, set it to sequential mode. The command below sets it to run on a specific future date/time and repeat every 168 hours (7 days).
sudo perccli64 /c0 set cc=seq starttime=YYYY/MM/DD 02 delay=168
To enable the automatic **Patrol Read**, set its mode to auto. The command below sets it to run on a specific future date/time and repeat every 168 hours.
sudo perccli64 /c0 set patrolread mode=auto starttime=YYYY/MM/DD 02 delay=168
===== 5. Essential Maintenance: Battery Backup Unit (BBU) =====
The BBU protects the data in the controller's write-cache in case of a power failure. If the BBU is dead or failing, the controller will disable Write-Back cache (''WB''), severely degrading write performance.
Check the BBU status regularly:
sudo perccli64 /c0/bbu show all
Look for the ''State''. It should be ''Optimal''. The battery will periodically run a "Learn Cycle" to recalibrate its capacity, during which performance may be temporarily reduced. If the battery is marked as ''Failed'' or ''Degraded'', it needs to be replaced.