Adam.Tony f3d03371a0 clean up
2025-10-23 16:38:10 +08:00
2025-10-22 15:02:09 +08:00
2025-10-22 15:33:39 +08:00
2025-10-23 15:30:10 +08:00
2025-10-22 12:59:32 +08:00
2025-10-22 17:35:22 +08:00
2025-10-22 05:23:24 +08:00

Disk Health Check Script for Harvester OS

A comprehensive bash script for checking the health of various storage devices including SATA HDD/SSD, SAS HDD/SSD, NVMe drives, and RAID configurations. This script provides detailed SMART data analysis, lifespan estimation, and health status reporting.

Features

  • Multi-Device Support: Works with SATA HDD/SSD, SAS HDD/SSD, NVMe drives
  • RAID Controller Detection: Supports MegaRAID, 3ware, Areca, HP Smart Array, Adaptec, and software RAID (MDRAID)
  • Comprehensive Health Analysis:
    • SMART attribute reading and interpretation
    • Power-on hours tracking
    • Temperature monitoring
    • Reallocated and pending sector counts
  • Lifespan Estimation:
    • HDD lifespan based on usage and error metrics
    • SSD/NVMe TBW (Total Bytes Written) calculation and endurance estimation
    • Wear level analysis
  • Enterprise vs Consumer Classification: Automatically detects drive class and applies appropriate endurance standards
  • Auto-Detection: Automatically discovers all connected storage devices
  • Color-Coded Output: Easy-to-read color-coded health status

Requirements

  • smartmontools package must be installed
  • Root privileges (for accessing all disk information)
  • Bash shell

Installation

  1. Ensure smartmontools is installed:
# Ubuntu/Debian
sudo apt-get install smartmontools

# CentOS/RHEL
sudo yum install smartmontools

# Harvester OS
# smartmontools should be available in the package manager
  1. Download the script:
wget https://raw.githubusercontent.com/yourusername/harvester-disk-health/main/harvester-v3.8.sh
chmod +x harvester-v3.8.sh

Usage

Basic Usage (Auto-detect all disks)

sudo ./harvester-v3.8.sh

Check Specific Disks

sudo ./harvester-v3.8.sh /dev/sda
sudo ./harvester-v3.8.sh /dev/nvme0n1
sudo ./harvester-v3.8.sh /dev/sda /dev/sdb /dev/nvme0n1

Help and Version Information

./harvester-v3.8.sh --help
./harvester-v3.8.sh --version

Output Interpretation

Health Status Colors

  • 🟢 GREEN: Healthy - No immediate concerns
  • 🟡 YELLOW: Warning - Monitor closely, some wear detected
  • 🔴 RED: Critical - Immediate attention required

Key Metrics Explained

For HDDs:

  • Power On Hours: Total operational time
  • Reallocated Sectors: Bad sectors that have been remapped
  • Pending Sectors: Sectors waiting to be remapped
  • Start/Stop Count: Mechanical cycle count
  • Load Cycle Count: Head load/unload cycles

For SSDs/NVMe:

  • TBW Used: Total terabytes written to date
  • TBW Endurance: Manufacturer's estimated endurance
  • TBW Remaining: Estimated remaining write capacity
  • Media Wearout: SSD wear level indicator
  • Lifetime Used: Percentage of lifespan consumed

Supported Controllers & Protocols

  • Direct SATA/SAS: Standard AHCI drives
  • NVMe: Native NVMe protocol
  • Hardware RAID:
    • MegaRAID (LSI/Broadcom)
    • 3ware
    • Areca
    • HP Smart Array
    • Adaptec
  • Software RAID: Linux MDRAID

Drive Classification

The script automatically classifies drives as Enterprise or Consumer based on:

  • Model name patterns (PRO, EP, DC, ENT, ENTERPRISE)
  • Interface type (SAS drives are always enterprise)
  • SMART feature detection

TBW Endurance Standards

Consumer SSDs

Capacity Minimum TBW
250GB 150 TB
500GB 300 TB
1TB 600 TB
2TB 1200 TB
4TB 2400 TB
8TB 4800 TB

Enterprise SSDs

Capacity Minimum TBW
250GB 450 TB
500GB 900 TB
1TB 1800 TB
2TB 3600 TB
4TB 7200 TB
8TB 14400 TB

Note: Actual TBW endurance may vary depending on the specific drive model and manufacturer. SAS SSDs often do not expose write statistics through SMART, so TBW information may not be available for these drives.

Known Model Database

The script includes a database of known drive models with their capacities for accurate reporting. Currently supported models include:

  • Seagate: ST91000640NS, ST2000NM0033, ST4000NM0033, etc.
  • Hitachi: HUC101212CSS600, HUC103012CSS600, etc.
  • Samsung: MZILT series
  • And many more...

Troubleshooting

"Cannot read disk information" Error

  • Ensure the disk is not offline or in standby
  • Try running with root privileges
  • For RAID controllers, specify the correct controller type

"smartctl is not installed" Error

sudo apt-get install smartmontools  # Ubuntu/Debian
sudo yum install smartmontools      # CentOS/RHEL

SAS Drive Detection Issues

SAS drives may require specific controller parameters. The script attempts auto-detection, but manual controller specification may be needed for some configurations.

Version History

  • v3.8: Enhanced SAS SSD support, improved TBW calculations, added enterprise classification
  • v3.7: Added NVMe support, improved RAID detection
  • v3.6: Initial release with basic SATA/SAS support

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for:

  • New drive model additions
  • Additional RAID controller support
  • Bug fixes and improvements

License

This script is provided as-is under the MIT License.

Author

Adam T. Lau - Creator and maintainer

Disclaimer

This script is provided for informational purposes only. Always maintain proper backups and consult drive manufacturer documentation for specific health monitoring recommendations. The lifespan estimates are approximations based on standard industry metrics and actual drive performance may vary.

Description
No description provided
Readme 290 KiB
Languages
Shell 100%