How to know if your HDD has bad sectors or is in poor health?

Not long ago I talked to you about how measure the performance of a HDD in LinuxIt is logical that if the writing is very slow (800kb or something like that) the HDD definitely has a problem, but this is not the only way to know it.

SMART

What is actually SMART? well, according to Wikipedia:

Technology S.M.A.R.T., acronym for Self Monitoring Analysis and Reporting Technology, consists of the ability to detect failures of the hard disk. Early detection of surface failures allows the user to make a copy of its content, or replace the disk, before irrecoverable data loss occurs.

In other words, years ago we knew that an HDD had problems when it stopped working, when it was late and we lost information, but today fortunately we do not need to go that far, we can know when the disk begins to fail, and then save of the information.

How to work with SMART on Linux?

Those of us who use Linux have a perfect tool for the terminal: Smartmontools

To install it on ArchLinux would:

sudo pacman -S smartmontools

In distros like Debian, Ubuntu or derivatives:

sudo apt-get install smartmontools

Once installed we must verify if SMART is activated on the HDD:

sudo smartctl -i /dev/sda

This will check it for the main or first HDD, that is, / dev / sda ... If you have another HDD that you also want to verify, run the command again but with sdb instead of sda

You should get something like this:

This means that it is enabled.

In case Enabled does NOT exit, that is, that is not enabled, you can enable it like this:

sudo smartctl -s on -d ata /dev/sda

How to check HDD health with data from SMART?

The idea is to do a test (one short and one long) to the HDD, then check the error log, so we will know if it has errors, what they are, and if we should hurry to save the data.

To do a short test (it takes about 1 minute) it is:

sudo smartctl -t short /dev/sda

To do the long test:

sudo smartctl -t long /dev/sda

I recommend checking the error log between each test, for this it would be:

sudo smartctl -l error /dev/sda

If the hard drive is completely healthy they will get this:

How would it look if the HDD has problems?

If the hard disk has problems then when executing the above command, the output would be similar to this:

smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.18.5-1-ARCH] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED Please note the following marginal Attributes: ID # ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 190 Airflow_Temperature_Cel 0x0022 044 033 045 Old_age Always FAILING_NOW 56 (96 110 58 25)

For more details you can use this other command:

sudo smartctl --attributes --log=selftest /dev/sda

Which would show an output similar to this, I say similar and not the same because obviously it is somewhat difficult for two hard drives to fail exactly the same hehe:

smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.18.5-1-ARCH] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID # ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 098 092 006 Pre-failTime 238320363 3 0 0003 Pre-fail-Time 100 100 Always 000 0 4x0 Pre-failUp 0032 Pre-fail100 100 020 587 5 Pre-failUp 0 0033 Pre-fail100 Always 100 Pre-fail 036 Pre-fail Always - 9 7 Start_Stop_Count 0x000 077 060 030 Old_age Always - 51672328 9 Reallocated_Sector_Ct 0x0032 095 095 000 Pre-fail Always - 4805 10 Seek_Error_Rate 0x0013f 100 100 097 Pre-fail Always - 0 12 Always Power_On_Hours 0 - 0032 100 100 020 586 Always Power_On_Hours 184 - 0 0032 100 100 099 Spin_Retry_Count 0x187 0 0032 001 Pre-fail Always - 001 000 Power_Cycle_Count 417x188 0 0032 100 099 Old_age Always - 000 4295032833 Unknown_Attribute 189x0 003 094 094 Old_age Always - 000 6 Reported_Uncorrect 190x0 0022 044 033 Old_age Always - 045 XNUMX Unknown_Attribute XNUMXxXNUMX XNUMX XNUMX XNUMX Old_age Always - XNUMX XNUMX High_age Always_Writes XNUMX XNUMXxXNUMXa XNUMX XNUMX Always_Fly_Writes XNUMXxXNUMXa XNUMX Always_Fly_Writes XNUMXxXNUMXa_ XNUMX XNUMX Old_age Always   FAILING_NOW 56 (96 122 58 25) 194 Temperature_Celsius 0x0022 056 067 000 Old_age Always - 56 (0 23 0 0) 195 Hardware_ECC_Recovered 0x001a 043 026 000 Old_age Always - 238320363 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 49 198 0 Old_Uncorrectable 0010x100 100 000 Offline_Uncorrectable 49x199 Offline - 0 003 UDMA_CRC_Error_Count 200x200e 000 0 240 Old_age Always - 0 0000 Head_Flying_Hours 100x253 000 172082159686339 241 Old_age Offline - 0 0000 Unknown_Attribute 100x253 000 2155546016 242 Old_age Offline - 0 0000 100 253 Revision_Attribute Structure 000 3048586928 1 1 Unknown-XNUMX XNUMX Old revision SMART-XNUMX test line SMART_Attribute XNUMX XNUMX Offline revision XNUMX Unknown-revision structure XNUMXxXNUMX Testline XNUMX Unknown -Attribute XNUMX Revision SMART XNUMX XNUMX Old_Attribute XNUMX number XNUMX Num Test_Description Status Remaining LifeTime (hours) LBA_of_first_error # XNUMX  Extended offline Completed: read failure 90% 4789 1746972641

If you still want to read much more information, the command to show you a complete output, almost a detailed debug is:

sudo smartctl -d ata -a /dev/sda

The end!

Well nothing, it was all ... another article about HDDs 😉

DesdeLinux