Anthony,

Thanks.  Yes to all.  We see on all host servers identical to below.  Notice smartctl appears normal from the command line within the osd.

On the baremetal:

# apt list smartmontools
Listing... Done
smartmontools/noble,now 7.4-2build1 amd64 [installed]
root@noc3

And then in the shell

# cephadm shell --name osd.1

root@noc3:/# dnf list smartmontools

Installed Packages
smartmontools.x86_64  1:7.2-9.el9                                                                                                                     @System
root@noc3:/# ls -l /dev/sd?
brw-rw---- 1 root disk 8,   0 May 30 14:03 /dev/sda
brw-rw---- 1 root disk 8,  16 May 30 14:04 /dev/sdb
brw-rw---- 1 root disk 8,  32 May 30 14:04 /dev/sdc
brw-rw---- 1 root disk 8,  48 May 30 14:04 /dev/sdd
brw-rw---- 1 root disk 8,  64 May 30 14:04 /dev/sde
brw-rw---- 1 root disk 8,  80 May 30 14:04 /dev/sdf
brw-rw---- 1 root disk 8,  96 May 30 14:04 /dev/sdg
brw-rw---- 1 root disk 8, 112 May 30 14:04 /dev/sdh


root@noc3:/# smartctl -j /dev/sdg
{
 "json_format_version": [
   1,
   0
 ],
 "smartctl": {
   "version": [
     7,
     2
   ],
   "svn_revision": "5155",
   "platform_info": "x86_64-linux-6.8.0-60-generic",
   "build_info": "(local build)",
   "argv": [
     "smartctl",
     "-j",
     "/dev/sdg"
   ],
   "exit_status": 0
 },
 "device": {
   "name": "/dev/sdg",
   "info_name": "/dev/sdg [SAT]",
   "type": "sat",
   "protocol": "ATA"
 }
}
root@noc3:/#


So, it's a puzzle.



On 5/30/25 19:12, Anthony D'Atri wrote:
Do you have 7.0+?  That’s when JSON output was added for Ceph.  Are your drives 
natively visible to the kernel, not hidden behind a RAID HBA?

On May 30, 2025, at 6:53 PM, Harry G Coin<hgc...@gmail.com> wrote:

Using 19.2.2, we notice under cluster/osds/'device health' on the dashboard, 
for all osds no matter the server:

     Warning
Smartctl has received an unknown argument (error code -22). You may be using an 
incompatible version of smartmontools. Version >= 7.0 of smartmontools is 
required to successfully retrieve data.  That error code resolves to 'unknown 
attribute' in the smartctl docs.  However, the same result occurs whether the 
drive is HGST, Seagate, or Western Digital.

"State of Health" is always "Stale"

"Life Expectancy" is always "n/a> 6 weeks"

Of course the 'diskprediction_local" module has been broken for over a year, as 
it requires a no-longer-distributed rev of a sub-package,  but that shouldn't stop 
the smartctl command from normal operations.

Any ideas?

Thanks!

Harry Coin


_______________________________________________
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to