Sujet : Re: errors using new Samsung 870 EVO SSD
De : Meyser (at) *nospam* xenet.de (Matthias Meyser)
Groupes : comp.unix.bsd.freebsd.miscDate : 18. Sep 2024, 12:16:15
Autres entêtes
Organisation : XeNET GmbH, 38678 Clausthal-Zellerfeld
Message-ID : <vcecpf$26jq$1@nntp.serx01.xenet.de>
References : 1
User-Agent : Mozilla Thunderbird
If not alread done install and anable "cpu-microcode" pkg.
Just a try.
With best regards
Matthias Meyser
Am 21.08.2024 um 10:47 schrieb Winston:
This is the first time I've used a solid state drive. It looks like
there's some kind of compatibility or interface problem. The output
from smartctl -x also points to some kind of interface problem.
ZFS counted about 180 write errors while resilvering ~80GB. Most seemed
to be retryable and succeeded on the second try (see logs below).
The system is using AMD-AHCI, not IDE.
The SATA interface is running at 3.0Gb/s, half the SSD's 6.0Gb/s speed.
Temperature is fine (~29C).
So far, the errors only occur during heavy activity: write errors during
resilvering, and 2 read errors later during a brief burst of read
activity.
[Note: I swapped the SATA cables: that's why ada1 during resilvering
became ada0 later. Since I swapped cables at the drive end, not the
motherboard end, I think it unlikely to be a cable/bad connection problem.]
Any ideas what the problem might be? Thanks,
-WBE
----------
[read error log entries:] [mildly edited]
Aug 21 03:01:24: (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 78 ff 64 40 13 00 00 00 00 00
Aug 21 03:01:24: (ada0:ahcich0:0:0:0): CAM status: Auto-Sense Retrieval Failed
Aug 21 03:01:24: (ada0:ahcich0:0:0:0): Error 5, Unretryable error
Aug 21 03:01:25: ahcich0: Timeout on slot 9 port 0
Aug 21 03:01:25: ahcich0: is 04000000 cs 00000200 ss 00000000 rs 00000200 tfd 451 serr 00400000 cmd 0000e917
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 58 00 65 40 13 00 00 00 00 00
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): CAM status: Auto-Sense Retrieval Failed
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): Error 5, Unretryable error
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 c0 36 65 40 13 00 00 00 00 00
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): CAM status: ATA Status Error
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): ATA status: 00 ()
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): RES: 00 00 00 00 00 00 00 00 00 00 00
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): Retrying command, 3 more tries remain
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 c8 ff 64 40 13 00 00 00 00 00
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): CAM status: ATA Status Error
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): ATA status: 00 ()
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): RES: 00 00 00 00 00 00 00 00 00 00 00
Aug 21 03:01:25: (ada0:ahcich0:0:0:0): Retrying command, 3 more tries remain
Aug 21 03:01:25 crystal ZFS[1332]: vdev I/O failure, zpool=zp path=/dev/ada0p3 offset=149417648128 size=4096 error=5
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 38 88 e4 80 40 13 00 00 00 00 00
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): CAM status: Uncorrectable parity/CRC error
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): Retrying command, 3 more tries remain
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 10 48 29 40 05 00 00 00 00 00
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): CAM status: Uncorrectable parity/CRC error
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): Retrying command, 3 more tries remain
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 c0 e4 80 40 13 00 00 00 00 00
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): CAM status: Uncorrectable parity/CRC error
Aug 21 03:01:26: (ada0:ahcich0:0:0:0): Retrying command, 3 more tries remain.
[end of read error log entries]
----------
[typical write errors during resilvering:] [mildly edited]
Aug 21 00:33:01: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 e0 b2 e7 40 02 00 00 00 00 00
Aug 21 00:33:01: (ada1:ahcich1:0:0:0): CAM status: Auto-Sense Retrieval Failed
Aug 21 00:33:01: (ada1:ahcich1:0:0:0): Error 5, Unretryable error
Aug 21 00:33:02: ahcich1: Timeout on slot 19 port 0
Aug 21 00:33:02: ahcich1: is 04000000 cs 00080000 ss 00000000 rs 00080000 tfd 451 serr 00400000 cmd 0000f317
Aug 21 00:33:02: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 48 f0 b2 e7 40 02 00 00 00 00 00
Aug 21 00:33:02: (ada1:ahcich1:0:0:0): CAM status: Auto-Sense Retrieval Failed
Aug 21 00:33:02: (ada1:ahcich1:0:0:0): Error 5, Unretryable error
Aug 21 00:33:02 crystal ZFS[1322]: vdev I/O failure, zpool=zp path=/dev/ada1p3 offset=7774244864 size=36864 error=5
Aug 21 00:33:05: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 78 2e f4 40 02 00 00 00 00 00
Aug 21 00:33:05: (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
Aug 21 00:33:05: (ada1:ahcich1:0:0:0): Retrying command, 3 more tries remain
Aug 21 00:33:05: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 58 2e f4 40 02 00 00 00 00 00
Aug 21 00:33:05: (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
Aug 21 00:33:05: (ada1:ahcich1:0:0:0): Retrying command, 3 more tries remain
[end of write error log entries]
----------
Here's smartctl -x output, keeping only what looked "interesting"/relevant:
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 3.0 Gb/s)
199 CRC_Error_Count -OSRCK 099 099 000 - 64
235 POR_Recovery_Count -O--C- 099 099 000 - 7
241 Total_LBAs_Written -O--CK 099 099 000 - 213396105
0x06 0x018 4 64 --- Number of Interface CRC Errors
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 2 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 65535+ R_ERR response for non-data FIS
0x0006 2 65535+ R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 5 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 5 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 65535+ Non-CRC errors within host-to-device FIS
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 65535+ R_ERR response for host-to-device non-data FIS, non-CRC
SCT Error Recovery Control:
Read: Disabled
Write: Disabled
----------
[END] [Thanks for reading.]